Links

F. Husson website





Book Exploratory Multivariate Analysis Using R

Outline

Introduction

Principal Component Analysis

Correspondence Analysis

Multiple Correspondence Analysis

Clustering

Multiple Factor Analysis

To conclude

Forum

Computer exercise.

The following dataset involves data provided by the OECD in 2015 on better life index (see the OECD website). For the 34 countries of the OECD, plus Russia and Brazil, we have 22 measures, which can be divided into 5 large groups for quantifying 'quality of life': The data can be found here. For more details on the variables, please look at the definitions given by the OECD ((in this folder)).
We would like to compare quality of life in the 36 countries, taking into account the five large groups of quality of life measures.
Use the command check.names=FALSE (to avoid spaces being replaced by dots in the variable names).

Q1) After having a think about the definition of the groups of variables, the choice of status of each (active or supplementary), and whether or not to normalize the data in each group, give the percentage of inertia in the 1-2 plane of the Multiple Factor Analysis that you then run.
35.18
42.12
51.24
72.18

Q2) Which two groups of variables are the most related?
Education - Happiness
Education - Material well-being
Material well-being - Health and safety
Education - Health and safety
Material well-being - Jobs

Q3) Which group contributes the most to construction of the 2nd dimension of the MFA?
Material well-being
Jobs
Happiness
Health and safety
Education

Q4) Tick all true statements (with the help of the individuals and variables plots if necessary):
Spain and Greece have high job security and high long-term unemployment rates
The first axis puts countries like Denmark, where quality of life is high, to the right, and countries like Brazil, where the quality of life is lower, to the left
France is an average country in terms of the quality of life criteria used in this survey
Poland and Italy, close to each other in the individuals plot, are similar with respect to the set of groups of variables

Q5) Plot the graph of partial points and show only the partial points for France and Austria. With the help of the variables plot, select which statements are true:
In terms of Material well-being, France and Austria are similar
In terms of Jobs, France and Austria are similar
In terms of Health and safety, France and Austria are similar
The Education indices are very good for France

Q6) By playing with the colors, we can show the partial points of just one group. To do this, we select the 'transparent' color for all other groups (the color 'black' corresponds to the color of the average point). For example, if we want to color the partial points of the 1st group red, we write:
plot(res.afm,partial="all", hab="group", cex=.7, palette = palette(c("black", "red", "transparent", "transparent", "transparent", "transparent")), ylim=c(-4,4))
Using this, show only the partial points of the Happiness group for all the countries, then tick each of the following statements if they are true:

In Denmark, the people are generally happy
Koreans are less happy that we would expect, given their quality of life indices (Jobs, Heath, Material well-being and Education)
The Spanish are less happy that we would expect, given their quality of life indices (Jobs, Heath, Material well-being and Education)
Koreans are generally happier than Brazilians
Israelis have essentially the same level of happiness as Americans

Q7) If we ran a PCA for each group of variables, for which groups would the 1st dimension of the PCA have a positive correlation (and greater than 0.5) with the 1st dimension of the MFA?
Material well-being
Jobs
Happiness
Health and safety
Education

Q8) Interpreting the data.
There is no correction provided for this question.

Score =
Correct answers: