rasa.sayange36

07-17-2018, 10:21 AM

Hi there,

I have a question about interpreting PCA plots from the Eurogenes site (Global 25). Is dividing the eigenvector of PC1 by the sum of eigenvectors (PC1 to PC25) equal to the proportion of variance explained by PC1 relative to the total variance?

I'll make an example from the Global 25 data. From the site, the eigenvectors from PC1 to PC25 are as follow:

129.557,103.13,14.222,10.433,9.471,7.778,5.523,5.3 25,4.183,3.321,2.637,2.246,2.21,1.894,1.842,1.758, 1.7,1.605,1.58,1.564,1.557,1.529,1.519,1.452,1.434

Total sum = 319.47

So, the proportion of eigenvector of PC1 is: 129.557/319.47 = 0.4055 (40.55%), and for the eigenvector of PC2 is: 103.13/319.47 = 0.3228 (32.28%). Hence, a PC1 vs. PC2 plot is able to explain 72.84% of the total variance, while a 3-dimensional PC1 vs. PC2 vs. PC3 plot is able to explain 77.29% of the total variance.

For instance, in the PC1 vs. PC2 plot, all Papuan populations are somewhere in-between the East Asian and European clusters. Now this is obviously not the case, as Papuans are not a mixture of the two. It is not until PC3, PC4, PC5, and PC6, before the Papuan populations form their own separate cluster. Since the eigenvectors of PC3 to PC6 explains for 13.12% of the sum, it follows that the distinction between Papuans and non-Africans is around 13.12% of the total variance in human populations.

Is my understanding correct?

Thanks.

I have a question about interpreting PCA plots from the Eurogenes site (Global 25). Is dividing the eigenvector of PC1 by the sum of eigenvectors (PC1 to PC25) equal to the proportion of variance explained by PC1 relative to the total variance?

I'll make an example from the Global 25 data. From the site, the eigenvectors from PC1 to PC25 are as follow:

129.557,103.13,14.222,10.433,9.471,7.778,5.523,5.3 25,4.183,3.321,2.637,2.246,2.21,1.894,1.842,1.758, 1.7,1.605,1.58,1.564,1.557,1.529,1.519,1.452,1.434

Total sum = 319.47

So, the proportion of eigenvector of PC1 is: 129.557/319.47 = 0.4055 (40.55%), and for the eigenvector of PC2 is: 103.13/319.47 = 0.3228 (32.28%). Hence, a PC1 vs. PC2 plot is able to explain 72.84% of the total variance, while a 3-dimensional PC1 vs. PC2 vs. PC3 plot is able to explain 77.29% of the total variance.

For instance, in the PC1 vs. PC2 plot, all Papuan populations are somewhere in-between the East Asian and European clusters. Now this is obviously not the case, as Papuans are not a mixture of the two. It is not until PC3, PC4, PC5, and PC6, before the Papuan populations form their own separate cluster. Since the eigenvectors of PC3 to PC6 explains for 13.12% of the sum, it follows that the distinction between Papuans and non-Africans is around 13.12% of the total variance in human populations.

Is my understanding correct?

Thanks.