Dimensionality Reduction, subtle art of Principal Component Analysis (PCA)
We used to live in 2D or 3D world and we fell comfortable with this.
It's not difficult, even for a child to read some 2D or a 3D plot without any special training.
![](https://steemitimages.com/DQmao3qfNkqSsNp2mN1RVtipVMPVPb3DwfWXpFJPXQT99R3/image.png)
But how can we represent 5 or 10 dimensions?
Let's take the example - this flower called iris:
![](https://steemitimages.com/640x0/https://steemitimages.com/DQmSNnpYnjLeCZFaoVwxGHZeyMZiDdcuE2kQtHD1pnyf3tH/image.png)
And we could measure multiple parameters: height, width, the color of each individual segment... The same for leaves, and the whole plant. So it's easy to have some dataset with 100 samples (plants) and 10 dimensions (parameters for each)
How to imagine 10-dimensional plot?
Coordinates, X-Y-Z, those are 3...
We could have different colors for the 4 th dimension.
A different temperature for the 5 th.
Or a different texture (smooth vs rough) for the 6th dimension...
You name other 4 dimensions (smell, moisture, solidity...).
But yes I agree that it's impossible to do something useful with such representation
How can we reduce the dimensionality?
Imagine the simple XY scatter-plot, like this one:
![](https://steemitimages.com/DQmczRwBUyZfeRS4kidG81zentz5Sf4Nh9GYBt9QT8WmEtb/image.png)
Each dot can be represented with two coordinates, X and Y.
But...
We could rotate the axes in such a manner that we "fit" the values on the new X-axis.
In that case, we will have high variability along that new X-axis and basically no variability along the new Y-axis.
In other words, the dimensionality was reduced from 2 dimensions to 1 dimension.
Why do we need this?
Let's see the example from my old paper.
We wanted to see how the elements are distributed in plants.
So, we did XRF spectroscopy with imaging and we got the data for various elements
![](https://steemitimages.com/640x0/https://steemitimages.com/DQmZhQnaVvGdJokkzhsiNPJbaLY7rHacZFfSEhmPYfCbAia/image.png)
But how those elements are connected?
Do some elements appear together?
We can see that K and Cl are together, as well as P and Ca and the Mn, Cu and Zn, while the Fe is the outsider.
And we can observe this from 2D plot, althoug initially we had 8-dimensional dataset
Similar analyses?
- ICA
- MCR-ALS
- PARAFAC
- ICALab
References
Dučić, Tanja, et al. "Enhancement in statistical and image analysis for in situ µSXRF studies of elemental distribution and co-localization, using Dioscorea balcanica." Journal of synchrotron radiation 20.2 (2013): 339-346. pdf
Kaiser, Henry F. "The varimax criterion for analytic rotation in factor analysis." Psychometrika 23.3 (1958): 187-200. pdf
@originalworks
The @OriginalWorks bot has determined this post by @alexs1320 to be original material and upvoted it!
To call @OriginalWorks, simply reply to any post with @originalworks or !originalworks in your message!