math::PCA(3tcl) Principal Components Analysis math::PCA(3tcl)
______________________________________________________________________________
NAME
math::PCA - Package for Principal Component Analysis
SYNOPSIS
package require Tcl ?8.6?
package require math::linearalgebra 1.0
::math::PCA::createPCA data ?args?
$pca using ?number?|?-minproportion value?
$pca eigenvectors ?option?
$pca eigenvalues ?option?
$pca proportions ?option?
$pca approximate observation
$pca approximatOriginal
$pca scores observation
$pca distance observation
$pca qstatistic observation ?option?
______________________________________________________________________________
DESCRIPTION
The PCA package provides a means to perform principal components analy-
sis in Tcl, using an object-oriented technique as facilitated by TclOO.
It actually defines a single public method, ::math::PCA::createPCA,
which constructs an object based on the data that are passed to perform
the actual analysis.
The methods of the PCA objects that are created with this command allow
one to examine the principal components, to approximate (new) observa-
tions using all or a selected number of components only and to examine
the properties of the components and the statistics of the approxima-
tions.
The package has been modelled after the PCA example provided by the
original linear algebra package by Ed Hume.
COMMANDS
The math::PCA package provides one public command:
::math::PCA::createPCA data ?args?
Create a new object, based on the data that are passed via the
data argument. The principal components may be based on either
correlations or covariances. All observations will be nor-
malised according to the mean and standard deviation of the
original data.
list data
- A list of observations (see the example below).
list args
- A list of key-value pairs defining the options. Cur-
rently there is only one key: -covariances. This indi-
cates if covariances are to be used (if the value is 1)
or instead correlations (value is 0). The default is to
use correlations.
The PCA object that is created has the following methods:
$pca using ?number?|?-minproportion value?
Set the number of components to be used in the analysis (the
number of retained components). Returns the number of compo-
nents, also if no argument is given.
int number
- The number of components to be retained
double value
- Select the number of components based on the minimum
proportion of variation that is retained by them. Should
be a value between 0 and 1.
$pca eigenvectors ?option?
Return the eigenvectors as a list of lists.
string option
- By default only the retained components are returned.
If all eigenvectors are required, use the option -all.
$pca eigenvalues ?option?
Return the eigenvalues as a list of lists.
string option
- By default only the eigenvalues of the retained compo-
nents are returned. If all eigenvalues are required, use
the option -all.
$pca proportions ?option?
Return the proportions for all components, that is, the amount
of variations that each components can explain.
$pca approximate observation
Return an approximation of the observation based on the retained
components
list observation
- The values for the observation.
$pca approximatOriginal
Return an approximation of the original data, using the retained
components. It is a convenience method that works on the com-
plete set of original data.
$pca scores observation
Return the scores per retained component for the given observa-
tion.
list observation
- The values for the observation.
$pca distance observation
Return the distance between the given observation and its ap-
proximation. (Note: this distance is based on the normalised
vectors.)
list observation
- The values for the observation.
$pca qstatistic observation ?option?
Return the Q statistic, basically the square of the distance,
for the given observation.
list observation
- The values for the observation.
string option
- If the observation is part of the original data, you
may want to use the corrected Q statistic. This is
achieved with the option "-original".
EXAMPLE
TODO: NIST example
BUGS, IDEAS, FEEDBACK
This document, and the package it describes, will undoubtedly contain
bugs and other problems. Please report such in the category PCA of the
Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist]. Please also
report any ideas for enhancements you may have for either package
and/or documentation.
When proposing code changes, please provide unified diffs, i.e the out-
put of diff -u.
Note further that attachments are strongly preferred over inlined
patches. Attachments can be made by going to the Edit form of the
ticket immediately after its creation, and then using the left-most
button in the secondary navigation bar.
KEYWORDS
PCA, math, statistics, tcl
CATEGORY
Mathematics
tcllib 1.0 math::PCA(3tcl)