Chapter 23

Chapter 23 Partial Least Squares Methods: Partial Least SquaresCorrelation and Partial Least Square RegressionHerve Abdi and Lynne J. WilliamsAbstractPartial least square(PLS) methods (also sometimes calledprojection to latent structures) relate the informationpresent in two data tables that collect measurements on the same set of proceed byderivinglatentvariableswhichare(optima l) to find the shared information between two tables, the approach is equivalent to a correlation problem andthe technique is then calledpartial least square correlation(PLSC) (also sometimes calledPLS-SVD). In thiscase there are two sets of latent variables (one set per table), and these latent variables are required to havemaximal covariance.

When the goal is topredictone data table the other one, the technique is then calledpartial least square regression . In this case there is one set of latent variables (derived from the predictor table)and these latent variables are required to give the best possible prediction. In this paper we present andillustratePLSCandPLSR andshowhowthesedescriptivemultivariatean alysistechniquescanbeextendedtodeal with inferential questions by using cross-validation techniques such as the bootstrap and words:Partial least square, Projection to latent structure,PLScorrelation,PLS-SVD,PLS-reg ression, Latent variable, Singular value decomposition, NIPALS method, Tucker inter-batteryanalysis1.

IntroductionPartial least square(PLS) methods (also sometimes calledprojectionto latent structures) relate the information present in two data tablesthat collect measurements on the same set of observations. Thesemethods were first developed in the late 1960s to the 1980s by theeconomist Herman Wold (55,56,57) but their main early area ofdevelopment were chemometrics (initiated by Herman s sonSvante, (59)) and sensory evaluation (34,35). The originalapproach of Herman Wold was to develop a least square algorithm(called NIPALS (56)) for estimating parameters in path analysisBrad Reisfeld and Arthur N. Mayeno (eds.),Computational Toxicology: Volume II, Methods in Molecular Biology, vol.

930,DOI ,#Springer Science+Business Media, LLC 2013549models (instead of the maximum likelihood approach used forstructural equation modeling such as, , LISREL). This firstapproach gave rise to partial least square path modeling (PLS-PM)which is still active today (see, , (26,48)) and can be seen as aleast square alternative for structural equation modeling (whichuses, in general, a maximum likelihood estimation approach).From a multivariate descriptive analysis point of view, however,most of the early developments ofPLSwere concerned with defininga latent variable approach to the analysis of two data tables describ-ing one set of observations. Latent variables are new variablesobtained as linear combinations of the original variables.

When thegoal is to find the shared information between these two tables, theapproach is equivalent to a correlation problem and the technique isthen called partial least square correlation (PLSC) (also sometimescalledPLS-SVD (31)). In this case there are two sets of latent vari-ables (one set per table), and these latent variables are required tohave maximal covariance. When is goal is topredictone data tablethe other one, the technique is then called partial least squareregression (PLSR, see (4,16,20,42)). In this case there is one setof latent variables (derived from the predictor table) and these latentvariables are computed to give the best possible prediction. Thelatent variables and associated parameters are often calleddimen-sion.

So, for example, for PLSC the first set of latent variables iscalled the first dimension of the this Chapter we will present PLSC and PLSR and illustratethem with an and their main goals aredescribed in NotationsData are stored in matrices which are denoted by upper case boldletters ( ,X). Theidentitymatrix is denotedI. Column vectorsFig. 1. Abdi and Williamsare denoted by lower case bold letters ( ,x). Matrix or vectortransposition is denoted by an uppercase superscriptT( ,XT).Two bold letters placed next to each other imply matrix or vectormultiplication unless otherwise mentioned. The number of rows,columns, or sub-matricesis denoted by an uppercase italic letter( ,I) and a given row, column, or sub-matrixis denoted by alowercase italic letter ( ,i).

PLSmethods analyze the information common to two first matrix is anIbyJmatrix denotedXwhose generic elementisxi,jand where the rows are observations and the columns arevariables. For PLSR theXmatrix contains the predictor variables( , independent variables). The second matrix is anIbyKmatrix,denotedY, whose generic element isyi,k. For PLSR, theYmatrixcontains the variables to be predicted ( , dependent variables). Ingeneral, matricesXandYare statistically preprocessed in order tomake the variables comparable. Most of the time, the columns ofXandYwill be rescaled such that the mean of each column is zero andits norm ( , the square root of the sum of its squared elements) isone.

When we need to mark the difference between the original dataand the preprocessed data, the original data matrices will be denotedXandYand the rescaled data matrices will be The Main Tool:The Singular ValueDecompositionThe main analytical tool forPLSis the singular value decomposition(SVD) of a matrix (see (3,21,30,47), for details and tutorials).Recall that theSVDof a givenJ KmatrixZdecomposes it intothree matrices as:Z UDVT XL d u vT (1)whereUis theJbyLmatrix of the normalized left singular vectors(withLbeing the rank ofZ),VtheKbyLmatrix of the normalizedright singular vectors,DtheLbyLdiagonal matrix of theLsingularvalues. Also,d ,u , andv are,respectively, the th singular value,left, and right singular vectors.

MatricesUandVare orthonormalmatrices ( ,UTU VTV I).TheSVDis closely related to and generalizes the well-knowneigen-decompositionbecauseUis also the matrix of the normalizedeigenvectors ofZZT,Vis the matrix of the normalized eigenvectorsofZTZ, and the singular values are the square root of theeigenvalues ofZZTandZTZ(these two matrices have the sameeigenvalues).Key property:theSVDprovides the best reconstitution(in a least squaressense) oftheoriginalmatrixbya matrixwitha lowerrank (for more details, see, , (1 3,47)).23 Partial Least Squares Methods: Partial Least Partial LeastSquares CorrelationPLSC generalizes the idea of correlation between two variables totwo tables.

It was originally developed by Tucker (51), and refinedby Bookstein (14,15,46). This technique is particularly popular inbrain imaging because it can handle the very large data sets gener-ated by these techniques and can easily be adapted to handlesophisticated experimental designs (31,38 41). For PLSC, bothtables play a similar role ( , both are dependent variables) and thegoal is to analyze the informationcommonto these two tables. Thisis obtained by deriving two new sets of variables (one for each table)called latent variables that are obtained as linear combinations ofthe original variables. These latent variables, which describe theobservations, are required to explain the largest portion of thecovariancebetween the two tables.

Chapter 23

Tags:

Information

Advertisement

Transcription of Chapter 23

Related search queries

Chapter 23

Tags:

Information

Advertisement

Documents from same domain

Related documents

Related search queries