There are various ways to implement PLS, including the NIPALS, SIMPLS and the bi-diagonalizaton method of Rolf Manne. Here I provide a code based Wold’s 2001 paper: Chemometr. Intell. Lab. 58(2001)109-130.
X: data matrix of size n x p
Y: response variable of size n x 1
A: the number of PLS components to extract, which is usually optimized by cross validation.
B: a p-dimensional regression vector, where p equals the number of columns in X. If you want to add an intercept in your model, just add an additional column of ones to X.
T: PLS component or score matrix of size n x A. Can be thought of dimension-reduced representation of X. Similar to principal components in PCA but obtained in a different way.
Wstar: [Wstar1, Wstar2,…,WstarA], weight matrix to calculate T from original input X. Mathematically, T=XWstar.
W: [W1, W2,…,WA], weight matrix to calculate T from the residual-X at each iteration. Note that W is different from Wstar in addition to W1=Wstar1.
P: Loading matrix. X=TP’+E
R2X: a A-dimensional vector, records the explained variance of X by each PLS component
R2Y: a A-dimensional vector, records the explained variance of Y by each PLS component
Code: copy the whole below and save as a function.
%+++ The NIPALS algorithm for both PLS-1 (a single y) and PLS-2 (multiple Y)
%+++ The model is assumed to be: Y=XB+E,where E is random errors.
%+++ X: n x p matrix
%+++ Y: n x m matrix
%+++ A: number of latent variables
%+++ Code: Hongdong Li, firstname.lastname@example.org, Feb, 2014
%+++ reference: Wold, S., M. Sj?str?m, and L. Eriksson, 2001. PLS-regression: a basic tool of chemometrics,
% Chemometr. Intell. Lab. 58(2001)109-130.
while (error>1e-8 && niter<1000) % for convergence test
q=Y’*t/(t’*t); % regress Y against t;
%+++ calculate explained variance