There are various ways to implement PLS, including the NIPALS, SIMPLS and the bi-diagonalizaton method of Rolf Manne. Here I provide a code based Wold’s 2001 paper: Chemometr. Intell. Lab. 58(2001)109-130.
Input:
X: data matrix of size n x p
Y: response variable of size n x 1
A: the number of PLS components to extract, which is usually optimized by cross validation.
Output:
B: a p-dimensional regression vector, where p equals the number of columns in X. If you want to add an intercept in your model, just add an additional column of ones to X.
T: PLS component or score matrix of size n x A. Can be thought of dimension-reduced representation of X. Similar to principal components in PCA but obtained in a different way.
Wstar: [Wstar1, Wstar2,…,WstarA], weight matrix to calculate T from original input X. Mathematically, T=XWstar.
W: [W1, W2,…,WA], weight matrix to calculate T from the residual-X at each iteration. Note that W is different from Wstar in addition to W1=Wstar1.
P: Loading matrix. X=TP’+E
R2X: a A-dimensional vector, records the explained variance of X by each PLS component
R2Y: a A-dimensional vector, records the explained variance of Y by each PLS component
Code: copy the whole below and save as a function.
function [B,Wstar,T,P,Q,W,R2X,R2Y]=pls_basic(X,Y,A)
%+++ The NIPALS algorithm for both PLS-1 (a single y) and PLS-2 (multiple Y)
%+++ The model is assumed to be: Y=XB+E,where E is random errors.
%+++ X: n x p matrix
%+++ Y: n x m matrix
%+++ A: number of latent variables
%+++ Code: Hongdong Li, lhdcsu@gmail.com, Feb, 2014
%+++ reference: Wold, S., M. Sj?str?m, and L. Eriksson, 2001. PLS-regression: a basic tool of chemometrics,
% Chemometr. Intell. Lab. 58(2001)109-130.
varX=sum(sum(X.^2));
varY=sum(sum(Y.^2));
for i=1:A
error=1;
u=Y(:,1);
niter=0;
while (error>1e-8 && niter<1000) % for convergence test
w=X’*u/(u’*u);
w=w/norm(w);
t=X*w;
q=Y’*t/(t’*t); % regress Y against t;
u1=Y*q/(q’*q);
error=norm(u1-u)/norm(u);
u=u1;
niter=niter+1;
end
p=X’*t/(t’*t);
X=X-t*p’;
Y=Y-t*q’;
%+++ store
W(:,i)=w;
T(:,i)=t;
P(:,i)=p;
Q(:,i)=q;
end
%+++ calculate explained variance
R2X=diag(T’*T*P’*P)/varX;
R2Y=diag(T’*T*Q’*Q)/varY;
Wstar=W*(P’*W)^(-1);
B=Wstar*Q’;
Q=Q’;
%+++
Hi – since I recently upgraded to R 3.2.2 I can no longer use your CARSPLS package (as it was built prior to R 3.0.0) . Will you be updating this (great!) package in the near future so it is usable again?
Thanks.
Sure, rebuilt. Check out:
http://www.libpls.net/codes/carspls_1.0.001.tgz
How do you predict a test sample?
See Section 1.2 at: http://www.libpls.net/usage.php
I hope this helps.
HI, I tried to install the package give in the above link to R in windors. but I couldn’t do that . here is the error I got
Error in read.dcf(file.path(pkgname, “DESCRIPTION”), c(“Package”, “Type”)) :
cannot open the connection
In addition: Warning messages:
1: In unzip(zipname, exdir = dest) : error 1 in extracting from zip file
2: In read.dcf(file.path(pkgname, “DESCRIPTION”), c(“Package”, “Type”)) :
cannot open compressed file ‘carspls_1.0.001.tgz/DESCRIPTION’, probable reason ‘No such file or directory’
Can you please guide me where can I find the correct package ?
Hi Dixon, please use the libPLS in MATLAB. The R package still needs development, and I haven’t maintain it for a long time.