There are various ways to implement PLS, including the NIPALS, SIMPLS and the bi-diagonalizaton method of Rolf Manne. Here I provide a code based Wold’s 2001 paper: Chemometr. Intell. Lab. 58(2001)109-130.

**Input**:

**X**: data matrix of size n x p

**Y**: response variable of size n x 1

A: the number of PLS components to extract, which is usually optimized by cross validation.

**Output:**

**B**: a p-dimensional regression vector, where p equals the number of columns in **X**. If you want to add an intercept in your model, just add an additional column of ones to **X**.

**T**: PLS component or score matrix of size n x A. Can be thought of dimension-reduced representation of **X**. Similar to principal components in PCA but obtained in a different way.

**Wstar: [Wstar1, Wstar2,…,WstarA],** weight matrix to calculate **T** from original input **X**. Mathematically, **T=XWstar**.

**W**: **[W1, W2,…,WA], **weight matrix to calculate **T** from the *residual- X* at each iteration. Note that

**W**is different from

**Wstar**in addition to

**W1=Wstar1**.

**P**: Loading matrix. **X=TP’+E**

**R2X:** a A-dimensional vector, records the explained variance of **X** by each PLS component

**R2Y**: a A-dimensional vector, records the explained variance of **Y** by each PLS component

**Code**: copy the whole below and save as a function.

function [B,Wstar,T,P,Q,W,R2X,R2Y]=pls_basic(X,Y,A)

%+++ The NIPALS algorithm for both PLS-1 (a single y) and PLS-2 (multiple Y)

%+++ The model is assumed to be: Y=XB+E,where E is random errors.

%+++ X: n x p matrix

%+++ Y: n x m matrix

%+++ A: number of latent variables

%+++ Code: Hongdong Li, lhdcsu@gmail.com, Feb, 2014

%+++ reference: Wold, S., M. Sj?str?m, and L. Eriksson, 2001. PLS-regression: a basic tool of chemometrics,

% Chemometr. Intell. Lab. 58(2001)109-130.

varX=sum(sum(X.^2));

varY=sum(sum(Y.^2));

for i=1:A

error=1;

u=Y(:,1);

niter=0;

while (error>1e-8 && niter<1000) % for convergence test

w=X’*u/(u’*u);

w=w/norm(w);

t=X*w;

q=Y’*t/(t’*t); % regress Y against t;

u1=Y*q/(q’*q);

error=norm(u1-u)/norm(u);

u=u1;

niter=niter+1;

end

p=X’*t/(t’*t);

X=X-t*p’;

Y=Y-t*q’;

%+++ store

W(:,i)=w;

T(:,i)=t;

P(:,i)=p;

Q(:,i)=q;

end

%+++ calculate explained variance

R2X=diag(T’*T*P’*P)/varX;

R2Y=diag(T’*T*Q’*Q)/varY;

Wstar=W*(P’*W)^(-1);

B=Wstar*Q’;

Q=Q’;

%+++

ECHi – since I recently upgraded to R 3.2.2 I can no longer use your CARSPLS package (as it was built prior to R 3.0.0) . Will you be updating this (great!) package in the near future so it is usable again?

Thanks.

adminPost authorSure, rebuilt. Check out:

http://www.libpls.net/codes/carspls_1.0.001.tgz

XuHow do you predict a test sample?

LPost authorSee Section 1.2 at: http://www.libpls.net/usage.php

I hope this helps.

DixonHI, I tried to install the package give in the above link to R in windors. but I couldn’t do that . here is the error I got

Error in read.dcf(file.path(pkgname, “DESCRIPTION”), c(“Package”, “Type”)) :

cannot open the connection

In addition: Warning messages:

1: In unzip(zipname, exdir = dest) : error 1 in extracting from zip file

2: In read.dcf(file.path(pkgname, “DESCRIPTION”), c(“Package”, “Type”)) :

cannot open compressed file ‘carspls_1.0.001.tgz/DESCRIPTION’, probable reason ‘No such file or directory’

Can you please guide me where can I find the correct package ?

LPost authorHi Dixon, please use the libPLS in MATLAB. The R package still needs development, and I haven’t maintain it for a long time.