# Least angle regression (LARS)

LARS is one of the most effective algorithms to select linear regression models. It is not so greedy as step-wise regression and takes not so much steps as Lasso.

Contents

% Least Angle Regression, demo


## Create synthetic data

We think of a model, see below the code. Try to guess the model with LARS.

% Assign a model
f = inline('w(1)*x.^0 + w(2)*x.^(1/2) + w(3)*x.^1', 'w','x'); % synthetic model
b = [1 -2 2]'; % parameters
x = [0.1:0.1:1]'; % independent variable
y = f(b,x); % dependent variable

% We use the following features to find the model we thought of
g = {'x.^0', 'x.^(1/2)', 'x.^1', 'x.^(3/2)', 'x.*log(x)'};
% make the sample set; columns are independent variables
X = eval([ '[' g{1} ' ' g{2} ' ' g{3} ' ' g{4} ' ' g{5} ']']);


## Run LARS

[n,p] = size(X);
muA = zeros(n,1);  % estimation of the dependent variable
beta = zeros(p,1); % estimation of parameters
betaLtst = beta';  % keep parameters in a storage

for i = 1:p
% correlation coefficients between each feature (column of X) and vector of residuals
c = X'*(y-muA); % note that columns of X are centered and normalized
[C, A] = max(abs(c)); % find maximal value of correlation and corresponding index j of column in X
% A = find(C == abs(c));
% Aplus = find(C==c); % never used
Sj  = sign(c(A)); % get sign of j-th correlation coefficient
XA = X(:,A).*(ones(n,1)*Sj'); %
% XA = X(:,A)*Sj; %
G = XA'*XA; % norm of XA
oA = ones(1,length(A)); % vector of ones
AA =(oA*G^(-1)*oA')^(-0.5); % inverse matrix in the normal equation

wA = AA*G^(-1)*oA'; % parameters to compute the unit bisector
uA = XA*wA; % compute the unit bisector
a = X'*uA; % product vector to compute new gamma
if i<p % for all columns of X but the last
M = [(C-c)./(AA-a);(C+c)./(AA+a)];
M(find(M<=0)) = +Inf;
gamma = min(M);
else
gamma  = C/AA;
end

muA = muA + gamma*uA; % make new approximation of the dependent variable
beta(A) = beta(A) + gamma*wA.*Sj; % make new parameters
betaLtst = [betaLtst; beta']; % store the parameters at k-th step
end


## Plot parameters at each step of LARS

s1 = sum(abs(betaLtst),2);
plot(s1, betaLtst);
hold on
plot(s1, betaLtst);
xlabel('sum of model parameters');
ylabel('parameters');
legend('\beta_3', '\beta_1', '\beta_2')
hold off ## Plot the results

idx = find(beta ~= 0);
disp('Recovered model contains')
g(idx)

% Recover LARS results using the last value of beta
y1 = X(:,idx)*beta(idx);

% Recover Least Squares result
Xa = X(:,idx);
y3 = Xa * pinv(Xa'*Xa)*Xa'*y;

plot(x,y,'*');
hold on
plot(x,y1,'r-');
plot(x,y3,'b-');
legend('Source data','LARS C_p metric', 'LARS selected')

Recovered model contains

ans =

'x.^0'    'x.^(1/2)'    'x.^1' ## Warning

% Note that Least Squares
w = pinv(X'*X)*X'*y
% give the same result (for this very example, of course)

w =

1.0000
-2.0000
2.0000
0.0000
-0.0000