clear
format compact
format short e
echo on
%Solution to Problem 22
%
%In this problem we are asked to do a
%combined linear and non-linear regression
%of a set of data.  We are trying to get
%two exponential growth factors and two
%preexponential constants.  The fitting
%function is of the form:
%
%  c = A exp(ka*t) + B exp(kb*t)
%
%We shall do the linear regression part in
%a separate function, and only do the non-linear
%regression (optimization) on the two rate
%constants.  This is much more efficient than
%doing the whole problem as a four parameter
%non-linear optimization problem.
pause

%We want to pass the arrays t, c, and the parameters
%A and B to the function linfit which does the linear
%part of the optimization.  Thus we use the global
%command:
global tpass cpass capass cbpass
pause

%First we read in the data:
load problem22.dat
t = problem22(:,1);
c = problem22(:,2);
tpass=t;
cpass=c;
%Note that the data should be saved under the
%name problem22.dat In the appropriate
%directory.  Let's plot this up:
figure(1)
plot(t,c,'o')
hold on
pause

%The routine for finding the optimum is very simple:
k=[1,-1];
k=fmins('linfit',k);
ca=capass;
cb=cbpass;
%The best fit values are thus:
[k(1),k(2),ca,cb]
pause

%We can plot this up:
plot(t,ca*exp(k(1)*t)+cb*exp(k(2)*t))
hold off
%which agrees very well.
pause

%It is interesting to see just how reliable this
%fit to the data is.  We can calculate the random
%error in the concentration data:
n=length(t);
cvar=linfit(k)/(n-4);
cstd=cvar^.5
%which is pretty small.
pause

%This does not really answer the question of how
%good the coefficients are, however, because it does
%not include the sensitivity of the coefficients
%to the error in the data.  The simplest way of
%calculating the matrix of covariance is to determine
%the sensitivity of the parameters to variations
%in the measured data points.  Effectively,
%this means that you are calculating grad F
%where F is the array of best fit parameters
%and the gradient is the gradient with respect
%to the measured value of each data point
%individually.  If the sensitivity to (or error
%in) each data point is small, then we can use
%our standard error propagation formula for
%arbitrary non-linear functions!
pause

%We must first calculate the matrix corresponding
%to grad F.  We do this numerically.
eps=0.1*cvar^.5;
%We have chosen eps to be scaled with the
%standard deviation of the data points.  Note that
%if we make it too small, we won't calculate the
%gradient properly due to the tolerance of fmins!
%We use this to compute the derivatives:
gradf=zeros(4,n);
for i=1:n
 cpass=c;
 cpass(i)=c(i)+eps;
 knew=fmins('linfit',k);
 gradf(1:2,i)=(knew-k)'/eps;
 gradf(3,i)=(capass-ca)/eps;
 gradf(4,i)=(cbpass-cb)/eps;
 echo off
end
echo on
pause

%We also need the variance in the measured
%concentrations.  We shall assume that this is
%constant, and that all of the measurements are
%independent.  We had gotten this already above
%as cvar.  Thus, we have:
varf=gradf*gradf'*cvar
%or the standard deviations:
[[k,ca,cb]',diag(varf).^.5]
%If the number of data points was very large,
%the time to compute grad F would have become
%prohibitive.  In that case it might be faster
%to use an alternative technique such as the
%bootstrap or undersampling.  In either case,
%it is much harder (computationally more
%expensive) to compute the error than it is to
%get the answer in the first place!
pause

%The 95% confidence intervals for the two growth
%rates are:
sig=diag(varf).^.5;
interval=[k'-2*sig(1:2),k'+2*sig(1:2)]
pause

%As a final note, we can examine (graphically) how
%variations in the two growth rates affect the fit to
%the data:
ka=[k(1)-1:.2:k(1)+1];
kb=[k(2)-1:.2:k(2)+1];
ni=length(ka);
nj=length(kb);
z=zeros(nj,nj);
for i=1:ni
 for j=1:nj
  z(i,j)=linfit([ka(i),kb(j)]);
  echo off
 end
end
echo on
figure(2)
contour(kb,ka,z,30)
pause

%Note that the sum of squares forms a trough
%rather than a sharp minimum.  For some data sets
%(particularly ones in which both growth rates are
%positive and not too different) the minimum is
%nearly impossible to find.  Such problems are
%close to degeneracy.
echo off


___________________________


function y=linfit(k)
%This function takes in the values of ka and kb
%from the non-linear optimization routine and
%computes the best fit values for the linear
%fitting parameters A and B.  The function returns
%the sum of the square of the deviation which
%we are trying to minimize.  We make use of the
%global command between this function and the
%main script so that we can pass the best fit
%values of A and B back and forth, and also pass
%in the data values.  The latter is done to minimize
%the data I/O.
%
%Thus:
global tpass cpass capass cbpass
clear a
t=tpass;
c=cpass;
a=[exp(k(1)*t),exp(k(2)*t)];
x=a\c;
capass=x(1);
cbpass=x(2);
y=norm(a*x-c)^2;




_____________________



         0    4.1077
    0.0500    3.8447
    0.1000    3.5597
    0.1500    3.2212
    0.2000    3.0120
    0.2500    2.8910
    0.3000    2.8429
    0.3500    2.3134
    0.4000    2.3515
    0.4500    2.5100
    0.5000    2.4260
    0.5500    2.3457
    0.6000    2.4561
    0.6500    2.3981
    0.7000    2.5175
    0.7500    2.8218
    0.8000    3.1638
    0.8500    3.3671
    0.9000    3.5864
    0.9500    4.1444
    1.0000    4.5336