clear
echo on
%Solution to problem 25

%In this problem we are asked to integrate
%data where some of the data points are missing.
%The first thing to do is to save the data points
%under the file name waste.dat.  We also replace
%all of the *****'s with Matlab's equivalent 'NaN'.
%This way we can load in all of the data:

load 'waste.dat'
time=waste(:,1);
conc=waste(:,2);
flow=waste(:,3);
pause

%We now plot this stuff up:
plot(time,conc,'o',time,flow,'*')
xlabel('time (hours)')
ylabel('concentration and flow rates')
legend('concentration','flow rate')
pause

%As may be seen, there is a significant variability
%in both concentration and flow rate.  In order to
%determine the total flux, it is necessary to fill
%in the missing points.  We are actually missing a
%number of data points - in addition to the three
%missing flow rates, we are also missing the concentration
%and flow rate at t=0, which would be needed if we
%wanted to use an integration rule such as trapezoidal
%or simpson's rule.
pause

%To estimate this missing information we can do many
%different approximations.  One way of doing it is to
%fit the data to a model and interpolate.  The data
%appear to be roughly sinusoidal with two periods per
%day, so we shall do it that way.  First we do the
%concentration since that isn't missing any data:

n=length(time);
ac=[ones(n,1),sin(2*pi/12*time),cos(2*pi/12*time)];
kc=inv(ac'*ac)*ac';
xc=kc*conc;
pause

%and then we do the flow rate.  We have to find the
%times where we actually have measurements:

i=find(finite(flow));
nf=length(i);
tf=time(i);
af=[ones(nf,1),sin(2*pi/12*tf),cos(2*pi/12*tf)];
kf=inv(af'*af)*af';
xf=kf*flow(i);
pause

%and plotting it up:
tp=[0:.1:24]';
ap=[tp.^0,sin(2*pi/12*tp),cos(2*pi/12*tp)];
hold on
plot(tp,[ap*xc,ap*xf])
hold off

pause

%From this plot we can estimate the variability in
%both the concentration and the flowrate.  We calculate
%this in the usual way:
sigconc=norm(ac*xc-conc)/(n-3)^.5
sigflow=norm(af*xf-flow(i))/(nf-3)^.5

%So the variability in the concentration is quite a bit
%higher than that of the flow rate.
pause

%OK, now for estimating the replacement data:
irep=find(isnan(flow));
frep=[ones(length(irep),1),sin(2*pi/12*time(irep)),cos(2*pi/12*time(irep))]*xf;
fall=flow;
fall(irep)=frep;

%And the values at zero:

tall=[0;time];
call=[[1 0 1]*xc;conc];
fall=[[1 0 1]*xf;fall];

%Which we can plot up:
hold on
plot(tall,call,'+b',tall,fall,'+g')
hold off
pause

%And now we do the integral itself.  We use Simpson's rule
%for simplicity, since we anticipate that the scatter in the
%data will lead to a far larger error than the algorithm error
%of this rule.
w=ones(1,n+1);
w(2:2:n)=4*w(2:2:n);
w(3:2:n-1)=2*w(3:2:n-1);
w=w/3;
%With the total release:
total=w*(call.*fall)
pause

%To complete the problem, we also need an estimate of the error.
%The algorithm error will be very small.  We must therefore obtain
%an estimate of the random error.  The simplest way of doing this
%is to use the vector form we have discussed in class.  To do this
%we need to take the gradient of the total release as a function of
%the variation in the concentration and the flow rate.  The easiest
%way to do this is to take all of the above executable code and
%dump it in a function totalwaste(t,c,f).  We then just vary the known
%concentrations and flow rates:

ep=0.001;
gradc=zeros(1,n);
for j=1:n
 c=conc;
 c(j)=c(j)+ep;
 gradc(j)=(totalwaste(time,c,flow)-total)/ep;
 echo off
end
echo on

gradf=zeros(1,nf);
for j=1:nf
 f=flow;
 f(i(j))=f(i(j))+ep;
 gradf(i(j))=(totalwaste(time,conc,f)-total)/ep;
 echo off
end
echo on
pause

%The variance can then be easily calculated from these
%gradients:
var=(norm(gradc)*sigconc)^2+(norm(gradf)*sigflow)^2;

%or, putting it all together:
%
echo off
disp('total release, error')
format compact
disp([total,var^.5])
format loose
echo on
%Which is a pretty small error - only around 4% or so.
%Your results may differ from this if you use a different
%algorithm, but it is a justifiable answer.
pause

%An alternative approach is to assume that the concentration
%and flow rate -is- purely sinusoidal, and to use that in
%analytically estimating the integral.  We can use the fitted
%values:
release=24*(xc(1)*xf(1)+0.5*xc(2)*xf(2)+0.5*xc(3)*xf(3))
%which is about half a percent higher than was estimated using
%Simpson's rule and direct integration of the data - pretty
%close!
pause

%We can get the error in the calculated release from the matrix
%of covariance of the fitting parameters:
varxf=sigflow^2*kf*kf';
varxc=sigconc^2*kc*kc';
varx=[[varxc,zeros(3,3)];[zeros(3,3),varxf]];

%and the gradient:
gradrelease=24*[xf(1),0.5*xf(2),0.5*xf(3),xc(1),0.5*xc(2),0.5*xc(3)];
varrelease=gradrelease*varx*gradrelease';

%Thus we get the final result:
echo off
disp('total release, error')
format compact
disp([release,varrelease^.5])
format loose
echo on

%Which is almost exactly the same as calculated via the earlier
%algorithm!  Note that this time we didn't interpolate the data,
%but we did assume that the data was purely sinusoidal over one
%day, evidently a pretty good assumption.
echo off



______________________

function total=totalwaste(t,c,f)
%This function takes all of the executable lines of the example
%code and uses them to compute the total waste released as a
%function of the arrays t, c, and f.  The code is copied without
%modification except for the first three lines, assigning time,
%conc, and flow to the input variables.
time=t;
conc=c;
flow=f;

%The rest of this is identical to the code in the main script.
n=length(time);
ac=[ones(n,1),sin(2*pi/12*time),cos(2*pi/12*time)];
xc=ac\conc;
i=find(finite(flow));
nf=length(i);
tf=time(i);
af=[ones(nf,1),sin(2*pi/12*tf),cos(2*pi/12*tf)];
xf=af\flow(i);
irep=find(isnan(flow));
frep=[ones(length(irep),1),sin(2*pi/12*time(irep)),cos(2*pi/12*time(irep))]*xf;
fall=flow;
fall(irep)=frep;
tall=[0;time];
call=[[1 0 1]*xc;conc];
fall=[[1 0 1]*xf;fall];
w=ones(1,n+1);
w(2:2:n)=4*w(2:2:n);
w(3:2:n-1)=2*w(3:2:n-1);
w=w/3;
%With the total release:
total=w*(call.*fall);


____________________________


Store the following data under the name waste.dat


        1       6.0674          4.2844
        2       5.9325          6.4692
        3       8.1253          8.0648
        4       7.8858          10.4524
        5       5.3535          NaN
        6       6.1909          NaN
        7       4.6892          8.5155
        8       2.3643          6.3340
        9       2.3273          4.4329
        10      2.5766          2.7618
        11      3.3133          2.5798
        12      5.7258          3.2963
        13      5.9117          4.5298
        14      9.7813          6.6405
        15      7.8636          8.2091
        16      7.7120          NaN
        17      7.5668          10.7833
        18      5.0593          10.1449
        19      3.4044          8.4611
        20      1.5696          6.5678
        21      2.2944          4.7942
        22      1.0657          2.5951
        23      4.2143          2.6029
        24      6.6236          2.5862