1. Not finding help here? Sign up for a free 30min tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Matlab data-analysis problems (systematic error, distributions)

  1. Oct 20, 2015 #1
    The problem statement, all variables and given/known data

    I have a few data-analysis problems due to Thursday,

    1. Assume that a sandbag is dropped at different heights and the observations are (z
    i;ti) pairs. Physical model for a free fall is z=½gt2. Assume that the height measurement z has an additive random error v. The observation model is then z=h(θ;t) +v,where θ is the unknown parameter vector. Write the observation models (withpaper and pen) for the following cases by treating systematic errors as model parameters:
    a) height z has a systematic error z0 (zobs=zreal+z0)
    b) falling time thas a systematic error t0 (tobs=treal+t0)
    c) both z and t have additive systematic errors. Simulate results by using Matlab for graphic presentation.

    2. Assume that a sandbag is dropped at different heights and the observations are (zi,ti)pairs. It’s known that the clock advances p percents, but the clock can be considered precise (no random error for t). In addition, assume that the height z has a systematic error z0. Write the (linear) observation model withpaper and pen. How would you solve a value for g by using the model?

    4.
    Simulate with
    Matlab the distributions of:
    y1=sin(x)
    y2=cos(x)
    y3=log(x)
    when x is uniformly distributed between [−π, π] in Eq. (1) and in Eq. (2), and uniformly distributed
    between [1,2] in Eq. (3).

    5. The second order polynomial function is of the form
    y=p(1)x2+(2)x+p(3). The exact roots of the polynomial are x1=1 and x2=2. For some reason, there is a uniformly distributed error in the observed polynomial coefficients p(1), p2) and p(3). Here, we examine the distribution of observed roots in that case. (Matlab simulation)

    a) Solve the exact polynomial coefficients p(1), p(2) and p(3) (command poly) and plot the polynomial y
    in range x∈[−1,...,4] (command polyval)
    b) Generate N= 5000 observations for polynomial coefficients with uniformly distributed noise.
    Solve the roots for each observation (command roots).
    c) Plot a histogram of observed roots x1 and x2

    The attempt at a solution

    1.
    a.)
    z=h(θ;t) +v +z0
    b.) z=h(θ;t+t0) +v
    c.) z=h(θ;t+t0) +v+z0

    %% Problem 1/a

    g=10; % gravity constant
    h_err=2; % systematic error of height
    t=[0.3 0.5 0.7 1 2 2.5 3 3.7 3.8 4]';
    tt=[0:.1:4]';

    h=0.5*g*t.^2 + h_err + 10*randn(size(t));

    H=[0.5*t.^2 ones(size(t))];

    th=H\h

    HH=[0.5*tt.^2 ones(size(tt))];

    figure(1),clf
    plot(t,h,'+',tt,HH*th)
    xlabel 't'
    ylabel 'h'

    %% Problem 1/b

    g=10; % gravity constant
    t_err=2; % systematic error of time
    t=[0.3 0.5 0.7 1 2 2.5 3 3.7 3.8 4]';
    tt=[0:.1:4]';

    h=0.5*g*(t+t_err).^2 + 10*randn(size(t));

    H=[0.5*t.^2 ones(size(t))];

    th=H\h;

    HH=[0.5*tt.^2 ones(size(tt))];

    figure(2),clf
    plot(t,h,'+',tt,HH*th)
    xlabel 't'
    ylabel 'h'

    %% Problem 1/c


    g=10; % gravity constant
    h_err=2; % systematic error of height
    t_err=0;5; % systematic error of time
    t=[0.3 0.5 0.7 1 2 2.5 3 3.7 3.8 4]';
    tt=[0:.1:4]';

    h=0.5*g*(t+t_err).^2 + h_err + 10*randn(size(t));

    H=[0.5*t.^2 ones(size(t))];

    th=H\h;

    HH=[0.5*tt.^2 ones(size(tt))];

    figure(3),clf
    plot(t,h,'+',tt,HH*th)
    xlabel 't'
    ylabel 'h'

    %% Problem 2

    g=10; % gravity constant
    t_err=0;01*t; % systematic error of time, e.g.clock is advancing one percent
    t=[0.3 0.5 0.7 1 2 2.5 3 3.7 3.8 4]';
    tt=[0:.1:4]';

    h=0.5*g*(t+t_err).^2 + h_err;

    H=[0.5*t.^2 ones(size(t))];

    th=H\h

    HH=[0.5*tt.^2 ones(size(tt))];

    figure(4),clf
    plot(t,h,'+',tt,HH*th)
    xlabel 't'
    ylabel 'h'

    %% Problem 4

    x_1=unifdist(-pi,pi,2,1);

    y_1=sin (x_1);
    y_2=cos (x_1);


    x_3=unifdist(1,2,2,1);
    y_3=log (x_3);

    I will appreciate any help as I am not very familiar with matlab yet \\ could sb please check what I've done so far
    p.s. not sure if this is the right forum section for matlab-based questions
     
    Last edited: Oct 20, 2015
  2. jcsd
  3. Oct 20, 2015 #2

    RUber

    User Avatar
    Homework Helper

    There is a lot in this post.
    Let me start with 1a.
    You have an additive random error in h, and a systematic (constant) error in h, right?
    h=0.5*g*t.^2 + h_err + 10*randn(size(t));
    You have 10 * randn, which gives you a normally distributed error with standard deviation of 10. Is that v? That seems really high for an additive random error, but I can work with it.

    In 1b, you are using (time + time_error) to compute height. This does not take into consideration that you have data collected for z, right? Is that supposed to be the case?
    ---
    I suppose I am confused about what your output is supposed to represent. Are the red curves your 2nd degree polynomial best fit curves for the data, z_obs, t_obs? That would make the most sense...Maybe adding a second curve to show the true relation might be helpful as well for visualizing the impact of these errors.

    The way I see this is:
    Given data ## (z_i, t_i), ## and model ##z =\frac12 g t^2 = h(\theta,t)## You have a model (based on accurate measurements of z and t): ##z_{obs} = \frac12 g t_{true}^2 + v## where v is your random error ##v \sim norm(0,10)##.
    I used the 10 as the standard deviation based on your model.

    If you add in measurement error in z_i, then your model becomes
    ##z_{true} =\frac12 g t^2##, and ##z_{obs} = z_{true} + v + z_0## so...
    ##z_{obs} = \frac12 g t_{true}^2 + v+ z_0##

    If you add in measurement error in t_i, then what happens?
    ##z_{true} =\frac12 g t_{true}^2##
    ##z_{obs} = \frac12 g t_{true}^2 + v+ z_0##
    But what is t_true? ##t_{obs} = t_{true}+t_0##
    ##z_{obs} = \frac12 g (t_{obs}-t_0 )^2 + v+ z_0##
    If you expand that out, you get:
    ##z_{obs} = \frac12 g (t_{obs}^2-2t_{obs}t_0+t_0^2 ) + v+ z_0##
    This should make sense, because your data for z should not be impacted by your error in measuring time.
    Make the plot of z_obs vs. t_obs, that should be your observational model.
    What should be changed is maybe your estimation of the curve, since all your times are shifted right by 2 (using your constant error of 2).
     
  4. Oct 20, 2015 #3
    Yes, v is 'random error' which is normally distributed. I can make it's deviation smaller so let it be 1.
    I think so.
    They should be. :oldsmile:

    How should I change it?
     
  5. Oct 20, 2015 #4

    RUber

    User Avatar
    Homework Helper

    Your t data set in Matlab is the true t, right?
    But your observational model should be based off of observed z and observed t.
    So, your observed t should b the true t plus your error.
    Your observed z will not be affected by any error in t, so you can calculate that with the true t data.
    Then you fit your curve to the observed data.

    Specifically, I would change your line for h to give you the observed z values ... this should be (for part b)
    h=0.5*g*(t).^2 + 10*randn(size(t));
    See how it is the true function of t plus the sampling error in measuring the height?

    Then your H data is the prediction from the model based on your observed time data...so this should be:
    H=[0.5*(t+t_err).^2 ones(size(t))];
    See how it is the function of the observed time?

    Then in your plots, I would change the + to appear at the observed time and height.
    plot(t+t_err,h,'+',tt,HH*th)

    But you will notice that your reference line for the fit will end early, so then add t_err to the end of your range for tt.
    tt=[0:.1:4+t_err]';

    These edits should help you understand how this model is actually working. Apply that to fix 1c.
     
  6. Oct 20, 2015 #5

    RUber

    User Avatar
    Homework Helper

    For problem 4, I do not have a function called unifdist in my Matlab version. What do the 2,1 at the end of the function refer to?
    I normally use rand() to call a uniform distribution between 0 and 1.
    You can adjust this by multiplying by the size of the interval and adding/subtracting to recenter. So for -pi to pi, you would have
    x1 = rand(N,1)*2*pi - pi
    N is the number of points you want to generate.

    Does your unifdist function allow you to specify how many points you want to generate?
    What does simulate the distribution mean to you? Is this something that you would sample from later? The hist(y) function will help to visualize the functions' effects on the uniform distribution if you generate enough points to see the pattern.
     
  7. Oct 21, 2015 #6

    RUber

    User Avatar
    Homework Helper

    Have you had any luck on this set of problems?
     
  8. Oct 21, 2015 #7
    Yes, it is a function I created to pick N uniformly distributed numbers. But now I changed it to
    x_1 = rand(N,1)*2*pi - pi
    x_2= rand(N,1)*2 - 1.5

    Well, not really. I am almost giving up :/

    matlab is giving error messages and only plots one graph
     
  9. Oct 21, 2015 #8

    RUber

    User Avatar
    Homework Helper

    Sorry to hear you are so frustrated. I will comment more explicitly on your first two codes to help you see what to do. 1a doesn't need to be changes, I just point out what the lines are doing.
    Essentially, your h is your observed z values and H is your estimated z based on t_obs. In 1a, your t_obs is accurate, so no problems. In 1b, you used t_obs in your line for h, and that was a problem.

    For 1a, You have:
    ----
    %% Problem 1/a
    g=10; % gravity constant
    h_err=2; % systematic error of height
    t=[0.3 0.5 0.7 1 2 2.5 3 3.7 3.8 4]';
    tt=[0:.1:4]'; %This is the range of times you want to plot the curve on

    h=0.5*g*t.^2 + h_err + 10*randn(size(t)); %This is creating your Z_obs

    H=[0.5*t.^2 ones(size(t))]; %This is creating your Z estimates based on t_obs

    th=H\h

    HH=[0.5*tt.^2 ones(size(tt))]; %This is your best fit curve for the data at points in tt

    figure(1),clf
    plot(t,h,'+',tt,HH*th) %+ is (observed t, observed z), line is best fit curve.
    xlabel 't'
    ylabel 'h'
    ----
    I see no real issues with this one.
    -----
    For 1b, you have:
    -----
    %% Problem 1/b
    g=10; % gravity constant
    t_err=2; % systematic error of time
    t=[0.3 0.5 0.7 1 2 2.5 3 3.7 3.8 4]'; % t + t_err is your observed time
    tt=[0:.1:4]'; % Add t_err to upper bound of this interval for better plot

    h=0.5*g*(t+t_err).^2 + 10*randn(size(t)); % This should be 0.5*g*(t).^2 + 10*randn(size(t)); since it is calculating Z_obs

    H=[0.5*t.^2 ones(size(t))]; % This should be =[0.5*(t+t_err).^2 ones(size(t))]; since it is calculating estimated z from t_obs.

    th=H\h;

    HH=[0.5*tt.^2 ones(size(tt))];

    figure(2),clf
    plot(t,h,'+',tt,HH*th) % Should be plot(t+t_err,h,'+',tt,HH*th) so that + represents (Z_obs, t_obs)
    xlabel 't'
    ylabel 'h'
    _____


    Do these edits make sense? You should at least get plots from both of these.
     
  10. Oct 21, 2015 #9
    yes, now 1 a and b works
     
    Last edited: Oct 21, 2015
  11. Oct 21, 2015 #10

    RUber

    User Avatar
    Homework Helper

    1c should just be adding back in the error in z_obs from 1a on top of the code for 1b.

    h=0.5*g*t.^2 + h_err + 10*randn(size(t)); %This is creating your Z_obs
     
  12. Oct 21, 2015 #11

    RUber

    User Avatar
    Homework Helper

    Also, in 1c, your code has the line:
    t_err=0;5; % systematic error of time
    That semicolon after 0 ends the line...so 5 is doing nothing.
     
  13. Oct 21, 2015 #12
    So now it looks like this...

    %% Problem 1/a
    g=10; % gravity constant
    h_err=2; % systematic error of height
    t=[0.3 0.5 0.7 1 2 2.5 3 3.7 3.8 4]';
    tt=[0:.1:4]'; %This is the range of times I want to plot the curve on

    h=0.5*g*t.^2 + h_err + 10*randn(size(t)); %This is creating Z_obs

    H=[0.5*t.^2 ones(size(t))]; %This is creating Z estimates based on t_obs

    th=H\h;

    HH=[0.5*tt.^2 ones(size(tt))]; %This is your best fit curve for the data at points in tt

    figure(1)
    plot(t,h,'+',tt,HH*th) %+ is (observed t, observed z), line is best fit curve.
    xlabel 't'
    ylabel 'h'

    %% Problem 1/b
    g=10; % gravity constant
    t_err=2; % systematic error of time
    t=[0.3 0.5 0.7 1 2 2.5 3 3.7 3.8 4]'; % t + t_err is observed time
    tt=[0:.1:4]'+t_err; % Added t_err to upper bound of this interval for better plot

    h=0.5*g*(t).^2 + 10*randn(size(t)); % This is calculating Z_obs

    H=[0.5*(t+t_err).^2 ones(size(t))]; % This is calculating estimated z from t_obs.

    th=H\h;

    HH=[0.5*tt.^2 ones(size(tt))];

    figure(2)
    plot(t+t_err,h,'+',tt,HH*th) %+ represents (Z_obs, t_obs)
    xlabel 't'
    ylabel 'h'

    %% Problem 1/c


    g=10; % gravity constant
    h_err=2; % systematic error of height
    t_err=0.5; % systematic error of time
    t=[0.3 0.5 0.7 1 2 2.5 3 3.7 3.8 4]';
    tt=[0:.1:4]'+t_err; %This is the range of times I want to plot the curve on

    h=0.5*g*(t).^2 + h_err + 10*randn(size(t)); %This is calculating Z_obs

    H=[0.5*(t+t_err).^2 ones(size(t))]; %This is creating Z estimates based on t_obs

    th=H\h;

    HH=[0.5*tt.^2 ones(size(tt))];

    figure(3)
    plot(t+t_err,h,'+',tt,HH*th)
    xlabel 't'
    ylabel 'h'

    %% Problem 2


    g=10; % gravity constant
    t_err=0.01*t; % systematic error of time, e.g. clock is advancing one percent
    t=[0.3 0.5 0.7 1 2 2.5 3 3.7 3.8 4]';
    tt=[0:.1:4]';%This is the range of times I want to plot the curve on

    h=0.5*g*(t).^2 + h_err;

    H=[0.5*(t+t_err).^2 ones(size(t))];

    th=H\h

    HH=[0.5*tt.^2 ones(size(tt))];

    figure(4)
    plot(t+t_err,h,'+',tt,HH*th)
    xlabel 't'
    ylabel 'h'

    %% Problem 4


    x=rand(50,1)*2*pi-pi
    y_1=sin (x);
    hist(y_1,50)

    x=rand(50,1)*2*pi-pi
    y_2=cos (x);
    hist(y_2,50)

    x_2=rand(50,1)*2-1.5;
    y_3=log (x_2);
    hist(y_3,50)

    what does 'clf' do?
    when i click run i get only figures 1,2,3,4 where number 4 is a histogram so there is something wrong with them?
    and in problem 2 how would I solve a value for g by using the model? least squares method?
     
    Last edited: Oct 21, 2015
  14. Oct 21, 2015 #13

    RUber

    User Avatar
    Homework Helper

    I have never used clf before.
    If you are running all the lines at the same time, you will want to use the
    figure( 5 ), figure(6), etc. before your plot command to declare which window you want to plot into. If you are doing them 1 by 1, there is no need.

    In this part of 4:
    x_2=rand(50,1)*2-1.5;
    y_3=log (x_2);
    hist(y_3,50)

    Your range for x_2 is off. This will generate a uniform distribution between -1.5 and .5, you are looking for uniform on [1,2], right?
    x_2=rand(50,1)
    Is uniform on [0,1], so just add 1 to shift that interval to the right.
    x_2=rand(50,1)+1.

    50 bins for your histogram is way too many for 50 data points. Either make much larger x vectors or reduce the bins in your hist(y, bins) command.
    g is one of the outputs in "th", check the other models to see.
     
  15. Oct 21, 2015 #14
    How about no. 5?

    %% Problem 5/a


    %y=p(1)x^2+(2)x+p(3) and two known roots are x_1=1, x_2=2
    r=[1,2];
    p=[2,1,0];

    p=poly(r);


    %% Problem 5/b

    p_1=1+rand(5000,1)
    p_2=-3+rand(5000,1)
    p_3=2+rand(5000,1)

    %% Problem 5/c
     
  16. Oct 22, 2015 #15

    RUber

    User Avatar
    Homework Helper

    You have the first step right for 5a. I don't know why you have a line that says p=[2,1,0] followed by another line defining p...this is unnecessary.
    %Step 1. Input parameters
    r = [1,2]; % you did this right.
    p = poly(r); % you did this right.
    %Step 2. Plot the polynomial
    x = linspace(-1,4,500); %linspace allows you to put in endpoints...the 500 is how many points you want, if you leave this blank it will default to 100.
    y = polyval(p,x); % plots polynomial with coefficients [p] on points [x].
    plot(x,y)

    For 5b, you are told to add uniformly distributed noise. Noise is normally centered on the actual values, so I would add 2*k*(rand()-0.5) to put it into the range [-k,k].
    You can also make all your p's in one matrix like:
    k = .5;
    P = ones(5000,1)*p+ 2*k*(rand(5000,3)-0.5);
    Make your roots collecting matrix:
    Roots = zeros(5000,2); % you will fill this up using the roots command.

    I don't know how to take roots on a full matrix, so you may have to use a for loop -- feed one row at a time from P into the roots function.
    I found that if your k is too large, you might start to get imaginary roots, so keep it relatively small.

    For 5c, the histogram is the same command you did before. Your Roots should come out in a 5000x2 matrix, so separately do hist(Roots(:,1)) and hist(Roots(:,2)).
    If you have imaginary roots, shrink k or use the real command to get the nearest real value to the root...hist(real(Roots(:,1))) and hist(real(Roots(:,2))).
     
  17. Oct 24, 2015 #16
    Thank you!
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted



Similar Discussions: Matlab data-analysis problems (systematic error, distributions)
  1. MATLAB errors (Replies: 4)

  2. Matlab Percent Error (Replies: 1)

Loading...