Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Problem at Work with Regression

  1. Apr 12, 2010 #1
    Hi

    I've been tasked with making a sort of sensitivity analysis tool. The goal was to get as many parameters as possible and then use them to build a model that would allow users to change variables a bit and see what happens to the one dependent variable.

    So I used a multi variable regression tool to come up with a single equation:

    y = c1*x1 + c2*x2 + ... + cn*xn

    Here c's are the calculated coefficients, x's are the independent variables and y is the one dependent value.

    Then what I did was I built at tool that allowed the users to adjust the variables they wanted to adjust to see how that changes y. The problem is that when some variables are adjusted up, it makes y go down which does make much sense in our business model.

    So I have a few questions
    1) Should I be using a different type of regression?
    2) Is there a better way to go about this?
    3) Are there ways to influence the coefficients I calculate?


    Additional info:
    Please let me know if additional info is needed.
    R^2 = .978
    F = 93.967
     
  2. jcsd
  3. Apr 14, 2010 #2
    So maybe I will try asking in a different way since I did not get any responses yet.

    I want to build a model based on variables to predict a value, but my variables are have very little correlation to the value I want to predict. What I mean by this is when I look at scatter plots of my variable versus the values I want to predict there is no immediately obvious best fit line, the graphs are truly scattered!

    Is there even a way to build an accurate model?
     
  4. Apr 14, 2010 #3
    Lies, damn lies and statistics. You are touching a very sensitive subject. The main problem if you are asking physicists is this: We make a model and then we fit it to our data. If the fitted model yields wrong results we discard it, or say that it produces wrong results in variable a, but say that it explains some values for variable b better than other models.

    You have decided that you have some process that produces a variable y. Your model assumes that it can be expressed as a function of some variables which has a dominating linear component overshadowed by noise.

    In physics we usually know the noise. We can measure if it fits our model. And now we come to your job. You were the person who claims that your data should be model-able in the manner you stated. Your model seems to yield wrong results, maybe you should discard it. Why is there noise in your data? Can you reduce it. Is is gaussian?

    Of course there are more tools you could use, and maybe squeeze more from your data, but we don't even know what kind of data you have, and you are probably better advised to look at what other people are doing in your field.

    On the other hand your fit parameters don't look so bad. Maybe this site gives you some ideas: http://documents.wolfram.com/applications/eda/FittingDataToLinearModelsByLeast-SquaresTechniques.html [Broken]
     
    Last edited by a moderator: May 4, 2017
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook