Problem at Work with Regression

Diffy · Apr 12, 2010

Hi

I've been tasked with making a sort of sensitivity analysis tool. The goal was to get as many parameters as possible and then use them to build a model that would allow users to change variables a bit and see what happens to the one dependent variable.

So I used a multi variable regression tool to come up with a single equation:

y = c1*x1 + c2*x2 + ... + cn*xn

Here c's are the calculated coefficients, x's are the independent variables and y is the one dependent value.

Then what I did was I built at tool that allowed the users to adjust the variables they wanted to adjust to see how that changes y. The problem is that when some variables are adjusted up, it makes y go down which does make much sense in our business model.

So I have a few questions
1) Should I be using a different type of regression?
2) Is there a better way to go about this?
3) Are there ways to influence the coefficients I calculate?

Additional info:
Please let me know if additional info is needed.
R^2 = .978
F = 93.967

Diffy · Apr 14, 2010

So maybe I will try asking in a different way since I did not get any responses yet.

I want to build a model based on variables to predict a value, but my variables are have very little correlation to the value I want to predict. What I mean by this is when I look at scatter plots of my variable versus the values I want to predict there is no immediately obvious best fit line, the graphs are truly scattered!

Is there even a way to build an accurate model?

0xDEADBEEF · Apr 14, 2010

Lies, damn lies and statistics. You are touching a very sensitive subject. The main problem if you are asking physicists is this: We make a model and then we fit it to our data. If the fitted model yields wrong results we discard it, or say that it produces wrong results in variable a, but say that it explains some values for variable b better than other models.

You have decided that you have some process that produces a variable y. Your model assumes that it can be expressed as a function of some variables which has a dominating linear component overshadowed by noise.

In physics we usually know the noise. We can measure if it fits our model. And now we come to your job. You were the person who claims that your data should be model-able in the manner you stated. Your model seems to yield wrong results, maybe you should discard it. Why is there noise in your data? Can you reduce it. Is is gaussian?

Of course there are more tools you could use, and maybe squeeze more from your data, but we don't even know what kind of data you have, and you are probably better advised to look at what other people are doing in your field.

On the other hand your fit parameters don't look so bad. Maybe this site gives you some ideas: http://documents.wolfram.com/applications/eda/FittingDataToLinearModelsByLeast-SquaresTechniques.html

Problem at Work with Regression

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect