Probability from an optimization problem

  • Context: Graduate 
  • Thread starter Thread starter hosseinGhafari
  • Start date Start date
  • Tags Tags
    Optimization Probability
Click For Summary

Discussion Overview

The discussion revolves around inferring probabilities from a cost function in the context of a linear regression problem. Participants explore the relationship between the cost function, probability distributions, and the computation of parameters such as covariance and Fisher information. The scope includes theoretical aspects of regression analysis and probabilistic modeling.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant presents a cost function comprising a sum of quadratic losses and a regularization term, questioning how to infer probabilities from it.
  • Another participant asks for clarification on what probability is being inferred, suggesting that more context is needed.
  • A participant describes the full problem involving linear regression, specifying the cost function and expressing the need to compute p((y,x)|w) without assuming a logistic regression distribution.
  • There is a request for clarification on the meaning of "y" in the context of the problem, indicating that the notation p(y|x,w) is not clear to some participants.
  • A participant clarifies that "y" represents a label in a classification problem, specifically +1 or -1, and that p(y|w,x) denotes the probability of these labels given w and x.
  • Concerns are raised about the need for a clear definition of the probability space and the random process that generates the outcomes associated with "y".
  • Another participant questions whether the weights must be chosen such that the sum of the weighted inputs equals +1 or -1, or if it is rounded in some manner.

Areas of Agreement / Disagreement

Participants express varying levels of understanding regarding the definitions and implications of the probability notations used in the problem. There is no consensus on how to proceed with the inference of probabilities or the interpretation of the cost function in relation to the probability distributions.

Contextual Notes

Participants highlight the need for clearer definitions and explanations regarding the variables and the probability space involved in the problem. The discussion reflects uncertainty about the relationships between the cost function, the parameters, and the probabilistic interpretations.

hosseinGhafari
Messages
4
Reaction score
0
I have a cost function which consists of sum of a set of quadratic loss plus a term which regularize function. my problem is: is there any way to infer probability from such a cost functions?
 
Physics news on Phys.org
Probability of what?

It would be useful to see the full problem.
 
Here is the full problem: i have a linear regression, <w,x> with cost function \sigma(y_i,<w,x_i>)^2+\lambda*norm(w)^2. i need to compute the p((y,x)|w). i know that sometimes it is good to assume a logistic regression distribution, but if we cannot make such an assumption, is there any way to compute p(y|x,w) or p((y,x)|w)?
Also, is there any way to compute covvariance of parameter w? obviously, it is related to the above probability via cramer-rao and fisher information. I want also to compute fisher information.
Thanks
 
Last edited:
hosseinGhafari said:
Here is the full problem: i have a linear regression, <w,x> with cost function \sigma(y_i,<w,x_i>)^2+\lambda*norm(w)^2.

That is not clear statement of a problem. If you can't explain the problem, perhaps you can give a link to an online example of a similar problem.

The simplest way to describe a regression problem would be to describe the format of the data. Explain which variable is to be predicted from which other variable(s).

i need to compute the p((y,x)|w)

What does "y" represent?
 
Hi Stephen, here is the required information about the problem.
The data x_i is an n-dimensional vector. we have a set of {x_i},i=1,2,...,n. y = <w,x>. By, <w,x> i mean w^T.x .i.e inner product of the w and x. we want to find w as an n-dimensional vector such that the above mentioned cost function is minimized.y is to be predicted from x. I hope it made statement of the problem clear.But regarding the format of the data, we can only know that it is an n-dimensional vector.

Thank you
 
Last edited:
hosseinGhafari said:
I hope it made statement of the problem clear.

What you said is clear. However, the meaning of p(y|x,w) or other notation involving "p(y..." is not clear. As mfb asked, what is the event "y"?

In your problem, presumably you have data as an array of vectors

y[1], x[1][1], x[1][2],...x[1][n],
y[2], x[2][1],x[2][2],... x[2][n],
...
y[m], x[m][1],x[m][2],...x[m][n]

And you have a vector of constants

[itex]w[1],w[2],...w[n][/itex]

And you have a model in the variables [itex]Y[/itex] and [itex]X[1],X[2]...X[n][/itex]

[itex]Y = \sum_{i=1}^n w<i> X </i>[/itex]


But the meaning of "p(y...)" is not clear.
 
Y is the label in this classification problem. it is +1 or -1. in fact in this problem, we are going to find a hyperplane which discriminates samples with plus or minus label. P(y|w,x) is the probability of label y=1 or -1 given w and x.
 
I think you should try to give a link to some online explanation of a similar problem.

hosseinGhafari said:
Y is the label in this classification problem. it is +1 or -1.

Are you saying the weights [itex]w[/itex] must be chosen so [itex]\sum_{i=1}^n w<i> X </i>[/itex] is either exactly +1 or -1. Or is the sum rounded to the nearest integer? Or rounded in some other way to either +1 or -1 ?

P(y|w,x) is the probability of label y=1 or -1 given w and x.

This doesn't explain what y is. To use probability, you need a "probability space". The points in the space are "outcomes" of some process. The sets that are assigned probabilities are "events". What random process generates the outcome or event y? Your problem has several y's in it. There are observations [itex]y[1], y[2],...[/itex] and there are predictions [itex]Y[/itex].
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 0 ·
Replies
0
Views
2K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 18 ·
Replies
18
Views
4K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K