Minimax Error (decision theory problem)

  • Thread starter Thread starter greatm31
  • Start date Start date
  • Tags Tags
    Error Theory
Click For Summary
SUMMARY

The discussion focuses on solving a minimax decision theory problem involving one-dimensional Gaussian distributions, specifically using the minimax criterion to find the optimal decision point x* in terms of the means (μ_i) and variances (σ_i²) of the distributions. The user correctly identifies that the zero-one risk implies a loss function of 1 for incorrect decisions and 0 for correct ones. The derived formula for the optimal decision point x* is x = (σ_i⁴ * μ_j - σ_j⁴ * μ_i) / (σ_i⁴ - σ_j⁴), although concerns are raised regarding its validity when σ_i equals σ_j.

PREREQUISITES
  • Understanding of Decision Theory concepts, particularly minimax and Bayesian risk.
  • Familiarity with Gaussian distributions, specifically N(μ, σ²).
  • Knowledge of Bayes' Theorem and its application in decision-making.
  • Ability to perform calculus, particularly differentiation, to find optimal solutions.
NEXT STEPS
  • Explore the implications of equal variances (σ_i = σ_j) on the decision point x* in minimax problems.
  • Study the concept of expectimax and its differences from minimax in decision theory.
  • Investigate advanced topics in Bayesian decision theory, including prior probability impacts.
  • Review practical applications of minimax decision-making in machine learning and pattern classification.
USEFUL FOR

Students and professionals in statistics, data science, and machine learning, particularly those focusing on decision theory and pattern classification methodologies.

greatm31
Messages
1
Reaction score
0

Homework Statement



I've been trying to teach myself some Decision Theory using "Pattern Classification" by Duda, Heart, and Stork. It's a great book but the homework problems are giving me a lot of trouble. This one, concerning minimax, is particularly difficult:

"Assume we have one-dimensional Gaussian distributions [tex]p(x|\omega_{i})=N(\mu_{i},\sigma_{i}^{2})[/tex] for i=1,2, but completely unknown prior probabilities. Use the minimax criterion to find the optimal decision point x* in terms of [tex]\mu_{i}[/tex] and [tex]\sigma_{i}[/tex] under a zero-one risk."

Just to clarify, the zero-one risk means the loss function is 1 for an incorrect decision, and 0 for a correct decision.

2. The attempt at a solution

Now, my understanding of the minimax criterion is that we want to minimize the maximum Bayesian error for all possible priors. So I was thinking that the goal is to maximize the conditional Bayesian risk:

(key:
[tex]a_{i}[/tex] = action associated with selecting [tex]i[/tex]
[tex]\omega_{j}[/tex] = the so-called "state of nature" (tells whether [tex]j[/tex] is true)
[tex]x[/tex] = the data
[tex]c[/tex] = number of states of nature):

[tex]R(a_i|x) = \sum_{j}^c cost(a_{i}|\omega_{j}) \cdot P(\omega_{j}|x)[/tex]

Since the loss function is just 1 if [tex]i \neq j[/tex] and 0 if [tex]i=j[/tex],

[tex]R(a_{i}|x) = \sum_{j \neq i}^c P(\omega_{j}|x)[/tex]
[tex]= 1 - P(\omega_{i}|x)[/tex]

OK so this much is already developed in the book. I figure all I have to do is maximize the conditional error [tex]R(a_{i}|x)[/tex] and I'll have the "decision point" - the point at which you have the maximal Bayesian Risk, which should (I think) be independent of the prior probabilities [tex]P(\omega_{i})[/tex]. So that's what I did:

(here I'll use the abbreviation [tex]N_{i} = N(\mu_{i},\sigma_{i}^2)[/tex] = normal distribution with mean [tex]\mu_{i}[/tex] and variance [tex]\sigma_{i}^{2}[/tex]):

[tex]R(a_{i}|x) = 1 - P(\omega_{i}|x)[/tex]
[tex]= 1 - \frac{p(x|\omega_{i}) \cdot P(\omega_{i})}{p(x)}[/tex] (Bayes' Rule)
[tex]= 1 - \frac{N_{i} \cdot P(\omega_{i})}{N_{i} \cdot P(\omega_{i})+N_{j} \cdot P(\omega_{j})}[/tex]
[tex]= 1 - \frac{1}{1+\frac{P(\omega_{j})}{P(\omega_{i})} \cdot \frac{N_{j}}{N_{i}}}[/tex]

To minimize this, we maximize [tex]N_{j}/N_{i}[/tex] (thus independent of priors). I maximize this just by taking the first derivative and setting = 0:

[tex]0 = \frac{N_{i} \cdot N_{j}' - N_{j} \cdot N_{i}'}{N_{i}^2}[/tex]
[tex]\Rightarrow N_{i} \cdot N_{j}' = N_{j} \cdot N_{i}'[/tex]
[tex]\Rightarrow N_{i} \cdot \frac{-2 \cdot (x-\mu_{j})}{(2\sigma_{j}^{2})^{2}} \cdot N_{j} = N_{j} \cdot \frac{-2 \cdot (x-\mu_{i})}{(2\sigma_{i}^{2})^{2}} \cdot N_{i}[/tex] (deriv. of normal)
[tex]\Rightarrow \frac{x-\mu_{j}}{\sigma_{j}^{4}} = \frac{x-\mu_{i}}{\sigma_{i}^{4}}[/tex]
[tex]\Rightarrow x = \frac{\sigma_{i}^{4} \cdot \mu_{j} - \sigma_{j}^{4} \cdot \mu_{i}}{\sigma_{i}^4 - \sigma_{j}^{4}}[/tex]

So that's what I think x* is. The trouble is it doesn't make too much sense (what happens when [tex]\sigma_{i} = \sigma_{j}[/tex] ?). And also I'm not totally sure this is the right way of thinking about things.

Thanks for any help you can provide, even if it's just to tell me I'm completely wrong about my approach.
 
Physics news on Phys.org
you might want to look at expectimax since you are dealing with probability
 
I don't think that you use "the the minimax criterion" when solving this problem, so the answer cannot make sure that it is indepent of the prior probabilities...
 

Similar threads

Replies
1
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 1 ·
Replies
1
Views
1K
Replies
2
Views
3K
Replies
3
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K