Equation related to Linear Discriminant Analysis (LDA)?

  • Thread starter Thread starter zak100
  • Start date Start date
  • Tags Tags
    Analysis Linear
Click For Summary
SUMMARY

The discussion centers on understanding the equation for Linear Discriminant Analysis (LDA), specifically the projection equation Y = W^T X. Participants clarify that W represents the projection matrix, which is crucial for maximizing class separation while reducing dimensionality. The equation involves the use of symmetric positive definite matrices S_W and S_B, and the goal is to optimize W to achieve maximum likelihood. Additionally, the distinction between μ and âμ is addressed, highlighting their roles as means of different datasets.

PREREQUISITES
  • Understanding of Linear Discriminant Analysis (LDA)
  • Familiarity with matrix notation and operations
  • Knowledge of symmetric positive definite matrices
  • Basic concepts of eigenvalues and eigenvectors
NEXT STEPS
  • Study the derivation of the LDA projection equation Y = W^T X
  • Explore the properties of symmetric positive definite matrices S_W and S_B
  • Learn about the optimization techniques for selecting the projection matrix W
  • Investigate the significance of eigenvalues and eigenvectors in dimensionality reduction
USEFUL FOR

Students and professionals in data science, machine learning practitioners, and statisticians seeking to deepen their understanding of Linear Discriminant Analysis and its mathematical foundations.

zak100
Messages
462
Reaction score
11

Homework Statement


I can't understand an equation to LDA. The context is:
The objective of LDA is to perform dimensionality reduction while
preserving as much of the class discriminatory information as
possible
Maybe the lecturer is trying to create a proof of the equation given below.

I know the above that LDA projects the points along an axis so that we can have maximum separation between two classes.in addition to reducing dimesionality

Homework Equations


I am not able to understand the following equation:
##Y =W^T## ##X##

It says that:
Assume we have a set of D-dimensional samples ##{x_1, x_2,...x_N},## ##N_1## of belong to class
##\Omega_1## and ##N_2## to class ##\Omega_2##. We seek to obtain a scalar ##Y## by projecting the samples ##X## onto a line:
In the above there is no W. So I want to know what is W?

The Attempt at a Solution



W might represent the projection line? But T = transpose.

Somebody please guide me. For complete description, please see the attached file.

Zulfi.
[/B]
 

Attachments

Physics news on Phys.org
If you are patient enough, we can step through this.

There are some severe notation and definitional roadblocks that will come up. From past threads, you know that a projection matrix satisfies ##P^2 = P## i.e. idempotence implies it is square and in fact diagonalizable and in fact full rank iff it is the identity matrix, yet this contradicts slide 8 of your attachment. (I have a guess as to what's actually being said here, but the attachment is problematic. My guess btw is that ##W^T W = I## but ## WW^T = P##)

Typically more than half the battle is clearly stating what is being asked, then I'd finish it off with something a bit esoteric like matrix calculus or majorization. The fact mentioned on page 9 that LDA can be interpreted / derived as a Max Likelihood method for certain normals... is probably the most direct method.
 
Last edited:
Hi,
Thanks for your reply? Do you mean that W is the sample matrix?

Zulfi.
 
e_
zak100 said:
Hi,
Thanks for your reply? Do you mean that W is the sample matrix?

Zulfi.

Have you looked at pages 7 and 9 in Detail? It seems fairly clear to me that ##W## is made up. Equivalently, you choose it, and you should choose optimally (page 9).

- - - -
My belief, btw, is that page 8 shows

##J(W) = \frac{\det\big(W^T S_b W\big)}{\det\big(W^T S_W W\big)}##

where ##S_W## and ##S_B## are symmetric positive (semi?) definite matrices. However since I've conjectured that ##W^TW = I## but ##WW^T=P## my belief is you select ##W## to be a rank ##r## matrix and hence

##J(W) = \frac{\det\big(W^T S_b W\big)}{\det\big(W^T S_W W\big)} = \frac{e_r\big(W^T S_b W\big)}{e_r\big(W^T S_b W\big)}= \frac{e_r\big(P S_b \big)}{e_r\big(P S_b \big)}= \frac{e_r\big(P S_b P\big)}{e_r\big(P S_b P\big)}##

where ##e_r## is the rth elementary symmetric function of the eigenvalues of the matrix inside. But these notes clearly are part of a much bigger sequence and are not standalone. There should be a notational lookup somewhere.
 
Last edited:
Hi,
Thanks. You mean that W represents the Matrix of Eigen Vectors.

Kindly tell me what is the difference between ##\mu## and ##\hat{\mu}## in slide #3. ##\mu## represents the mean of X values where as ##\hat{\mu}## represents the mean of Y values. If both are mean why we use ^ symbol with one and other one is without ^ symbol. We could have represented them using ##\mu_1## and ##\mu_2##. I can't understand this.

Please guide me.

Zulfi.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 7 ·
Replies
7
Views
1K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 1 ·
Replies
1
Views
5K