Line maximizing orthogonal projections

In summary, the video gives a formula for the quality of separation between two sets of points. You can parameterize your line and express the quality as function of those two parameters. Then you have to minimize/maximize it: Find points where both derivatives (in the line parameters) are 0.
  • #1
googleadbot
5
0
Say I have a set of points in 2D space. How would I find a line that maximizes the sum orthogonal projection of the points onto the line. The line does not have to go through the origin.
 
Physics news on Phys.org
  • #2
Sum of what?
Maximizes or minimizes?
 
  • #3
Untitled.png

Sorry, I did not formulate my question properly. I'll describe the application. Say that I have a set of points as shown in the photo. Some points are red and some are green. I want to find the line that separates the two groups with as much margin as possible. Support vector machines do this, but I want to know how the black line is selected. If the groups cannot be perfectly separated, then I want to select the line that separates the groups as best as possible.

This youtube video here:
http://www.youtube.com/watch?v=-Cn5...iHIyxJq0hw_9b8w2j_sWOVe&feature=results_video

At 17:00 shows the formula for how SVMs select the decision boundary, but it assumes that the decision boundary goes through the origin.
 
  • #4
Hey googleadbot.

Just shift all points so that the plane goes through the origin and do the calculation.

In other words, if you have a formula for the plane going through the origin (which should be ax + by + cz = 0) then shift all those points d units in the direction of the inverse normal (i.e d*n) and do the computation on the shifted points.
 
  • #5
@chiro: The problem is two-dimensional anyway.

@googleadbot: The video gives a formula for the quality of the separation. You can parameterize your line and express the quality as function of those two parameters. Then you have to minimize/maximize it: Find points where both derivatives (in the line parameters) are 0.
 
  • #6
mfb said:
@chiro: The problem is two-dimensional anyway.

@googleadbot: The video gives a formula for the quality of the separation. You can parameterize your line and express the quality as function of those two parameters. Then you have to minimize/maximize it: Find points where both derivatives (in the line parameters) are 0.

Hi mfb, I think you're right about taking the derivative on the minimization function. However, that's where I'm stuck. The minimization function has 3 parameters for 2D space: theta0, theta1, and theta2, no? How do I take the derivative of a funtion with 3 variables?
scan.jpg


I think the general solution is given here:

http://nlp.stanford.edu/IR-book/htm...r-machines-the-linearly-separable-case-1.html

At points (K) and (L). However, it's not clear to me how to maximize equation L. What are x(i) and x(j)?
 
Last edited:
  • #7
Why do you have three parameters?
(intersection with y-axis) and (gradient)
(intersection with y-axis) and (intersection with x-axis)
(oriented distance to the origin) and (gradient)
(distance to the origin) and (angle towards line)
or whatever. They all have some advantages and disadvantages, I think at least the first two will give reasonable expressions to minimize.
 
  • #9
Well, that approach introduces more parameters than required - you get multiple different ways to describe the same line:
<1+2t,2+t> is the same as <1+4t,2+2t> and <3+2t,3+t> and so on.
Therefore, I would use a different parametrization.
 
  • #10
mfb said:
@chiro: The problem is two-dimensional anyway.

@googleadbot: The video gives a formula for the quality of the separation. You can parameterize your line and express the quality as function of those two parameters. Then you have to minimize/maximize it: Find points where both derivatives (in the line parameters) are 0.

I think I can see how I can parameterize my line, but what matters, according to the video, is the vector theta that is normal to the line. Doesn't the line have an infinite number of normal vectors corresponding to it? If I want to parameterize the normal vector theta, then I would need at least 4 parameters as far as I can tell, because the vector would require both direction as well as magnitude (2 parameters) as well as the origin of theta (x/y coordinates of the base point). With that many parameters, I don't see how I can find a direct solution...I think this problem might require quadratic programming (that's what SVM uses to find the boundary line), problem is that I'm not sure how it does it and code to SVM Light or SVM Lib is pretty hard to follow.
 
  • #11
Note to the OP: SVM's don't require that the hyperplane pass through the origin, which should pretty much solve your problem.
 
  • #12
Doesn't the line have an infinite number of normal vectors corresponding to it?
They just differ in their length, not in their direction (up to the sign, and in 2D).
There is an infinite number of normal lines, but just one which passes through a specific point.If you express your line as ##px+qy=c## where ##p^2+q^2=1##, that is equivalent to ##\vec{x} \cdot \left( \begin{array}{}p \\ q\end{array} \right) - c= 0## for points p on the line. The nice feature of that expression: it allows to calculate the distance in a very similar way: $$d(\vec{x}) = \vec{x} \cdot \left( \begin{array}{c}p \\ q\end{array} \right) - c = \vec{x} \cdot \left( \begin{array}{c}p \\ \pm \sqrt{1-p^2}\end{array} \right) - c$$

You can add the distances (or the absolute value of those distances) of all points and find p,c where that sum is minimal.
 

1. What is a line maximizing orthogonal projection?

A line maximizing orthogonal projection is a mathematical concept in which a line is drawn through a set of points in such a way that the distance between the points and the line is minimized, thus creating the most accurate representation of the data.

2. How is a line maximizing orthogonal projection calculated?

A line maximizing orthogonal projection is calculated using linear regression techniques, such as the method of least squares. This involves finding the line that minimizes the sum of the squared distances between the points and the line.

3. What is the purpose of using a line maximizing orthogonal projection?

The purpose of using a line maximizing orthogonal projection is to find the best-fitting line through a set of data points, which can then be used to make predictions or identify patterns in the data. It is commonly used in statistical analysis and data visualization.

4. What are some common applications of line maximizing orthogonal projections?

Line maximizing orthogonal projections are commonly used in various fields, such as finance, economics, engineering, and biology, to analyze and interpret data. They can also be used in image processing, computer graphics, and machine learning algorithms.

5. Are there any limitations to using line maximizing orthogonal projections?

One limitation of using line maximizing orthogonal projections is that it assumes a linear relationship between the variables being analyzed. If the relationship is non-linear, the projection may not accurately represent the data. Additionally, outliers in the data can significantly affect the results of the projection.

Similar threads

  • Linear and Abstract Algebra
Replies
23
Views
1K
Replies
2
Views
937
  • Linear and Abstract Algebra
Replies
9
Views
380
  • Linear and Abstract Algebra
Replies
5
Views
3K
  • Linear and Abstract Algebra
Replies
13
Views
383
  • Linear and Abstract Algebra
Replies
14
Views
1K
Replies
26
Views
2K
Replies
2
Views
739
Replies
4
Views
607
  • Linear and Abstract Algebra
Replies
2
Views
868
Back
Top