Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Line maximizing orthogonal projections

  1. Nov 3, 2012 #1
    Say I have a set of points in 2D space. How would I find a line that maximizes the sum orthogonal projection of the points onto the line. The line does not have to go through the origin.
     
  2. jcsd
  3. Nov 3, 2012 #2

    mfb

    User Avatar
    2016 Award

    Staff: Mentor

    Sum of what?
    Maximizes or minimizes?
     
  4. Nov 3, 2012 #3
    Untitled.png
    Sorry, I did not formulate my question properly. I'll describe the application. Say that I have a set of points as shown in the photo. Some points are red and some are green. I want to find the line that separates the two groups with as much margin as possible. Support vector machines do this, but I want to know how the black line is selected. If the groups cannot be perfectly separated, then I want to select the line that separates the groups as best as possible.

    This youtube video here:
    http://www.youtube.com/watch?v=-Cn5...iHIyxJq0hw_9b8w2j_sWOVe&feature=results_video

    At 17:00 shows the formula for how SVMs select the decision boundary, but it assumes that the decision boundary goes through the origin.
     
  5. Nov 3, 2012 #4

    chiro

    User Avatar
    Science Advisor

    Hey googleadbot.

    Just shift all points so that the plane goes through the origin and do the calculation.

    In other words, if you have a formula for the plane going through the origin (which should be ax + by + cz = 0) then shift all those points d units in the direction of the inverse normal (i.e d*n) and do the computation on the shifted points.
     
  6. Nov 4, 2012 #5

    mfb

    User Avatar
    2016 Award

    Staff: Mentor

    @chiro: The problem is two-dimensional anyway.

    @googleadbot: The video gives a formula for the quality of the separation. You can parameterize your line and express the quality as function of those two parameters. Then you have to minimize/maximize it: Find points where both derivatives (in the line parameters) are 0.
     
  7. Nov 4, 2012 #6
    Hi mfb, I think you're right about taking the derivative on the minimization function. However, that's where I'm stuck. The minimization function has 3 parameters for 2D space: theta0, theta1, and theta2, no? How do I take the derivative of a funtion with 3 variables?


    scan.jpg

    I think the general solution is given here:

    http://nlp.stanford.edu/IR-book/htm...r-machines-the-linearly-separable-case-1.html

    At points (K) and (L). However, it's not clear to me how to maximize equation L. What are x(i) and x(j)?
     
    Last edited: Nov 4, 2012
  8. Nov 4, 2012 #7

    mfb

    User Avatar
    2016 Award

    Staff: Mentor

    Why do you have three parameters?
    (intersection with y-axis) and (gradient)
    (intersection with y-axis) and (intersection with x-axis)
    (oriented distance to the origin) and (gradient)
    (distance to the origin) and (angle towards line)
    or whatever. They all have some advantages and disadvantages, I think at least the first two will give reasonable expressions to minimize.
     
  9. Nov 4, 2012 #8
  10. Nov 5, 2012 #9

    mfb

    User Avatar
    2016 Award

    Staff: Mentor

    Well, that approach introduces more parameters than required - you get multiple different ways to describe the same line:
    <1+2t,2+t> is the same as <1+4t,2+2t> and <3+2t,3+t> and so on.
    Therefore, I would use a different parametrization.
     
  11. Nov 6, 2012 #10
    I think I can see how I can parameterize my line, but what matters, according to the video, is the vector theta that is normal to the line. Doesn't the line have an infinite number of normal vectors corresponding to it? If I want to parameterize the normal vector theta, then I would need at least 4 parameters as far as I can tell, because the vector would require both direction as well as magnitude (2 parameters) as well as the origin of theta (x/y coordinates of the base point). With that many parameters, I don't see how I can find a direct solution...I think this problem might require quadratic programming (that's what SVM uses to find the boundary line), problem is that I'm not sure how it does it and code to SVM Light or SVM Lib is pretty hard to follow.
     
  12. Nov 6, 2012 #11
    Note to the OP: SVM's don't require that the hyperplane pass through the origin, which should pretty much solve your problem.
     
  13. Nov 6, 2012 #12

    mfb

    User Avatar
    2016 Award

    Staff: Mentor

    They just differ in their length, not in their direction (up to the sign, and in 2D).
    There is an infinite number of normal lines, but just one which passes through a specific point.


    If you express your line as ##px+qy=c## where ##p^2+q^2=1##, that is equivalent to ##\vec{x} \cdot \left( \begin{array}{}p \\ q\end{array} \right) - c= 0## for points p on the line. The nice feature of that expression: it allows to calculate the distance in a very similar way: $$d(\vec{x}) = \vec{x} \cdot \left( \begin{array}{c}p \\ q\end{array} \right) - c = \vec{x} \cdot \left( \begin{array}{c}p \\ \pm \sqrt{1-p^2}\end{array} \right) - c$$

    You can add the distances (or the absolute value of those distances) of all points and find p,c where that sum is minimal.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: Line maximizing orthogonal projections
  1. Orthogonal projection (Replies: 8)

Loading...