Spectral Bisection of Simplest Graph Clearly Incorrect

In summary, the speaker is experimenting with Spectral Bisections and Fiedler vectors to bisect a graph using the sparsest cut. However, they have found that even simple graphs like a 4x4 square lattice are being bisected incorrectly. They are wondering what they may be missing in their approach. They provide an example of the Laplacian matrix for the 4x4 square lattice.
  • #1
maverick_starstrider
1,119
6
I'm playing around with Spectral Bisections and Fiedler vectors as ways of bisecting a graph using the sparsest cut. However, I'm finding that even the most trivially simple graphs are bisected incorrectly by this approach. What am I missing?

Take for example a graph that is just a 4x4 square lattice (no-periodicity). There are clearly two possible bisections: a horizontal cut right across the middle, and a vertical cut right down the middle. The Laplacian matrix of this is just:

2 -1 0 0 -1 0 0 0 0 0 0 0 0 0 0 0
-1 3 -1 0 0 -1 0 0 0 0 0 0 0 0 0 0
0 -1 3 -1 0 0 -1 0 0 0 0 0 0 0 0 0
0 0 -1 2 0 0 0 -1 0 0 0 0 0 0 0 0
-1 0 0 0 3 -1 0 0 -1 0 0 0 0 0 0 0
0 -1 0 0 -1 4 -1 0 0 -1 0 0 0 0 0 0
0 0 -1 0 0 -1 4 -1 0 0 -1 0 0 0 0 0
0 0 0 -1 0 0 -1 3 0 0 0 -1 0 0 0 0
0 0 0 0 -1 0 0 0 3 -1 0 0 -1 0 0 0
0 0 0 0 0 -1 0 0 -1 4 -1 0 0 -1 0 0
0 0 0 0 0 0 -1 0 0 -1 4 -1 0 0 -1 0
0 0 0 0 0 0 0 -1 0 0 -1 3 0 0 0 -1
0 0 0 0 0 0 0 0 -1 0 0 0 2 -1 0 0
0 0 0 0 0 0 0 0 0 -1 0 0 -1 3 -1 0
0 0 0 0 0 0 0 0 0 0 -1 0 0 -1 3 -1
0 0 0 0 0 0 0 0 0 0 0 -1 0 0 -1 2

So I got to MATLAB and say eig(Laplacian) and the four smallest eigenvectors are: 0.0, 0.585, 0.585 and 1.1716. So, so far so good. The first eigenvalue is zero, as you expect, and the second eigenvalue has a multiplicity of two, corresponding to the two identical best cuts (horizontal or vertical).

But then when I look at the eigenvector of, say, the first non-zero eigenvalue, I get:

-0.0642
-0.1794
-0.3423
-0.4575
0.0886
-0.0266
-0.1895
-0.3047
0.3047
0.1895
0.0266
-0.0886
0.4575
0.3423
0.1794
0.0642

Which is wrong. The correct cut should have 1-8 as one sign (negative in this case) and 9-16 as the other sign (positive) in this case. But instead the method things node 5 should be on the other cut (which makes for a partition of 6 edges, instead of the optimal of 4).

This isn't the only case, I've tried all kinds of rectangular grids and the behaviour of this bisection method is unpredictable at best. Am I missing something, or is this how it is supposed to perform?
 
Mathematics news on Phys.org
  • #2
maverick_starstrider said:
Which is wrong. The correct cut should have 1-8 as one sign (negative in this case) and 9-16 as the other sign (positive) in this case. But instead the method things node 5 should be on the other cut (which makes for a partition of 6 edges, instead of the optimal of 4).
What do you mean when you say the result "is wrong". And how do you know what the "correct cut" is ? (And more technically what do you mean by "correct"?) The actual optimum sparsest cut problem is NP Complete -- using eigenvectors to effect a cut is an approximation algorithm.

For human readability, LaTeX is the preference. I dropped in your Laplacian below.

$$\mathbf L =
\left[
\begin{array}{cccccccccccccccc}
2 & -1 & 0 & 0 & -1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
-1 & 3 & -1 & 0 & 0 & -1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & -1 & 3 & -1 & 0 & 0 & -1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & -1 & 2 & 0 & 0 & 0 & -1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
-1 & 0 & 0 & 0 & 3 & -1 & 0 & 0 & -1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & -1 & 0 & 0 & -1 & 4 & -1 & 0 & 0 & -1 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & -1 & 0 & 0 & -1 & 4 & -1 & 0 & 0 & -1 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & -1 & 0 & 0 & -1 & 3 & 0 & 0 & 0 & -1 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & -1 & 0 & 0 & 0 & 3 & -1 & 0 & 0 & -1 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & -1 & 0 & 0 & -1 & 4 & -1 & 0 & 0 & -1 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & -1 & 0 & 0 & -1 & 4 & -1 & 0 & 0 & -1 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & -1 & 0 & 0 & -1 & 3 & 0 & 0 & 0 & -1 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & -1 & 0 & 0 & 0 & 2 & -1 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & -1 & 0 & 0 & -1 & 3 & -1 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & -1 & 0 & 0 & -1 & 3 & -1 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & -1 & 0 & 0 & -1 & 2 \\
\end{array}
\right]$$
 
  • #3
Apologies for the no LateX. I know the correct cut because it's a 4x4 square, it's trivially easy to see visually that the cut, which 4 edges cross, that goes through the middle is the sparsest cut. It's like this without the connections going out the side of the image:

https://upload.wikimedia.org/wikipe...rid_graph.svg/768px-Square_grid_graph.svg.png

Even if that weren't the optimal, the one put out by the Bisector method has 6 edges crossing the cut and thus is clearly worse than the middle cut.
 
  • #4
The issue still is the use of the word "wrong".

Using eigenvectors to make sparse cuts is an approximation algorithm-- with provable bounds on its deviation from optimal cuts. It is nothing more and nothing less than this. If you think those provable bounds are being broken, then you have the case that it is doing something wrong. Otherwise it is performing in line with expectations.

I'm guessing you'll want to spend some time googling approximation algorithms for intractable problems. To be clear -- if there is special structure you know of, then use it. But generally for large graphs this sparse cut problem is intractable. If you have taken an algorithms course, this will probably make a lot of sense. If you haven't, then it probably won't make much sense. (Something I read earlier said the problem is NP Complete but I think its actually NP Hard which is at least as bad as NP Complete).
 
  • #5
I appreciate your feedback but I have to say that if such a correctly implemented algorithm gave such obviously poor answers in trivial cases, no one would use it. I would direct your attention to the rectangular example at the bottom here:

https://www.mathworks.com/examples/matlab/community/20901-splitting-the-vertex-set-of-a-graph

Now this MATLAB function "split" claims to just take the Fiedler vector of the Laplacian, but if I make a Laplacian of that structure and just solve for the eigenvectors of the system using eig or eigs, I don't get the same cut output, I get worse ones. I also can get different cuts from the exact same matrix by just hitting enter again with eigs, which is a Lanczos type solver, based on a random initial start vector (vs.eig which is a full solver). Which is usually a sign that there's something mathematically funny about this particular matrix, making it numerical eigen solution twitchy.

I did my PhD in Computational Physics, so when I see a situation where we have matlab's slice function which gives a correct result, which claims to just diagonalize the Laplacian, but diagnolizing the Laplacian directly gives different (and fluctuating) results that screams to my intuition that there are known "stability tricks" that are used in these algorithms, that everyone who makes them just knows about and that are hidden in the slice function (which gives the correct answer).

So I have to say I don't agree that people are actually using an algorithm that gives blatantly false results in even the simplest cases even with the computational complexity of the problem, there's no shortage of other partitioning methods, but as matlab's slice function demonstrates, a "proper" implementation of the Fiedler vector approaches SHOULD give the right answer in these simple situations. So my question is "what's the thing I'm missing in their implementation"?

My assumption is that "the things I'm missing" has a flavor like: "in systems with degenerate optimal solutions, you have a multiplicity of eigenvectors that are not actually uniquely determined and span a subspace of the rank of their multiplicity. In order to ensure that the physically correct is returned you need to..." or "In the case of degenerate solutions, where a partition is chosen by the mean or median of the eigenvector, values will lie very close to the mean value and it tis known that numerical rounding error can mix the values over the median. To combat this, one can define a fixed cut-off radius around the mean and forcibly assign them all to one partition." Or something like that.

Or even better, maybe it's something way easier, maybe I'm missing something super basic about, like, how a Laplacian is defined and the one I gave above is just the wrong Laplacian for a 4x4 square grid. That'd be nice.
 
  • #6
Look, there's right ways to do these things and then there's what you're doing here.

First, if you are asserting that the approximation algorithm you're using is giving incorrect results: you must show that the result does not stay within guaranteed performance bounds. You still have not done this.

Second, if your question is, assume you have ##\mathbf L## with well ordered eigenvalues ##\lambda_1 \geq \lambda_2 \geq \lambda_3 \geq ... \lambda_{n-2} \geq \lambda_{n-1} \gt \lambda_n = 0## (which must be the case for a connected undirected graph), and say you are specifically interested in the case where ##\lambda_{n-2} = \lambda_{n-1}## and perhaps ##\lambda_{n-3}= \lambda_{n-2} = \lambda_{n-1}##, etc.. I.e. how do you handle 'ties' when the eigenvector associated with the smallest non-zero eigenvalue of a connected graph may not be unique because there are ties with that eigenvalue? How to deal with ties is an interesting question but quite a bit different from what you originally asked. (It also seems to be a 'Programming and Computer Science' question not so much a 'General Math' question.) My understanding is that the proof underlying the bounds on accuracy of cutting with eigenvectors does not require uniqueness of eigenvalues. Thus, there may be multiple solutions when cutting with eigenvectors (when there are multiple eigenvalues equal to ##\lambda_{n-1}##) but if all solutions are within the guaranteed performance bounds, that does not mean anyone of them is "wrong". Amongst the set of feasible approximation solutions, you'd like to choose one that (weakly) dominates all others -- it's an interesting fine tuning problem but again quite different from the strictly dominated solutions being "incorrect".

- - - -
This is where I draw the line and exit the thread.
 
  • Like
Likes mfb

1. What is spectral bisection of simplest graph?

Spectral bisection of simplest graph is a mathematical algorithm used to divide a graph into two equal-sized subgraphs, with the goal of minimizing the number of edges between the two subgraphs. This algorithm uses the concept of eigenvalues and eigenvectors to determine the optimal division of the graph.

2. How does spectral bisection work?

The spectral bisection algorithm works by first calculating the eigenvalues and eigenvectors of the graph's adjacency matrix. The second smallest eigenvalue is then used to determine the cut point that divides the graph into two subgraphs. This process is repeated until the two subgraphs are of equal size.

3. What is the purpose of spectral bisection?

The purpose of spectral bisection is to find an efficient way to divide a graph into two equal-sized subgraphs. This is useful in various applications, such as network partitioning, image segmentation, and community detection in social networks.

4. Is spectral bisection always accurate?

No, spectral bisection may not always produce the optimal division of a graph. It depends on the structure of the graph and the choice of cut point. However, it is a commonly used algorithm that has been shown to be effective in many cases.

5. What are some limitations of spectral bisection?

One limitation of spectral bisection is that it may not work well for graphs with irregular or complex structures. It also relies heavily on the choice of cut point, which can greatly affect the results. Additionally, spectral bisection can be computationally expensive for large graphs.

Similar threads

  • Introductory Physics Homework Help
Replies
1
Views
2K
Back
Top