# Cheapest Route Question

• I
• NekotoKoara
In summary, this conversation discusses the complexities of finding the cheapest route through a set of destinations, known as the "Traveling Salesman Problem." The problem is a type of graph theory and has been proven to be NP-complete, meaning it requires exponential time to solve. Solutions often use approximation algorithms or simulate biological processes such as how ants find the shortest path. This problem has real-world applications in industries such as transportation and logistics.f

#### NekotoKoara

Hello, recently I was looking into visiting Tokyo and started looking into its complicated but wonderfully efficient railway system. I picked 6 destinations in the area and looked up how much it would cost to go from anyone of the locations to any other.

Then I started to wonder, if I theoretically wanted to go to each location in one day without going to any location more than once, what would be the cheapest path. After realizing there were 360 unique paths (paths that are just the reverse of another path I thought shouldn’t be included since they would be the same price) I realized how complex of a problem this could become.

After doing some research it seems like this is a type of problem that would be solved in graph theory. However, I personally have only self studied up to halfway through Calculus 1 and would not even begin to know how to mathematically approach this problem. A friend and I did eventually come up with the cheapest route but it wasn’t a terribly efficient method and I’m not sure how well our approach would work in a situation with more points and different values.

Anyone want to try to explain how this type of problem is generally dealt with and/or recommend some resources to get me on the right path of understanding it better? Thanks! :-)

Did you look up "Traveling salesman problem"?

NekotoKoara
After doing some research it seems like this is a type of problem that would be solved in graph theory. However, I personally have only self studied up to halfway through Calculus 1 and would not even begin to know how to mathematically approach this problem.
It is more a theoretical computer science approach: This problem is NP complete, which means that all known deterministic solutions require an algorithm, whose run time depends exponentially on the number of stations. One can guess solutions and will be faster, but then a random component is required. The best known algorithms use, as far as I know, neurological networks, which simulate how ants find their shortest way. It is actually a real world problem: How to staff airplanes? How to deliver in shortest time? etc.

This problem is NP complete

Organizations sometimes struggle to recognize such problems and invest lots of resources investigating why the optimal answer has not yet been provided for such an obviously simple problem statement (many NP complete problems are conceptually very simple to articulate and understand, leading one to intuit that the solution should also be simple).

Its very hard to explain the concept to someone without any background in algorithms - it sounds like one is just making excuses.

StoneTemplePython
Organizations sometimes struggle to recognize such problems and invest lots of resources investigating why the optimal answer has not yet been provided for such an obviously simple problem statement (many NP complete problems are conceptually very simple to articulate and understand, leading one to intuit that the solution should also be simple).

Just being able to spot NP Complete (or technically NP Hard in this case) problems is quite beneficial.

Its very hard to explain the concept to someone without any background in algorithms - it sounds like one is just making excuses.

To my knowledge the best approximation (as in not quite optimal but able to bound best score and show you a path that is close) algorithms for TSP are in the semidefinite programming family. If you really want to have some fun with people, start rambling on about the cone of positive semidefinite matrices. That'll show 'em.

The problem might be NP but 360 options is still something a computer does in a few microseconds, and there are reasonable and fast approximations for larger numbers. They are not guaranteed to find the ideal solution but 0.1% worse for 1000 destinations is not that bad.

The algorithms to staff flight personal or create a railway plan aren't done in microseconds. And these are the examples for which 0.1% worse translates into costs.

Hello, recently I was looking into visiting Tokyo and started looking into its complicated but wonderfully efficient railway system. I picked 6 destinations in the area and looked up how much it would cost to go from anyone of the locations to any other.

It's worth flagging that @mfb is right that 6 cities (not 360 cities) is a small problem that can be done very quickly (even just combing through permutations). However, this will grow really quite poorly... railway costs I believe (and worse: airline flights) are notoriously bad because their cost functions don't obey the triangle inequality. It turns out that arbitrary cost functions are much worse than metric based ones for bounding exercises including the nuts and bolts of a lot of approximation algorithms.

I'm not sure where that 0.1% error number came from for 1000 cities though, esp for 'bad' cost functions like this.

See e.g. D. Johnson and L. McGeoch: Experimental analysis of heuristics for the STSP
They discuss instances with 100,000 cities where they get "within fractions of a percent" in a few hours - with 2001 computing power. For 1000-city instances they needed 20 seconds to reach 0.2%. They used both Euclidean and non-Euclidean distances.

fresh_42
See e.g. D. Johnson and L. McGeoch: Experimental analysis of heuristics for the STSP
They discuss instances with 100,000 cities where they get "within fractions of a percent" in a few hours - with 2001 computing power. For 1000-city instances they needed 20 seconds to reach 0.2%.

well, having a glimpse:
opening paragraph said:
This chapter considers the symmetric TSP; the next considers the more general and less well-studied asymmetric case

My gut reaction is this is a "nice" cost function... and I explicitly pointed out in my last post that OP's questions about monetary costs for railroads (i.e. dollars, yen, etc. set by companies and govt's) are "bad" cost functions i.e. not metrics, and I think the authors are are saying the "bad" ones are less well-studied.

It didn't let me search by term, but I read a bit more:

page 5 said:
Distances are the Euclidean distances

Which clinches it. So yea, you are citing a writeup on 'nice' cost functions specifically this paper uses a euclidean metric. The problem mentioned by OP is fundamentally much worse (as it scales). But again 6 cities is sufficiently small to not worry about all this for OP.

Just one page further:
page 6 said:
As to applications with non-geometric distances, these tend to be asymmetric as well as non-geometric and hence will be covered in the next chapter

Just one page further:
I'll have to take a closer look tomorrow. There is a fundamental difference between metric distances and non-metric distances, with the latter being a much harder problem in general.

Btw, editing your post to say "They used both Euclidean and non-Euclidean distances," after I already responded about them using euclidean distances makes the line of conversation unintelligible to a 3rd party.

I'll have to take a closer look tomorrow. There is a fundamental difference between metric distances and non-metric distances, with the latter being a much harder problem in general.

Btw, editing your post to say "They used both Euclidean and non-Euclidean distances," after I already responded about them using euclidean distances makes the line of conversation unintelligible to a 3rd party.
Yes, indeed the mention of Euclidean and non-Euclidean distances has confused the issue for me. OP is interested in the cheapest route, therefore "distances" are expressed in currency units and there no x-y coordinates. However, there might be different kinds of considerations such as weekend rates, booking x number of trips within y days, etc.

StoneTemplePython
Btw, editing your post to say "They used both Euclidean and non-Euclidean distances," after I already responded about them using euclidean distances makes the line of conversation unintelligible to a 3rd party.
I changed it three minutes after I made the original post, 11 minutes before you posted. I didn't see it originally, but then I found it so I added it to the post. There was no reply at that point. I guess you opened the thread in the three minutes between my post and my edit.

I changed it three minutes after I made the original post, 11 minutes before you posted. I didn't see it originally, but then I found it so I added it to the post. There was no reply at that point. I guess you opened the thread in the three minutes between my post and my edit.

It's feeling like a glitch in the matrix to me. I believe what you say. It was peculiar my end thouh...

not only did I not see the edit, but from past experience, when I quote an entire response, if the edit comes in before I press "post reply" then the edit actually flows into my response as part of a system sync. Sometimes this feels initially strange as I seem to be responding to something I didn't notice, but the side effect is it gives me a time stamp of what was registered in the system at the time of my response -- or so I thought. Call this a prior belief. The fact that the edited portion seemed to directly relate to my response, flowed into my likelihood function for reading the tea leaves.

- - - -
The point I wanted to make was its common for graphs to have weights that are non euclidean. But in general that's not important for TSP approximation algorithms. What really matters is whether the weights (i.e. implicit underlying cost function) can be said to be a metric. I took a closer look at the chapter you linked and I think it's all metric stuff in that particular chapter. Right? I still think it is not referring to the problem at hand -- i.e. non-metric based TSP, in particular TSP problems involving bizarre cost functions made up by companies operating railroads (and airlines). People can still learn from reading the link, but they should know it refers to a milder problem.

I took a closer look at the chapter you linked and I think it's all metric stuff in that particular chapter. Right?
I didn't check it in detail.

When you click "reply" to add a quote to the textbox the website gets the most recent state of the post as far as I know. Edits after that are not considered.