# Solving a colour puzzle programmatically

Gold Member
Summary:
How complex might the algorithm end up being to solve this colour-matching puzzle in an ideally minimal number of moves?
Just for fun (read as: staring at the ceiling trying to sleep), trying to figure out if there is a programmatic way to solve this colour-matching puzzle app on my phone.

It's called... Here is a partially solved puzzle: The dotted tiles are immovable, and serve as references. The undotted tiles can be swapped two at a time.
This is not a time challenge; it is a minimum-move challenge. i.e.: the fewer moves the better.

I was idly wondering whether there is a theoretical "perfect" game, by solving it with the minimum possible moves for a given starting configuration with no wasted moves. (Note that every move you make involves the displacement of two tiles, but normally only one is put in its final location).

A couple steps are involved - some are more engineering that logic - although much of it can be easier done in one's head than with physical automation.

0. Establish conventions:
Columns: A-G, Rows:1-9.

1. Identify the initial configuration.
Capture the colour of tile A1, tag it with its unique RGB value. Repeat 62 times (for this puzzle) all the way to G9.
eg:
[A1:000,000,255], [B1:000,060,255], ...
[A2: 060,000,255], [B2: 060,060,255], ...
... [G9: 255,255,0]

2. Find neighbours.
??
At first I did not see a problem here. I assumed that a neighbour could be identified by subtracting the RG and B values to two tiles. A small reminder means close neighbours. The smallest remainder of all means this is a neighbor. But it doesn't yet tell us which neighbour it is. It could be one of four.

Notice too, that - depending on the difficulty of the puzzle - the difference in tint between neighbours can be arbitrarily small. The whole puzzle might be cool pastels - and that means the RGBs might be so close that neighbours might be arbitrarily hard to distinguish mathematically.

I think there's a lot more logic here than I first thought.

3. Sort.
Use a sorting algorithm that minimizes the number of swaps in the 63-cell dataset. Output a sequence of move instructions.
eg:
[F2 > E2],
[E3 > E9],
[E2 > ??]

I see a problem here too. In step 3, E2 is supposed to move - but its on longer there. It was inadvertently moved in step 1, to make room for F2.

Hope I don't fall asleep too quickly tonight or I'll never figure this out.

Ibix
2020 Award
I'd observe that you have three functions, ##R(x,y)##, ##G(x,y)##, and ##B(x,y)## and you've got the values of those functions at specified points (your dotted tiles). So I'd try to find simple polynomials that fit those fixed points - least-squares will probably do. Each polynomial model will provide predictions of the colours of the movable tiles. See how well the set of predicted colours matches the set of available tiles and pick the model that gives the best match for the lowest order polynomial (there are tools like the Akaike Information Criterion for controlling over-fitting).

Now you've got your predictions for the correct tile positions you first swap any pairs of tiles that are in each other's correct places. Then you swap triplets that are in each others' correct places, then quadruplets, etc.

Baluncore
Now you've got your predictions for the correct tile positions you first swap any pairs of tiles that are in each other's correct places. Then you swap triplets that are in each others' correct places, then quadruplets, etc.
Correct.
Assuming only two tile swaps are permitted.
First identify and tabulate the initial position, and the destination for each movable tile.
Identify cycles by following the linked loops in the list. An example of a cycle of three is; 1 → 5, 5 → 19, 19 → 1.

For n random placed tiles I would expect LOGe( n ) cycles or loops in the list. There may be one loop, there may be n loops if the problem is solved.

The problem requires solving the loops individually, in any loop order. Any departure fuses two economic loops into a more expensive loop.

The worst case is a single loop, one tile correct for one swap, cost = 100%.
The optimum swap sequences will be, for a cycle of;
1. Do not touch tiles that are correct.
2. Look for simple swaps = a cycle of two. Cost is 1 swap / to get 2 correct = 50%
3. Look for cycles of three, where two swaps set three tiles in correct positions. 2/3=66%.
4. Look for cycles of four, where 3 swaps correct 4 tiles. 3/4 = 75%.
5. 4/5 = 80%
6. 5/6 = 83% etc
Note that for a cycle of n, requires n-1 swaps. After the first swap it becomes a cycle of n-1.
This problem is akin to an indirect linked list, a shuffle order problem, or a cyclic cryptographic lookup table.

Gold Member
I'd observe that you have three functions, ##R(x,y)##, ##G(x,y)##, and ##B(x,y)## and you've got the values of those functions at specified points (your dotted tiles). So I'd try to find simple polynomials that fit those fixed points - least-squares will probably do. Each polynomial model will provide predictions of the colours of the movable tiles. See how well the set of predicted colours matches the set of available tiles and pick the model that gives the best match for the lowest order polynomial (there are tools like the Akaike Information Criterion for controlling over-fitting).

Now you've got your predictions for the correct tile positions you first swap any pairs of tiles that are in each other's correct places. Then you swap triplets that are in each others' correct places, then quadruplets, etc.
Wow. I am super out of my depth**. I might have learned about polynomials in HS, but I merely reading the definition of a polynomial, and I'm going cross-eyed.

a function that involves only non-negative integer powers or only positive integer exponents of a variable in an equation

Waitaminnit - this simply saying 'equations that only use positive integer powers', i.e. y=2x^0, y=2x^1, y=2x^2, y=2x^3, etc. OK.

**But I knew I would be as soon as I started writing my post here.

Gold Member
2. Look for simple swaps = a cycle of two. Cost is 1 swap / to get 2 correct = 50%
Yeah. I assumed this would be fairly rare.

3. Look for cycles of three, where two swaps set three tiles in correct positions. 2/3=66%.
I figured this would be exceedingly rare, hardly worth considering. Still, you're right, in that this is the right way.

And that means a lot of additional logic.

Ibix
2020 Award
Waitaminnit - this simply saying 'equations that only use positive integer powers', i.e. y=2x^0, y=2x^1, y=2x^2, y=2x^3, etc. OK.
Yes. So you could guess ##R(x,y)=R_0## (only terms like ##ax^0##) and estimate ##R_0##. If that's a poor match (because the ##R## channel isn't actually a constant) you could try a linear model, ##R(x,y)=R_0+R_{x}x+R_{y}y## (only terms like ##ax^0## and ##bx^1##). If that doesn't work too well either, try a quadratic one, ##R(x,y)=R_0+R_{x}x+R_yy+R_{x2}x^2+R_{xy}xy+R_{y2}y^2## (only terms like ##ax^0##, ##bx^1##, ##cx^2##). Etcetera. There's a standard technique called least-squares for estimating the constants ##R_0##, ##R_x## (etc) if you have a few data points.

The more terms you add the better your fit will be. If you have as many terms in your model as data points then it will fit perfectly - but it's likely to be a force-fit rather than a natural one. Actually, though, you have an independent test in that you can use your model to predict the colours of the scrambled tiles. So there's not much need to worry about things like the information criterion I mentioned. You fit your fixed tiles as best you can and stop adding to the model when the predicted colours of the scrambled tiles are "close enough" to the actual ones.

Staff Emeritus
I'm quite puzzled by the rules of the game and solution. Can any two movable (non-dotted) titles be swapped? Or is there some restriction (i.e. adjacency). If so, isn't it just a sort?

pinball1970
Gold Member
I'm quite puzzled by the rules of the game and solution. Can any two movable (non-dotted) titles be swapped? Or is there some restriction (i.e. adjacency). If so, isn't it just a sort?
Fewest moves?
They may be using gods algorithm @suremarc may be able to elaborate!

Baluncore
They may be using gods algorithm
That is not needed since the subsets of tiles linked in each closed chain, lead directly to an obvious minimum solution. You need only follow each link once for each shuffled tile to identify the loops. It is a trivial list.

Staff Emeritus
If I understand it correctly - and I probably don't - any non-dotted tile can be swapped, and the proper final locations can be determined before swapping anything.

You have N tiles. M are dotted or already in the correct position, so you need N-M-1 swaps.

Baluncore
so you need N-M-1 swaps.
That is a maximum. There will be less because of the economy gained from short chains. Until you see the discrete chains in the shuffled data you will not see the economy.

Staff Emeritus
Thanks. By short chains you mean 1 & 2 need to be swapped, but then (by happenstance) 2 is in position, so I can swap 3 & 4, and again by happenstance 4 is in position, etc? So the answer is (N-M-1)/2 at best and (N-M-1) at worst?

Gold Member
Yah, it's a straight sort. The trickiness lies in the fact that the only criteria is colour, and the colour gradients get more and more subtle.

It is confounded by contextual perception - eg. a yellow tile will look peach or green depending on context. And you have no way to compare them before committing to a swap.

eg: these discs are both the same yellow: Gold Member
That is a maximum. There will be less because of the economy gained from short chains. Until you see the discrete chains in the shuffled data you will not see the economy.
My experienced opinion (having solved several hundred of these) is that - while this may work for a computer that can directly compare two colours in virtual space before committing - it is, by any practical measure, impossible for a human. It is at the human capacity to do no more than simply place a single tile in its correct location without error - never mind two at a time.

IOW: it's harder than you think.

(Anyone is invited to try it. It's a free downloadable app on Google Play and presumably Apple Store).

pbuk
Gold Member
Thanks. By short chains you mean 1 & 2 need to be swapped, but then (by happenstance) 2 is in position, so I can swap 3 & 4, and again by happenstance 4 is in position, etc? So the answer is (N-M-1)/2 at best and (N-M-1) at worst?
I think the minimum is ## \left\lceil \frac{N - M}{2} \right\rceil ## which is 1 more than you have for e.g. ## N - M = 3 ##.

pbuk
Gold Member
eg: these discs are both the same yellow:
Actually they are not because of the resolution/compression (I guess the image has been reduced from its original size): the magenta/green bars 'bleed' into the yellow. But I take your point.

pbuk
Gold Member
It is at the human capacity to do no more than simply place a single tile in its correct location without error - never mind two at a time.
I don't think it matters: as long as you are not actually making a mistake then aren't the lengths of the chains fixed?

Gold Member
(I guess the image has been reduced from its original size)
Yeah. I just posted the thumbnail.

Gold Member
I don't think it matters: as long as you are not actually making a mistake then aren't the lengths of the chains fixed?
If all humans had to do was "not actually make mistakes", this would be a very different world.

I think you need to try it to understand what you're asking of a human.

Baluncore
The interesting thing about n shuffled links is that there will probably be Ln( n ) loops.
Each loop saves one swap, which suggests statistically; n – Ln( n ) swaps will be needed.

Consider an array indexed from 0 to 11.
Initialise each element with it's index, so it points to itself.
Shuffle the contents of the array.

Then start at 0, and follow the links along the chain until it returns to 0.
Then select the lowest untouched, and follow that new chain until it must also close.
Repeat until all elements have been identified as members of a chain.

A link that points to itself is resolved.
If a chain contains two elements, then a single swap will resolve two tiles.
If a chain contains three elements, then two swaps resolve three tiles. Etc.

Now think of the array as a chain locker that begins with a single loop of n links.
Randomly select two links and swap them. If the links are in the same loop, the loop is split into two smaller loops. If the links are in different loops, those two loops are fused into one. Continue the shuffling process.

Shuffling by random swaps will result most often in Ln( n ) loops.
The chance that there will be only one loop is 1 / n
The chance that there will be n loops is 1 / n!

Here is a random example for n = 12. Gold Member
I guess that's the key. A computer has the ability to:
1] store all its work virtually, and only make a move once it's processed all the options it sees fit.
2] know exactly what colour a tile is - and therefore an exact index as to where it belongs - without requiring any contextual comparison.

A human could use a pencil notebook for 1] but could not do 2] without artificial accessories.

Staff Emeritus
which is 1 more than you have

To be fair, it's only 1/2 more. In the words of my old gunnery sergeant "You three guys - half of you come with me!"

• pbuk
pbuk
Gold Member
I guess that's the key. A computer has the ability to:
1] store all its work virtually, and only make a move once it's processed all the options it sees fit.
But you don't need to process all the options, as soon as you know exactly where a tile goes you lose nothing by swapping it into that place straight away.

Gold Member
But you don't need to process all the options, as soon as you know exactly where a tile goes you lose nothing by swapping it into that place straight away.
1] You don't know exactly where a tile goes - because human colour perception is subjective. That's the real trick to the puzzle*.

*(And, incidentally, why I originally thought it would be trivial for a program to solve. If you have processed every tile and obtained absolute values for every one, the puzzle is 3/4ths solved.)

2] If you "swap [a tile] into place straight away", then you are only solving chains of length one - i.e. the second tile in a given swap is not solved at all.

To solve chains of length two or more, a solver would need to try many permutations before even finding a single valid swap-pair. It might quite likely go through every permutation of a given puzzle and not find a single pair of mutually-swappable tiles.

pbuk
Gold Member
1] You don't know exactly where a tile goes - because human colour perception is subjective. That's the real trick to the puzzle*.

Yes I think we are all agreed on this.

2] If you "swap [a tile] into place straight away", then you are only solving chains of length one - i.e. the second tile in a given swap is not solved at all.

First let's sort some terminology: @Baluncore's post #21 called a tile that was already in the right place a chain of length 1: I prefer to call that a chain of length 0 and it seems that you do too so let's go with that.

Now when we swap the first tile into the correct place then there are only two possibilities:
1. We complete a chain of length 1
2. We start a chain of length > 1.
To solve chains of length two or more, a solver would need to try many permutations before even finding a single valid swap-pair.
No, this is not the case. You don't have to search for chains because you can never break them by making a 'wrong' swap.

It might quite likely go through every permutation of a given puzzle and not find a single pair of mutually-swappable tiles.
It may well be the case that there are no chains of length 1, but there is nothing to be lost or gained from searching for them.