Is there a flaw in my longest common subsequence algorithm?

SlurrerOfSpeech · Feb 24, 2018

What I have is

Code:

    /// <summary>
    /// Provides a solution to the Common Child string problem
    /// https://www.hackerrank.com/challenges/common-child/problem
    /// </summary>
    public static class CommonChild
    {
        public static int Solve(string first, string second)
        {
            int max = 0;
            for (int i = 0; i < first.Length; ++i)
            {
                int current = 0;
                for (int j = i, k = 0; j < first.Length; ++j)
                {
                    int m = k;
                    bool found = false;
                    for (; k < second.Length && !found; ++k)
                    {
                        if (first[j] == second[k])
                        {
                            current += 1;
                            found = true;
                        }
                    }
                    if (!found)
                    {
                        k = m;
                    }
                }
                if (current > max)
                {
                    max = current;
                }
            }
            return max;
        }
    }

Which is passing the basic test cases of mine and the sites's:

Code:

        [TestMethod]
        public void SampleTest0()
        {
            ExecuteTest(first: "HARRY", second: "SALLY", expected: 2);
        }

        [TestMethod]
        public void SampleTest1()
        {
            ExecuteTest(first: "AA", second: "BB", expected: 0);
        }

        [TestMethod]
        public void SampleTest2()
        {
            ExecuteTest(first: "SHINCHAN", second: "NOHARAAA", expected: 3);
        }

        [TestMethod]
        public void SampleTest3()
        {
            ExecuteTest(first: "ABCDEF", second: "FBDAMN", expected: 2);
        }

        [TestMethod]
        public void MyTest0()
        {
            ExecuteTest(first: "", second: "", expected: 0);
        }

        [TestMethod]
        public void MyTest1()
        {
            ExecuteTest(first: "", second: "A", expected: 0);
        }

        [TestMethod]
        public void MyTest2()
        {
            ExecuteTest(first: "A", second: "A", expected: 1);
        }

        [TestMethod]
        public void MyTest3()
        {
            ExecuteTest(first: "A", second: "B", expected: 0);
        }

        [TestMethod]
        public void MyTest4()
        {
            ExecuteTest(first: "A", second: "AB", expected: 1);
        }

        [TestMethod]
        public void MyTest5()
        {
            ExecuteTest(first: "A", second: "BA", expected: 1);
        }
        [TestMethod]
        public void MyTest6()
        {
            ExecuteTest(first: "AB", second: "BA", expected: 1);
        }

        [TestMethod]
        public void MyTest7()
        {
            ExecuteTest(first: "AB", second: "AB", expected: 2);
        }

        [TestMethod]
        public void MyTest8()
        {
            ExecuteTest(first: "AC", second: "ABC", expected: 2);
        }

        [TestMethod]
        public void MyTest9()
        {
            ExecuteTest(first: "AC", second: "ABCD", expected: 2);
        }

        [TestMethod]
        public void MyTest10()
        {
            ExecuteTest(first: "ACE", second: "ABCDEE", expected: 3);
        }

        [TestMethod]
        public void MyTest11()
        {
            ExecuteTest(first: "ACEF", second: "ABCDEE", expected: 3);
        }

        [TestMethod]
        public void MyTest12()
        {
            ExecuteTest(first: "ACEJJKLMNPST", second: "ABCDEEGHHHIPQRRS", expected: 5);
        }

        [TestMethod]
        public void MyTest13()
        {
            ExecuteTest(first: "ACEF", second: "BCDEEA", expected: 2);
        }

        [TestMethod]
        public void MyTest14()
        {
            ExecuteTest(first: "WYZXY", second: "YOOX", expected: 2);
        }

        [TestMethod]
        public void MyTest15()
        {
            ExecuteTest(first: "XXXXXXYXXX", second: "XYXXXXXXXX", expected: 9);
        }

But it is failing this one

Code:

        [TestMethod]
        public void TestCase1()
        {
            ExecuteTest
            (
                first: "WEWOUCUIDGCGTRMEZEPXZFEJWISRSBBSYXAYDFEJJDLEBVHHKS",
                second: "FDAGCXGKCTKWNECHMRXZWMLRYUCOCZHJRRJBOAJOQJZZVUYXIC",
                expected: 15
            );
        }

and I can't figure out why because the inputs are too big for me to see what's going on. Mine is undercalculating (10).

SlurrerOfSpeech · Feb 24, 2018

Basically, I'm wondering if someone here has a smaller test case that can help me see my error

EngWiPy · Feb 24, 2018

It helps if you explain what you want to achieve in plain English. It is not clear to me from the examples.

Mark44 · Feb 24, 2018

S_David said:

It helps if you explain what you want to achieve in plain English. It is not clear to me from the examples.

I agree. Rather than start with "What I have is" and then a bunch of code it would be helpful to specify what you're looking for. I think I figured out from your examples that the longest matching subsequences in HARRY and SALLY are AY, but neither I nor other readers should have to parse your code to see what you're doing, especially since you have not a single comment other than the summary at the top. I shouldn't have to follow a hyperlink to get the gist of what you're doing.

If you're having trouble with your longer example, take the first 11 characters of each string and watch what is happening in a debugger. Watching what happens for the first 9 characters might be a bit tedious, but would probably enlighten you as to what is causing the bug.

willem2 · Feb 25, 2018

try
CYAT
CATY
in this order. (of course order shouldn't matter in a correct program)

QuantumQuest · Feb 25, 2018

SlurrerOfSpeech said:

... and I can't figure out why because the inputs are too big for me to see what's going on. Mine is undercalculating (10).

First, I'll totally agree to Mark44. It would be much more helpful for anyone reviewing your code but even more importantly for you to put the appropriate comments into your code. A summary comment doesn't tell much. So, the way it is, leaves you with two options. First, as has already been pointed out, you can use your debugger and see where the code fails. Second, you can write your program in some pseudocode form and see what operation(s) potentially fail and go from there. If you didn't write pseudocode first, including the algorithm you use and then wrote your code based on it as - although not recommended, happens to all programmers sometimes , you can do it now. I think that spotting the error(s) this way is preferable in order to get a complete view for any (feasible) input.

A second point is that, as I see reviewing your code, you 're using a brute force approach. This is not the optimal way to solve the longest common subsequence problem. It has ##O(2^n)## time complexity where ##n## is the number of characters of the first string. Why not use dynamic programming? It has ##O(2^n)## time complexity in the worst case i.e. when all characters of the two examined strings mismatch. Using memorization you can deal with the overlapping substructure problem and reduce it to ##O(mn)## where ##m##,##n## the length of the two strings respectively: ##[0...m - 1], [0...n - 1]##.

Is there a flaw in my longest common subsequence algorithm?

1. What is a longest common subsequence algorithm?

2. How do I know if there is a flaw in my longest common subsequence algorithm?

3. What are some common flaws in longest common subsequence algorithms?

4. How can I improve my longest common subsequence algorithm?

5. Are there any resources available to help me with my longest common subsequence algorithm?

Similar threads

Hot Threads

Recent Insights