Is there a flaw in my longest common subsequence algorithm?

Click For Summary
The discussion centers on a solution for the Common Child string problem, where the provided code aims to find the length of the longest common subsequence between two strings. The initial implementation passes several basic test cases but fails on a larger test case, undercalculating the expected result. Participants emphasize the need for clearer explanations and comments within the code to improve understanding. Suggestions include debugging smaller test cases to identify errors and considering a more efficient approach using dynamic programming instead of the current brute-force method, which has a high time complexity. Dynamic programming could optimize the solution, reducing the complexity significantly and addressing the overlapping substructure problem effectively.
SlurrerOfSpeech
Messages
141
Reaction score
11
What I have is

Code:
    /// <summary>
    /// Provides a solution to the Common Child string problem
    /// https://www.hackerrank.com/challenges/common-child/problem
    /// </summary>
    public static class CommonChild
    {
        public static int Solve(string first, string second)
        {
            int max = 0;
            for (int i = 0; i < first.Length; ++i)
            {
                int current = 0;
                for (int j = i, k = 0; j < first.Length; ++j)
                {
                    int m = k;
                    bool found = false;
                    for (; k < second.Length && !found; ++k)
                    {
                        if (first[j] == second[k])
                        {
                            current += 1;
                            found = true;
                        }
                    }
                    if (!found)
                    {
                        k = m;
                    }
                }
                if (current > max)
                {
                    max = current;
                }
            }
            return max;
        }
    }

Which is passing the basic test cases of mine and the sites's:

Code:
        [TestMethod]
        public void SampleTest0()
        {
            ExecuteTest(first: "HARRY", second: "SALLY", expected: 2);
        }

        [TestMethod]
        public void SampleTest1()
        {
            ExecuteTest(first: "AA", second: "BB", expected: 0);
        }

        [TestMethod]
        public void SampleTest2()
        {
            ExecuteTest(first: "SHINCHAN", second: "NOHARAAA", expected: 3);
        }

        [TestMethod]
        public void SampleTest3()
        {
            ExecuteTest(first: "ABCDEF", second: "FBDAMN", expected: 2);
        }

        [TestMethod]
        public void MyTest0()
        {
            ExecuteTest(first: "", second: "", expected: 0);
        }

        [TestMethod]
        public void MyTest1()
        {
            ExecuteTest(first: "", second: "A", expected: 0);
        }

        [TestMethod]
        public void MyTest2()
        {
            ExecuteTest(first: "A", second: "A", expected: 1);
        }

        [TestMethod]
        public void MyTest3()
        {
            ExecuteTest(first: "A", second: "B", expected: 0);
        }

        [TestMethod]
        public void MyTest4()
        {
            ExecuteTest(first: "A", second: "AB", expected: 1);
        }

        [TestMethod]
        public void MyTest5()
        {
            ExecuteTest(first: "A", second: "BA", expected: 1);
        }
        [TestMethod]
        public void MyTest6()
        {
            ExecuteTest(first: "AB", second: "BA", expected: 1);
        }

        [TestMethod]
        public void MyTest7()
        {
            ExecuteTest(first: "AB", second: "AB", expected: 2);
        }

        [TestMethod]
        public void MyTest8()
        {
            ExecuteTest(first: "AC", second: "ABC", expected: 2);
        }

        [TestMethod]
        public void MyTest9()
        {
            ExecuteTest(first: "AC", second: "ABCD", expected: 2);
        }

        [TestMethod]
        public void MyTest10()
        {
            ExecuteTest(first: "ACE", second: "ABCDEE", expected: 3);
        }

        [TestMethod]
        public void MyTest11()
        {
            ExecuteTest(first: "ACEF", second: "ABCDEE", expected: 3);
        }

        [TestMethod]
        public void MyTest12()
        {
            ExecuteTest(first: "ACEJJKLMNPST", second: "ABCDEEGHHHIPQRRS", expected: 5);
        }

        [TestMethod]
        public void MyTest13()
        {
            ExecuteTest(first: "ACEF", second: "BCDEEA", expected: 2);
        }

        [TestMethod]
        public void MyTest14()
        {
            ExecuteTest(first: "WYZXY", second: "YOOX", expected: 2);
        }

        [TestMethod]
        public void MyTest15()
        {
            ExecuteTest(first: "XXXXXXYXXX", second: "XYXXXXXXXX", expected: 9);
        }

But it is failing this one

Code:
        [TestMethod]
        public void TestCase1()
        {
            ExecuteTest
            (
                first: "WEWOUCUIDGCGTRMEZEPXZFEJWISRSBBSYXAYDFEJJDLEBVHHKS",
                second: "FDAGCXGKCTKWNECHMRXZWMLRYUCOCZHJRRJBOAJOQJZZVUYXIC",
                expected: 15
            );
        }

and I can't figure out why because the inputs are too big for me to see what's going on. Mine is undercalculating (10).
 
Technology news on Phys.org
Basically, I'm wondering if someone here has a smaller test case that can help me see my error
 
It helps if you explain what you want to achieve in plain English. It is not clear to me from the examples.
 
S_David said:
It helps if you explain what you want to achieve in plain English. It is not clear to me from the examples.
I agree. Rather than start with "What I have is" and then a bunch of code it would be helpful to specify what you're looking for. I think I figured out from your examples that the longest matching subsequences in HARRY and SALLY are AY, but neither I nor other readers should have to parse your code to see what you're doing, especially since you have not a single comment other than the summary at the top. I shouldn't have to follow a hyperlink to get the gist of what you're doing.

If you're having trouble with your longer example, take the first 11 characters of each string and watch what is happening in a debugger. Watching what happens for the first 9 characters might be a bit tedious, but would probably enlighten you as to what is causing the bug.
 
  • Like
Likes QuantumQuest
try
CYAT
CATY
in this order. (of course order shouldn't matter in a correct program)
 
SlurrerOfSpeech said:
... and I can't figure out why because the inputs are too big for me to see what's going on. Mine is undercalculating (10).

First, I'll totally agree to Mark44. It would be much more helpful for anyone reviewing your code but even more importantly for you to put the appropriate comments into your code. A summary comment doesn't tell much. So, the way it is, leaves you with two options. First, as has already been pointed out, you can use your debugger and see where the code fails. Second, you can write your program in some pseudocode form and see what operation(s) potentially fail and go from there. If you didn't write pseudocode first, including the algorithm you use and then wrote your code based on it as - although not recommended, happens to all programmers sometimes , you can do it now. I think that spotting the error(s) this way is preferable in order to get a complete view for any (feasible) input.

A second point is that, as I see reviewing your code, you 're using a brute force approach. This is not the optimal way to solve the longest common subsequence problem. It has ##O(2^n)## time complexity where ##n## is the number of characters of the first string. Why not use dynamic programming? It has ##O(2^n)## time complexity in the worst case i.e. when all characters of the two examined strings mismatch. Using memorization you can deal with the overlapping substructure problem and reduce it to ##O(mn)## where ##m##,##n## the length of the two strings respectively: ##[0...m - 1], [0...n - 1]##.
 
Learn If you want to write code for Python Machine learning, AI Statistics/data analysis Scientific research Web application servers Some microcontrollers JavaScript/Node JS/TypeScript Web sites Web application servers C# Games (Unity) Consumer applications (Windows) Business applications C++ Games (Unreal Engine) Operating systems, device drivers Microcontrollers/embedded systems Consumer applications (Linux) Some more tips: Do not learn C++ (or any other dialect of C) as a...

Similar threads

  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 36 ·
2
Replies
36
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K