Is there a flaw in my longest common subsequence algorithm?

In summary: TestMethod] public void SampleTest3() { ExecuteTest(first: "NOHARAAA", second: "RAINSIDE", expected: 2); } }In summary, the solution to the Common Child string problem is to pass the basic test cases of the sites's.
  • #1
SlurrerOfSpeech
141
11
What I have is

Code:
    /// <summary>
    /// Provides a solution to the Common Child string problem
    /// https://www.hackerrank.com/challenges/common-child/problem
    /// </summary>
    public static class CommonChild
    {
        public static int Solve(string first, string second)
        {
            int max = 0;
            for (int i = 0; i < first.Length; ++i)
            {
                int current = 0;
                for (int j = i, k = 0; j < first.Length; ++j)
                {
                    int m = k;
                    bool found = false;
                    for (; k < second.Length && !found; ++k)
                    {
                        if (first[j] == second[k])
                        {
                            current += 1;
                            found = true;
                        }
                    }
                    if (!found)
                    {
                        k = m;
                    }
                }
                if (current > max)
                {
                    max = current;
                }
            }
            return max;
        }
    }

Which is passing the basic test cases of mine and the sites's:

Code:
        [TestMethod]
        public void SampleTest0()
        {
            ExecuteTest(first: "HARRY", second: "SALLY", expected: 2);
        }

        [TestMethod]
        public void SampleTest1()
        {
            ExecuteTest(first: "AA", second: "BB", expected: 0);
        }

        [TestMethod]
        public void SampleTest2()
        {
            ExecuteTest(first: "SHINCHAN", second: "NOHARAAA", expected: 3);
        }

        [TestMethod]
        public void SampleTest3()
        {
            ExecuteTest(first: "ABCDEF", second: "FBDAMN", expected: 2);
        }

        [TestMethod]
        public void MyTest0()
        {
            ExecuteTest(first: "", second: "", expected: 0);
        }

        [TestMethod]
        public void MyTest1()
        {
            ExecuteTest(first: "", second: "A", expected: 0);
        }

        [TestMethod]
        public void MyTest2()
        {
            ExecuteTest(first: "A", second: "A", expected: 1);
        }

        [TestMethod]
        public void MyTest3()
        {
            ExecuteTest(first: "A", second: "B", expected: 0);
        }

        [TestMethod]
        public void MyTest4()
        {
            ExecuteTest(first: "A", second: "AB", expected: 1);
        }

        [TestMethod]
        public void MyTest5()
        {
            ExecuteTest(first: "A", second: "BA", expected: 1);
        }
        [TestMethod]
        public void MyTest6()
        {
            ExecuteTest(first: "AB", second: "BA", expected: 1);
        }

        [TestMethod]
        public void MyTest7()
        {
            ExecuteTest(first: "AB", second: "AB", expected: 2);
        }

        [TestMethod]
        public void MyTest8()
        {
            ExecuteTest(first: "AC", second: "ABC", expected: 2);
        }

        [TestMethod]
        public void MyTest9()
        {
            ExecuteTest(first: "AC", second: "ABCD", expected: 2);
        }

        [TestMethod]
        public void MyTest10()
        {
            ExecuteTest(first: "ACE", second: "ABCDEE", expected: 3);
        }

        [TestMethod]
        public void MyTest11()
        {
            ExecuteTest(first: "ACEF", second: "ABCDEE", expected: 3);
        }

        [TestMethod]
        public void MyTest12()
        {
            ExecuteTest(first: "ACEJJKLMNPST", second: "ABCDEEGHHHIPQRRS", expected: 5);
        }

        [TestMethod]
        public void MyTest13()
        {
            ExecuteTest(first: "ACEF", second: "BCDEEA", expected: 2);
        }

        [TestMethod]
        public void MyTest14()
        {
            ExecuteTest(first: "WYZXY", second: "YOOX", expected: 2);
        }

        [TestMethod]
        public void MyTest15()
        {
            ExecuteTest(first: "XXXXXXYXXX", second: "XYXXXXXXXX", expected: 9);
        }

But it is failing this one

Code:
        [TestMethod]
        public void TestCase1()
        {
            ExecuteTest
            (
                first: "WEWOUCUIDGCGTRMEZEPXZFEJWISRSBBSYXAYDFEJJDLEBVHHKS",
                second: "FDAGCXGKCTKWNECHMRXZWMLRYUCOCZHJRRJBOAJOQJZZVUYXIC",
                expected: 15
            );
        }

and I can't figure out why because the inputs are too big for me to see what's going on. Mine is undercalculating (10).
 
Technology news on Phys.org
  • #2
Basically, I'm wondering if someone here has a smaller test case that can help me see my error
 
  • #3
It helps if you explain what you want to achieve in plain English. It is not clear to me from the examples.
 
  • #4
S_David said:
It helps if you explain what you want to achieve in plain English. It is not clear to me from the examples.
I agree. Rather than start with "What I have is" and then a bunch of code it would be helpful to specify what you're looking for. I think I figured out from your examples that the longest matching subsequences in HARRY and SALLY are AY, but neither I nor other readers should have to parse your code to see what you're doing, especially since you have not a single comment other than the summary at the top. I shouldn't have to follow a hyperlink to get the gist of what you're doing.

If you're having trouble with your longer example, take the first 11 characters of each string and watch what is happening in a debugger. Watching what happens for the first 9 characters might be a bit tedious, but would probably enlighten you as to what is causing the bug.
 
  • Like
Likes QuantumQuest
  • #5
try
CYAT
CATY
in this order. (of course order shouldn't matter in a correct program)
 
  • #6
SlurrerOfSpeech said:
... and I can't figure out why because the inputs are too big for me to see what's going on. Mine is undercalculating (10).

First, I'll totally agree to Mark44. It would be much more helpful for anyone reviewing your code but even more importantly for you to put the appropriate comments into your code. A summary comment doesn't tell much. So, the way it is, leaves you with two options. First, as has already been pointed out, you can use your debugger and see where the code fails. Second, you can write your program in some pseudocode form and see what operation(s) potentially fail and go from there. If you didn't write pseudocode first, including the algorithm you use and then wrote your code based on it as - although not recommended, happens to all programmers sometimes , you can do it now. I think that spotting the error(s) this way is preferable in order to get a complete view for any (feasible) input.

A second point is that, as I see reviewing your code, you 're using a brute force approach. This is not the optimal way to solve the longest common subsequence problem. It has ##O(2^n)## time complexity where ##n## is the number of characters of the first string. Why not use dynamic programming? It has ##O(2^n)## time complexity in the worst case i.e. when all characters of the two examined strings mismatch. Using memorization you can deal with the overlapping substructure problem and reduce it to ##O(mn)## where ##m##,##n## the length of the two strings respectively: ##[0...m - 1], [0...n - 1]##.
 

1. What is a longest common subsequence algorithm?

A longest common subsequence algorithm is a method used to find the longest subsequence that is common to two or more sequences. It is often used in bioinformatics, computer science, and other fields to compare strings of data and identify similarities.

2. How do I know if there is a flaw in my longest common subsequence algorithm?

There are a few ways to determine if there is a flaw in your longest common subsequence algorithm. You can test it with different input sequences and see if the results are accurate. You can also compare your results with other established algorithms to see if there are any discrepancies.

3. What are some common flaws in longest common subsequence algorithms?

Some common flaws in longest common subsequence algorithms include not accounting for edge cases, incorrect handling of special characters, and not considering all possible combinations of input sequences. It is important to thoroughly test and debug your algorithm to identify and fix any potential flaws.

4. How can I improve my longest common subsequence algorithm?

One way to improve your longest common subsequence algorithm is to optimize the code for efficiency. This could include reducing the number of steps or using more efficient data structures. Additionally, incorporating additional checks and error handling can help to improve the accuracy and reliability of your algorithm.

5. Are there any resources available to help me with my longest common subsequence algorithm?

Yes, there are many resources available to help with longest common subsequence algorithms. Searching online for tutorials, forums, and code examples can provide valuable insights and tips. There are also textbooks and academic papers that discuss the theory and implementation of these algorithms in depth.

Similar threads

  • Programming and Computer Science
Replies
9
Views
1K
  • Programming and Computer Science
Replies
2
Views
643
  • Programming and Computer Science
Replies
8
Views
1K
  • Programming and Computer Science
Replies
3
Views
1K
  • Programming and Computer Science
2
Replies
36
Views
2K
  • Programming and Computer Science
Replies
1
Views
3K
  • Programming and Computer Science
Replies
2
Views
2K
  • Programming and Computer Science
Replies
1
Views
1K
  • Programming and Computer Science
Replies
2
Views
2K
  • Programming and Computer Science
Replies
3
Views
3K
Back
Top