Sentences about Greatest Common Subsequence

  • MHB
  • Thread starter evinda
  • Start date
  • Tags
    Subsequence
In summary: Space!In summary, the conversation discusses the concept of greatest common subsequence (GCS) in sequences and its properties. A GCS is a subsequence that is common to both sequences and has the greatest possible length. If the last elements of the sequences are equal, then the last element is also part of the GCS and the GCS of the remaining sequences is also a GCS of the original sequences. If the last elements are not equal, then the GCS is the GCS of the remaining sequences and one of the sequences. The conversation also touches upon the difference between a subsequence and a substring and the terminology of "greatest" and "maximal" elements.
  • #1
evinda
Gold Member
MHB
3,836
0
Hello! (Wave)

Let $X=<x_1,x_2, \dots , x_m>$ and $Y=<y_1,y_2, \dots, y_n>$ be sequences and let $Z=<z_1, z_2, \dots, z_k>$ a greatest common subsequence(GCS) of $X$ and $Y$.Then:

  1. If $x_m=y_n$,then $z_k=x_m=y_n$ and $Z_{k-1}$ is a GCS of $X_{m-1}$ and $Y_{n-1}$.
  • If $x_m \neq y_n$ and $z_k \neq x_m$,then $Z_{k-1}$ is a GCS of $X_{m-1}$ and $Y$.

Could you explain me the above sentences? (Thinking)

If we conclude at the first sentence that $Z_{k-1}$ is a GCS of $X_{m-1}$ and $Y_{n-1}$,does this mean that we take into consideration the order of the elements of the sequences? (Thinking)

Also, at the second sentence,do we consider that $n<m$ and that $k=m$ ? (Thinking)
 
Physics news on Phys.org
  • #2
To be on the safe side, it would be best to have the definition of the greatest common subsequence.

evinda said:
If we conclude at the first sentence that $Z_{k-1}$ is a GCS of $X_{m-1}$ and $Y_{n-1}$,does this mean that we take into consideration the order of the elements of the sequences?
We take into account the order of elements in a sequence even if we don't conclude that $Z_{k-1}$ is a GCS of $X_{m-1}$ and $Y_{n-1}$. That's what sequence means.

evinda said:
Also, at the second sentence,do we consider that $n<m$ and that $k=m$ ?
I don't think this is relevant.
 
  • #3
Evgeny.Makarov said:
To be on the safe side, it would be best to have the definition of the greatest common subsequence.

We take into account the order of elements in a sequence even if we don't conclude that $Z_{k-1}$ is a GCS of $X_{m-1}$ and $Y_{n-1}$. That's what sequence means.

I don't think this is relevant.

According to my textbook:

"For two given sequences $X$ and $Y$,we say that a sequence $Z$ is a common subsequence of $X$ and $Y$,if it is a subsequence of $X$ and of $Y$.For example, if $X=<A,B,C,B,D,A,B>$ and $Y=<B,D,C,A,B,A>$,the sequence $<B,C,A>$ is a common subsequence of $X$ and $Y$.However, $<B,C,A>$ is not a GCS of $X$ and $Y$,because its length is $3$,but the sequence $<B,C,B,A>$,that is also common at $X$ and $Y$,has length $4$.The sequence $<B,C,B,A>$ is a GCS of $X$ and $Y$,as also the sequence $<B,D,A,B>$,because there is no common subsequence with length $5$ or with a greater length.

At the problem of the GCS,we are given two sequences $X=<x_1,x_2, \dots ,x_m>$ and $Y=<y_1, y_2, \dots, y_n>$ and we are asked to find a common subsequence of $X$ and $Y$,with the greatest possible length. "If we consider for example this GCS: $<B,C,B,A>$,we see that $x_m \neq y_n$ and $z_k \neq x_m$,so we are at the second case,right?
So..does this mean that $Z_{k-1}$ is a GCS of $X_{m-1}$ and $Y$ ?
But,the subsequence $<B,C,B>$ does not exist at $Y$ in this order..

So,do we not take the order into consideration? (Thinking)
 
  • #4
evinda said:
If we consider for example this GCS: $<B,C,B,A>$,we see that $x_m \neq y_n$ and $z_k \neq x_m$,so we are at the second case,right?
Yes.

evinda said:
But,the subsequence $<B,C,B>$ does not exist at $Y$ in this order.
So, you don't object to the textbook's claim that $\langle B,C,B,A\rangle$ is the GCS of $X=\langle A,B,C,B,D,A,B\rangle$ and $Y=\langle B,D,C,A,B,A\rangle$, but you do object that $\langle B,C,B\rangle$ is a subsequence of $Y$? Isn't the relation "is a subsequence of" transitive? I suggest you review the definition of subsequence in your textbook or in Wikipedia (see also the note at the bottom about the difference between a subsequence and a substring) and explain why you think that $\langle B,C,B\rangle$ is not a subsequence of $Y$.

evinda said:
  1. If $x_m=y_n$,then $z_k=x_m=y_n$ and $Z_{k-1}$ is a GCS of $X_{m-1}$ and $Y_{n-1}$.
  • If $x_m \neq y_n$ and $z_k \neq x_m$,then $Z_{k-1}$ is a GCS of $X_{m-1}$ and $Y$.
I am not sure I agree with 2. Let $X=\langle 1,2,3\rangle$ and $Y=\langle1,4,2,5\rangle$. Then the GCS of $X$ and $Y$ is $Z=\langle 1,2\rangle$. Also, the last elements of $X$ and $Y$ are different, and so are the last elements of $X$ and $Z$, so we are in a situation described by proposition 2. I also agree that $Z_{k-1}=\langle 1\rangle$ is a subsequence of $X_{m-1}$ and $Y$, but it is not the longest common subsequence: $Z$ is.

evinda said:
At the problem of the GCS,we are given two sequences $X=<x_1,x_2, \dots ,x_m>$ and $Y=<y_1, y_2, \dots, y_n>$ and we are asked to find a common subsequence of $X$ and $Y$,with the greatest possible length. "
Just a remark concerning terminology. There is a difference between "greatest" and "maximal". The greatest element is greater than everything else, while a maximal element is such that no greater element exists. The difference happens when the order is not linear and some elements are not comparable. For example, if sets are compared by inclusion, then the family of all subsets of some set $A$ has the greatest element: $A$ itself, since $A\subseteq A$. On the other hand, the family of proper subsets has many maximal subsets but no greatest subset. If you remove one element from $A$, you get a maximal proper subset: there is no greater proper subset. But maximal subsets obtained in this way are not comparable.

In this sense, "the greatest common divisor" is a correct term because that divisor is indeed greater than all other common divisors (and is even divisible by them). Note, however, that in a partial order, the greatest element is unique. Indeed, if $x$ and $y$ are two greatest elements, then $x\ge y$ and $y\ge x$, which implies $x=y$ by antisymmetry. Your textbook says that there can be two greatest common subsequences, which is possible because they should be more precisely called "common subsequences of greatest length". We don't compare subsequences themselves, in which case the greatest subsequence can be only one; we compare their lengths. In this sense, the name GCS is somewhat confusing. The Wikipedia link above calls a GCS "longest common subsequence", which makes more sense.

P.S. Hint: In typed text, every period and comma must be followed by a space.
 
  • #5
Evgeny.Makarov said:
Yes.

So, you don't object to the textbook's claim that $\langle B,C,B,A\rangle$ is the GCS of $X=\langle A,B,C,B,D,A,B\rangle$ and $Y=\langle B,D,C,A,B,A\rangle$, but you do object that $\langle B,C,B\rangle$ is a subsequence of $Y$? Isn't the relation "is a subsequence of" transitive? I suggest you review the definition of subsequence in your textbook or in Wikipedia (see also the note at the bottom about the difference between a subsequence and a substring) and explain why you think that $\langle B,C,B\rangle$ is not a subsequence of $Y$.

I am not sure I agree with 2. Let $X=\langle 1,2,3\rangle$ and $Y=\langle1,4,2,5\rangle$. Then the GCS of $X$ and $Y$ is $Z=\langle 1,2\rangle$. Also, the last elements of $X$ and $Y$ are different, and so are the last elements of $X$ and $Z$, so we are in a situation described by proposition 2. I also agree that $Z_{k-1}=\langle 1\rangle$ is a subsequence of $X_{m-1}$ and $Y$, but it is not the longest common subsequence: $Z$ is.

Just a remark concerning terminology. There is a difference between "greatest" and "maximal". The greatest element is greater than everything else, while a maximal element is such that no greater element exists. The difference happens when the order is not linear and some elements are not comparable. For example, if sets are compared by inclusion, then the family of all subsets of some set $A$ has the greatest element: $A$ itself, since $A\subseteq A$. On the other hand, the family of proper subsets has many maximal subsets but no greatest subset. If you remove one element from $A$, you get a maximal proper subset: there is no greater proper subset. But maximal subsets obtained in this way are not comparable.

In this sense, "the greatest common divisor" is a correct term because that divisor is indeed greater than all other common divisors (and is even divisible by them). Note, however, that in a partial order, the greatest element is unique. Indeed, if $x$ and $y$ are two greatest elements, then $x\ge y$ and $y\ge x$, which implies $x=y$ by antisymmetry. Your textbook says that there can be two greatest common subsequences, which is possible because they should be more precisely called "common subsequences of greatest length". We don't compare subsequences themselves, in which case the greatest subsequence can be only one; we compare their lengths. In this sense, the name GCS is somewhat confusing. The Wikipedia link above calls a GCS "longest common subsequence", which makes more sense.

P.S. Hint: In typed text, every period and comma must be followed by a space.

I found the theorem,that I wrote in my first post,in my notes.But..now I found the same theorem in my textbook and noticed that the former had some mistakes.. :eek:

According to my textbook:

Let $X=<x_1,x_2,…,x_m>$ and $Y=<y_1,y_2,…,y_n>$ be sequences and let $Z=<z_1,z_2,…,z_k>$ a longest common subsequence (LCS) of $X$ and $Y$.Then:

  • If $x_m=y_n$,then $z_k=x_m=y_n$ and $Z_{k-1}$ is a LCS of $X_{m-1}$ and $Y_{n-1}$
  • If $x_m \neq y_n$ and $z_k \neq x_m$,it implies that $Z$ is a LCS of $X_{m-1}$ and $Y$

  • I still haven't understood why $x_m=y_n \Rightarrow z_k=x_m=y_n$. (Thinking)
    If we have for example $X=<A,B,E,A,F,A>$ and $Y=<D,F,B,A>$, should the last element of $Z$ be $A$,or can it also be $F$ or $B$ and we could write $A$ at which position we want?
  • At the second sentence,why do we conclude that $Z$ is a LCS of $X_{m-1}$ and $Y$ ?

(Thinking)
 
  • #6
evinda said:
Let $X=<x_1,x_2,…,x_m>$ and $Y=<y_1,y_2,…,y_n>$ be sequences and let $Z=<z_1,z_2,…,z_k>$ a longest common subsequence (LCS) of $X$ and $Y$.Then:

  • If $x_m=y_n$,then $z_k=x_m=y_n$ and $Z_{k-1}$ is a LCS of $X_{m-1}$ and $Y_{n-1}$
  • If $x_m \neq y_n$ and $z_k \neq x_m$,it implies that $Z$ is a LCS of $X_{m-1}$ and $Y$
This seems more credible.

evinda said:
[*] I still haven't understood why $x_m=y_n \Rightarrow z_k=x_m=y_n$.
If the last elements of $X$ and $Y$ coincide, then why not include it in the longest common subsequence? Without that last element the common subsequence would not be the longest.

evinda said:
If we have for example $X=<A,B,E,A,F,A>$ and $Y=<D,F,B,A>$, should the last element of $Z$ be $A$,or can it also be $F$ or $B$ and we could write $A$ at which position we want?
The longest common subsequences of $X$ and $Y$ are $\langle F,A\rangle$ and $\langle B,A\rangle$. Indeed, looking at $Y$, we see that $D$ does not occur in $X$, so it can't be in an LCS. If an LCS starts with an $F$, then from $X$ it can only be $\langle F\rangle$ or $\langle F,A\rangle$, the second of which is longer. Finally, an LCS can start with a $B$, then it is $\langle B,A\rangle$.
 
  • #7
Evgeny.Makarov said:
If the last elements of $X$ and $Y$ coincide, then why not include it in the longest common subsequence? Without that last element the common subsequence would not be the longest.

I understand.. (Nod)
Evgeny.Makarov said:
The longest common subsequences of $X$ and $Y$ are $\langle F,A\rangle$ and $\langle B,A\rangle$. Indeed, looking at $Y$, we see that $D$ does not occur in $X$, so it can't be in an LCS. If an LCS starts with an $F$, then from $X$ it can only be $\langle F\rangle$ or $\langle F,A\rangle$, the second of which is longer. Finally, an LCS can start with a $B$, then it is $\langle B,A\rangle$.

Isn't the longest common subsequence of $X$ and $Y$: $<A,B,F>$ ? (Thinking) Or am I wrong?
 
  • #8
evinda said:
Isn't the longest common subsequence of $X$ and $Y$: $<A,B,F>$ ?
$\langle A,B,F\rangle$ is not a subsequence of $Y=\langle D,F,B,A\rangle$. In post #4 I recommended looking up the definition of the term "subsequence".
 
  • #9
Evgeny.Makarov said:
$\langle A,B,F\rangle$ is not a subsequence of $Y=\langle D,F,B,A\rangle$. In post #4 I recommended looking up the definition of the term "subsequence".

A ok..I got it now!Thank you very much! (Smile)
 

1. What is a Greatest Common Subsequence (GCS)?

A Greatest Common Subsequence (GCS) is a sequence of characters or elements that appear in the same order in two or more given sequences. It is not necessarily a contiguous subsequence, meaning there can be other elements in between the common elements.

2. How is the length of a GCS determined?

The length of a GCS is determined by the number of characters or elements that are present in the subsequence. It is possible for there to be multiple GCS with different lengths in the same given sequences.

3. What is the importance of finding the GCS in a sequence?

Finding the GCS in a sequence is important because it can help identify the similarities and differences between two or more sequences. This can be useful in fields such as bioinformatics, where comparing genetic sequences can provide insight into evolutionary relationships.

4. How is the GCS algorithmically calculated?

The GCS is typically calculated using dynamic programming algorithms, such as the Longest Common Subsequence (LCS) algorithm. This involves building a matrix to store the lengths of the common subsequences and then backtracking to find the actual GCS.

5. Can the GCS be used to solve other problems?

Yes, the GCS algorithm can be used to solve other problems such as the Shortest Common Supersequence problem, which involves finding the shortest sequence that contains all the given sequences as subsequences. The GCS algorithm can also be adapted for use in string searching and comparing algorithms.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
Replies
14
Views
1K
  • Calculus and Beyond Homework Help
Replies
7
Views
1K
  • Differential Equations
Replies
1
Views
1K
Replies
18
Views
2K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
3
Views
1K
  • Linear and Abstract Algebra
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
869
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
929
  • Set Theory, Logic, Probability, Statistics
Replies
10
Views
3K
Back
Top