Find the context-free grammar,that produces a language

  • Context: MHB 
  • Thread starter Thread starter evinda
  • Start date Start date
  • Tags Tags
    Language
Click For Summary

Discussion Overview

The discussion revolves around finding a context-free grammar (CFG) that generates a specific language defined by the condition that the number of 'a's in a string is twice the number of 'b's. Participants explore the concept of CFGs, share examples, and discuss methods for proving properties of the grammar.

Discussion Character

  • Exploratory
  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant requests clarification on how to find a CFG for a specific language, indicating a lack of familiarity with CFGs.
  • Another participant suggests that the forum is not a substitute for textbooks and recommends studying examples from published works before asking specific questions.
  • A participant proposes a CFG with rules that introduce one 'b' and two 'a's, noting that this approach may lead to ambiguity in the grammar.
  • There is a discussion about proving that the proposed grammar generates the required language, with participants suggesting methods such as induction on derivation.
  • Some participants express uncertainty about the correctness of their proofs and seek hints or guidance on how to proceed with their arguments.
  • Participants discuss the structure of strings in the language and how to break them down to prove properties related to the number of 'a's and 'b's.

Areas of Agreement / Disagreement

Participants generally agree on the need to establish a CFG for the specified language and the importance of proving that the grammar satisfies the language's conditions. However, there is no consensus on the best approach to proving the properties of the grammar, and multiple viewpoints on the CFG's structure and ambiguity remain present.

Contextual Notes

Participants mention the potential ambiguity of the proposed grammar and the challenge of generating all strings in the language. There are also references to the need for a more economical grammar that is less ambiguous and suitable for parsing.

Who May Find This Useful

This discussion may be useful for students and practitioners interested in context-free grammars, formal language theory, and mathematical proofs related to language generation.

evinda
Gold Member
MHB
Messages
3,741
Reaction score
0
Hello! :)
Could you explain me how I could find the context-free grammar,that produces a language?Because I am not familiar yet with context-free grammars :o
 
Technology news on Phys.org
Sorry, this forum is not a replacement for a textbook. It is possible to write some approximation to a book chapter (like Deveno is won't to do :)), but why would an explanation from an amateur helper be better than that from a published author and a professional mathematician? I assume you mean that you don't yet have enough intuition about CFG and are not comfortable with them, not that you don't know the definition. Even so, you should first go through several examples in a textbook. For example, the book by Lewis and Papadimitriou has five worked out examples of grammars. After you have some familiarity, feel free to post any questions about concrete languages or problems.
 
Evgeny.Makarov said:
Sorry, this forum is not a replacement for a textbook. It is possible to write some approximation to a book chapter (like Deveno is won't to do :)), but why would an explanation from an amateur helper be better than that from a published author and a professional mathematician? I assume you mean that you don't yet have enough intuition about CFG and are not comfortable with them, not that you don't know the definition. Even so, you should first go through several examples in a textbook. For example, the book by Lewis and Papadimitriou has five worked out examples of grammars. After you have some familiarity, feel free to post any questions about concrete languages or problems.

I have seen some examples from an other book I have,but I haven't understood how to find the context-free grammars. :confused: Could you explain me how I can find it,for example,for this language: \{w \epsilon{a,b}^{*}: (number of a in w)=2 * (number of b in w)} ?
 
evinda said:
Could you explain me how I can find it,for example,for this language: \{w \epsilon{a,b}^{*}: (number of a in w)=2 * (number of b in w)} ?
By trial and error, using common sense. (Smile)

First we need to cover the empty string, so there should be the rule
\[
S\to\epsilon
\]
Next, an obvious approach is to have all other rules introduce one $b$ and two $a$s. By induction on derivation, this condition guarantees that each generated word belongs to the language, but there are grammars that don't have this restriction. The challenge is to generate all strings in the language. The newly introduced $b$ may be located to the left of the newly introduced $a$'s, between them, or to the right. So we may define the following rules.
\[
\begin{aligned}
S&\to SbSaSaS\\
S&\to SaSbSaS\\
S&\to SaSaSbS
\end{aligned}
\]
By inserting $S$ between symbols we don't lose the ability to make the intermediate string (with nonterminal symbols) into an arbitrary string in the language. We don't want to commit, for example, to $baa$ with no way to insert something between $b$ and $a$.

It seems to me that these rules should cover the language, though this grammar is probably highly ambiguous, i.e., each terminal word has many derivations. In other words, inserting $S$ everywhere is probably an overkill. Anyway, the next step is proving that this grammar generates precisely the required language.

It is possible to try to provide a more economical grammar, which is less ambiguous and more suitable for parsing.
 
Evgeny.Makarov said:
By trial and error, using common sense. (Smile)

First we need to cover the empty string, so there should be the rule
\[
S\to\epsilon
\]
Next, an obvious approach is to have all other rules introduce one $b$ and two $a$s. By induction on derivation, this condition guarantees that each generated word belongs to the language, but there are grammars that don't have this restriction. The challenge is to generate all strings in the language. The newly introduced $b$ may be located to the left of the newly introduced $a$'s, between them, or to the right. So we may define the following rules.
\[
\begin{aligned}
S&\to SbSaSaS\\
S&\to SaSbSaS\\
S&\to SaSaSbS
\end{aligned}
\]
By inserting $S$ between symbols we don't lose the ability to make the intermediate string (with nonterminal symbols) into an arbitrary string in the language. We don't want to commit, for example, to $baa$ with no way to insert something between $b$ and $a$.

It seems to me that these rules should cover the language, though this grammar is probably highly ambiguous, i.e., each terminal word has many derivations. In other words, inserting $S$ everywhere is probably an overkill. Anyway, the next step is proving that this grammar generates precisely the required language.

It is possible to try to provide a more economical grammar, which is less ambiguous and more suitable for parsing.

I understand.But how can I prove that this grammar generates precisely the required language? :confused: :o
 
evinda said:
But how can I prove that this grammar generates precisely the required language?
I'll wait for your ideas and respond to them.
 
Evgeny.Makarov said:
I'll wait for your ideas and respond to them.

Do I have maybe to do the following:
-to prove that every string produced by the grammar has the property (number of a in w)=2 * (number of b in w)
-to prove that every string with this property is produced by the grammar.

If yes,could you give me a hint how to start? :confused:
 
evinda said:
Do I have maybe to do the following:
-to prove that every string produced by the grammar has the property (number of a in w)=2 * (number of b in w)
-to prove that every string with this property is produced by the grammar.
Certainly.

evinda said:
If yes,could you give me a hint how to start?
Concerning your first point:
Evgeny.Makarov said:
an obvious approach is to have all other rules introduce one $b$ and two $a$s. By induction on derivation, this condition guarantees that each generated word belongs to the language

Then

Evgeny.Makarov said:
The challenge is to generate all strings in the language.

Take an arbitrary string in the language. Suppose it starts with a $b$. Then somewhere in the string there are two $a$'s. (That's an understatement!) These three symbols break the string into four (possibly empty) parts. What can be said about them? That is, this direction uses induction on the string length.
 
Evgeny.Makarov said:
Certainly.Concerning your first point:Then
Take an arbitrary string in the language. Suppose it starts with a $b$. Then somewhere in the string there are two $a$'s. (That's an understatement!) These three symbols break the string into four (possibly empty) parts. What can be said about them? That is, this direction uses induction on the string length.

For the first step,I tried this:

-to prove that every string produced by the grammar has the property (number of a in w)=2 * (number of b in w), by induction on derivation:
Claim:For every natural number n, if w is a sentential form whose derivation is of length n, then w has the property: (number of a in w)=2*(number of b in w)
Base case: The claim holds for n=0,as the only sentential form with derivation length 0 is S,which satisfies the property. The claim also holds for n=1, as the only possible sentential forms are SbSaSaS, SaSbSaS, SaSaSbS, or the empty string.
Induction Hypothesis: Let the claim be true for n<=k. So, assume that for all sentential forms of length less than k satisfy the property.
Induction Step: Consider a sentential form w whose derivation is of length k+1.
Let the sentential form z have a derivation of length k, and by induction hypothesis it satisfies the property. We can write z=xSy, where x and y are sentential forms and we replace S in the (k+1)-step to obtain w. This S is one of the strings SbSaSaS, SaSbSaS, SaSaSbS, or ε. In any case the property is satisfied.

Is this right?
 
  • #10
evinda said:
Base case: The claim holds for n=0,as the only sentential form with derivation length 0 is S,which satisfies the property. The claim also holds for n=1, as the only possible sentential forms are SbSaSaS, SaSbSaS, SaSaSbS, or the empty string.

Up to this point everything is fine. A couple of remarks about the following.

evinda said:
Induction Hypothesis: Let the claim be true for n<=k. So, assume that for all sentential forms of length less than k satisfy the property.
First you say, for $\le k$ and then for $<k$.

evinda said:
Induction Step:
Stating induction hypothesis is a part of the induction step. That is, you should signal the beginning of the induction step before stating the hypothesis.

evinda said:
Consider a sentential form w whose derivation is of length k+1.
Let the sentential form z have a derivation of length k
From the following, $z$ is not an arbitrary sentential form with a derivation of length $k$, but a direct ancestor of $w$. This should be stated from the start.

evinda said:
and by induction hypothesis it satisfies the property. We can write z=xSy, where x and y are sentential forms and we replace S in the (k+1)-step to obtain w. This S is one of the strings SbSaSaS, SaSbSaS, SaSaSbS, or ε.
The last sentence has to be written more carefully. $S$ is not $SbSaSaS$.

evinda said:
In any case the property is satisfied.
Up to now, your reasoning is very detailed, which is fine, but this last statement is comparatively unsupported. How do you use $x$, $y$ and $z$ to prove it?
 
  • #11
Evgeny.Makarov said:
Up to this point everything is fine. A couple of remarks about the following.

First you say, for $\le k$ and then for $<k$.

Stating induction hypothesis is a part of the induction step. That is, you should signal the beginning of the induction step before stating the hypothesis.

From the following, $z$ is not an arbitrary sentential form with a derivation of length $k$, but a direct ancestor of $w$. This should be stated from the start.

The last sentence has to be written more carefully. $S$ is not $SbSaSaS$.

Up to now, your reasoning is very detailed, which is fine, but this last statement is comparatively unsupported. How do you use $x$, $y$ and $z$ to prove it?

x,y,z are strings that satisfy the property.Or can't I do it like that?
 
  • #12
evinda said:
x,y,z are strings that satisfy the property.Or can't I do it like that?
Why do $x$ and $y$ satisfy the property?

I just wanted to say that it would be nice to write in more detail why the fact that $z$ satisfies the property implies that $w$ does, too.
 
  • #13
Evgeny.Makarov said:
Why do $x$ and $y$ satisfy the property?

I just wanted to say that it would be nice to write in more detail why the fact that $z$ satisfies the property implies that $w$ does, too.

Can I explain it like that:
Since S can be replaced by one of the strings SbSaSaS, SaSbSaS, SaSaSbS, ε, then we obtain w cause z has now a derivation of length k+1.
Or am I wrong?
 
  • #14
evinda said:
Can I explain it like that:
Since S can be replaced by one of the strings SbSaSaS, SaSbSaS, SaSaSbS, ε, then we obtain w cause z has now a derivation of length k+1.
Or am I wrong?
Unfortunately, this is not even wrong. :( You are saying, "Since ..., we obtain $w$". But what kind of conclusion is this? It's as if "we obtain $w$" can be true or false; it can't at this point. In the induction step we started by considering a derivation of length $k+1$ and called its conclusion $w$. We already obtained $w$ and this fact needs no justification.

Second, $z$ is a fixed string, a direct predecessor of $w$ in the derivation. You can't say, "$z$ has now a derivation of length $k+1$" because $z$ has not changed, and the derivation we fixed in the induction step has not changed. There is $z$, and by one more rule application we obtain $w$.

Finally, you have not said anything about the number of $a$'s and $b$'s.
 
  • #15
Evgeny.Makarov said:
Unfortunately, this is not even wrong. :( You are saying, "Since ..., we obtain $w$". But what kind of conclusion is this? It's as if "we obtain $w$" can be true or false; it can't at this point. In the induction step we started by considering a derivation of length $k+1$ and called its conclusion $w$. We already obtained $w$ and this fact needs no justification.

Second, $z$ is a fixed string, a direct predecessor of $w$ in the derivation. You can't say, "$z$ has now a derivation of length $k+1$" because $z$ has not changed, and the derivation we fixed in the induction step has not changed. There is $z$, and by one more rule application we obtain $w$.

Finally, you have not said anything about the number of $a$'s and $b$'s.

I will think about it and I will tell you what I think that I could change to improve my answer..:o
 
Last edited by a moderator:

Similar threads

  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 15 ·
Replies
15
Views
4K
Replies
98
Views
3K
  • · Replies 6 ·
Replies
6
Views
5K
  • · Replies 26 ·
Replies
26
Views
2K
  • · Replies 13 ·
Replies
13
Views
3K