How can I modify my regex to allow for overlapping matches in Python?

In summary, the conversation discusses creating a function to generate a regex that can be used in algebraic manipulations. The regex is meant to match specific patterns, but the current version is not as flexible as desired. The speaker is considering using lex to handle the task instead.
  • #1
TylerH
729
0
I'm writing a function to take a string like "aXYb" and return a regex in which the lower case letters act like actual character and the upper case become free variables.

The regex generated from "aXYb" should match anything of the form a([a-z]+)([a-z]+)b. It does. But not exactly as "freely" as I would like. For example, when I re.compile(a([a-z]+)([a-z]+)b).match("aaxab"), the only tuple I get back is ("ax", "a").

Preferably, I'd like both ("a", "xa") and ("ax", "a"). How can I change "a([a-z]+)([a-z]+)b" to get these results?

For a little background, it's intended to be used in describing rules for algebraic manipulations (in the abstract algebra sense). The final goal is to be able do a breadth first search of all manipulations until I get to the desired result. The free variables are used in describing the variable part of valid manipulations, like (AB=AC -> B=C). That's why overlapping is absolutely necessary.
 
Technology news on Phys.org
  • #2
That's not how regular expressions work. They are hard enough already as is, both for the users and the implementers. You are the one who knows that that ([a-z]+)([a-z]+) means something special. You can form a list of all possible matches from the one match you do get.
 
  • #3
Yeah, I figured out that algebraic expressions are context free rather than regular. I think lex is used for context free languages. Would it be easier to get lex to do this than to write my own?
 

1. What is overlapping matches in Python?

Overlapping matches in Python refer to the instances where a pattern can be matched multiple times within a single string, with each match overlapping with the previous one.

2. How do I identify overlapping matches in Python?

To identify overlapping matches in Python, you can use the re.finditer() function, which returns an iterator containing all the matched objects. This allows you to see all the overlapping matches within a string.

3. Can I use regular expressions to find overlapping matches in Python?

Yes, regular expressions can be used to find overlapping matches in Python. You can use the re.findall() function to find all matches in a string, even if they overlap with each other.

4. How can I avoid overlapping matches in Python?

To avoid overlapping matches in Python, you can use the re.sub() function, which replaces all matched patterns with a specified string. This allows you to control the number of matches and avoid overlapping.

5. Are there any limitations to using overlapping matches in Python?

One limitation of using overlapping matches in Python is that it can be computationally expensive, especially when dealing with large strings. It is important to optimize your code and use efficient regular expressions to avoid any performance issues.

Similar threads

  • Programming and Computer Science
Replies
1
Views
1K
  • Programming and Computer Science
Replies
8
Views
826
  • Precalculus Mathematics Homework Help
Replies
17
Views
907
Replies
4
Views
2K
  • Programming and Computer Science
Replies
1
Views
1K
  • Programming and Computer Science
Replies
4
Views
11K
  • Programming and Computer Science
Replies
5
Views
2K
  • Programming and Computer Science
Replies
1
Views
3K
Back
Top