Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Reading regular expressions

  1. Nov 26, 2012 #1

    I came across a regular expression i couldn't get my head around.

    Code (Text):
    ' there \([^ ]*\)'
    Code (Text):
    echo "Howdy there neighbor" | sed 's/there \([^ ]*\)//'
    returns howdy.

    It's the subgroup that's a bit confusing.

    match any sentence which contains banana then a space and then a non-space character.

    Is this the correct way of interpreting this regular expression ?
  2. jcsd
  3. Nov 26, 2012 #2
    So, basically, it matches "there " (the word 'there' followed with a blank space) followed with as many consecutive non-blank spaces as it can find "[^ ]*" and replaces that with nothing.

    You can test that only replaces what I said, if you test it with "Howdy there neighbor what up?"

    Oh, the back slashes are there to escape the parenthesis within the double quotes
  4. Nov 28, 2012 #3

    Is this the same as saying 'match as much as possible up until a white space is found' ?
  5. Nov 28, 2012 #4


    User Avatar
    Science Advisor

    "White space" would normally include tab characters. This will eat everything up until the first space character; that detail aside, yes.
  6. Nov 28, 2012 #5
    Sorry, I guess I need to be more correct, like Ibix says.

    The expression "[^ ]*" will consume consecutive character after character until it finds a "blank space" character.

    A "blank space" character itself is not the same as "white space" in general...it is a subset. If the regular expression is looking for "white space" then, "blank space" and "tab" characters qualify....but if you are looking for "blank space", then a "tab" is a totally different character.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook