Basic Shell Script with Sed Malfunctioning

  • Thread starter Thread starter sunmaz94
  • Start date Start date
  • Tags Tags
    Shell
AI Thread Summary
The discussion revolves around a bash shell script intended to rename files by removing vowels from their names. The main issue identified is with the regex used to strip file extensions, specifically the use of a plus sign instead of an asterisk, which is causing the script to malfunction. Participants suggest using the `-r` option with `sed` to enable extended regex syntax, which allows for the use of the plus sign. Additionally, alternatives such as using shell parameter expansion for removing extensions are proposed for better efficiency and portability across different systems. The conversation emphasizes the importance of understanding regex behavior in different environments and the potential pitfalls of relying solely on `sed`.
sunmaz94
Messages
42
Reaction score
0

Homework Statement



Write a bash shell script to do the following:

# This shell script renames all files in the current
# directory, removing all vowels in the names.
# if the resultant name would lack any characters (excluding the extension)
# The file is not renamed. Also does not attempt to rename files that have no vowels.

Homework Equations



None.

The Attempt at a Solution


I do not understand why this line of code doesn't work:

Code:
new_name_no_ext=`echo $voweless_name | sed -e "s/\.[^\.]+$//"`

The above line should assign to $new_name_no_ext the string $voweless_name, without an extension. I am quite positive that the regex used is correct.

The entire program is:

Code:
#!/usr/bin/bash

# This shell script renames all files in the current
# directory, removing all vowels in the names.
# if the resultant name would lack any characters (excluding the extension)
# The file is not renamed. Also does not attempt to rename files that have no vowels.

for filename in *																#Traverse all files in the current directory
do        
       current_name=$filename													#Get the filename
       voweless_name=`echo $current_name | sed -e "s/[aeiou]//g"`   			#Substitute vowel for nothing using sed
	   new_name_no_ext=`echo $voweless_name | sed -e "s/\.[^\.]+$//"`
	   echo "$new_name_no_ext"
	   #if [ `echo $voweless_name | sed -e "s/\.([^\.]+)$//"` != "" -a "$current_name" != "$voweless_name"  ]	#Rename iff the new nane is non-empty and is not identival to the current name
	   if [ "$new_name_no_ext" != "" -a "$current_name" != "$voweless_name"  ]	#Rename iff the new nane is non-empty and is not identival to the current name
	   then
		mv "$current_name" "$voweless_name"                   					#Do the actual renaming
	   fi
done

Any idea as to what is wrong?

Thanks in advance.
 
Physics news on Phys.org
sunmaz94: Instead of a plus sign, did you intend to use an asterisk (*)?
 
nvn said:
sunmaz94: Instead of a plus sign, did you intend to use an asterisk (*)?

Thanks for your response.

No I believe I want a "+", as the regex quantifier for "one or more". That behavior is desirable for extension removal. Any other idea as to why this isn't working?
 
sunmaz94: Can you echo $current_name, immediately after the first line in the do-block, and show us what it outputs?
 
nvn said:
sunmaz94: Can you echo $current_name, immediately after the first line in the do-block, and show us what it outputs?

It just outputs the name of the file being traversed (ex. apple.txt, hello.txt, a.txt, and b.txt ...).

Likewise, the vowel-less variable outputs the the name of the file being traversed without vowels (ex. appl.txt, hllo.txt,.txt, and b.txt ...).
 
sunmaz94: Hint 1: Put a backslash before your plus sign, and see if that helps. Or, you can change your plus sign to an asterisk, if you wish.

By the way, remove your second backslash. It is incorrect, and is matching backslash. (Or did you intend to match backslash, in this design?)
 
Last edited:
nvn said:
sunmaz94: Hint 1: Put a backslash before your plus sign, and see if that helps. Or, you can change your plus sign to an asterisk, if you wish.

By the way, remove your second backslash. It is incorrect, and is matching backslash. (Or did you intend to match backslash, in this design?)

Thanks! Both of those were my mistakes.
 
I believe your original expression works if you use sed -r -e ... The "+" is part of the "extended" syntax, and you need the -r switch to use it.
 
I'm seeing sed living up to its reputation! :frown:

You can omit the -e almost universally here, I believe, so I have.

$ echo appl.txt |sed -r "s/\.[^\.]+$//"
appl
$ echo appl.txt |sed -r "s/\.[^\.]\+$//"
appl.txt
$ echo appl.txt |sed "s/\.[^\.]\+$//"
appl.txt
$ echo appl.txt |sed "s/\.[^\.]+$//"
appl.txt

So -r was definitely needed above, and it didn't require \+

But on another system, I find:

# echo appl.txt |sed "s/\.[^\.]+$//"
appl.txt
# echo appl.txt |sed "s/\.[^\.]\+$//"
appl
# echo appl.txt |sed -r "s/\.[^\.]+$//"
appl
# echo appl.txt |sed -r "s/\.[^\.]\+$//"
appl.txt

Will work using \+ without the need for -r

Such inconsistency! :frown:
 
  • #10
And none of them work on my computer (OSX 10.6) with the shell that I prefer to use (tcsh). I need to change the double quotes to single quotes, and change the -r option to -E.

This is the 21st century, not 1990. The arguments for using sed in lieu of more powerful and better suited tools such as perl or python are getting a bit weak. The only systems where you won't find perl, python, or both available from boot time will also have a very archaic and probably non-standard sed.
 
  • #11
NascentOxygen said:
You can omit the -e almost universally here.
Good catch. I missed that. I would say universally, unless you know of any exception. The -e is redundant, and unneeded, here. NascentOxygen, your first system seems weird. :biggrin: Your second system seems normal.
Dodo said:
Your original expression works if you use sed -r.
Good point. Thanks for mentioning that.
D H said:
I need to change the double quotes to single quotes, and change the -r option to -E.
A logical name change, or so it might seem, since the -r option invokes ERE (extended regular expressions). However, destroying portability for something that makes no difference is highly illogical.
 
Last edited:
  • #12
sunmaz94 said:
Code:
new_name_no_ext=`echo $voweless_name | sed -e "s/\.[^\.]+$//"`
If you are using bash or ksh, consider using the shell's much more efficient construct:

Code:
new_name_no_ext=`${voweless_name%.?*}`

You seemed to be insistent on the + for sed, so I think you'll need the ? here for the shell. (Try it with, and without.)
 
  • #13
nvn said:
A logical name change, or so it might seem, since the -r option invokes ERE (extended regular expressions). However, destroying portability for something that makes no difference is highly illogical.
Portability? What portability are you talking about? That -r option to sed is Linux-specific. The -E option on my computer is OSX-specific.

If you want to use sed portably you will only use the options -e, -f, and -n, and won't use extended regular expressions. Everything other option and the use of extended regular expressions is non-standard.

True portability amongst UNIX systems such as OSX, AIX, and HP-UX and UNIX-like systems such as Linux is achieved by going to the lowest common denominator.
 
  • #14
D H: Thanks for the helpful information. I almost never use ERE. A BRE (basic regular expression) works just as well here.
 
Last edited:
  • #15
NascentOxygen said:
If you are using bash or ksh, consider using the shell's much more efficient construct:

Code:
new_name_no_ext=`${voweless_name%.?*}`

You seemed to be insistent on the + for sed, so I think you'll need the ? here for the shell. (Try it with, and without.)

There should not be back quotes around the parameter, delete these. ` ... `
 
Last edited:

Similar threads

Back
Top