How can I use AWK to batch rename files in Unix?

In summary: Awk is simpler and easier to use for this type of task, but Perl is a better option because it is more versatile.
  • #1
Ygggdrasil
Science Advisor
Insights Author
Gold Member
3,759
4,199
TL;DR Summary
I'd like help writing a bash script to rename a list of files.
I'd like to rename a bunch of files in a directory based on data from a tab-delimited text file. I know how to do this in R:
Code:
dir <- "~/user/folder/"
fileNames <- read.table(paste0(dir,"fileNames.txt"),sep="\t",header=T)
for (i in c(1:nrow(fileNames))){
  oriFile <- paste0(dir,fileNames[i,"Sample"],"_S",i,sprintf("_L%03d",fileNames[i,"Lane"]),"_R1_001.fastq.gz")
  newFile <- paste0(dir,fileNames[i,"ID"],".fastq.gz")
  system(paste("mv",oriFile,newFile))
}

Where the fileNames.txt file looks something like:
Code:
Lane	Sample	ID
1	xxx-xx-32S-pl1-J01	WT1.IN
1	xxx-xx-32S-pl1-J02	WT2.IN
1	xxx-xx-32S-pl1-J03	WT3.IN

In an effort to improve my knowledge of unix shell scripting, I'd like to know how one would approach writing a bash script that does this.
 
Technology news on Phys.org
  • #2
Ygggdrasil said:
Where the fileNames.txt file looks something like

Is the intent that the second column is the old filename and the third column is the new filename? And the first column doesn't get used?
 
  • #3
PeterDonis said:
Is the intent that the second column is the old filename and the third column is the new filename? And the first column doesn't get used?

No, the string in the second column needs additional information appended to it.

For example, the first file to be re-named is xxx-xx-32S-pl1-J01_S1_L001_R1_1.fastq.gz, where the S# is the row number of the item in the table and the L### contains the integer in the first column of the file with the appropriate number of leading zeros appended.
 
  • #4
This is better done with an awk script or a python script rather than a pure bash script.

Because you are doing relatively simple line by line actions then awk seems to be the best option here although python wouldn’t be much more difficult.
 
  • Like
Likes Ygggdrasil
  • #5
jedishrfu said:
This is better done with an awk script or a python script rather than a pure bash script.
Ok, then using an R script is probably the correct approach after all. Thanks!
 
  • Like
Likes jedishrfu
  • #6
Doing filename edits in bash can be rather painful.
 
  • Like
Likes Ygggdrasil
  • #7
Perl was developed for these types of tasks. It is a superior scripting tool and is universally available on Unix machines.
 
  • Like
Likes jedishrfu
  • #8
But awk is better. Just sayin...

As an aside, discovering awk helped rejuvenate my career as a programmer. At the time, in the 1980’s I wanted to learn Unix but was stuck with DOS. However, there were some DOS software packages that transformed DOS and the one I selected also had an awk compiler.

I was amazed at how easy it became to write text processing programs with regular expressions and awk’s processing stanzas. I rewrote many of my c programs, enhanced them and went crazy adding features. It was a wild time and I was having a blast. There was just something about the language and the accompanying book that inspired me even today I will turn to when I have to do some quick one off project.

My most recent program converted Matlab to Julia for a work project that never took off. As always the awk coding was a challenge and a lot more fun.
 
Last edited:
  • Like
Likes anorlunda
  • #9
jedishrfu said:
But awk is better. Just sayin...
Awk may be perfectly fine (and simpler) for this task. If a task has multiple steps that become alternating sed and awk steps, then a scripting language like Perl is far superior.
 
  • #10
Why would you use awk and sed when awk can do both?

I know it’s common to write simple awk text expressions but awk can do so much more whereas sed is somewhat limited.

Anyway, we shouldn’t side track the thread any further. It was my fault for starting this.
 
  • #11
FactChecker said:
Perl was developed for these types of tasks. It is a superior scripting tool and is universally available on Unix machines.

I second that. I found that small functions and utilities that could have been done with awk or similar "narrow purpose" tools for me quickly grew to require more general coding (like access to dictionaries when parsing and matching up data) and Perl just have all the needed bits in one package.
 
  • Like
Likes FactChecker
  • #12
So many languages are developed when a programmer gets frustrated with its limitations. For text processing you can start with SNOBOL then Tex... then AWK then Perl then Python then Ruby then Groovy then Kotlin and the list will continue until we have some major paradigm shift in programming that changes forever the way we do things.

I just liked AWK for its uniqueness, clean design, novelty and usefulness other languages addressed its limitations later on by extending regular depression parsing, or by giving up on its filtering template but for me it detracted from its elegance.

There was one issue that was a pain though in that the early implementations on Unix would throw an error and it was devilishly hard to find what happened and where. However, my DOS version being a compiler fixed that issue and made it much easier to code with.

At one time, I tried designing an AWK++ to use in place of C++ using AWK as I found C++ frustrating to program in especially with the non-intuitive STL inclusion but wanted a means to explore OO design principles more simply. It was a fools errand as you really need to follow compiler parsing rules with a language grammar to do it right.
 

1. How do I rename multiple files at once in Unix using batch commands?

To rename multiple files at once in Unix, you can use the mv command with wildcard characters. For example, if you want to rename all files with the extension .txt to .doc, you can use the command mv *.txt *.doc. This will rename all files with the .txt extension to have the .doc extension.

2. Can I use regular expressions in batch file renaming in Unix?

Yes, you can use regular expressions in batch file renaming in Unix. The rename command allows you to use regular expressions to match and rename files. For example, if you want to replace all spaces in file names with underscores, you can use the command rename 's/ /_/g' *.

3. How can I preview the changes before actually renaming the files?

You can use the -n flag with the mv command to preview the changes before actually renaming the files. This will show you the changes that will be made without actually renaming the files. If you are satisfied with the changes, you can run the command again without the -n flag to actually rename the files.

4. Is it possible to batch rename files with a specific prefix or suffix?

Yes, it is possible to batch rename files with a specific prefix or suffix. You can use the rename command with the -p option to add a prefix to the file name, or the -s option to add a suffix. For example, rename -p 'prefix_' * will add the prefix "prefix_" to all file names.

5. Can I use batch file renaming in Unix to change file extensions?

Yes, you can use batch file renaming in Unix to change file extensions. You can use the mv command with wildcard characters to rename files with a specific extension to a different extension. For example, mv *.txt *.doc will change all files with the .txt extension to have the .doc extension.

Back
Top