Batch file renaming in Unix

Ygggdrasil

Science Advisor
Insights Author
Gold Member
2,887
1,976
Summary
I'd like help writing a bash script to rename a list of files.
I'd like to rename a bunch of files in a directory based on data from a tab-delimited text file. I know how to do this in R:
Code:
dir <- "~/user/folder/"
fileNames <- read.table(paste0(dir,"fileNames.txt"),sep="\t",header=T)
for (i in c(1:nrow(fileNames))){
  oriFile <- paste0(dir,fileNames[i,"Sample"],"_S",i,sprintf("_L%03d",fileNames[i,"Lane"]),"_R1_001.fastq.gz")
  newFile <- paste0(dir,fileNames[i,"ID"],".fastq.gz")
  system(paste("mv",oriFile,newFile))
}
Where the fileNames.txt file looks something like:
Code:
Lane	Sample	ID
1	xxx-xx-32S-pl1-J01	WT1.IN
1	xxx-xx-32S-pl1-J02	WT2.IN
1	xxx-xx-32S-pl1-J03	WT3.IN
In an effort to improve my knowledge of unix shell scripting, I'd like to know how one would approach writing a bash script that does this.
 
25,764
6,611
Where the fileNames.txt file looks something like
Is the intent that the second column is the old filename and the third column is the new filename? And the first column doesn't get used?
 

Ygggdrasil

Science Advisor
Insights Author
Gold Member
2,887
1,976
Is the intent that the second column is the old filename and the third column is the new filename? And the first column doesn't get used?
No, the string in the second column needs additional information appended to it.

For example, the first file to be re-named is xxx-xx-32S-pl1-J01_S1_L001_R1_1.fastq.gz, where the S# is the row number of the item in the table and the L### contains the integer in the first column of the file with the appropriate number of leading zeros appended.
 
10,686
4,243
This is better done with an awk script or a python script rather than a pure bash script.

Because you are doing relatively simple line by line actions then awk seems to be the best option here although python wouldn’t be much more difficult.
 

Ygggdrasil

Science Advisor
Insights Author
Gold Member
2,887
1,976
This is better done with an awk script or a python script rather than a pure bash script.
Ok, then using an R script is probably the correct approach after all. Thanks!
 

FactChecker

Science Advisor
Gold Member
2018 Award
4,979
1,748
Perl was developed for these types of tasks. It is a superior scripting tool and is universally available on Unix machines.
 
10,686
4,243
But awk is better. Just sayin...

As an aside, discovering awk helped rejuvenate my career as a programmer. At the time, in the 1980’s I wanted to learn Unix but was stuck with DOS. However, there were some DOS software packages that transformed DOS and the one I selected also had an awk compiler.

I was amazed at how easy it became to write text processing programs with regular expressions and awk’s processing stanzas. I rewrote many of my c programs, enhanced them and went crazy adding features. It was a wild time and I was having a blast. There was just something about the language and the accompanying book that inspired me even today I will turn to when I have to do some quick one off project.

My most recent program converted Matlab to Julia for a work project that never took off. As always the awk coding was a challenge and a lot more fun.
 
Last edited:

FactChecker

Science Advisor
Gold Member
2018 Award
4,979
1,748
But awk is better. Just sayin...
Awk may be perfectly fine (and simpler) for this task. If a task has multiple steps that become alternating sed and awk steps, then a scripting language like Perl is far superior.
 
10,686
4,243
Why would you use awk and sed when awk can do both?

I know it’s common to write simple awk text expressions but awk can do so much more whereas sed is somewhat limited.

Anyway, we shouldn’t side track the thread any further. It was my fault for starting this.
 

Filip Larsen

Gold Member
1,212
149
Perl was developed for these types of tasks. It is a superior scripting tool and is universally available on Unix machines.
I second that. I found that small functions and utilities that could have been done with awk or similar "narrow purpose" tools for me quickly grew to require more general coding (like access to dictionaries when parsing and matching up data) and Perl just have all the needed bits in one package.
 
10,686
4,243
So many languages are developed when a programmer gets frustrated with its limitations. For text processing you can start with SNOBOL then Tex... then AWK then Perl then Python then Ruby then Groovy then Kotlin and the list will continue until we have some major paradigm shift in programming that changes forever the way we do things.

I just liked AWK for its uniqueness, clean design, novelty and usefulness other languages addressed its limitations later on by extending regular depression parsing, or by giving up on its filtering template but for me it detracted from its elegance.

There was one issue that was a pain though in that the early implementations on Unix would throw an error and it was devilishly hard to find what happened and where. However, my DOS version being a compiler fixed that issue and made it much easier to code with.

At one time, I tried designing an AWK++ to use in place of C++ using AWK as I found C++ frustrating to program in especially with the non-intuitive STL inclusion but wanted a means to explore OO design principles more simply. It was a fools errand as you really need to follow compiler parsing rules with a language grammar to do it right.
 

Want to reply to this thread?

"Batch file renaming in Unix" You must log in or register to reply here.

Related Threads for: Batch file renaming in Unix

  • Posted
Replies
1
Views
3K
Replies
2
Views
2K
  • Posted
Replies
3
Views
7K
  • Posted
Replies
3
Views
1K
  • Posted
Replies
3
Views
1K

Physics Forums Values

We Value Quality
• Topics based on mainstream science
• Proper English grammar and spelling
We Value Civility
• Positive and compassionate attitudes
• Patience while debating
We Value Productivity
• Disciplined to remain on-topic
• Recognition of own weaknesses
• Solo and co-op problem solving
Top