Grep strings from a .txt file (linux/mac)

  • Thread starter dRic2
  • Start date
  • #1
dRic2
Gold Member
818
198
I have a DATA.txt file which contains lots of useless info. I get the informations I want typing in my terminal
Bash:
grep freq DATA.txt
and the output is the following:

freq ( 1) = -0.193719 [THz] = -6.461768 [cm-1]
freq ( 2) = -0.193719 [THz] = -6.461768 [cm-1]
freq ( 3) = -0.193719 [THz] = -6.461768 [cm-1]
freq ( 4) = 5.968261 [THz] = 199.079745 [cm-1]
freq ( 5) = 5.968261 [THz] = 199.079745 [cm-1]
freq ( 6) = 5.968261 [THz] = 199.079745 [cm-1]

I would like to create a bash script which extracts only the numbers of the last column and stores them in a new .txt file as follows:

1 -6.461768
2 -6.461768
3 -6.461768
4 199.079745
5 199.079745
6 199.079745

I though of something on the line of

Bash:
FILEIN=DATA.txt
FILEOUT=OUT.txt

while ...
freq = `grep freq $FILEIN | ... something ...`

echo ... ${freq} >> $FILEOUT
done

I don't really know how to program in bash... I just copy and modify scripts that were handed to me in similar occasions. But I don't know how to implement this one.

Thanks in advance!
Ric
 

Answers and Replies

  • #3
FactChecker
Science Advisor
Gold Member
6,208
2,405
If you have Perl installed (it is often installed by default), you can try this program.

Perl:
open(IN, "<DATA.txt");
open(OUT, ">new.txt");
while( $line = <IN> ){
    if( $line =~ /freq\s*\(\s*(\d+)\).*?(\S+) \[cm/ ){
        print OUT "$1 $2\n";
    }
}
close IN;
close OUT;
 
  • #4
dRic2
Gold Member
818
198
Thanks a lot! awk works like a charm and I got everything I needed done. I'll be looking into Perl as well since I don't know what that is.
 
  • #5
phyzguy
Science Advisor
4,789
1,744
I do all of these kinds of things in Python. It is much easier and more versatile than bash scripts or Perl.
 
Last edited:
  • #6
dRic2
Gold Member
818
198
I do all of these kinds of things in Python. It is much easier and more versatile than bash scripts or Perl.
1) I work remotely via ssh connection on a machine and I'm not sure if Phyton is installed on it...
2) I don't know python... sad but true. I' will try to learn the basics though
 
  • #7
FactChecker
Science Advisor
Gold Member
6,208
2,405
I disagree. Perl is significantly easier than Python for a scripting task like this. I have re-coded several Python scripts done by Python advocates who are excellent programmers and ended up with much simpler programs. For example, Perl makes it much easier to capture the STDOUT output of a called application in a string and parse it.
That being said, I would currently recommend that a person learn Python rather than Perl because of several other advantages it has.

Another thing I like about Perl is how easy and natural it is to include error checking and warnings. Here is the code above with some error checking:

Perl with some simple error checking and warnings:
open(IN, "<DATA.txt") or die "ERROR: Can't open DATA.txt to read";
open(OUT, ">new.txt") or die "ERROR: Can't open new.txt to write" ;
while( $line = <IN> ){
    $lineNum++;
    if( $line =~ /freq\s*\(\s*(\d+)\).*?(\S+) \[cm/ ){
        print OUT "$1 $2\n";
    }elsif( $line =~ /freq/ ){
        warn "Line $lineNum wasn't parsed correctly:\n$line\n";
    }
}
close IN;
close OUT;
 
Last edited:
  • #8
phyzguy
Science Advisor
4,789
1,744
1) I work remotely via ssh connection on a machine and I'm not sure if Phyton is installed on it...
2) I don't know python... sad but true. I' will try to learn the basics though
Most Unix systems have Python installed by default. I just find Python more capable and easier to use than Perl or shell scripts, and once you've learned Python you can do many more things that you can't do with scripting languages. But these things are a matter of taste, so use whichever makes the job easier for you. I think the most important thing is not how simple or elegant the program is. The important thing is how long it takes you to get the program coded up and working.
 
  • #9
FactChecker
Science Advisor
Gold Member
6,208
2,405
Most Unix systems have Python installed by default. I just find Python more capable and easier to use than Perl or shell scripts, and once you've learned Python you can do many more things that you can't do with scripting languages. But these things are a matter of taste, so use whichever makes the job easier for you. I think the most important thing is not how simple or elegant the program is. The important thing is how long it takes you to get the program coded up and working.
IMHO, the great advantage now of Python over Perl (which I still think is a better OS scripting language) is the great interest and thriving community of users. There are so many scientific hobbies, like robotics, artificial intelligence, gaming, etc., where Python works within an entire environment for that hobby. In many (most?) of those hobbies, the Python code is a small fraction of the learning and most of the effort is in learning how to use the specific tools for the hobby (which is the fun part).
 
  • #10
12,747
6,621
A slight adjustment shows Awk to be very versatile too:
Bash:
awk -e ‘/freq/ { print $4 }’ data.txt
 
  • #11
FactChecker
Science Advisor
Gold Member
6,208
2,405
A slight adjustment shows Awk to be very versatile too:
Bash:
awk -e ‘/freq/ { print $4 }’ data.txt
People who do a lot of work with combinations of sed and awk are likely to go crazy. There is a good reason that Perl was developed and completely transformed that kind of work.
 
  • #12
12,747
6,621
I seldom used sed. While working on PC DOS, I happened upon the Thompson toolkit which provided unix commands for the PC world. In that collection was the Awk compiler which could generate exe files. I was in a slump at the time doing C utilities for our group and Awk transformed my work productivity that I began doing a lot of tools using it exclusively.

After awhile though, there was a point where some new functionality didn’t really fit the Awk programming pattern and I had to recode in another language or suffer through the added complexity of using counters and flags... to get what I wanted. This usually came up when more than one file of a different format was being processed by the Awk script.

I tried Perl once and ran into some strange error and decided that awk suited me better. Also I don’t recall Perl having an implementation on PC DOS at the time. One thing I did like was the enhanced regular expressions in Perl that awk didn’t implement where you could identify words or classes of characters.

More recently, I’ve recoded some of my awk scripts into python with good results and better scalability. But I still view Awk fondly and do one offs with it when I can.
 
  • Like
Likes Dr Transport and Keith_McClary
  • #13
FactChecker
Science Advisor
Gold Member
6,208
2,405
My largest single use of awk and sed was in the conversion of a lot of FORTRAN code to Ada. I used a several-step sequence of alternating sed and awk steps to perform a lot of the routine conversions before I went in for the final conversion by hand. A while later, another task was to automate code generation to access data in memory locations in the languages of FORTRAN, C, and Ada. Starting from any of the three languages, the other two had to be generated. The data structures allowed were very flexible. By that time, I had learned Perl and was able to completely automate the process. I don't believe that I could have attempted it using sed and awk without ending up in an asylum.
 
  • #14
12,747
6,621
Program transcription is a painful art. I've done it a few times.

Once, years ago I had a Timesharing Fortran program in a dialect different from the Honeywell Fortran-Y that we had installed on our mainframe. I wrote a conversion tool in the Text Executive programming language to facilitate the conversion when I started seeing that I was not being consistent when converting similar lines of code and realized an automated approach could reduce injected errors through misunderstanding what a line of code was doing.

I left that job when it was half finished but going well. My successor decided instead to take what I had converted and go manual from there. The end result was it never got implemented and the project failed.

More recently, I wrote a small awk script to do a MatLab to Julia conversion which worked well but I never got to the point of actually testing the converted code as there was little interest in our group to invest in Julia.

I also considered a Fortran to Julia conversion to break away from legacy Fortran code and make it more accessible as a Julia program. The one area that gave me trouble was Fortran's use of common blocks and Julia's lack of global variables. However there was some Julia structure that could approximate a global but I stopped at that point too for lack of management interest.
 
  • Like
Likes FactChecker
  • #15
dRic2
Gold Member
818
198
I think the most important thing is not how simple or elegant the program is. The important thing is how long it takes you to get the program coded up and working.
yes I agree. Here I'm working with a software that is well interfaced with simple bash scripts and the community online seems to only use bash scripts. That's why I'm sticking to it... I don't have preferences since I'm a newbie... I'll look into Python, as suggested, because it seems a valuable programming tool to know, but I'll keep using bash scripts for this one.
 
  • #16
FactChecker
Science Advisor
Gold Member
6,208
2,405
I think the most important thing is not how simple or elegant the program is. The important thing is how long it takes you to get the program coded up and working.
A simple program should normally be much easier to code and get working. KISS: Keep It Simple Stupid. And it is hard to beat a language that is made especially for the specific type of application. IMHO, for most scripting, Perl has advantages that Python can not match.
 
Last edited:

Related Threads on Grep strings from a .txt file (linux/mac)

Replies
3
Views
633
Replies
5
Views
78K
Replies
10
Views
8K
Replies
0
Views
1K
Replies
1
Views
10K
Replies
7
Views
6K
Replies
1
Views
4K
Replies
1
Views
8K
Replies
3
Views
2K
Top