Creating a Subset of Data with Awk Command in Unix

  • Thread starter RJLiberator
  • Start date
  • Tags
    Unix
In summary, to create a subset of a data file using awk, you can use the operator && to select for more than one pattern at once. In this case, you can specify a range of values for a specific column by using the syntax '$1>=value1&&$1<=value2{print}'. This will print all records that fall within the specified range of values in the first column, and output them to a new file.
  • #1
RJLiberator
Gold Member
1,095
63

Homework Statement


use awk to create a subset of the original data file called subset.gmt that only contains records with longitudes between -112 and -102. Hint: you can use the operator && to tell awk to look for more than one pattern at once.

Homework Equations

The Attempt at a Solution



What we need is to find an awk command that allows me to take the data from a previous file, the column 1 (longitude) and take only the records between -112 and -102.

So, I understand we start the command as such:

awk 'NR>1{print $1...}' originalfile.txt >! newoutput.ps

The column that I want to pull from is 1 from the file.
But how do I create a code such that it pulls only the records between -112 and -102?
Is it an if statement? Professor mentions the use of boolean operator &&, but I've searched the internet with no luck...

Thank you.
 
Physics news on Phys.org
  • #2
Presumably, the values in column 1 will not be all neat integers?

I think your first move should be to invent your own sample data file containing a representative selection of records, including a couple where the relevant field is, say, -112 or something. Then focus on devising the code for selecting those records where the desired field equals that value, -112

Once you have this working, you can build on it to select for a range of values, and not necessarily integers. Looking at short sample awk scripts in your class notes or online is the best way to learn how this can be achieved.

You have an awk utility at home that you can keep testing your script on? Study manuals all you like, but nothing is as valuable to learning as the immediacy of being able to "guess & test".
 
  • Like
Likes RJLiberator
  • #3
Yes, I have a virtual machine which allows me to use linux to study unix and the awk commands here.

I can't confirm if they are neat integers or not, they may be, I will be able to confirm that tomorrow. :)

I will check through some sample class code, but after doing so earlier, I didn't have much luck in finding a way to isolate part of the column results in such a manner.
What could the && operator be used for?
 
  • #4
Google will find plenty of resources, e.g., search on "unix awk tutorial".
 
  • #5
A friend of mine suggested something along the nature of :

awk 'NR>1{print $1 -112&&-102}' originalfile.txt >! newoutput.ps

but not sure how to make it 'search' for the values inbetween -112 and -102

Perhaps:

awk -F'[:}]''$(NF-1) >= -112&& $(NF-1) <= -102' file > output.txt
 
  • #6
Boom.
Got it.
Sample code:
Code:
awk '$1>=-112&&$1<=-102{print}' EX1.csh > subset11.gmt

What this does is it takes column 1 and says all values between -112 and -102 you will print.
 
  • #7
RJLiberator said:
Sample code:
Code:
awk '$1>=-112&&$1<=-102{print}' EX1.csh > subset11.gmt

What this does is it takes column 1 and says all values between -112 and -102 you will print.
Looks good. You will find that you can even omit the {print} command because printing the whole record is the default action for AWK.
 
  • Like
Likes RJLiberator

1. What is the purpose of the Awk command in Unix?

The Awk command is a powerful text-processing tool in Unix that allows users to manipulate and extract data from text files. It is primarily used for searching and filtering data, performing calculations, and generating reports.

2. How do I use the Awk command in Unix?

To use the Awk command, you need to specify a pattern to match, an action to perform on the matched pattern, and the input file(s) to process. The basic syntax is "awk 'pattern {action}' input_file". You can also use command-line options and built-in functions to customize your Awk command.

3. Can I combine multiple Awk commands in a single line?

Yes, you can combine multiple Awk commands in a single line by separating them with a semicolon. This allows you to perform multiple operations on the same input file without having to run the Awk command multiple times.

4. How do I save the output of an Awk command to a file?

You can save the output of an Awk command to a file by using the redirection operator ">", followed by the name of the output file. For example, "awk '{print $1}' input_file > output_file" will save the first column of data from the input file to the output file.

5. Are there any alternatives to the Awk command in Unix?

Yes, there are several alternatives to the Awk command in Unix, such as Sed, Perl, and Python. Each of these tools has its own strengths and can be used for similar purposes as the Awk command. It ultimately depends on the specific task you are trying to accomplish.

Similar threads

  • Engineering and Comp Sci Homework Help
Replies
7
Views
1K
  • Engineering and Comp Sci Homework Help
Replies
1
Views
3K
  • Engineering and Comp Sci Homework Help
Replies
3
Views
4K
  • Engineering and Comp Sci Homework Help
Replies
5
Views
2K
  • Programming and Computer Science
Replies
1
Views
1K
  • Engineering and Comp Sci Homework Help
Replies
14
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
1
Views
1K
Replies
9
Views
1K
Replies
2
Views
2K
  • Programming and Computer Science
Replies
4
Views
5K
Back
Top