Python How to Import and Manipulate Data from a Text File in Python?

Click For Summary
The discussion centers on extracting specific data from a text file containing numerical values arranged in rows and columns. The goal is to create a matrix from every p'th row, specifically taking the second and third columns. A Python solution is presented using NumPy, where the file is read into memory, and the relevant data is extracted by selecting every p'th row and the specified columns. The code demonstrates how to achieve this efficiently, while also noting that if the file is too large, line-by-line reading is an alternative. The conversation touches on the possibility of using Pandas for similar tasks, although the original poster has not yet learned it. The importance of knowing the values of m and p in advance is emphasized, and a potential issue with non-integer results from m/p is acknowledged.
Avatrin
Messages
242
Reaction score
6
Hi

Lets say I have a txt file with m rows with n columns of numbers of the form:

1 .12222E+01 .27300E+01 -.41442E+01 -.49391E+00
1 .80375E+00 .15953E+00 .47715E+00 -.10432E+01
2 .11046E+01 .79376E-01 -.12639E+00 -.10389E+00
2 -.95291E-02 -.54210E+01 .36199E+00 -.13710E+01
2 .16524E+00 -.27779E+01 .74098E+00 -.34125E+00

Lets say I want to take every p'th row and take the second and third columns and turn it into a \frac{m}{p}\times 2 matrix. How would I go about doing that?
 
Technology news on Phys.org
These things are very easy in Python. Below is the way I would do this, but there are many other ways. I'm assuming that you know m and p in advance. You must know p, but if you don't know m, you can read this file first and get m from len(lines). I'm also assuming that the text file is small enough that you can read the whole thing into memory, which is the easiest thing to do. If the file is too big for that, there are ways to read it line by line.
Code:
import numpy as np
m = <whatever>
p = <whatever>
data = np.zeros([m/p,2])
file = open(textfile, 'r')
lines = file.readlines()
file.close()
for i,line in enumerate(lines):
     if i%p == 0:
          items = line.split()
          data[i/p,0] = float(items[1])
          data[i/p,1] = float(items[2])
     else:
          continue
 
  • Like
Likes Avatrin
My approach would be the following:
Code:
import numpy as np
# Import the data as a numpy array of arrays (a matrix). The elements of the array are the rows of the matrix.
raw_data = np.genfromtxt('data.txt', delimiter=' ', dtype=float)

# Select p'th rows
p_data = raw_data[::p]

# Select the 2nd and 3rd columns
data = p_data[:,[1,2]]

Of course don't forget to put a value for p. For p=2 you'd get that data is worth
Code:
array([[ 1.2222  ,  2.73    ], [ 1.1046  ,  0.079376], [ 0.16524 , -2.7779  ]])

I don't know whether this is what you were looking for, especially because m=5 and p=2 so m/p is not an integer while I obtain a 3x2 matrix. You could even easily condense the above code into a 3 liners, if you don't mind having no comment (bad practice of coding though) and lose some readability.
 
ChrisVer said:
Can't you use pandas (http://pandas.pydata.org/) to read tables from a file?

Well, Pandas is on my list of things I have to learn. However, I am not quite there yet. :)
 
Learn If you want to write code for Python Machine learning, AI Statistics/data analysis Scientific research Web application servers Some microcontrollers JavaScript/Node JS/TypeScript Web sites Web application servers C# Games (Unity) Consumer applications (Windows) Business applications C++ Games (Unreal Engine) Operating systems, device drivers Microcontrollers/embedded systems Consumer applications (Linux) Some more tips: Do not learn C++ (or any other dialect of C) as a...

Similar threads

  • · Replies 8 ·
Replies
8
Views
4K
  • · Replies 16 ·
Replies
16
Views
3K
  • · Replies 18 ·
Replies
18
Views
3K
  • · Replies 14 ·
Replies
14
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 7 ·
Replies
7
Views
4K
  • · Replies 8 ·
Replies
8
Views
2K