Python [Python] Importing csv files and creating numpy arrays

AI Thread Summary
The discussion centers around loading CSV files into Python, specifically using NumPy, to extract and sum data from a file containing Multi-channel analyzer properties. The user is trying to load counts from a CSV file that has unnecessary header rows and columns. Key points include the need to ensure the CSV file is in the same directory as the Python script unless a full path is provided. It is advised not to modify the CSV file by removing the unwanted rows or columns but rather to use NumPy's capabilities to skip the initial rows and selectively read the relevant data. The user is encouraged to familiarize themselves with the `genfromtxt()` function and consider using options to skip lines, as well as to iterate through the resulting array to calculate the total counts. While there is mention of using pandas for data handling, the user prefers to stick with NumPy for now.
dcrisci
Messages
45
Reaction score
0
So I am fairly new to Python and feel somewhat comfortable with it. I am trying to do something what I think seems kind of simple however am not too experienced with loading files into Python (or any programming language for that matter).

What I am trying to do is I have csv files that contain a bunch of Multi-channel analyzer properties in the first 21 rows and 7 columns (which I do not need to be loaded into the arrays I would like). Below these properties I have three columns for channel number, energy level, and the number of counts. I would like to load the column of counts into one numpy array and just sum them up to find the total number of counts. The coding I have this far is :

Python:
"""
Created on Wed Apr 1 16:25:25 2015
@author: dariocrisci
"""

import numpy as np

csv = np.genfromtxt('al_five_degrees.csv', delimiter= ",")

Counts = csv[:,2]

print(Counts)

I understand this will only print the array of counts, I haven't done the summing up yet (which is just a simple numpy command) but I am getting an error for not finding the file. SOOOOO, my questions are:
1. Where do the csv files need to be saved for python to find them?
2. To only create an array of value of the number of counts, should I go into my csv files and remove the MCA properties and save them with only the three columns of values?
3. Does the coding I have to load the csv file look correct?​
I have read many threads saying to use pandas but I am not familiar with this module and am trying to remain with modules I am comfortable with.
 
Last edited:
Technology news on Phys.org
If you are not going to provide a full path but only a single file name, then, you need to run your python script from the same directory where your csv file is.

Other than that, I don't feel like reading too much from your writing and speculate on how exactly your csv file look like, if you need more help, you need to post the file and maybe highlight the data you need. Having said that, you may need to read the documentation for genfromtxt and read about a couple other options like skipping lines so that you can get to where to need to be.
 
dcrisci said:
So I am fairly new to Python and feel somewhat comfortable with it. I am trying to do something what I think seems kind of simple however am not too experienced with loading files into Python (or any programming language for that matter).

What I am trying to do is I have csv files that contain a bunch of Multi-channel analyzer properties in the first 21 rows and 7 columns (which I do not need to be loaded into the arrays I would like). Below these properties I have three columns for channel number, energy level, and the number of counts. I would like to load the column of counts into one numpy array and just sum them up to find the total number of counts. The coding I have this far is :

Python:
"""
Created on Wed Apr 1 16:25:25 2015
@author: dariocrisci
"""

import numpy as np

csv = np.genfromtxt('al_five_degrees.csv', delimiter= ",")

Counts = csv[:,2]

print(Counts)

I understand this will only print the array of counts, I haven't done the summing up yet (which is just a simple numpy command) but I am getting an error for not finding the file. SOOOOO, my questions are:
1. Where do the csv files need to be saved for python to find them?
Most likely in the same directory as your python file is in, unless you provide the full path and filename to a different directory.
dcrisci said:
2. To only create an array of value of the number of counts, should I go into my csv files and remove the MCA properties and save them with only the three columns of values?
No, that's not a good idea. Just read and discard (i.e., don't store into variables) the values you don't want in the first 21 rows. For the following rows, read and discard the channel number and energy level, but store the counts number in your array.
When you have read all of the counts values, iterate through your array to find the sum of the counts.
dcrisci said:
3. Does the coding I have to load the csv file look correct?
I don't think so, but I'm not familiar with the numpy library. Typically when you're getting input from a file you have to open the file (using the open() function), and then you can read through the file, using read() (reads the whole file) or readline() (reads a single line - probably what you want).

Here is some documentation about the genfromtxt() function, if you haven't already seen it.
dcrisci said:
I have read many threads saying to use pandas but I am not familiar with this module and am trying to remain with modules I am comfortable with.
 
Thread 'Is this public key encryption?'
I've tried to intuit public key encryption but never quite managed. But this seems to wrap it up in a bow. This seems to be a very elegant way of transmitting a message publicly that only the sender and receiver can decipher. Is this how PKE works? No, it cant be. In the above case, the requester knows the target's "secret" key - because they have his ID, and therefore knows his birthdate.
Back
Top