Python workflow for experimental data analysis

Click For Summary

Discussion Overview

The discussion revolves around the challenges and experiences of using Python for data analysis and plotting in experimental physics. Participants share their workflows, tools, and preferences, particularly in comparison to MATLAB and OriginPro, and seek advice on improving their processes.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant expresses dissatisfaction with their current Python workflow for plotting, noting difficulties in saving and retrieving plots and data, which contrasts with their experience using OriginPro.
  • Another participant suggests that the issue may stem from not saving Python scripts, advocating for a practice of documenting and saving work similar to their MATLAB approach.
  • A different participant acknowledges saving scripts but highlights the challenges of interactive programming and the complexity of recreating specific workspaces from scratch.
  • One suggestion involves programmatically logging selected options and saving accompanying plot images in a timestamped folder, along with the Python code used for generating the plots.
  • Another participant shares their method of documenting code changes in MATLAB, emphasizing the importance of keeping a record of all lines of code for future reference.
  • Several participants recommend using Jupyter notebooks for their advantages in plotting and documentation, with one noting the ability to embed live plots and the convenience of having code and visualizations together.
  • A participant mentions the potential of Jupyter Lab as a future IDE, expressing interest in its development and features.

Areas of Agreement / Disagreement

There is no clear consensus on the best workflow for data analysis and plotting in Python. Participants present various approaches and tools, with some advocating for Jupyter notebooks while others prefer traditional IDEs or MATLAB practices. The discussion remains open-ended with multiple competing views.

Contextual Notes

Participants express varying levels of comfort with different tools and workflows, indicating that their preferences may depend on specific tasks or types of analysis. There are also references to limitations in current software capabilities and personal habits in coding and documentation.

franciobr
Messages
13
Reaction score
0
Hello!

I have been using python for my data analysis and processing needs as an experimental physicist for an year now. I have used MATLAB and originpro before and python provides me everything I need.
But I am not satisfacted with my worflow specifically for plotting needs. I often find myself looking for old plots (images) that I did and having to rewrite code to do them over again, which is counterproductive.

I am currently using a combo of sublime text 3 (plus Anaconda plugins) and Spyder IDE for data visualization and running code on its Ipython interpreter (I like the variable explorer and the Ipython interpreter is much superior to SublimeREPL) . It works great for data processing and general coding but saving plots and processed data in a tidy way has been a hassle. If I could save all my workspace like in MATLAB it would be much better (Spyder's implementation of this is limited).

Originpro in contrast automatically saves all the data and plots in the same project which is a great workflow for plotting and saving your work. I must say that even having a decent background with python I find myself more productive for plotting and light data processing on originpro.

I am aware that there are alternatives such as Jupyter and Pycharm. Maybe I should give Jupyter notebook another shot.

I wanted to have some insight from you guys. What is your workflow for data analysis/plotting with python? Do you have a hard time going back and editting plots and datasets? How do you save your work?
 
Technology news on Phys.org
from your post, you appear to not be saving your python scripts. Sure in Matlab you can save the workspace, but I never do, I write everything as a .m file and rerun later if I have to. I also work my imagery/plots then save a copy to refer back to later and include in presentations and reports, so why aren't you doing the same with your python.
 
Well, I do save most of my python scripts but some work I do in the form of interactive programming (ipython interpreter). Specially when I am exploring the data and testing different visualizations. And sometimes I do batches of plot that involve running a combination of custom function several times and it is time consuming to document everything in such way I can recreate that specific workspace by running a single script. I wish I could just pick it up from where I left and easily edit the work instead of having to recreate it every time.
 
Can you programmatically create a log of your selected options [as a text file],
together with accompanying plot images, possibly dumped into a folder with a timestamp label?
You might consider outputting in that text file the python code of your options [e.g. function calls with parameters] that you could copy-paste to reproduce the plot.
(Later, you could write a script that will [say] create an html page that can show all of the plot images (sitting in the various folders) so that you can survey the variations.)
 
When I do a new analysis in matlab, I will comment out lines, document and add to the file until I get what I want, I don't really ever delete lines of code, if you did that in your scripts then you'd have your log of keystrokes...
 
I suggest using an ipython notebook. I typically use the nohup command so that the ipython server persists if the terminal is closed

nohup jupyter notebook &

You can embed live plots (that is, they can be zoomed, panned, saved to disc, etc) using the %matplotlib magic command

%matplotlib notebook
 
I was on vacation and away from data analysis and science for a while, sorry for the long time without showing up. Thank you robphy and Dr. Transport for the thoughtful answers. I could use the history from my commands or thoroughly document everything but it seems to be too much trouble to me.

Daverz said:
I suggest using an ipython notebook. I typically use the nohup command so that the ipython server persists if the terminal is closed

nohup jupyter notebook &

You can embed live plots (that is, they can be zoomed, panned, saved to disc, etc) using the %matplotlib magic command

%matplotlib notebook

Turns out the jupyter notebook is really good for plotting! I did not know about the %matplotlib notebook (just the inline magic command) and it is very helpful. I had tried the notebook before for general programming and felt it lacking some IDE features for lots of coding. Having all the plots and their code right next to each other is great for editing and documenting my work. I can also add some code and latex equations with the notebook and do some sharing of the analysis. Still not ditching sublime for some "serious" coding but for light coding and plotting Jupyter is awesome! It also supports R which I use from time to time. Thanks for the tip Daverz.

it seems the jupyter team is working on a project to turn jupyter notebook into a "full featured" IDE called jupyter lab http://blog.jupyter.org/2016/07/14/jupyter-lab-alpha/. Looks promising, will definitely check it out once version 1.0 comes out.
 

Similar threads

  • · Replies 17 ·
Replies
17
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
Replies
5
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K