Can't Open .ipynb file in Jupyter: Too Large

  • Thread starter Thread starter WWGD
  • Start date Start date
  • Tags Tags
    File
Click For Summary
SUMMARY

The forum discussion addresses the issue of opening a large Jupyter Notebook (.ipynb) file, specifically one that is 99.3 MB in size, which fails to load due to excessive output generated by an incorrectly iterated loop. Users recommend changing the file extension to .txt or .json and using Notepad++ for editing, as it can handle large files and display JSON structure. Additionally, a Python script is provided to strip outputs from the notebook, allowing for successful reopening. The discussion emphasizes the importance of maintaining well-formed JSON during manual edits.

PREREQUISITES
  • Understanding of Jupyter Notebook file structure and .ipynb format
  • Familiarity with JSON data format
  • Basic knowledge of Python programming
  • Experience using text editors like Notepad++ for code editing
NEXT STEPS
  • Learn how to use Notepad++ for editing large JSON files
  • Explore the functionality of the nbstripout tool for cleaning Jupyter notebooks
  • Practice writing Python scripts to manipulate JSON data
  • Investigate Jupyter Notebook extensions that enhance file management and editing
USEFUL FOR

Data scientists, Python developers, and anyone working with Jupyter Notebooks who needs to manage large files and optimize their workflow.

WWGD
Science Advisor
Homework Helper
Messages
7,777
Reaction score
13,011
Hi All,
I may have asked this question earlier. Apologies if I did.

I have a Python .ipynb file in a Jupyter Notebook whose size is 99.3 MB. Thing is most of it is not code:
I wrote some code; specifically a loop incorrectly and it iterated way too many times. I saved the file, which included this repeated loop. Now I just get "File failed to load" error messages. Someone suggested ( it may have been here in PF) to change the extension to some text file, open it and delete the repeated output. But I don't see how to do this .
Any ideas?
 
Technology news on Phys.org
Change the .ipynb extension to .txt and open it with notepad++.
 
Last edited:
  • Like
Likes   Reactions: WWGD
Borg said:
Change the .ipynb extension to .txt and item or with notepad++.
Thank you, I thought of that, but I am not sure of how to do it. The format of the notebook does not seem to allow for that. These files seem to be stored in the back end server of Jupyter. Edit: And are not accessible from the Windows file manager.Maybe I can use save as and store them in a path. using the .txt extension. Will give it a try.
 
Last edited:
Sorry about the typo earlier (phone autocorrect). Notepad++ should be able to handle a 90Mb file without too much difficulty. BTW, you don't really have to even change the extension - just open it directly with Notepad++.

Once you open it, the notebook is just a JSON file when you open it. Notepad++ will show the line numbers on the left side and should highlight an entire section when you click in it. Just look for one that's really big. When you remove the extra junk, you'll have to make sure that it's still well-formed json when you're done - no dangling brackets like this { that don't have the ending bracket (or vice versa). I would advise two things with this:
  • Practice on manually removing content from a smaller file first to make sure that it opens when you're finished.
  • Make a backup of the file that you're trying to fix and only do this to copies of the original.
 
  • Like
Likes   Reactions: WWGD
WWGD said:
I have a Python .ipynb file in a Jupyter Notebook whose size is 99.3 MB. Thing is most of it is not code:
I wrote some code; specifically a loop incorrectly and it iterated way too many times. I saved the file, which included this repeated loop.
Why do you want to open this file? Searching manually through a 99.3 MB seems like looking for a needle in an array of haystacks.
 
  • Like
Likes   Reactions: WWGD and Vanadium 50
nbstripout was made for this task.

If you want to do it manually:

WWGD said:
Thank you, I thought of that, but I am not sure of how to do it. The format of the notebook does not seem to allow for that. These files seem to be stored in the back end server of Jupyter.
No, the 'back end server of Jupyter' just stores them in the file system, wherever you tell it to, defaulting to your home directory which on Windows is something like C:\Users\WWGD. Don't you recognise the path when you browse files from the Jupyter window?

WWGD said:
Edit: And are not accessible from the Windows file manager.
Either you are looking in the wrong place or you have some daft setting in File Manager. If it is the first, type
Code:
cd C:\
dir *.ipynb /S
and wait to find all the notebooks on your C: drive (and repeat for other drives if necessary).

WWGD said:
Maybe I can use save as and store them in a path. using the .txt extension. Will give it a try.
Unless you have a Jupyter plugin for Notepad++ it would be better to change the extension to .json, then you should get code folding which will help you. If you do have a Jupyter plugin then just leave the extension.
 
  • Like
Likes   Reactions: WWGD
Mark44 said:
Why do you want to open this file? Searching manually through a 99.3 MB seems like looking for a needle in an array of haystacks.
Unless the file is corrupted there is no need to search manually, any code folding editor will quickly separate the wheat from the chaff.
 
  • Like
Likes   Reactions: WWGD and Borg
Mark44 said:
Why do you want to open this file? Searching manually through a 99.3 MB seems like looking for a needle in an array of haystacks.
A Ctrl +F should help.
 
Now, I found the file path ( Indeed in C:\Users\...) I've tried to open the file ( With a .pdf extension) , in Thonny and Thonny shuts down each time ( 3 times so far).
 
  • #10
WWGD said:
Now, I found the file path ( Indeed in C:\Users\...) I've tried to open the file ( With a .pdf extension)
If it has a pdf extension it is probably a pdf file, not a Jupyter notebook.

WWGD said:
in Thonny and Thonny shuts down each time ( 3 times so far).
Ditch Thonny, you need a proper code editor. For most purposes I recomment Visual Studio Code, but it barfs on really big files so for this purpose on Windows I would use Notepad++. But it won't be any use for a pdf file, that is not going to contain your code.
 
  • Like
Likes   Reactions: WWGD
  • #11
pbuk said:
If it has a pdf extension it is probably a pdf file, not a Jupyter notebook.Ditch Thonny, you need a proper code editor. For most purposes I recomment Visual Studio Code, but it barfs on really big files so for this purpose on Windows I would use Notepad++. But it won't be any use for a pdf file, that is not going to contain your code.
Thanks; will try resaving as a .ipynb and see if I can open it.
 
  • #12
This short script will strip the outputs from a Jupyter notebook (and will run as a Jupyter notebook itself).

[CODE lang="python" title="Strip output from Jupyter notebook"]import json

# Enter your filename here.
filename = 'your-file-name-without-extension'

ext = '.ipynb'
outfilename = filename + '-stripped'

# Parse the notebook file into a dict and reference its 'cells'.
infile = open(filename + ext)
notebook = json.load(infile)
infile.close()

cells = notebook['cells']

print('Removing output from', len(cells), 'cells')
for cell in cells:
if 'outputs' in cell:
cell['outputs'] = []

# Dump the modified dict as prettyprinted json (= ipynb).
print('Writing to', outfilename + ext)
outfile = open(outfilename + ext, 'w')
json.dump(obj, outfile, indent = 2)
outfile.close()[/CODE]
 
  • Like
Likes   Reactions: Borg and WWGD
  • #13
Thanks, @pbuk , just a follow-up. It worked well. Thanks again.
 
  • Love
Likes   Reactions: pbuk

Similar threads

Replies
7
Views
2K
  • · Replies 12 ·
Replies
12
Views
9K
Replies
3
Views
3K
Replies
6
Views
3K
  • · Replies 17 ·
Replies
17
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 7 ·
Replies
7
Views
1K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 27 ·
Replies
27
Views
5K
  • · Replies 14 ·
Replies
14
Views
2K