Can't Open .ipynb file in Jupyter: Too Large

  • Thread starter Thread starter WWGD
  • Start date Start date
  • Tags Tags
    File
Click For Summary

Discussion Overview

The discussion revolves around the challenges of opening a large Jupyter Notebook file (.ipynb) that has become unresponsive due to excessive output generated by an incorrectly implemented loop. Participants explore various methods to access and modify the file to remove the problematic content.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Exploratory

Main Points Raised

  • One participant describes their issue with a 99.3 MB .ipynb file that fails to load due to excessive output from a loop.
  • Several participants suggest changing the file extension to .txt or .json to open it in Notepad++, which can handle large files.
  • There is a discussion about the accessibility of the file in the Jupyter environment and whether it can be found using the Windows file manager.
  • Some participants express skepticism about manually searching through such a large file, suggesting that code folding editors could simplify the process.
  • A participant shares a script designed to strip outputs from a Jupyter notebook, allowing for easier handling of large files.
  • Another participant mentions that using Thonny to open the file has resulted in crashes, indicating it may not be suitable for large files.

Areas of Agreement / Disagreement

Participants generally agree on the methods to access and modify the file, but there are differing opinions on the best tools to use and the necessity of manual searching through the file.

Contextual Notes

Some participants mention the importance of ensuring the JSON format remains well-formed after editing, and there are concerns about the limitations of certain code editors when handling large files.

WWGD
Science Advisor
Homework Helper
Messages
7,783
Reaction score
13,038
Hi All,
I may have asked this question earlier. Apologies if I did.

I have a Python .ipynb file in a Jupyter Notebook whose size is 99.3 MB. Thing is most of it is not code:
I wrote some code; specifically a loop incorrectly and it iterated way too many times. I saved the file, which included this repeated loop. Now I just get "File failed to load" error messages. Someone suggested ( it may have been here in PF) to change the extension to some text file, open it and delete the repeated output. But I don't see how to do this .
Any ideas?
 
Technology news on Phys.org
Change the .ipynb extension to .txt and open it with notepad++.
 
Last edited:
  • Like
Likes   Reactions: WWGD
Borg said:
Change the .ipynb extension to .txt and item or with notepad++.
Thank you, I thought of that, but I am not sure of how to do it. The format of the notebook does not seem to allow for that. These files seem to be stored in the back end server of Jupyter. Edit: And are not accessible from the Windows file manager.Maybe I can use save as and store them in a path. using the .txt extension. Will give it a try.
 
Last edited:
Sorry about the typo earlier (phone autocorrect). Notepad++ should be able to handle a 90Mb file without too much difficulty. BTW, you don't really have to even change the extension - just open it directly with Notepad++.

Once you open it, the notebook is just a JSON file when you open it. Notepad++ will show the line numbers on the left side and should highlight an entire section when you click in it. Just look for one that's really big. When you remove the extra junk, you'll have to make sure that it's still well-formed json when you're done - no dangling brackets like this { that don't have the ending bracket (or vice versa). I would advise two things with this:
  • Practice on manually removing content from a smaller file first to make sure that it opens when you're finished.
  • Make a backup of the file that you're trying to fix and only do this to copies of the original.
 
  • Like
Likes   Reactions: WWGD
WWGD said:
I have a Python .ipynb file in a Jupyter Notebook whose size is 99.3 MB. Thing is most of it is not code:
I wrote some code; specifically a loop incorrectly and it iterated way too many times. I saved the file, which included this repeated loop.
Why do you want to open this file? Searching manually through a 99.3 MB seems like looking for a needle in an array of haystacks.
 
  • Like
Likes   Reactions: WWGD and Vanadium 50
nbstripout was made for this task.

If you want to do it manually:

WWGD said:
Thank you, I thought of that, but I am not sure of how to do it. The format of the notebook does not seem to allow for that. These files seem to be stored in the back end server of Jupyter.
No, the 'back end server of Jupyter' just stores them in the file system, wherever you tell it to, defaulting to your home directory which on Windows is something like C:\Users\WWGD. Don't you recognise the path when you browse files from the Jupyter window?

WWGD said:
Edit: And are not accessible from the Windows file manager.
Either you are looking in the wrong place or you have some daft setting in File Manager. If it is the first, type
Code:
cd C:\
dir *.ipynb /S
and wait to find all the notebooks on your C: drive (and repeat for other drives if necessary).

WWGD said:
Maybe I can use save as and store them in a path. using the .txt extension. Will give it a try.
Unless you have a Jupyter plugin for Notepad++ it would be better to change the extension to .json, then you should get code folding which will help you. If you do have a Jupyter plugin then just leave the extension.
 
  • Like
Likes   Reactions: WWGD
Mark44 said:
Why do you want to open this file? Searching manually through a 99.3 MB seems like looking for a needle in an array of haystacks.
Unless the file is corrupted there is no need to search manually, any code folding editor will quickly separate the wheat from the chaff.
 
  • Like
Likes   Reactions: WWGD and Borg
Mark44 said:
Why do you want to open this file? Searching manually through a 99.3 MB seems like looking for a needle in an array of haystacks.
A Ctrl +F should help.
 
Now, I found the file path ( Indeed in C:\Users\...) I've tried to open the file ( With a .pdf extension) , in Thonny and Thonny shuts down each time ( 3 times so far).
 
  • #10
WWGD said:
Now, I found the file path ( Indeed in C:\Users\...) I've tried to open the file ( With a .pdf extension)
If it has a pdf extension it is probably a pdf file, not a Jupyter notebook.

WWGD said:
in Thonny and Thonny shuts down each time ( 3 times so far).
Ditch Thonny, you need a proper code editor. For most purposes I recomment Visual Studio Code, but it barfs on really big files so for this purpose on Windows I would use Notepad++. But it won't be any use for a pdf file, that is not going to contain your code.
 
  • Like
Likes   Reactions: WWGD
  • #11
pbuk said:
If it has a pdf extension it is probably a pdf file, not a Jupyter notebook.Ditch Thonny, you need a proper code editor. For most purposes I recomment Visual Studio Code, but it barfs on really big files so for this purpose on Windows I would use Notepad++. But it won't be any use for a pdf file, that is not going to contain your code.
Thanks; will try resaving as a .ipynb and see if I can open it.
 
  • #12
This short script will strip the outputs from a Jupyter notebook (and will run as a Jupyter notebook itself).

[CODE lang="python" title="Strip output from Jupyter notebook"]import json

# Enter your filename here.
filename = 'your-file-name-without-extension'

ext = '.ipynb'
outfilename = filename + '-stripped'

# Parse the notebook file into a dict and reference its 'cells'.
infile = open(filename + ext)
notebook = json.load(infile)
infile.close()

cells = notebook['cells']

print('Removing output from', len(cells), 'cells')
for cell in cells:
if 'outputs' in cell:
cell['outputs'] = []

# Dump the modified dict as prettyprinted json (= ipynb).
print('Writing to', outfilename + ext)
outfile = open(outfilename + ext, 'w')
json.dump(obj, outfile, indent = 2)
outfile.close()[/CODE]
 
  • Like
Likes   Reactions: Borg and WWGD
  • #13
Thanks, @pbuk , just a follow-up. It worked well. Thanks again.
 
  • Love
Likes   Reactions: pbuk

Similar threads

Replies
7
Views
2K
  • · Replies 12 ·
Replies
12
Views
9K
Replies
3
Views
3K
Replies
6
Views
3K
  • · Replies 17 ·
Replies
17
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 7 ·
Replies
7
Views
1K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 27 ·
Replies
27
Views
5K
  • · Replies 14 ·
Replies
14
Views
2K