Writing modules that require supplementary files

In summary, Jedishrfu is working on a Python program that accesses supplementary files, but is having trouble deciding how to make the files available to the user. He considers startup costs and user options before deciding that option 1 is the best course of action.
  • #1
Eclair_de_XII
1,083
91
TL;DR Summary
Let's say we have a module that allows the user to view sunset data, rent rates, and economic data for various regions of the United States. Let's say I have that data compiled. Would it be better for that data to be automatically be downloaded when the program is run, or should it be included with the module with instructions on where to put it?
I'm working with Python modules at the mo', and I am having trouble trying to decide how best to include supplementary material that is accessed by my module, but not necessarily a part of it. The program works alright when accessing it from the IDE, but it fails to recognize the files when I'm the program is being run from the shell, because these files are referenced with relative paths. Let's say I want an end-user to access these files. Do I trust him or her to follow my instructions on where to place the files, or do I write more script to automatically download the files to the right locations for him or her?

1. On one hand, the user will not always know where to place the files and if they might trick the module by placing a given file with the same name in the right location, with incorrect contents.

2. On the other hand, the user might not be trusting of programs that automatically download things to their computer, even if they are prompted to confirm the download.

I feel like option 1 is better, because the user can view the contents of the file(s) beforehand, and the instructions are explicit enough:

Python:
from os import getcwd

folder='sunset_data'
message='Please place the \"%s\" folder into %s'%(folder,getcwd())
print(message)
 
Last edited:
Technology news on Phys.org
  • #2
These are all good questions that only you can answer based on feedback from users.

I usually consider startup costs as a key decision point. You don’t want the user to wait too long for your program to startup. Similarly, during program execution you have to decide how long the user might wait for you to do a download. If you can do downloads in the background that’s better.

for deciding on where files are saved, you could tell the user they will be saved in a subdirectory under the home directory, or in a subdirectory under the current directory or in a subdirectory in temp. Each has some pros and cons. For a single user, it makes sense to create it under home or under tmp. For a multiuser computer, saving per user under home is best unless the data would be the same for all users then tmp would be a good place.

saving under the current directory can lead to your program pooping data files everywhere cluttering up the filesystem so its not the best way to go.

You could provide user options to ask each time or just once when you need permission to place files on their machine. You could provide options of where and when to save the data and when to expire it.
 
  • Like
Likes Eclair_de_XII
  • #3
jedishrfu said:
You don’t want the user to wait too long for your program to startup. Similarly, during program execution you have to decide how long the user might wait for you to do a download. If you can do downloads in the background that’s better.

My program has to read a collection of twenty-something files, all of my creation. It does not produce any additional file output; it only shows information calculated from the data files. Since it depends entirely on these supplementary files, I do not think it could even run until the files are available.
 
  • #4
@jedishrfu

You know, I think I had already made up my mind before I made this topic. So I appreciate the input and I will keep it in mind for future coding projects. But for this particular one, I do not think it would be a great fit. Thank you for your help and insight.
 
  • #5
You could read in the file in the background and display a progress bar and partial data while they are being read. Giving the user some feedback prevents a user panic that the program is hung or something else is wrong.
 
  • Like
Likes Eclair_de_XII
  • #6
Is this for Windows? For Windows, you can define an .msi file that will install things (from a .zip file?) where they should go. I don't know how .exe files do the same thing, but they can.
 
  • #7
Pergpas have a command line option which allows the user to specify a path to the datafile (as a single zipped archive; your program can unzip it as part of the load process) with a default to the current directory if the option is missing.
 
  • #8
Let's say I want to let the user specify where he or she wants the data files. My code is hard-wired to look for the files in question in the paths relative to the shell's cwd. Would I have to change every line of code that reads these files from the relative paths, or is there an easier way? I ask out of pure curiosity; I'm still ambivalent on whether or not I wish to let the user decide where to save the supplementary files.
 
  • #9
You could use an environment variable to hold the base directory of where you will store the files and if not present use a relative directory relative to your current directory or relative to where your program resides.

You should play with these ideas and let your users try them out. We really can't answer this question well here.

As an exampe, you could see how other systems do this. One example is golang aka go which has two defined environment parameters GOROOT and GOPATH for its go command.

GOROOT locates the program and its related code and data files usually defaulted to /usr/local/go. GOPATH locates where it will store vendor code downloaded from the internet usually defaulted ~/go.

If a user types "go get github.com/gizak/termui" the go command will download termui code from github into the ~/go/src (GOPATH) directory stored as ~/go/src/github.com/gizak/termui.
 
Last edited:
  • #10
Based on your description, it appears these are data files that should belong with and be distributed with your module. In that case place them within a subdirectory relative to your module. For example:

Code:
/package/mymodule.py
/package/data/file1.dat
/package/data/file2.dat
/package/data/file...dat

From within your module. You can get the path to the data using something like:

Python:
DATA_DIR = os.path.dirname(__file__)
data1_filename = os.path.join(DATA_DIR, 'file1.dat')
...

If you distribute your module, you can specify data files within the setup.py file using the package_data attribute to make sure these files are installed in the appropriate location.

This is the recommended approach to use if your data files do not change frequently for a given module version. But if you are using different data files each time the application runs, then this approach won't work and you will have to download them each time either to the users current working directory (not a good idea since they may not have write permissions to it) or to a temporary directory which you create (see https://docs.python.org/3/library/tempfile.html).

One other point, if you do a lot of pre-processing of data files that is the same from instance to instance, it would be wise to run this in advance and save the processed files rather than the raw data files, to speed up load time. Also consider using a binary memory-mapped file for saving the data, as this can speed up load times dramatically.
 
  • Like
Likes jedishrfu

FAQ: Writing modules that require supplementary files

1. How do I create a module that requires supplementary files?

To create a module that requires supplementary files, you will first need to determine the type of files that are needed. Then, you can use the "require" or "include" function in your code to load the necessary files. Make sure to specify the correct file path to ensure that the files are properly loaded.

2. What is the purpose of using supplementary files in a module?

Supplementary files are used in a module to provide additional functionality or resources that are not included in the main code. This can include images, data files, or other external resources. By using supplementary files, you can keep your main code clean and organized, and easily add or update resources as needed.

3. How do I organize my supplementary files within a module?

The organization of supplementary files within a module will depend on the specific needs and structure of your code. However, it is generally a good practice to create a separate folder for your supplementary files and use relative paths to access them. This will help keep your main code organized and make it easier to locate and update the supplementary files.

4. Can I use external resources as supplementary files in my module?

Yes, you can use external resources such as images or data files as supplementary files in your module. However, it is important to ensure that these resources are properly referenced and accessible in your code. You may need to use absolute paths or include the necessary external libraries to use these resources in your module.

5. How can I ensure that my module and its supplementary files are compatible with different systems?

To ensure compatibility with different systems, it is important to use relative paths when referencing your supplementary files. This will ensure that the files can be properly accessed regardless of the file structure on different systems. Additionally, it is helpful to thoroughly test your module on different systems to identify and address any potential compatibility issues.

Similar threads

Replies
8
Views
1K
Replies
5
Views
802
Replies
2
Views
27K
Replies
6
Views
36K
Replies
4
Views
8K
Replies
18
Views
7K
Back
Top