Understanding Python Module Installation with Pip

  • Context: Python 
  • Thread starter Thread starter fog37
  • Start date Start date
  • Tags Tags
    Modules Python
Click For Summary

Discussion Overview

The discussion revolves around the installation of Python modules using pip, the structure of Python modules and packages, and the differences between various Python distributions such as Anaconda. Participants explore the concepts of module importation, the role of the PYTHONPATH environment variable, and the organization of the Python standard library.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Debate/contested
  • Homework-related

Main Points Raised

  • Some participants explain that modules are Python files containing functions and classes, and that they must be imported to be used in programs.
  • Others note that if a module is not available in the standard library, it can be installed using pip, which downloads the module from the internet.
  • A participant clarifies that a module does not need to be in a package to be importable, as long as it is in a directory listed in sys.path.
  • There is a discussion about the definition of a package in Python, with some arguing that a package is synonymous with a library, while others assert that packages are defined by their role in structuring the module namespace.
  • One participant shares their experience installing the numpy module using pip and inquires about the installation location of the module and the pip program itself.
  • Another participant suggests that the location of installed packages can be found by importing the package and checking its __path__ attribute.
  • There is mention of the Python documentation being a valuable resource for answering questions about modules and packages.

Areas of Agreement / Disagreement

Participants express differing views on the definitions and relationships between modules, packages, and libraries. While some points are clarified, there remains no consensus on the terminology and structure of Python's module system.

Contextual Notes

Participants reference specific Python versions and environments, indicating that the discussion may be influenced by individual setups and experiences. There are also mentions of the need for an internet connection for pip installations and the organization of Python installations on different operating systems.

Who May Find This Useful

This discussion may be useful for Python programmers, especially those new to module installation and management, as well as those seeking to understand the structure of Python's module system and the differences between various Python distributions.

fog37
Messages
1,566
Reaction score
108
TL;DR
Correct procedure to install modules that is not available
Hello,

I understand that modules are essentially Python file save as .py. These files contain both functions and/or classes. To use them in our programs, we must use the keyword import.

However, this works only if the module is available, i.e. already installed in the standard Python library, correct? A library, called package, is a folder containing lots of related modules and a file called __init__.py.

On the other hand, a "distribution" like Anaconda is a full collection of libraries together with other useful tools.

If the single module that we need is not available in the standard library, we must use the Python library manager pip and install it the same way we install an entire library. I guess an internet connection must be available since the pip is practically downloading the module from the web...

Thanks for any validation/correction...
 
Technology news on Phys.org
There are other schemes besides pip most notably the anaconda distribution and its smaller sibling miniconda.

Anaconda comes with a host of commonly used modules preinstalled so you pretty much can get started without an internet connection.

When developing your code, you might try to develop your own modules and then you must extend the PYTHONPATH environment variable with a path to your modules. You can find details here:

https://www.tutorialspoint.com/What-is-PYTHONPATH-environment-variable-in-Python

and for building your own modules:

https://www.digitalocean.com/community/tutorials/how-to-write-modules-in-python-3

Python programmers also build environments that select what modules a given program uses to speed load time as Python pulls everything together:

https://realpython.com/python-virtual-environments-a-primer/

As a python programmer you must become comfortable with these topics.

Pycharm is arguably the best IDE for python development so you should check it out. MS VS Code Editor looks pretty good too. Both have debugging features. VS Code seems to load faster. Both have community editions (free, free free free!)

I'm a bad example of a python programmer in that I have yet to use virtual environments for anything. But my code is mostly useful utilities that I alone use.
 
  • Like
Likes   Reactions: fog37
fog37 said:
this works only if the module is available, i.e. already installed in the standard Python library, correct? A library, called package, is a folder containing lots of related modules and a file called __init__.py.

A module does not have to be in a package to be importable. It can be a single module, not inside a package, and will be importable as long as its .py file is in a directory that is in sys.path.

The Python standard library itself is not a package (although it contains packages). It is just a directory that is in sys.path (because it is set up that way when Python is installed on your computer) and contains modules and packages that are shipped with the Python interpreter.
 
  • Like
Likes   Reactions: fog37
Thank you PeterDonis. So a package seems to be a synonym of library which is a directory (folder) of modules.

Why wouldn't the Python standard library not be a package is library=package=directory with modules?
 
fog37 said:
a package seems to be a synonym of library which is a directory (folder) of modules.

No. What you describe is the default way that packages are stored in the filesystem (and you've missed some parts--the directory has to be a subdirectory of a directory that's in sys.path and has to have an __init__.py file in it), but that's not what defines what a package is in Python.

Here is what a package is in Python:

https://docs.python.org/3/tutorial/modules.html#packages

The first sentence of this section is:

Packages are a way of structuring Python’s module namespace by using “dotted module names”.

In other words, packages, in and of themselves, are defined by the way they structure Python's module namespace. The way in which this structuring of Python's module namespace connects with how things are stored in the filesystem is, properly speaking, an implementation detail; there is nothing in the Python language specification that requires a package to correspond to a directory in the filesystem. And, in fact, since Python's import mechanism itself is programmable, it is possible to have the import statement look for packages and modules in other ways besides the default way of looking through the directories in sys.path for a subdirectory with the given name that meets the requirements for a package.

fog37 said:
Why wouldn't the Python standard library not be a package

Because it isn't. See above.

Btw, a general note: the Python documentation is actually very detailed, and it is highly probable that it already contains answers to a lot of questions you might have. I strongly recommend taking some time to read through it and become familiar with the topics it covers. The top-level page for the current release candidate of Python (3.8rc1) is here:

https://docs.python.org/3/index.html

You can use the version dropdown at the top to get to the corresponding top-level page for the documentation of any extant version of Python.
 
  • Like
Likes   Reactions: fog37
Hello,
Today I successfully installed the numpy module (which wasn't installed) using pip but I am not sure what was really going on exactly. On the command line, I typed python - m pip install numpy

1600195540290.png

  • Where did the pip program manager install the numpy module? In which folder can I find the numpy module (or package)?
  • Also, the pip program itself is not installed in the system32 folder. The folder system32 is the current folder, correct? How can I run pip from there?
  • pip is installed in the subfolder Scripts in the folder Python37 where I see the program python which is not installed in the subfolder system32...Is that the python interpreter? If so, can I use the interpreter directly if I click on it instead of using the IDLE or the command line?
1600196293165.png


However, it seems that the shortcuts to the IDLE and to the interpreter (Python 3.7 (64-bit)) are installed elsewhere in the Start Menu folder (see below). Is this a good practice?

1600195933820.png


Thanks!
 

Attachments

  • 1600195716546.png
    1600195716546.png
    37.9 KB · Views: 205
fog37 said:
  • Where did the pip program manager install the numpy module? In which folder can I find the numpy module (or package)?

The simplest way to find out where a Python package (numpy is a package) is installed is to start an interactive Python interpreter session, import the package, and then look at its __path__ attribute. Here's what happens on my machine when I do that (my machine is Linux, not Windows, but the interpreter commands would be the same):

Python:
Python 3.5.2 (default, Jul 17 2020, 14:04:10)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> numpy.__path__
['/usr/local/lib/python3.5/dist-packages/numpy']
>>>

fog37 said:
  • Also, the pip program itself is not installed in the system32 folder. The folder system32 is the current folder, correct? How can I run pip from there?

The system32 folder on Windows is not for programs; it's for system libraries. Programs are not installed there. I have no idea why you would need to run any program from that folder. As for pip, it doesn't matter which folder you run it from; it doesn't care what your current folder is.

Also, pip is not a standalone program, it's a Python module (actually it's a package, see note below). When you issue the command python -m pip install numpy, the program you are running is python; that program then loads the pip module and runs it as a script. See the Python documentation (remember what I said about the answers to your questions already being in the Python documentation?) here:

https://docs.python.org/3/tutorial/modules.html#executing-modules-as-scripts

Note that pip is actually a package, so when you run it as a script this way, the module that gets executed as a script is actually the __main__.py file in the package. See the Python docs (there's the docs again...) here:

https://docs.python.org/3/library/__main__.html?highlight=__main__ py#module-__main__

fog37 said:
  • pip is installed in the subfolder Scripts in the folder Python37 where I see the program python which is not installed in the subfolder system32...Is that the python interpreter? If so, can I use the interpreter directly if I click on it instead of using the IDLE or the command line?

The python program in the Python37/Scripts folder is the python interpreter executable, yes. Python scripts that can be run as standalone programs also get installed there. So instead of typing the command python -m pip install numpy, you could just type pip install numpy and it should work. If you look at the source code of the pip script in the Scripts folder, all it really does is import the pip package and run the same function the __main__.py file in that package runs if you run it as a script with python -m.

As for using the interpreter directly, I don't know if clicking on it in Windows will bring up a command prompt window with the interpreter running, but you can always type "python" in a command prompt window and get an interactive interpreter session.

fog37 said:
it seems that the shortcuts to the IDLE and to the interpreter (Python 3.7 (64-bit)) are installed elsewhere in the Start Menu folder (see below). Is this a good practice?

If by "good practice" you mean "what Windows always does with shortcuts", yes. Whether that is actually a good practice as far as usability is a different question. :wink:
 
  • Like
Likes   Reactions: fog37
Thank you PeterDonis. I really appreciate your inputs. It is true that the documentation has everything but sometimes it get overwhelming to read and after you facilitate the topic I can dig into the documentation with a different spirit.
For instance, in regards to pip, you mentioned that it is both a Python library/package installation management tool as well as a Python module...

In some cases, pip is used with the statement:
Code:
pip install modulename

while in other instances as
Code:
-m pip install modulename
The first part means to run pip as a module...
Code:
-m

Why would we want to run pip, which is a package installation tool, as a module? Also, when I don't use -m, things seem to work just fine...

Thanks
 
fog37 said:
in regards to pip, you mentioned that it is both a Python library/package installation management tool as well as a Python module...

That's not quite what I said.

The package pip is what contains the actual Python code that does the work.

Inside that package, there is a module named __main__.py. That is what gets run if you type python -m pip install <something> at the command line, since that's how the python -m command works when you give it the name of a package. (If you give it the name of a standalone module that's not in a package, it just runs that module.) The module __main__.py inside the pip package is just a little bit of stub code that calls a function elsewhere in the pip package to do the work.

There is also a standalone script called pip, which is what gets run if you type pip install <something> at the command line. (This assumes that the script has been installed somewhere on your command line's search path.) This script is also just a little bit of stub code that calls a function in the pip package (the same one the module __main__.py inside the pip package calls) to do the work.

So python -m pip install <something> and pip install <something> are just two different ways of calling the same code to do the same work. Which one you use is really just a matter of personal preference.
 

Similar threads

  • · Replies 16 ·
Replies
16
Views
5K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
9K
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
6
Views
3K
  • · Replies 10 ·
Replies
10
Views
6K