{"id":335,"date":"2021-03-31T21:04:20","date_gmt":"2021-03-31T21:04:20","guid":{"rendered":"https:\/\/andrejacobs.org\/?p=335"},"modified":"2022-04-11T20:22:59","modified_gmt":"2022-04-11T20:22:59","slug":"100-days-of-learning-day-23-more-exploring-of-the-python-importing-process","status":"publish","type":"post","link":"https:\/\/andrejacobs.org\/100-days-challenge\/100-days-of-learning-day-23-more-exploring-of-the-python-importing-process\/","title":{"rendered":"100 Days of Learning: Day 23 \u2013 More exploring of the Python importing process"},"content":{"rendered":"\n
Photo by Maxwell Nelson<\/a> on Unsplash<\/a><\/p>\n\n\n\n Here is my Log book<\/a><\/p>\n Code for today<\/a>.<\/p>\n The Definitive Guide to Python import Statements<\/a> is a really good article to read.<\/p>\n According to the docs<\/a> when a module is imported the interpreter will first search for a built-in module with that name. If not found it will then search for a file in the list of directories specified by Let’s explore What does Key points:<\/strong><\/p>\n Let’s explore some of these key points. Create a new file named hello.py inside package1<\/p>\n Modify example.py and run it.<\/p>\n Here we can see that the Take note, the code at global script is only executed the first time it is imported. We had to add a function to module1.py since all the print statements in that module would have already been executed.<\/p>\n Let’s modify Thus Based on the docs, it would mean that if we had a module name that matches a name of a built-in module, then our module will never be imported.<\/p>\n Let’s explore that. First lets get the names of built-in modules<\/p>\n We can see from sys.builtin_module_names that there is a module named time. Create a file named time.py.<\/p>\n Modify example.py<\/p>\n Let’s make sure. Rename time.py to mytime.py and adjust the import to be import mytime.<\/p>\n Again we have barely scratched the surface on the module loading and importing process. To be honest it seems quite complex and I am pretty sure it evolved that way for good reasons over the years.<\/p>\n The key take away<\/strong> for me on this is that I need to stop thinking in terms of where my files are located but rather start thinking about how the loading system works.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":" Photo by Maxwell Nelson on Unsplash […]<\/p>\nQuick recap on importing<\/h2>\n
\n
__init__.py<\/code> file.<\/li>\n
__name__<\/code> specifies the current module’s name. This can be changed by the loading system (e.g.
__main__<\/code> )<\/li>\n
__package__<\/code> specifies the package the current module belongs too.<\/li>\n
How does importing work?<\/h2>\n
sys.path<\/code>. According to the guide<\/a> the interpreter will also look for a package matching the name.<\/p>\n
sys.path<\/code> is initialized with the directory that contains the script being run, followed by the environment variable
PYTHONPATH<\/code> (same syntax as
PATH<\/code>) then followed by Python standard libraries.<\/p>\n
sys.path<\/h3>\n
sys.path<\/code>. Modify the example.py file and run it.<\/p>\n
# example.py\nimport sys\n...\nif __name__ == '__main__':\n ...\n print(f'sys.path: {sys.path}')\n<\/code><\/pre>\n
$ python example.py\n...\nsys.path: ['\/Users\/andre\/...\/project', '\/Users\/andre\/.pyenv\/versions\/3.9.1\/lib\/python39.zip', '\/Users\/andre\/.pyenv\/versions\/3.9.1\/lib\/python3.9', '\/Users\/andre\/.pyenv\/versions\/3.9.1\/lib\/python3.9\/lib-dynload', '\/Users\/andre\/.pyenv\/versions\/3.9.1\/lib\/python3.9\/site-packages']\n# Ok that checks out, the first directory is the one containing the script being run.\n\n# What happens when we add paths to PYTHONPATH\nexport PYTHONPATH=${PYTHONPATH}:${HOME}\/temp\n$ python example.py\nsys.path: ['\/Users\/andre\/...\/project', '\/Users\/andre\/temp', ...]\n\n# Ok that also checks out, script directory followed by PYTHONPATH followed by the rest.\n<\/code><\/pre>\n
sys.path<\/code> look like when we are technically not running a script? Running the Python REPL means we are not running a script.<\/p>\n
# Start a new terminal session to be sure that PYTHONPATH is not going to interfere\n$ echo $PYTHONPATH\n# nothing\n\n$ python\n>>> import sys\n>>> print(sys.path)\n['', '\/Users\/andre\/.pyenv\/versions\/3.9.1\/lib\/python39.zip', '\/Users\/andre\/.pyenv\/versions\/3.9.1\/lib\/python3.9', '\/Users\/andre\/.pyenv\/versions\/3.9.1\/lib\/python3.9\/lib-dynload', '\/Users\/andre\/.pyenv\/versions\/3.9.1\/lib\/python3.9\/site-packages']\n\n# Ok so we indeed do not get a script directory or the current working directory\n<\/code><\/pre>\n
\n
sys.path<\/code> does not contain the current working directory when running a script. This should not be confused when running the REPL, here the sys.path[0] == ” actually means the current working directory.<\/li>\n
sys.path<\/code> can be modified.<\/li>\n
sys.path<\/code> is shared across all the imported modules.<\/li>\n<\/ul>\n
\u251c\u2500\u2500 package1\n \u00a0 \u00a0 \u251c\u2500\u2500 __init__.py\n \u2514\u2500\u2500 hello.py\n<\/code><\/pre>\n
# hello.py\nimport sys\nprint(f'{__name__} sys.path: {sys.path}')\n<\/code><\/pre>\n
# example.py\n...\nimport package1.hello\n...\n<\/code><\/pre>\n
$ python example.py\n...\npackage1.hello sys.path: ['\/Users\/andre\/...\/project', '\/Users\/andre\/.pyenv\/versions\/3.9.1\/lib\/python39.zip', '\/Users\/andre\/.pyenv\/versions\/3.9.1\/lib\/python3.9', '\/Users\/andre\/.pyenv\/versions\/3.9.1\/lib\/python3.9\/lib-dynload', '\/Users\/andre\/.pyenv\/versions\/3.9.1\/lib\/python3.9\/site-packages']\n<\/code><\/pre>\n
sys.path<\/code> did indeed carry across to the hello module inside of package1 as is. So in theory it means that hello.py could import modules from the same directory that the script (example.py) lives in.<\/p>\n
# Add the following function to module1.py\ndef say_something():\n print(f'{__name__} says something')\n\n# Then modify hello.py\n# hello.py\n...\nprint('hello.py will import module1')\nimport module1\nmodule1.say_something()\n<\/code><\/pre>\n
$ python example.py\n...\nhello.py will import module1\nmodule1 says something\n<\/code><\/pre>\n
sys.path<\/code>. Edit example.py<\/p>\n
# example.py\n...\nsys.path.append('\/Users\/andre\/temp')\nimport tempmodule\n\n# Create a file named tempmodule in \/Users\/andre\/temp\n# tempmodule.py\nprint('This module lives no where near the project')\n<\/code><\/pre>\n
$ python example.py\n...\nThis module lives no where near the project\n...\nsys.path: [ ..., '\/Users\/andre\/temp']\n<\/code><\/pre>\n
sys.path<\/code> can be modified and thus alter the importing process.<\/p>\n
Built-in modules takes precedence<\/h3>\n
$ python\n>>> import sys\n>>> print(sys.builtin_module_names)\n('_abc', '_ast', '_codecs', '_collections', '_functools', '_imp', '_io', '_locale', '_operator', '_peg_parser', '_signal', '_sre', '_stat', '_string', '_symtable', '_thread', '_tracemalloc', '_warnings', '_weakref', 'atexit', 'builtins', 'errno', 'faulthandler', 'gc', 'itertools', 'marshal', 'posix', 'pwd', 'sys', 'time', 'xxsubtype')\n<\/code><\/pre>\n
project\n \u251c\u2500\u2500 example.py\n \u251c\u2500\u2500 module1.py\n \u251c\u2500\u2500 package1\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 __init__.py\n \u2514\u2500\u2500 time.py\n<\/code><\/pre>\n
#time.py\nprint('THIS SHOULD NOT BE IMPORTED')\n<\/code><\/pre>\n
...\n# This should import the built-in and not the time.py module\nimport time\n<\/code><\/pre>\n
$ python example.py\n...\n# Output as would be expected. Did not get the print from time.py\n<\/code><\/pre>\n
$ python example.py\n...\nTHIS SHOULD NOT BE IMPORTED\n# Well in this case it should have been imported as we expected.\n<\/code><\/pre>\n
Wrap up for tonight<\/h2>\n