-
-
Notifications
You must be signed in to change notification settings - Fork 33.8k
gh-69605: Check for already imported modules in PyREPL module completion #139461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
gh-69605: Check for already imported modules in PyREPL module completion #139461
Conversation
|
Ok, one of the new test fails on Windows x64/32 (but pass on arm) I just setup and built Python on a Win 11 x64 to try to reproduce, the test pass 😵💫 (and the fix works) |
|
I'm going to convert it into a draft until you deem this PR ready. That will also reduce the probability of someone clicking on the PR to review it (like me...) while it's not yet ready for. |
Thanks, I just fixed the failing test, this should be ready for review! Sorry again for the noise, I was dealing with what turned out to be a importer cache issue, which I couldn't reproduce outside of the CI buildbots 😓 [edit] I just noticed I can convert to draft / mark ready myself, will do it next times! |
…ules' of https://github.com/loic-simon/cpython into pyrepl-module-completion-check-for-already-imported-modules
…ready-imported-modules
Lib/_pyrepl/_module_completer.py
Outdated
| if os.path.basename(imported_path) == "__init__.py": # package | ||
| imported_path = os.path.dirname(imported_path) | ||
| import_location = os.path.dirname(imported_path) | ||
| modules = list(pkgutil.iter_modules([import_location])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is correct, it's going to find submodules but we actually want this to be the top-level modules
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's OK, because os.path.dirname make us search in the folder containing the module origin, not on the origin itself (that's why we need to do it twice for a package):
>>> import os, pkgutil, typing, concurrent
>>>
>>> typing.__spec__.origin # single-file module
'<venv>/lib/python3.14/typing.py'
>>> concurrent.__spec__.origin # package
'<venv>/lib/python3.14/concurrent/__init__.py'
>>>
>>> loc = os.path.dirname(typing.__spec__.origin) # or do it twice for concurrent
>>> [mod for mod in pkgutil.iter_modules([loc]) if mod.name in ("typing", "concurrent")]
[ModuleInfo(module_finder=FileFinder('<venv>/lib/python3.14'), name='concurrent', ispkg=True),
ModuleInfo(module_finder=FileFinder('<venv>/lib/python3.14'), name='typing', ispkg=False)]While refactoring this into a separate function I ended up rewriting the whole thing, it should be more explicit now (and it checks module.__package__ instead of the os.path.basename(imported_path) == "__init__.py" hack)
Lib/_pyrepl/_module_completer.py
Outdated
| imported_path = os.path.dirname(imported_path) | ||
| import_location = os.path.dirname(imported_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be safer to check __spec__.has_location before treating the origin as a path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't saw this attribute, will do!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That made me realize completion didn't work anymore for already imported frozen modules (for which has_location=False and origin='frozen'), quite a niche issue but fixed it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That made me realize completion didn't work anymore for already imported frozen modules (for which
has_location=Falseandorigin='frozen'), quite a niche issue but fixed it!
Might want to add a test for that as well :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added one yes (plus one in the non-imported case), coverage is still 100% on this branch 👌
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also realized we will need the same logic for my other PR about attributes completion, so I factored the logic into a private method, I suggest we merge this one first so I can rebase merge main into the other and call it!
EDIT: ok that's not true, since attributes completion checks directly the module object, I forgot 😅 but the refactor is a good thing anyway I think
Lib/_pyrepl/_module_completer.py
Outdated
| imported_path = os.path.dirname(imported_path) | ||
| import_location = os.path.dirname(imported_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't saw this attribute, will do!
Lib/_pyrepl/_module_completer.py
Outdated
| if os.path.basename(imported_path) == "__init__.py": # package | ||
| imported_path = os.path.dirname(imported_path) | ||
| import_location = os.path.dirname(imported_path) | ||
| modules = list(pkgutil.iter_modules([import_location])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's OK, because os.path.dirname make us search in the folder containing the module origin, not on the origin itself (that's why we need to do it twice for a package):
>>> import os, pkgutil, typing, concurrent
>>>
>>> typing.__spec__.origin # single-file module
'<venv>/lib/python3.14/typing.py'
>>> concurrent.__spec__.origin # package
'<venv>/lib/python3.14/concurrent/__init__.py'
>>>
>>> loc = os.path.dirname(typing.__spec__.origin) # or do it twice for concurrent
>>> [mod for mod in pkgutil.iter_modules([loc]) if mod.name in ("typing", "concurrent")]
[ModuleInfo(module_finder=FileFinder('<venv>/lib/python3.14'), name='concurrent', ispkg=True),
ModuleInfo(module_finder=FileFinder('<venv>/lib/python3.14'), name='typing', ispkg=False)]While refactoring this into a separate function I ended up rewriting the whole thing, it should be more explicit now (and it checks module.__package__ instead of the os.path.basename(imported_path) == "__init__.py" hack)
Lib/_pyrepl/_module_completer.py
Outdated
| imported_path = os.path.dirname(imported_path) | ||
| import_location = os.path.dirname(imported_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That made me realize completion didn't work anymore for already imported frozen modules (for which has_location=False and origin='frozen'), quite a niche issue but fixed it!
| return None | ||
| if not spec.has_location: | ||
| if spec.origin == "frozen": # See Tools/build/freeze_modules.py | ||
| return os.path.join(self._stdlib_path, f"{spec.name}.py") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not always going to be a real path right? e.g. _frozen_importlib.. in any case it does not seem that useful to create the full path when we strip it away with os.path.dirname right after.
Perhaps we can have a smaller diff if instead of reverse-engineering the location, we simply look for the right ModuleInfo in the global cache. After all if the module has already been imported, it should be visible to pkgutil.iter_modules. Something like this:
modules: Iterable[pkgutil.ModuleInfo] = self.global_cache
imported_module = sys.modules.get(path.split('.')[0])
if imported_module is not None:
# Filter modules to those whose name and spec matches the imported module
modules = ...We can compare the module name and origin (maybe with something like mod_info.module_finder.find_spec(mod_info.name)) to make sure we have the right one.
This PR handles an edge-case when suggesting module imports in PyREPL (see this comment):
If a module is already imported, the import machinery will only get submodules from it, even if a module of the same name is higher in the path. This can happen if eg. sys.path has been updated since the module was originaly imported.
The easiest way to reproduce this issue (and how I stumbled on it) is with stdlib modules imported during interpreter startup (before site customisation, so ignoring potential shadowing):
This PR add a special case in
_pyrepl.ModuleCompleter._find_modulesthatIt was a little tricky to get right, I'm not sure about everything 😅 but it pass all tests I could think off!
cc @tomasr8