gh-139899: Introduce MetaPathFinder.discover and PathEntryFinder.discover #139900

FFY00 · 2025-10-10T13:09:31Z

Issue: Introduce MetaPathFinder.discover and PathEntryFinder.discover #139899

📚 Documentation preview 📚: https://cpython-previews--139900.org.readthedocs.build/

FFY00 · 2025-10-10T13:14:30Z

Still needs tests, but I'll wait to see the feedback on the issue.

…r.discover Signed-off-by: Filipe Laíns <lains@riseup.net>

FFY00 · 2025-12-10T12:51:58Z

I've updated the method to take a module spec, instead of a module object, as in some cases the parent might have failed to import.

In a follow-up, I will add a protocol for finders implementing .discover(), as @brettcannon suggested.

Signed-off-by: Filipe Laíns <lains@riseup.net>

This reverts commit 31d1a8f. Signed-off-by: Filipe Laíns <lains@riseup.net>

Signed-off-by: Filipe Laíns <lains@riseup.net>

gpshead · 2025-12-13T21:14:32Z

Lib/importlib/_bootstrap_external.py

+        if parent is None:
+            path = sys.path
+        else:
+            path = parent.submodule_search_locations


I think this can be None when parent is a non-package module? make a nicer error message or should this situation use sys.path?

How can it be a non-package module if it has a child? It could be a namespace package, but that's still a package, and parent.submodule_search_locations should be an iterable objects when the spec is fully initialized, which it should always be at this point.

Unless I am missing something? Do we support package-like module extensions?

I think @gpshead is referring to the error case where parent.submodule_search_locations is None because the caller supplied a "parent" spec from a non-package module. Instead of the default "'NoneType' object is not iterable" message from the failed iteration attempt, we should raise something more specific.

gpshead · 2025-12-13T21:17:44Z

Lib/importlib/_bootstrap_external.py

        return path_hook_for_FileFinder

+    def _find_children(self):
+        for entry in _os.scandir(self.path):


use a with statement to explicitly close the iterator - https://docs.python.org/3/library/os.html#os.scandir.close

also consider handling exceptions similar to what _fill_cache does, which is also be needed around the is_dir and is_file bits within the loop below - https://docs.python.org/3/library/os.html#os.scandir.close

document exception handling semantics of discover. I doubt callers ever expect to need be prepared for those?

gpshead · 2025-12-13T21:18:26Z

Lib/importlib/_bootstrap_external.py

        return path_hook_for_FileFinder

+    def _find_children(self):
+        for entry in _os.scandir(self.path):


should this be cached or not? we should document the cache interaction behavior either way regardless.

I don't think so? Because it could change at runtime, no?

Getting back to this, I don't think we should touch anything in the cache if we keep using os.scandir. Alternatively, we can rewrite _find_children to use the cache / os.listdir.

gpshead · 2025-12-13T21:26:05Z

Lib/importlib/_bootstrap_external.py

+            # files
+            if entry.is_file():
+                yield from [
+                    entry.name.removesuffix(suffix)


entry.name could exist with multiple loader suffixes on the filesystem. dedupe to avoid redundant specs?

gpshead · 2025-12-13T21:31:06Z

Doc/library/importlib.rst

+      module spec. If *parent* is *None*, :meth:`MetaPathFinder.discover` will
+      search for top-level modules.
+
+      Returns an iterable of possible specs.


would this be the first time we have a public API yielding things from importlib, do we actually want to do that or should this return a list?

callers consuming the results might be writing code that makes changes that could impact future results... yielding could get messy.

what are the intended use cases for the API? if it's something we expect callers to short circuit and stop iterating on after the first match maybe yield makes sense, but then we should probably just have an explicit direct discover_first API for that instead.

This is something I considered.

The main use-case is finding a similar-named module to show as a hint on ModuleNotFoundError (eg. "Did you meant numpy?", when trying to import numby), so I think it would make sense to make this new API a generator, or at least some kind of lazy container.

For cases such as you describe, the user could just consume the full generator into a list to avoid any issue. Still leaving opportunity for the code that could leverage the benefit of this being a lazy API — scaning directories with a lot of files can take a while, not to mention the other exotic finders out there that may operate over the network or something like that.

I am not fundamentally opposed to make this method return a list, but I can't see the value in the trade-of if we document it properly.

callers consuming the results might be writing code that makes changes that could impact future results... yielding could get messy.

While this is technically possible, I would find it extremely uncommon. And I think it should be reasonable to assume that people who are knowledgeable enough to do that, would probably be aware of the downsides of making changes to the import machinery, while consuming the API

but then we should probably just have an explicit direct discover_first API for that instead

And what would that look like? Would it take a predicate function and return the first entry that matches?

So, would have a warning in the documentation regarding your concern be a good enough compromise?

True, for the "nearest matching module name" error message case, you just want to keep the closest match, so even though all the names need to be checked, you only need to keep a reference to one of them (the closest so far).

Since the potential number of results may be absurdly high in some situations, and the iterable form provides more options for consumers to handle that appropriately, I do think it makes sense to make it an iterable. The warning in the docs should explain why it's an iterable, since consumers that unconditionally convert the result to a list can run into problems regardless.

As far as prior examples of iterable import APIs goes, importlib.resources.contents was the first example I found in a quick look (covering a very similar situation, just for non-module resources rather than submodules)

gpshead · 2025-12-13T21:33:17Z

Lib/importlib/_bootstrap_external.py

+        else:
+            path = parent.submodule_search_locations
+
+        for entry in path:


path can contain duplicate entries. should we dedupe?

Yeah, sure 👍

ncoghlan

This is definitely looking promising, but I do agree with a couple of @gpshead's requested tidy ups.

ncoghlan · 2026-01-01T15:09:36Z

Lib/importlib/_bootstrap_external.py

+        if parent is None:
+            path = sys.path
+        else:
+            path = parent.submodule_search_locations


I think @gpshead is referring to the error case where parent.submodule_search_locations is None because the caller supplied a "parent" spec from a non-package module. Instead of the default "'NoneType' object is not iterable" message from the failed iteration attempt, we should raise something more specific.

Misc/NEWS.d/next/Library/2025-10-10-14-08-58.gh-issue-139899.09leRY.rst

Signed-off-by: Filipe Laíns <lains@riseup.net>

Co-authored-by: Alyssa Coghlan <ncoghlan@gmail.com>

Signed-off-by: Filipe Laíns <lains@riseup.net>

FFY00 requested review from brettcannon, ericsnowcurrently, ncoghlan and warsaw as code owners October 10, 2025 13:09

bedevere-app bot mentioned this pull request Oct 10, 2025

Introduce MetaPathFinder.discover and PathEntryFinder.discover #139899

Open

bedevere-app bot added the awaiting core review label Oct 10, 2025

FFY00 mentioned this pull request Oct 10, 2025

Add import suggestions for ModuleNotFoundError #134872

Open

pythongh-139899: Introduce MetaPathFinder.discover and PathEntryFinde…

8099249

…r.discover Signed-off-by: Filipe Laíns <lains@riseup.net>

FFY00 force-pushed the gh-139899 branch from 29215a0 to 8099249 Compare December 10, 2025 12:45

FFY00 mentioned this pull request Dec 10, 2025

GH-134872: add ModuleNotFoundError suggestions #142512

Draft

FFY00 added 5 commits December 10, 2025 12:54

Fix doc reference

6816705

Signed-off-by: Filipe Laíns <lains@riseup.net>

Remove specific doc references

31d1a8f

Signed-off-by: Filipe Laíns <lains@riseup.net>

Fix docstrings

051cd1e

Signed-off-by: Filipe Laíns <lains@riseup.net>

Revert "Remove specific doc references"

c343d32

This reverts commit 31d1a8f. Signed-off-by: Filipe Laíns <lains@riseup.net>

Fix news references

a324d96

Signed-off-by: Filipe Laíns <lains@riseup.net>

gpshead reviewed Dec 13, 2025

View reviewed changes

ncoghlan reviewed Jan 1, 2026

View reviewed changes

FFY00 and others added 8 commits January 20, 2026 00:35

Add docs warning

dc6faa2

Signed-off-by: Filipe Laíns <lains@riseup.net>

Raise ValueError on invalid parent

3be757f

Signed-off-by: Filipe Laíns <lains@riseup.net>

Dedupe __path__ in PathFinder.discover

0da477f

Signed-off-by: Filipe Laíns <lains@riseup.net>

Use context manager and add error handling to os.scandir

282bef7

Signed-off-by: Filipe Laíns <lains@riseup.net>

Raise ValueError on invalid parent

469dc2a

Signed-off-by: Filipe Laíns <lains@riseup.net>

Dedupe when package exists with multiple suffixes

c09a12e

Signed-off-by: Filipe Laíns <lains@riseup.net>

Apply suggestions from code review

41cd071

Co-authored-by: Alyssa Coghlan <ncoghlan@gmail.com>

Add tests

5488e93

Signed-off-by: Filipe Laíns <lains@riseup.net>

Uh oh!

gh-139899: Introduce MetaPathFinder.discover and PathEntryFinder.discover #139900

Are you sure you want to change the base?

gh-139899: Introduce MetaPathFinder.discover and PathEntryFinder.discover #139900

Uh oh!

Conversation

FFY00 commented Oct 10, 2025 • edited by brettcannon Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FFY00 commented Oct 10, 2025

Uh oh!

FFY00 commented Dec 10, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

FFY00 Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ncoghlan Jan 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ncoghlan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

FFY00 commented Oct 10, 2025 •

edited by brettcannon

Loading

FFY00 Dec 17, 2025 •

edited

Loading

ncoghlan Jan 1, 2026 •

edited

Loading