Getting docstrings from python’s ast is far simpler and more reliable than any method of regex or brute force searching. It’s also much less intimidating than I originally thought.
First you need to load in some python code as a string, and parse it with ast.parse. This gives you a tree like object, like an html dom.
py_file = Path("plugins/auto_publish.py") raw_tree = py_file.read_text() tree = ast.parse(raw_tree)
Getting the Docstring #
You can then use ast.get_docstring to get the docstring of the node you are currently looking at. In the case of freshly loading in a file, this will be the module level doctring that is at the very top of a file.
module_docstring = ast.get_docstring(tree)