简体   繁体   中英

How to get a list of classes and functions from a python file without importing it

I have a python file with some classes and functions defined in it:

class A(object):
    def __init__(self, an_arg, a_default_arg=None):
        pass

def doStuff(an_other_arg, an_other_default_arg=None):
    pass

And I want to get a list of all classes and functions in this file. (their names and parameter definitions are enough)

Now, I do know you can do this with __import__(module_descriptor) and inspect , but this is not an option as the file I'm scanning is from an untrusted source.

My first reaction was to try and create a safe environment to import them, but this seems impossible according to other stackoverflow-questions.

You can use the ast module to parse the source file, without actually executing any code. Then you can traverse the node tree to get the function and class names/parameters.

import ast

def show_info(functionNode):
    print("Function name:", functionNode.name)
    print("Args:")
    for arg in functionNode.args.args:
        #import pdb; pdb.set_trace()
        print("\tParameter name:", arg.arg)


filename = "untrusted.py"
with open(filename) as file:
    node = ast.parse(file.read())

functions = [n for n in node.body if isinstance(n, ast.FunctionDef)]
classes = [n for n in node.body if isinstance(n, ast.ClassDef)]

for function in functions:
    show_info(function)

for class_ in classes:
    print("Class name:", class_.name)
    methods = [n for n in class_.body if isinstance(n, ast.FunctionDef)]
    for method in methods:
        show_info(method)

Result:

Function name: doStuff
Args:
        Parameter name: an_other_arg
        Parameter name: an_other_default_arg
Class name: A
Function name: __init__
Args:
        Parameter name: self
        Parameter name: an_arg
        Parameter name: a_default_arg

NOTHING short of actually executing the file can give you a 100% accurate answer to this question. There are just too many ways in Python to dynamically affect the namespace: importing names from elsewhere, conditionally executing definitions, manipulating the namespace directly by modifying its __dict__ , etc.

If you can live with only the static definitions, Python's built-in ast (Abstract Syntax Tree) module is probably the simplest solution. You can safely compile the file to an AST, then walk its top level looking for def and class statements. (In the case of classes, you'd then walk the class body looking for a def __init__ . Don't forget the possibility that a class has no __init__ of its own, but just inherits one from a superclass!)

The accepted solution is incomplete. Consider the following file:

def regular_function():
    def nested_function():
        pass

async def async_function():
    pass

The accepted solution will only print:

Function name: regular_function
Args:

To get all functions, we need to make two changes:

  1. Walk the entire AST, rather than just top level nodes
  2. Handle async functions as well as regular functions

Here is the corrected code, for finding functions:

import ast

from pathlib import Path

parsed_ast = ast.parse(Path(__file__).read_text())

functions = [
    node
    for node in ast.walk(parsed_ast)
    if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef))
]

for function in functions:
    print(f"Function name: {function.name}")
    print(f"Args: {', '.join([arg.arg for arg in function.args.args])}")

Note that this is pushing up against the bounds of what an AST walk should be used for. For anything more complicated, consider using NodeVisitor

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM