关于python:是否可以通过代码对象访问内部函数和类?

Is it possible to access inner functions and classes via code objects?

假设有一个函数func

1
2
3
4
5
6
def func():
    class a:
        def method(self):
            return 'method'
    def a(): return 'function'
    lambda x: 'lambda'

我需要检查的。

作为检查的一部分,我希望"检索"所有嵌套类和函数(如果有的话)的源代码或对象。然而,我确实意识到它们还不存在,而且在不运行func或定义它们的情况下,没有直接/干净的方法来访问它们。它们在(之前)func之外。不幸的是,我最多只能导入一个包含func的模块来获取func函数对象。

我发现函数有包含code对象的__code__属性,它有co_consts属性,所以我写了:

1
2
3
4
In [11]: [x for x in func.__code__.co_consts if iscode(x) and x.co_name == 'a']
Out[11]:
[<code object a at 0x7fe246aa9810, file"<ipython-input-6-31c52097eb5f>", line 2>,
 <code object a at 0x7fe246aa9030, file"<ipython-input-6-31c52097eb5f>", line 4>]

这些code对象看起来非常相似,我认为它们不包含有必要的数据来帮助我区分它们所代表的对象类型(例如typefunction)。

Q1:我说的对吗?

问题2:有没有方法访问在函数体中定义的类/函数(普通和lambda)?


A1:能帮助你的是-代码对象的常量

从文档中:

If a code object represents a function, the first item in co_consts is
the documentation string of the function, or None if undefined.

另外,如果代码对象表示一个类,那么co_consts的第一个项始终是该类的限定名。您可以尝试使用此信息。

以下解决方案在大多数情况下都可以正常工作,但您必须跳过python为list/set/dict理解和生成器表达式创建的代码对象:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from inspect import iscode

for x in func.__code__.co_consts:
    if iscode(x):
        # Skip <setcomp>, <dictcomp>, <listcomp> or <genexp>
        if x.co_name.startswith('<') and x.co_name != '<lambda>':
            continue
        firstconst = x.co_consts[0]
        # Compute the qualified name for the current code object
        # Note that we don't know its"type" yet
        qualname = '{func_name}.<locals>.{code_name}'.format(
                        func_name=func.__name__, code_name=x.co_name)
        if firstconst is None or firstconst != qualname:
            print(x, 'represents a function {!r}'.format(x.co_name))
        else:
            print(x, 'represents a class {!r}'.format(x.co_name))

印刷品

1
2
3
<code object a at 0x7fd149d1a9c0, file"<ipython-input>", line 2> represents a class 'a'
<code object a at 0x7fd149d1ab70, file"<ipython-input>", line 5> represents a function 'a'
<code object <lambda> at 0x7fd149d1aae0, file"<ipython-input>", line 6> represents a function '<lambda>'

代码标志

有一种方法可以从co_flags获得所需的信息。引用上述文件:

The following flag bits are defined for co_flags: bit 0x04 is set if
the function uses the *arguments syntax to accept an arbitrary number
of positional arguments; bit 0x08 is set if the function uses the
**keywords syntax to accept arbitrary keyword arguments; bit 0x20 is set if the function is a generator.

Other bits in co_flags are reserved for internal use.

标志在compute_code_flags中操作(python/compile.c):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
static int
compute_code_flags(struct compiler *c)
{
    PySTEntryObject *ste = c->u->u_ste;
    ...
    if (ste->ste_type == FunctionBlock) {
        flags |= CO_NEWLOCALS | CO_OPTIMIZED;
        if (ste->ste_nested)
            flags |= CO_NESTED;
        if (ste->ste_generator)
            flags |= CO_GENERATOR;
        if (ste->ste_varargs)
            flags |= CO_VARARGS;
        if (ste->ste_varkeywords)
            flags |= CO_VARKEYWORDS;
    }

    /* (Only) inherit compilerflags in PyCF_MASK */
    flags |= (c->c_flags->cf_flags & PyCF_MASK);

    n = PyDict_Size(c->u->u_freevars);
    ...
    if (n == 0) {
        n = PyDict_Size(c->u->u_cellvars);
        ...
        if (n == 0) {
            flags |= CO_NOFREE;
        }
    }
    ...
}

有两个代码标志(CO_NEWLOCALSCO_OPTIMIZED不会为类设置。您可以使用它们来检查类型(并不意味着您应该-文档不完整的实现细节将来可能会更改):

1
2
3
4
5
6
7
8
9
10
11
12
13
from inspect import iscode

for x in complex_func.__code__.co_consts:
    if iscode(x):
        # Skip <setcomp>, <dictcomp>, <listcomp> or <genexp>
        if x.co_name.startswith('<') and x.co_name != '<lambda>':
            continue
        flags = x.co_flags
        # CO_OPTIMIZED = 0x0001, CO_NEWLOCALS = 0x0002
        if flags & 0x0001 and flags & 0x0002:
            print(x, 'represents a function {!r}'.format(x.co_name))
        else:
            print(x, 'represents a class {!r}'.format(x.co_name))

输出完全相同。

外部函数的字节码

还可以通过检查外部函数的字节码来获取对象类型。

搜索字节码指令以查找带有LOAD_BUILD_CLASS的块,这意味着创建了一个类(LOAD_BUILD_CLASS将内建类推送到堆栈上。稍后调用函数来构造类。)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from dis import Bytecode
from inspect import iscode
from itertools import groupby

def _group(i):
    if i.starts_line is not None: _group.starts = i
    return _group.starts

bytecode = Bytecode(func)

for _, iset in groupby(bytecode, _group):
    iset = list(iset)
    try:
        code = next(arg.argval for arg in iset if iscode(arg.argval))
        # Skip <setcomp>, <dictcomp>, <listcomp> or <genexp>
        if code.co_name.startswith('<') and code.co_name != '<lambda>':
            raise TypeError
    except (StopIteration, TypeError):
        continue
    else:
        if any(x.opname == 'LOAD_BUILD_CLASS' for x in iset):
            print(code, 'represents a function {!r}'.format(code.co_name))
        else:
            print(code, 'represents a class {!r}'.format(code.co_name))

输出是相同的(再次)。

当然可以。源代码

为了获得代码对象的源代码,您可以使用inspect.getsource或等效工具:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from inspect import iscode, ismethod, getsource
from textwrap import dedent


def nested_sources(ob):
    if ismethod(ob):
        ob = ob.__func__
    try:
        code = ob.__code__
    except AttributeError:
        raise TypeError('Can\'t inspect {!r}'.format(ob)) from None
    for c in code.co_consts:
        if not iscode(c):
            continue
        name = c.co_name
        # Skip <setcomp>, <dictcomp>, <listcomp> or <genexp>
        if not name.startswith('<') or name == '<lambda>':
            yield dedent(getsource(c))

例如,nested_sources(complex_func)(见下文)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
def complex_func():
    lambda x: 42

    def decorator(cls):
        return lambda: cls()

    @decorator
    class b():
        def method():
            pass

    class c(int, metaclass=abc.ABCMeta):
        def method():
            pass

    {x for x in ()}
    {x: x for x in ()}
    [x for x in ()]
    (x for x in ())

必须为第一个lambdadecoratorb(包括@decoratorc生成源代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
In [41]: nested_sources(complex_func)
Out[41]: <generator object nested_sources at 0x7fd380781d58>

In [42]: for source in _:
   ....:     print(source, end='=' * 30 + '
'
)
   ....:    
lambda x: 42
==============================
def decorator(cls):
    return lambda: cls()
==============================
@decorator
class b():
    def method():
        pass
==============================
class c(int, metaclass=abc.ABCMeta):
    def method():
        pass
==============================

函数和类型对象

如果您仍然需要一个函数/类对象,那么可以使用eval/exec源代码。

例子

  • 对于lambda功能:

    1
    2
    3
    4
    In [39]: source = sources[0]

    In [40]: eval(source, func.__globals__)
    Out[40]: <function __main__.<lambda>>
  • 对于常规功能

    1
    2
    3
    4
    5
    6
    In [21]: source, local = sources[1], {}

    In [22]: exec(source, func.__globals__, local)

    In [23]: local.popitem()[1]
    Out[23]: <function __main__.decorator>
  • 上课

    1
    2
    3
    4
    5
    6
    In [24]: source, local = sources[3], {}

    In [25]: exec(source, func.__globals__, local)

    In [26]: local.popitem()[1]
    Out[26]: __main__.c

1
2
3
4
5
6
7
8
9
10
11
Disassemble the x object. x can denote either a module, a class, a method, a function, a generator, an asynchronous generator, a coroutine, a code object, a string of source code or a byte sequence of raw bytecode. For a module, it disassembles all functions. For a class, it disassembles all methods (including class and static methods). For a code object or sequence of raw bytecode, it prints one line per bytecode instruction. It also recursively disassembles nested code objects (the code of comprehensions, generator expressions and nested functions, and the code used for building nested classes). Strings are first compiled to code objects with the compile() built-in function before being disassembled. If no object is provided, this function disassembles the last traceback.

The disassembly is written as text to the supplied file argument if provided and to sys.stdout otherwise.

The maximal depth of recursion is limited by depth unless it is None. depth=0 means no recursion.

Changed in version 3.4: Added file parameter.

Changed in version 3.7: Implemented recursive disassembling and added depth parameter.

Changed in version 3.7: This can now handle coroutine and asynchronous generator objects.

https://docs.python.org/3/library/dis.html_dis.dis