关于python：从外部作用域绑定的函数本地名称

Function local name binding from an outer scope

我需要一种从外部代码块将名称"注入"到函数中的方法，这样它们就可以在本地访问，并且不需要由函数的代码(定义为函数参数，从*args等加载)专门处理。

简化的方案：提供一个框架，用户可以在其中定义(尽可能少的语法)自定义函数来操作框架的其他对象(不一定是global)。

理想情况下，用户定义

1
2
3
4

def user_func():
Mouse.eat(Cheese)
if Cat.find(Mouse):
Cat.happy += 1

这里，Cat、Mouse和Cheese是框架对象，出于充分的原因，不能绑定到全局命名空间。

我想为此函数编写一个包装器，使其行为如下：

1
2
3
4
5
6

def framework_wrap(user_func):
# this is a framework internal and has name bindings to Cat, Mouse and Cheese
def f():
inject(user_func, {'Cat': Cat, 'Mouse': Mouse, 'Cheese': Cheese})
user_func()
return f

然后这个包装器可以应用于所有用户定义的函数(作为一个修饰器，由用户自己或自动应用，尽管我计划使用一个元类)。

1 2	@framework_wrap def user_func():

我知道python 3的nonlocal关键字，但我仍然认为(从框架的用户角度来看)增加了一行：

1	nonlocal Cat, Mouse, Cheese

担心在这行中添加他需要的所有对象。

任何建议都非常感谢。

相关讨论

我越是乱摆弄那堆东西，我就越希望自己没有。不要为了你想做的事而去黑地球人。改为破解字节码。我有两种方法可以做到这一点。

1)在f.func_closure中添加包含所需引用的单元格。您必须重新组合函数的字节码以使用LOAD_DEREF而不是LOAD_GLOBAL，并为每个值生成一个单元格。然后您将一个单元元组和新的代码对象传递给types.FunctionType，并获得具有适当绑定的函数。函数的不同副本可以具有不同的本地绑定，因此它应该像您希望的那样具有线程安全性。

2)在函数参数列表的末尾添加新局部变量的参数。将适当出现的LOAD_GLOBAL替换为LOAD_FAST。然后，使用types.FunctionType构造一个新的函数，并传入新的代码对象和一个作为默认选项的绑定元组。这在某种意义上是有限的，因为python将函数参数限制为255，并且它不能用于使用变量参数的函数。这对我来说同样具有挑战性，所以这就是我所实现的(另外还有其他的事情可以用这个来完成)。同样，您可以使用不同的绑定制作函数的不同副本，也可以使用每个调用位置所需的绑定调用函数。所以它也可以像你想要的那样安全。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157

import types
import opcode

# Opcode constants used for comparison and replacecment
LOAD_FAST = opcode.opmap['LOAD_FAST']
LOAD_GLOBAL = opcode.opmap['LOAD_GLOBAL']
STORE_FAST = opcode.opmap['STORE_FAST']

DEBUGGING = True

def append_arguments(code_obj, new_locals):
co_varnames = code_obj.co_varnames # Old locals
co_names = code_obj.co_names # Old globals
co_argcount = code_obj.co_argcount # Argument count
co_code = code_obj.co_code # The actual bytecode as a string

# Make one pass over the bytecode to identify names that should be
# left in code_obj.co_names.
not_removed = set(opcode.hasname) - set([LOAD_GLOBAL])
saved_names = set()
for inst in instructions(co_code):
if inst[0] in not_removed:
saved_names.add(co_names[inst[1]])

# Build co_names for the new code object. This should consist of
# globals that were only accessed via LOAD_GLOBAL
names = tuple(name for name in co_names
if name not in set(new_locals) - saved_names)

# Build a dictionary that maps the indices of the entries in co_names
# to their entry in the new co_names
name_translations = dict((co_names.index(name), i)
for i, name in enumerate(names))

# Build co_varnames for the new code object. This should consist of
# the entirety of co_varnames with new_locals spliced in after the
# arguments
new_locals_len = len(new_locals)
varnames = (co_varnames[:co_argcount] + new_locals +
co_varnames[co_argcount:])

# Build the dictionary that maps indices of entries in the old co_varnames
# to their indices in the new co_varnames
range1, range2 = xrange(co_argcount), xrange(co_argcount, len(co_varnames))
varname_translations = dict((i, i) for i in range1)
varname_translations.update((i, i + new_locals_len) for i in range2)

# Build the dictionary that maps indices of deleted entries of co_names
# to their indices in the new co_varnames
names_to_varnames = dict((co_names.index(name), varnames.index(name))
for name in new_locals)

if DEBUGGING:
print"injecting: {0}".format(new_locals)
print"names: {0} -> {1}".format(co_names, names)
print"varnames: {0} -> {1}".format(co_varnames, varnames)
print"names_to_varnames: {0}".format(names_to_varnames)
print"varname_translations: {0}".format(varname_translations)
print"name_translations: {0}".format(name_translations)

# Now we modify the actual bytecode
modified = []
for inst in instructions(code_obj.co_code):
# If the instruction is a LOAD_GLOBAL, we have to check to see if
# it's one of the globals that we are replacing. Either way,
# update its arg using the appropriate dict.
if inst[0] == LOAD_GLOBAL:
print"LOAD_GLOBAL: {0}".format(inst[1])
if inst[1] in names_to_varnames:
print"replacing with {0}:".format(names_to_varnames[inst[1]])
inst[0] = LOAD_FAST
inst[1] = names_to_varnames[inst[1]]
elif inst[1] in name_translations:
inst[1] = name_translations[inst[1]]
else:
raise ValueError("a name was lost in translation")
# If it accesses co_varnames or co_names then update its argument.
elif inst[0] in opcode.haslocal:
inst[1] = varname_translations[inst[1]]
elif inst[0] in opcode.hasname:
inst[1] = name_translations[inst[1]]
modified.extend(write_instruction(inst))

code = ''.join(modified)
# Done modifying codestring - make the code object

return types.CodeType(co_argcount + new_locals_len,
code_obj.co_nlocals + new_locals_len,
code_obj.co_stacksize,
code_obj.co_flags,
code,
code_obj.co_consts,
names,
varnames,
code_obj.co_filename,
code_obj.co_name,
code_obj.co_firstlineno,
code_obj.co_lnotab)

def instructions(code):
code = map(ord, code)
i, L = 0, len(code)
extended_arg = 0
while i < L:
op = code[i]
i+= 1
if op < opcode.HAVE_ARGUMENT:
yield [op, None]
continue
oparg = code[i] + (code[i+1] << 8) + extended_arg
extended_arg = 0
i += 2
if op == opcode.EXTENDED_ARG:
extended_arg = oparg << 16
continue
yield [op, oparg]

def write_instruction(inst):
op, oparg = inst
if oparg is None:
return [chr(op)]
elif oparg <= 65536L:
return [chr(op), chr(oparg & 255), chr((oparg >> 8) & 255)]
elif oparg <= 4294967296L:
return [chr(opcode.EXTENDED_ARG),
chr((oparg >> 16) & 255),
chr((oparg >> 24) & 255),
chr(op),
chr(oparg & 255),
chr((oparg >> 8) & 255)]
else:
raise ValueError("Invalid oparg: {0} is too large".format(oparg))

if __name__=='__main__':
import dis

class Foo(object):
y = 1

z = 1
def test(x):
foo = Foo()
foo.y = 1
foo = x + y + z + foo.y
print foo

code_obj = append_arguments(test.func_code, ('y',))
f = types.FunctionType(code_obj, test.func_globals, argdefs=(1,))
if DEBUGGING:
dis.dis(test)
print '-'*20
dis.dis(f)
f(1)

请注意，此代码的一个完整分支(与EXTENDED_ARG相关的分支)未经测试，但对于常见情况，它似乎相当可靠。我将对其进行黑客攻击，目前正在编写一些代码来验证输出。然后(当我接触到它时)我将运行它与整个标准库，并修复任何错误。

我也可能正在实现第一个选项。

相关讨论

编辑后的应答——调用user_func()后恢复名称空间dict

使用python 2.7.5和3.3.2进行测试

文件framework.py:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

# framework objects
class Cat: pass
class Mouse: pass
class Cheese: pass

_namespace = {'Cat':Cat, 'Mouse':Mouse, 'Cheese':Cheese } # names to be injected

# framework decorator
from functools import wraps
def wrap(f):
func_globals = f.func_globals if hasattr(f,'func_globals') else f.__globals__
@wraps(f)
def wrapped(*args, **kwargs):
# determine which names in framework's _namespace collide and don't
preexistent = set(name for name in _namespace if name in func_globals)
nonexistent = set(name for name in _namespace if name not in preexistent)
# save any preexistent name's values
f.globals_save = {name: func_globals[name] for name in preexistent}
# temporarily inject framework's _namespace
func_globals.update(_namespace)

retval = f(*args, **kwargs) # call function and save return value

# clean up function's namespace
for name in nonexistent:
del func_globals[name] # remove those that didn't exist
# restore the values of any names that collided
func_globals.update(f.globals_save)
return retval

return wrapped

示例用法：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

from __future__ import print_function
import framework

class Cat: pass # name that collides with framework object

@framework.wrap
def user_func():
print('in user_func():')
print(' Cat:', Cat)
print(' Mouse:', Mouse)
print(' Cheese:', Cheese)

user_func()

print()
print('after user_func():')
for name in framework._namespace:
if name in globals():
print(' {} restored to {}'.format(name, globals()[name]))
else:
print(' {} not restored, does not exist'.format(name))

输出：

1
2
3
4
5
6
7
8
9

in user_func():
Cat: <class 'framework.Cat'>
Mouse: <class 'framework.Mouse'>
Cheese: <class 'framework.Cheese'>

after user_func():
Cheese not restored, does not exist
Mouse not restored, does not exist
Cat restored to <class '__main__.Cat'>

相关讨论

听起来你可能想使用exec code in dict，其中code是用户的函数，dict是你提供的字典，它可以

预先填充对用户代码应该能够使用的对象的引用
存储用户代码声明的任何函数或变量，以供以后框架使用。

exec文档：http://docs.python.org/reference/simple_stmts.html exec语句

但是，我非常确定，只有当用户的代码作为字符串引入，并且您需要执行它时，这才有效。如果函数已编译，则它将已经设置了全局绑定。因此，像exec"user_func(*args)" in framework_dict这样的操作将不起作用，因为user_func的全局已经设置为定义它的模块。

由于func_globals是只读的，所以我认为您必须按照martineau的建议来修改函数全局。

我认为(除非你正在做一些前所未有的出色工作，或者我缺少一些关键的微妙之处)，最好是将框架对象放入模块中，然后让用户代码导入该模块。一旦模块被执行import编辑，模块变量就可以通过在该模块外部定义的代码重新分配、变异或访问。

我认为这对于代码的可读性也会更好，因为user_func最终将为Cat、Dog等提供明确的名称间距，而不是读者不熟悉您的框架，不得不怀疑它们来自哪里。例如，animal_farm.Mouse.eat(animal_farm.Cheese)，或者可能是类似

1 2	from animal_farm import Goat cheese = make_cheese(Goat().milk())

如果您正在做一些前所未有的出色工作，我认为您需要使用C API将参数传递给代码对象。看起来pyeval_evalcodex函数就是您想要的函数。

相关讨论

我喜欢你的方法多么干净。但是，有几个问题：为了避免额外的代码编译，我想执行用户func.funcu代码(代码对象)，但是我找不到任何方法将额外的参数传递给用户func调用(如果函数定义需要)。另一个潜在的问题是在某些情况下处理全局数据，但目前还不是真正的问题。
如果你把code加到dict上，你就可以把exec"code(parameters)" in dict加起来。
但是，当然，在这种情况下，您不会避免额外的编译，我的错。但是如果你有表演(？)考虑到编译一个简单的函数调用，一种(大部分)解释语言并不是最好的选择。
六羟甲基三聚氰胺六甲醚。。您的意思是要将一些参数传递给在用户代码中用已知名称声明的函数？您可以调用生成的字典，即"dict"传递给"exec"[用户函数的名称](*args，**kwargs)。
另外：如果更合适，您可以将用户代码加载到模块中；有关详细信息，请参阅此处。
@如果我引用了一个函数，我知道如何调用一个带有参数的函数。我不知道如何通过exec调用函数(或func_code对象)并传递其他参数(不包括zooba的字符串eval解决方案)。
我就是这么说的。使用exec运行有函数声明的代码，然后在exec之后从主代码调用声明的函数。或者，通过将参数存储在字典键中，像Kwarg一样将参数传入exec块。这两种方法都有点笨拙，因为它们依赖于给函数或变量指定特定名称的约定。如果您想避免这种情况，并且您知道在exec块中只声明了一个函数，那么您可以检查dict的值以查找可调用的，而不是查找特定的键。
也许我错过了什么……如果您可以发布一个您想用这种方式参数化的代码示例，我可能会理解得更好。对于问题中的第二个代码示例，它只是将您的inject直接转换为exec，并适当地重新排列语法。
哦，等等，您想将参数传递给已经声明的包装函数。对不起，我想我第一次就错过了问题的要点。
@如果用户定义userf(arg1, arg2)，那么我想要类似exec userf.func_code with_args (arg1, arg2) in dic的东西。我知道我可以呈现一个字符串s="userf(arg1, arg2)"，然后是exec s in dic，但这需要额外的编译和……给我一种不愉快的感觉：)
嗯，结果很有趣。我敢肯定您不能用直接的Python代码将参数传递给代码对象；您需要使用C API。我想您要调用的函数是PyEval_EvalCodeEx。exec通过简化的c函数PyEval_EvalCode间接调用该函数，该函数不将代码对象的参数作为参数。我对python的C源代码做了一些修改，看起来没有任何python函数将args传递给代码对象。
我认为以不同的方式组织框架几乎肯定会更好。例如，将各种动物放入一个模块，然后让用户导入该模块以访问这些对象，可能会以更清晰的方式完成您想要的任务。所以您最终将使用用户函数来执行framework.Cat.meow("rrowr")或其他操作。
另外，我认为在这种情况下，exec方法不能满足您的需要，因为为了让用户函数的全局数据绑定到您的字典，您必须使用exec函数声明，而不是函数调用。我认为可能有某种方法可以转换它认为它在其中的模块，这可能会产生重新绑定它的全局的效果，这可能会奏效。但是函数对象(func_globals的属性是只读的，所以可能不是。
我认为最初我觉得用户代码是以字符串的形式出现的，在这种情况下，您可以在字典中使用cx1〔0〕来确定其绑定。

如果您的应用程序是严格意义上的python 3，我看不出使用python3的nonlocal比编写一个修饰器来操作函数的本地名称空间更糟糕。我说，尝试一下或者重新考虑一下这个策略。

相关讨论