An efficient Pythonic way of mapping values?
我有一些值需要重新映射,因为有两种方法可以指定规则。简单的
1 2 3 4 | if mod =="I": mod ="+" elif mod =="E": mod ="-" elif mod =="D": mod =":" elif mod =="M": mod ="." |
不是很有效的映射方法:
1 | mod = {"I":"+","E":"-","D":":","M":"." }.get(mod, mod) |
映射将导致O(1)查找。条件将导致o(n)。当然,如果您想对它进行所有挑剔的话,在您的映射实现中还需要考虑额外的函数调用。
我没有理论化,而是运行了一个快速基准。结果(删除
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | 2 function calls in 0.025 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 0.025 0.025 0.025 0.025 {range} 4 function calls in 0.035 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 2 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 2 0.035 0.018 0.035 0.018 {range} |
诚然,这只使用给定的案例。我的假设是,如果条件的运行时间随事例数线性增加,而映射查找将保持不变。
请记住,一个多级
这个故事的寓意是:
比较:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | def f(mod): if mod =="I": return"+" elif mod =="E": return"-" elif mod =="D": return":" elif mod =="M": return"." dis.dis(f) 4 0 LOAD_FAST 0 (mod) 3 LOAD_CONST 1 ('I') 6 COMPARE_OP 2 (==) 9 POP_JUMP_IF_FALSE 16 12 LOAD_CONST 2 ('+') 15 RETURN_VALUE 5 >> 16 LOAD_FAST 0 (mod) 19 LOAD_CONST 3 ('E') 22 COMPARE_OP 2 (==) 25 POP_JUMP_IF_FALSE 32 28 LOAD_CONST 4 ('-') 31 RETURN_VALUE 6 >> 32 LOAD_FAST 0 (mod) 35 LOAD_CONST 5 ('D') 38 COMPARE_OP 2 (==) 41 POP_JUMP_IF_FALSE 48 44 LOAD_CONST 6 (':') 47 RETURN_VALUE 7 >> 48 LOAD_FAST 0 (mod) 51 LOAD_CONST 7 ('M') 54 COMPARE_OP 2 (==) 57 POP_JUMP_IF_FALSE 64 60 LOAD_CONST 8 ('.') 63 RETURN_VALUE >> 64 LOAD_CONST 0 (None) 67 RETURN_VALUE d={"I":"+","E":"-","D":":","M":"."} def g(mod): return d.get(mod, mod) 12 0 LOAD_GLOBAL 0 (d) 3 LOAD_ATTR 1 (get) 6 LOAD_FAST 0 (mod) 9 LOAD_FAST 0 (mod) 12 CALL_FUNCTION 2 15 RETURN_VALUE |
如果你有固定数量和少量的病例,那么
如果一组案例是动态的,那么
另外,第三种非正统的方法是使用宏元编程。这对于普通的python来说是不可能的,但是有一些库,例如https://github.com/lihaoyi/macropy,允许您以一种(可以证明的)干净的方式(很可能没有得到python社区或guido的批准)执行此操作。
另一方面,宏方法在python中可能不会像在lisp中那样工作,在大多数python的宏实现中,它试图遵循普通python的语法;也就是说,从宏中生成
这不是一个真正的答案,但很多评论都集中在表现上,因为我确实问过。所以我对答案进行了一些性能测试:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | from datetime import datetime from string import maketrans tr_table = maketrans('IEDM', '+-:.') dictionary = {"I":"+","E":"-","D":":","M":"." } if_else_val ="E" N_OPS = 100000 now = datetime.now def time(func): s = now() func() print"%s took %d ms for %d operations" % (func.__name__, (now() - s).microseconds, N_OPS) def translation_table(): for i in xrange(N_OPS): "I".translate(tr_table) "E".translate(tr_table) "D".translate(tr_table) "M".translate(tr_table) def dict_lookup(): for i in xrange(N_OPS): dictionary.get("I") dictionary.get("E") dictionary.get("D") dictionary.get("M") def if_else(): for i in xrange(N_OPS): if if_else_val =="I": pass elif if_else_val =="E": pass elif if_else_val =="D": pass elif if_else_val =="M": pass time(if_else) time(translation_table) time(dict_lookup) |
结果如下:
1 2 3 | if_else took 12474 ms for 100000 operations translation_table took 81650 ms for 100000 operations dict_lookup took 66385 ms for 100000 operations |
字符串方法具有可用的转换函数。您必须构建一个256个字符长的翻译表。下面是我使用的代码片段:
1 2 3 4 5 6 | translationTable = ' '*256 translationTable = translationTable[:68]+':'+translationTable[69:] # D to : translationTable = translationTable[:69]+'-'+translationTable[70:] # E to - translationTable = translationTable[:73]+'+'+translationTable[74:] # I to + translationTable = translationTable[:77]+'.'+translationTable[78:] # M to . print 'EIDM'.translate(translationTable) |
输出:
1 | -+:. |