Is there a faster way to convert an arbitrary large integer to a big endian sequence of bytes?
我有这个python代码来做这个:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | from struct import pack as _pack def packl(lnum, pad = 1): if lnum < 0: raise RangeError("Cannot use packl to convert a negative integer" "to a string.") count = 0 l = [] while lnum > 0: l.append(lnum & 0xffffffffffffffffL) count += 1 lnum >>= 64 if count <= 0: return '\0' * pad elif pad >= 8: lens = 8 * count % pad pad = ((lens != 0) and (pad - lens)) or 0 l.append('>' + 'x' * pad + 'Q' * count) l.reverse() return _pack(*l) else: l.append('>' + 'Q' * count) l.reverse() s = _pack(*l).lstrip('\0') lens = len(s) if (lens % pad) != 0: return '\0' * (pad - lens % pad) + s else: return s |
在我的机器上,将
我能做些什么来加快速度吗?这将用于转换密码术中使用的大素数以及一些(但不是很多)小的数。
编辑
这是目前python<3.2中速度最快的选项,它需要大约一半的时间来作为公认的答案:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | def packl(lnum, padmultiple=1): """Packs the lnum (which must be convertable to a long) into a byte string 0 padded to a multiple of padmultiple bytes in size. 0 means no padding whatsoever, so that packing 0 result in an empty string. The resulting byte string is the big-endian two's complement representation of the passed in long.""" if lnum == 0: return b'\0' * padmultiple elif lnum < 0: raise ValueError("Can only convert non-negative numbers.") s = hex(lnum)[2:] s = s.rstrip('L') if len(s) & 1: s = '0' + s s = binascii.unhexlify(s) if (padmultiple != 1) and (padmultiple != 0): filled_so_far = len(s) % padmultiple if filled_so_far != 0: s = b'\0' * (padmultiple - filled_so_far) + s return s def unpackl(bytestr): """Treats a byte string as a sequence of base 256 digits representing an unsigned integer in big-endian format and converts that representation into a Python integer.""" return int(binascii.hexlify(bytestr), 16) if len(bytestr) > 0 else 0 |
在python 3.2中,
这里有一个通过
1 2 3 4 5 6 7 8 9 10 11 12 13 | import numpy import ctypes PyLong_AsByteArray = ctypes.pythonapi._PyLong_AsByteArray PyLong_AsByteArray.argtypes = [ctypes.py_object, numpy.ctypeslib.ndpointer(numpy.uint8), ctypes.c_size_t, ctypes.c_int, ctypes.c_int] def packl_ctypes_numpy(lnum): a = numpy.zeros(lnum.bit_length()//8 + 1, dtype=numpy.uint8) PyLong_AsByteArray(lnum, a, a.size, 0, 1) return a |
在我的机器上,这比你的方法快15倍。
编辑:这里有相同的代码,只使用
1 2 3 4 5 6 7 8 9 10 11 12 | import ctypes PyLong_AsByteArray = ctypes.pythonapi._PyLong_AsByteArray PyLong_AsByteArray.argtypes = [ctypes.py_object, ctypes.c_char_p, ctypes.c_size_t, ctypes.c_int, ctypes.c_int] def packl_ctypes(lnum): a = ctypes.create_string_buffer(lnum.bit_length()//8 + 1) PyLong_AsByteArray(lnum, a, len(a), 0, 1) return a.raw |
这又快了两倍,加起来我的机器的加速系数是30。
为了完整性和将来的读者:
从python 3.2开始,有函数
只是想发布一个对Sven答案的跟进(这很有效)。相反的操作-从任意长字节对象到python integer对象需要以下内容(因为我找不到pylong-frombytearray()c api函数):
1 2 3 4 5 6 7 | import binascii def unpack_bytes(stringbytes): #binascii.hexlify will be obsolete in python3 soon #They will add a .tohex() method to bytes class #Issue 3532 bugs.python.org return int(binascii.hexlify(stringbytes), 16) |
我想你真的应该只是使用numpy,我确信它有一些内置的功能。使用
imx,创建一个生成器并使用列表理解和/或内置求和比附加到列表的循环更快,因为附加可以在内部完成。哦,大绳子上的"lstrip"一定很贵。
此外,还有一些风格要点:特殊情况还不够特殊;而且您似乎没有收到有关新
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | from struct import pack as _pack Q_size = 64 Q_bitmask = (1L << Q_size) - 1L def quads_gen(a_long): while a_long: yield a_long & Q_bitmask a_long >>= Q_size def pack_long_big_endian(a_long, pad = 1): if lnum < 0: raise RangeError("Cannot use packl to convert a negative integer" "to a string.") qs = list(reversed(quads_gen(a_long))) # Pack the first one separately so we can lstrip nicely. first = _pack('>Q', qs[0]).lstrip('\x00') rest = _pack('>%sQ' % len(qs) - 1, *qs[1:]) count = len(first) + len(rest) # A little math trick that depends on Python's behaviour of modulus # for negative numbers - but it's well-defined and documented return '\x00' * (-count % pad) + first + rest |