Writing to numpy array from dictionary
我有一个文件头值字典(时间、帧数、年、月等),我想将其写入一个numpy数组。我目前的代码如下:
1 | arr=np.array([(k,)+v for k,v in fileheader.iteritems()],dtype=["a3,a,i4,i4,i4,i4,f8,i4,i4,i4,i4,i4,i4,a10,a26,a33,a235,i4,i4,i4,i4,i4,i4"]) |
但我得到一个错误,"只能将元组(而不是"int")连接到元组。
基本上,最终结果需要是存储整体文件头信息(512字节)和每个帧的数据(头和数据,每帧49408字节)的数组。有更简单的方法吗?
编辑:为了澄清(对我自己也一样),我需要将文件的每个帧中的数据写入一个数组。我以matlab代码为基础。下面是给我的代码的大致概念:
1 2 3 | data.frame=zeros([512 96]) frame=uint8(fread(fid,[data.numbeams,512]),'uint8')) data.frame=frame |
号
如何将"框架"转换为python?
你最好将头数据保存在dict中。你真的需要它作为数组吗?(如果是,为什么?头文件放在numpy数组中有一些好处,但它比简单的
除非有充分的理由将它放入一个麻木的数组中,否则您可能不想这样做。
但是,结构化数组将保留头的顺序,并使其更容易写入磁盘的二进制表示形式,但在其他方面它是不灵活的。
如果您确实想使头成为一个数组,您可以这样做:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | import numpy as np # Lists can be modified, but preserve order. That's important in this case. names = ['Name1', 'Name2', 'Name3'] # It's"S3" instead of"a3" for a string field in numpy, by the way formats = ['S3', 'i4', 'f8'] # It's often cleaner to specify the dtype this way instead of as a giant string dtype = dict(names=names, formats=formats) # This won't preserve the order we're specifying things in!! # If we iterate through it, things may be in any order. header = dict(Name1='abc', Name2=456, Name3=3.45) # Therefore, we'll be sure to pass things in in order... # Also, np.array will expect a tuple instead of a list for a structured array... values = tuple(header[name] for name in names) header_array = np.array(values, dtype=dtype) # We can access field in the array like this... print header_array['Name2'] # And dump it to disk (similar to a C struct) with header_array.tofile('test.dat') |
号
另一方面,如果您只想访问头中的值,只需将其保留为一个
根据听起来你在做什么,我会这样做。我使用numpy数组来读取头,但头值实际上是作为类属性(以及头数组)存储的。
这看起来比实际情况更复杂。
我只是定义两个新类,一个用于父文件,一个用于框架。你可以用更少的代码来做同样的事情,但是这会为你提供更复杂的事物的基础。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | import numpy as np class SonarFile(object): # These define the format of the file header header_fields = ('num_frames', 'name1', 'name2', 'name3') header_formats = ('i4', 'f4', 'S10', '>I4') def __init__(self, filename): self.infile = open(filename, 'r') dtype = dict(names=self.header_fields, formats=self.header_formats) # Read in the header as a numpy array (count=1 is important here!) self.header = np.fromfile(self.infile, dtype=dtype, count=1) # Store the position so we can"rewind" to the end of the header self.header_length = self.infile.tell() # You may or may not want to do this (If the field names can have # spaces, it's a bad idea). It will allow you to access things with # sonar_file.Name1 instead of sonar_file.header['Name1'], though. for field in self.header_fields: setattr(self, field, self.header[field]) # __iter__ is a special function that defines what should happen when we # try to iterate through an instance of this class. def __iter__(self): """Iterate through each frame in the dataset.""" # Rewind to the end of the file header self.infile.seek(self.header_length) # Iterate through frames... for _ in range(self.num_frames): yield Frame(self.infile) def close(self): self.infile.close() class Frame(object): header_fields = ('width', 'height', 'name') header_formats = ('i4', 'i4', 'S20') data_format = 'f4' def __init__(self, infile): dtype = dict(names=self.header_fields, formats=self.header_formats) self.header = np.fromfile(infile, dtype=dtype, count=1) # See discussion above... for field in self.header_fields: setattr(self, field, self.header[field]) # I'm assuming that the size of the frame is in the frame header... ncols, nrows = self.width, self.height # Read the data in self.data = np.fromfile(infile, self.data_format, count=ncols * nrows) # And reshape it into a 2d array. # I'm assuming C-order, instead of Fortran order. # If it's fortran order, just do"data.reshape((ncols, nrows)).T" self.data = self.data.reshape((nrows, ncols)) |
你可以这样使用它:
1 2 3 4 5 | dataset = SonarFile('input.dat') for frame in dataset: im = frame.data # Do something... |
。
问题似乎是
1 | arr=np.array([(k,v) for k,v in fileheader.iteritems()],dtype=["a3,a,i4,i4,i4,i4,f8,i4,i4,i4,i4,i4,i4,a10,a26,a33,a235,i4,i4,i4,i4,i4,i4"]) |