Altering the format of a list of strings
我必须分析地震数据,在开始分析数据之前,我必须更改数据列出方式的格式。我必须更改以下格式:
1 2 3 4 | 14km WSW of Willow, Alaska$2.4 4km NNW of The Geysers, California$0.9 13km ESE of Coalinga, California$2.1 ... |
到:
1 2 | ["2.4, 14km WSW of Willow, Alaska","0.9, 4km NNW of The Geysers, California", "2.1, 13km ESE of Coalinga, California", ...] |
号
对于原始格式(省略URL),我拥有的代码是:
1 2 3 4 5 6 7 | def fileToList(url): alist = [] source = urllib2.urlopen(url) for line in source: items = line.strip() alist.append(items) return alist |
我正在尝试创建变量Magnity和EarthQuakeloc来重新排列alist的格式,但我不知道从哪里开始。我对编码很陌生。任何建议都很好,谢谢。
似乎您只是在尝试重新排序每个字符串的格式化方式,因此,如果您在多行字符串中有这样的初始数据:
1 2 3 | earthquake_data ="""14km WSW of Willow, Alaska$2.4 4km NNW of The Geysers, California$0.9 13km ESE of Coalinga, California$2.1""" |
。
然后您可以在换行符上拆分它以获得字符串列表:
1 2 3 | lines = data.split(' ') >>> ['14km WSW of Willow, Alaska$2.4', '4km NNW of The Geysers, California$0.9', '13km ESE of Coalinga, California$2.1'] |
号
对于数据列表中的每个项目,将其拆分到"$"符号上,这将为您留下如下列表:
1 2 | split_lines = [l.split('$') for l in lines] >>> [['14km WSW of Willow, Alaska', '2.4'], ['4km NNW of The Geysers, California', '0.9'], ['13km ESE of Coalinga, California', '2.1']] |
号
然后,可以对列表理解中的每个项使用str.join()string方法将这些列表中的每个列表重新联接到字符串中:
1 2 | reformatted_data = [",".join([l[1], l[0]]) for l in split_lines] >>> ['2.4, 14km WSW of Willow, Alaska', '0.9, 4km NNW of The Geysers, California', '2.1, 13km ESE of Coalinga, California'] |
号
在这里,所有这些都包含在一个函数中:
1 2 3 4 5 6 7 8 9 | def reformatStrings(data): lines = data.split(" ") split_lines = [l.split('$') for l in lines] reformatted_data = [",".join([l[1], l[0]]) for l in split_lines] return reformatted_data print(reformatStrings(earthquake_data)) |
号
假设您的
1 2 3 | 14km WSW of Willow, Alaska$2.4 4km NNW of The Geysers, California$0.9 13km ESE of Coalinga, California$2.1 |
在最简单的情况下,使用
1 2 3 4 5 6 7 | def fileToList(url=''): source = urllib2.urlopen(url) return [', '.join(l.split('$')[::-1]) for l in source.split(' ') if l.strip()] print(fileToList()) |
。
输出如下:
1 | ['2.4, 14km WSW of Willow, Alaska', '0.9, 4km NNW of The Geysers, California', '2.1, 13km ESE of Coalinga, California'] |
提示:
1 2 3 4 5 6 7 8 9 | >>> a ="14km WSW of Willow, Alaska$2.4" >>> a = a.split("$") split the string on `$` >>> a ['14km WSW of Willow, Alaska', '2.4'] >>> a = a[::-1] reverse the list >>> a ['2.4', '14km WSW of Willow, Alaska'] >>>",".join(a) give jon on `,` '2.4,14km WSW of Willow, Alaska' |
一个衬里:
1 2 | >>>",".join(a.split("$")[::-1]) '2.4,14km WSW of Willow, Alaska' |
。
您预期输出的Python般的方式:
1 2 3 4 5 6 | >>> myString ="""14km WSW of Willow, Alaska$2.4 ... 4km NNW of The Geysers, California$0.9 ... 13km ESE of Coalinga, California$2.1""" >>> map(lambda x:",".join(x.split("$")[::-1]), myString.strip().split(" ")) ['2.4,14km WSW of Willow, Alaska', '0.9,4km NNW of The Geysers, California', '2.1,13km ESE of Coalinga, California'] |
。
如果您担心格式化,那么我将使用
1 2 3 4 5 6 7 8 9 10 11 12 | from collections import namedtuple Data = namedtuple('Data', ['position', 'magnitude']) mystr ="""14km WSW of Willow, Alaska$2.4 4km NNW of The Geysers, California$0.9 13km ESE of Coalinga, California$2.1""" list_of_data = [] for line in mystr.split(' '): # equivalent to your"for line in source" list_of_data.append(Data(*line.split('$'))) |
这将为您提供以下信息:
1 2 3 4 | >>> list_of_data [Data(position='14km WSW of Willow, Alaska', magnitude='2.4'), Data(position='4km NNW of The Geysers, California', magnitude='0.9'), Data(position='13km ESE of Coalinga, California', magnitude='2.1')] |
号
很容易操作:
1 2 3 4 | >>> ['{x.magnitude}, {x.position}'.format(x=x) for x in list_of_data] ['2.4, 14km WSW of Willow, Alaska', '0.9, 4km NNW of The Geysers, California', '2.1, 13km ESE of Coalinga, California'] |
或按大小排序:
1 2 3 4 | >>> sorted(list_of_data, key=lambda x: x.magnitude) [Data(position='4km NNW of The Geysers, California', magnitude='0.9'), Data(position='13km ESE of Coalinga, California', magnitude='2.1'), Data(position='14km WSW of Willow, Alaska', magnitude='2.4') |
。
最后,如果数据集很大,使用regex可能更有意义。但是用