Python List拆分，按日期排序，然后加入

Python List splitting, sorting by date, then joining

好吧，我已经在这里待了好几个小时了，我承认失败，请求你的宽恕。

目标：我有多个文件(银行对账单下载)，我想合并、排序、删除重复项。

下载的格式如下：

1
2
3
4
5
6
7
8

"08/04/2015","Balance","5,804.30","Current Balance for account 123S14"
"08/04/2015","Balance","5,804.30","Available Balance for account 123S14"
"02/03/2015","241.25","Transaction description","2,620.09"
"02/03/2015","-155.49","Transaction description","2,464.60"
"03/03/2015","82.00","Transaction description","2,546.60"
"03/03/2015","243.25","Transaction description","2,789.85"
"03/03/2015","-334.81","Transaction description","2,339.12"
"04/03/2015","-25.05","Transaction description","2,314.07"

除了完全不知道我在做什么之外，我的主要问题之一是数值中包含逗号。我已经成功地编写了代码，去掉了这些"隐藏"的逗号，然后去掉了引号，这样我就有了一个csv…行。

所以我现在有了这种格式的数据

1
2
3
4
5
6

['02/03/2015', ' \t ', '241.25\t ', ' \t ', 'Transaction Details
', '02/03/2015', ' \t ', ' \t ', '-155.49\t ', 'Transaction Details
', '03/03/2015', ' \t ', '82.00\t ', ' \t ', 'Transaction Details
', '03/03/2015', ' \t ', '243.25\t ', ' \t ', 'Transaction Details
', '02/03/2015', ' \t ', '241.25\t ', ' \t ', 'Transaction Details
']

号

我相信这使得它几乎可以对元素进行排序，但是我认为它现在是一个长列表，而不是一个列表。

我研究了排序并找到lambda…函数，所以我开始实现

1	new_file_data = sorted(new_file_data, key=lambda item: item[0])

但元素[0]只是"在BOL上"。

我还注意到，我需要指示日期的格式可能不正确，这导致了我的这种构造：

1	sorted(new_file_data, key=lambda d: datetime.strptime(d, '%d/%m/%Y'))

。

松散地说，我得到了"map"结构，但不知道如何组合，这样我就可以引用元素[0]以及如何引用它(datewise)。

现在我在这里，希望有人能把我推过这个障碍？我认为我需要更好地分割列表，这样每一行都是一个元素-我曾经得到过一个排序结果，但是所有的字段都是全局的，值(排序)，日期，单词等等。

所以，如果有人能就我失败的列表操作以及如何构造这个排序lambda提供一些建议的话。

感谢那些有时间和知道如何回答这些初学者的问题的人。

如果我理解正确，您希望阅读csv的内容并按日期对其进行排序。

考虑到EDOCX1[0]的内容

1
2
3
4
5
6
7
8

我将使用csv模块读取数据。

1
2
3

import csv
with open('data.csv') as f:
data = [row for row in csv.reader(f)]

。

它给出：

1
2
3
4
5
6
7
8
9

>>> data
[['08/04/2015', 'Balance', '5,804.30', 'Current Balance for account 123S14'],
['08/04/2015', 'Balance', '5,804.30', 'Available Balance for account 123S14'],
['02/03/2015', '241.25', 'Transaction description', '2,620.09'],
['02/03/2015', '-155.49', 'Transaction description', '2,464.60'],
['03/03/2015', '82.00', 'Transaction description', '2,546.60'],
['03/03/2015', '243.25', 'Transaction description', '2,789.85'],
['03/03/2015', '-334.81', 'Transaction description', '2,339.12'],
['04/03/2015', '-25.05', 'Transaction description', '2,314.07']]

然后，您可以使用datetime模块提供一个用于排序的键。

1 2	import datetime sorted_data = sorted(data, key=lambda row: datetime.datetime.strptime(row[0],"%d/%m/%Y"))

。

它给出：

1
2
3
4
5
6
7
8
9

>>> sorted_data
[['02/03/2015', '241.25', 'Transaction description', '2,620.09'],
['02/03/2015', '-155.49', 'Transaction description', '2,464.60'],
['03/03/2015', '82.00', 'Transaction description', '2,546.60'],
['03/03/2015', '243.25', 'Transaction description', '2,789.85'],
['03/03/2015', '-334.81', 'Transaction description', '2,339.12'],
['04/03/2015', '-25.05', 'Transaction description', '2,314.07'],
['08/04/2015', 'Balance', '5,804.30', 'Current Balance for account 123S14'],
['08/04/2015', 'Balance', '5,804.30', 'Available Balance for account 123S14']]

。