关于python:访问csv标头空白区域并且不区分大小写

Accessing csv header white space and case insensitive

我将重写csv.Dictreader.fieldnames属性,如下所示,从没有空白和小写的csv文件中读取所有头文件。

1
2
3
4
5
6
import csv
class MyDictReader(csv.DictReader):

    @property
    def fieldnames(self):
        return [field.strip().lower() for field in super(MyDictReader, self).fieldnames]

现在我的问题是,如何通过查询自动访问strip()lower()字段名?

这就是我手动操作的方式:

1
2
3
4
5
csvDict = MyDictReader(open('csv-file.csv', 'rU'))

for lineDict in csvDict:
    query = ' Column_A'.strip().lower()
    print(lineDict[query])

有什么想法吗?


根据佩德罗罗马诺的建议,我编写了以下示例。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import csv

class DictReaderInsensitive(csv.DictReader):
    # This class overrides the csv.fieldnames property.
    # All fieldnames are without white space and in lower case

    @property
    def fieldnames(self):
        return [field.strip().lower() for field in super(DictReaderInsensitive, self).fieldnames]

    def __next__(self):
        # get the result from the original __next__, but store it in DictInsensitive

        dInsensitive = DictInsensitive()
        dOriginal = super(DictReaderInsensitive, self).__next__()

        # store all pairs from the old dict in the new, custom one
        for key, value in dOriginal.items():
            dInsensitive[key] = value

        return dInsensitive

class DictInsensitive(dict):
    # This class overrides the __getitem__ method to automatically strip() and lower() the input key

    def __getitem__(self, key):
        return dict.__getitem__(self, key.strip().lower())

对于包含标题的文件

  • "柱状A"
  • "柱状A"
  • "柱状A"
  • "柱状A"

您可以这样访问列:

1
2
3
4
5
6
csvDict = DictReaderInsensitive(open('csv-file.csv', 'rU'))

for lineDict in csvDict:
    print(lineDict[' Column_A']) # or
    print(lineDict['Column_A']) # or
    print(lineDict[' column_a']) # all returns the same


你必须分两步完成:

  • 使用将.strip().lower()应用于其key参数的__getitem__方法创建dict专业化。
  • MyDictReader专业类上重写__next__以返回用csv.DictReader超类的__next__方法返回的字典初始化的特殊字典之一。