Weird output from .itemgetter for list sorting by values python
所以我正在研究google python代码类,并尝试做单词"u count.py"练习。目的是创建一个按字数(值)排序的单词字典(key),并将其作为元组返回以供打印。
我创建了一个助手函数来创建字典:
1 2 3 4 5 6 7 8 9 10 11 | def dict_creator(filename): #helper function to create a dictionary each 'word' is a key and the 'wordcount' is the value input_file = open(filename, 'r') #open file as read for line in input_file: #for each line of text in the input file words = line.split() #split each line into individual words for word in words: #for each word in the words list(?) word = word.lower() #make each word lower case. if word not in word_count: #if the word hasn't been seen before word_count[word] = 1 #create a dictionary key with the 'word' and assign a value of 1 else: word_count[word] += 1 #if 'word' seen before, increase value by 1 return word_count #return word_count dictionary word_count.close() |
我现在正在使用本文中概述的.itemgetter方法创建按值排序的字典:link。这是我的代码:
1 2 3 4 5 6 7 8 9 | def print_words(filename): word_count = dict_creator(filename) #run dict_creator on input file (creating dictionary) print sorted(word_count.iteritems(), key=operator.itemgetter(1), reverse=True) #print dictionary in total sorted descending by value. Values have been doubled compared to original dictionary? for word in sorted(word_count.iteritems(), key=operator.itemgetter(1), reverse=True): #create sorted list of tuples using operator module functions sorted in an inverse manner a = word b = word_count[word] print a, b #print key and value |
但是,当我在测试文件和较小的文件上运行代码时,它会抛出一个键错误(如下所示)。
1 2 3 4 5 6 | Traceback (most recent call last): File"F:\Misc\google-python-exercises\basic\wordcount_edited.py", line 74, in <module> print_words(lorem_ipsum) #run input file through print_words File"F:\Misc\google-python-exercises\basic\wordcount_edited.py", line 70, in print_words b = word_count[word] KeyError: ('in', 3) |
我已经打印了原始字典和已排序的字典,并且似乎字典排序后所有值都翻了一番。我查看了与此类问题相关的几个线程,并查看了.itemgetter文档,但是我似乎找不到其他有类似问题的人。
有人能指出是什么导致我的代码在word_count函数中第二次迭代字典,这会导致值的增加吗?
谢谢!
某人
(1)你没有在
1 | word_count = {} |
开始时。这意味着,无论
(2)对于键错误:
1 2 3 4 | for word in sorted(word_count.iteritems(), key=operator.itemgetter(1), reverse=True): #create sorted list of tuples using operator module functions sorted in an inverse manner a = word b = word_count[word] |
(3)在本部分中:
1 2 | return word_count #return word_count dictionary word_count.close() |
我认为你的意思是
1 2 3 | with open(filename) as input_file: code_goes_here = True return word_count |
在这里,文件将自动关闭。
在进行上述更改之后,您的代码似乎对我有用。