使用字典的值作为另一个字典中的键，使用键的频率作为值，python 3.6

Using the value of a dictionary as the key in another dictionary, and the frequency of the key as the value, Python 3.6

我有字典：

CodonDict = {'ATT':'I', 'ATC':'I', 'ATA':'I', 'CTT':'L', 'CTC':'L',...}这本词典继续为64个其它独特的三联体编纂。

我在迭代一个文本文件，它本质上是一个巨大的字符串。我的代码现在用从0到63的64个键更新空字典：

TripletCount = {0: 18626, 1: 9187, 2: 9273, 3: 9154, 4: 37129, 5: 36764, 6: 18468,...}，值为三联体的频率(但键为整数)。

TripletCount = {}

我想使用CodonDict的值作为TripletCount中的键，键的频率作为TripletCount中的值。

我以前用过python编程，但是格式化字典从来都不是我的强项。

不过，我正在迭代的数据文件本质上是这样的：

'GTGGCTTCTCTTCTCCACTCCTCTTTTTATTCCTTCCCAAACAAGAAGGTTAGTTATTATTATTTCCAGA...'

编辑：

我想得到的一个例子；

TripletCount = {'I': 18626, 'V': 9187, 'L': 9273, 'Y': 9154, 'E': 37129,...}

编辑2：

根据要求：我计划通过在列表中添加计数来解决关键冲突，因为不同类型的碱基对可以识别相同的氨基酸，所以{'I': [18626, 9187, 9154], ...}。

相关讨论

键在字典中是唯一的，因此在TripletCount中，每个值都等于1。
如果我误解了你的问题，请纠正我。

下面的代码可以通过使用defaultdict的defaultdict来解决您的问题

1
2
3
4
5
6
7
8
9

from collections import defaultdict as ddict

CodonDict = {'ATT':'I', 'ATC':'I', 'ATA':'I', 'CTT':'L', 'CTC':'L'}
TripletCount = ddict(lambda:ddict(int))

for key,value in CodonDict.items():
TripletCount[value][key] += 1

TrpletCount中的值是defaultdict，您可以通过类似map的方法将其转换为列表。

您可以迭代您的数据，一次查看三个连续字符，并检查三个字符的每个字符串是否是您的CodonDict字典中的键。如果是，可以增加EDOCX1的值(1)。

例如，使用问题中的示例数据集：

1
2
3
4
5
6
7
8
9
10
11

CodonDict = {'ATT':'I', 'ATC':'I', 'ATA':'I', 'CTT':'L', 'CTC':'L'}
TripletCount = {}
data = 'GTGGCTTCTCTTCTCCACTCCTCTTTTTATTCCTTCCCAAACAAGAAGGTTAGTTATTATTATTTCCAGA'

for i in range(3, len(data)): # iterates through your data string
triplet = CodonDict.get(data[i-3:i]) # check if the next 3 characters in a row are a key in CodonDict
if triplet: # if it is a key: increment the count of its value by one
TripletCount[triplet] = TripletCount.get(triplet, 0) + 1

print(TripletCount)
{'I': 4, 'L': 8}