pandas series containing arrays
我有一个熊猫数据框架列,看起来有点像:
1 2 3 4 5 | Out[67]: 0 ["cheese","milk... 1 ["yogurt","cheese... 2 ["cheese","cream"... 3 ["milk","cheese"... |
现在,我最终希望这是一个简单的列表,但在试图将其扁平化的过程中,我注意到大熊猫将
我该怎么把它压平,这样我最终得到:
1 | ["cheese","milk","yogurt","cheese","cheese"...] |
号
[编辑]所以下面给出的答案是:
埃多克斯1〔3〕
1 2 3 4 5 | s = s.str.strip("[]") df = s.str.split(',', expand=True) df = df.applymap(lambda x: x.replace("'", '').strip()) l = df.values.flatten() print (l.tolist()) |
这太好了,问题回答,回答被接受了,但我觉得这是一个相当不雅的解决方案。
您可以使用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | print df a 0 [cheese, milk] 1 [yogurt, cheese] 2 [cheese, cream] print df.a.values [[['cheese', 'milk']] [['yogurt', 'cheese']] [['cheese', 'cream']]] l = df.a.values.flatten() print l [['cheese', 'milk'] ['yogurt', 'cheese'] ['cheese', 'cream']] print [item for sublist in l for item in sublist] ['cheese', 'milk', 'yogurt', 'cheese', 'cheese', 'cream'] |
编辑:
您可以尝试:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | import pandas as pd s = pd.Series(["['cheese', 'milk']","['yogurt', 'cheese']","['cheese', 'cream']"]) #remove [] s = s.str.strip('[]') print s 0 'cheese', 'milk' 1 'yogurt', 'cheese' 2 'cheese', 'cream' dtype: object df = s.str.split(',', expand=True) #remove ' and strip empty string df = df.applymap(lambda x: x.replace("'", '').strip()) print df 0 1 0 cheese milk 1 yogurt cheese 2 cheese cream l = df.values.flatten() print l.tolist() ['cheese', 'milk', 'yogurt', 'cheese', 'cheese', 'cream'] |
号
您可以将
1 | s.apply(pd.Series).stack().tolist() |
要将列值从str转换为list,可以使用