[英]How do I create a list in a separate Dataframe column from specific values from within a list of dictionaries?
我有一个名为'new_api_df'
其中有一列名为new_api_df['Categories']
,其中包含一个字典列表:
[{'CallId': 22143866, 'BucketId': 1953, 'SectionId': 1256, 'BucketFullName': 'Categories.Filters.No Sale Made', 'Weight': 1.0}
, {'CallId': 22143866, 'BucketId': 2016, 'SectionId': 1255, 'BucketFullName': 'Categories.Imported.Objections', 'Weight': 3.0}
, {'CallId': 22143866, 'BucketId': 2017, 'SectionId': 1255, 'BucketFullName': 'Categories.Imported.Touting Benefits', 'Weight': 1.0}
]
我想获取每个 'BucketFullName' 并将这些值放入单独列new_api_df['category_list']
的列表中,如下所示:
['Categories.Filters.No Sale Made', 'Categories.Imported.Objections', Categories.Imported.Touting Benefits']
我试过使用列表理解,例如:
new_api_df['category_list'] =[item['BucketFullName'] for dictionary in new_api_df['Categories'] for item in dictionary]
但得到错误: ValueError: Length of values does not match length of index
我也试过应用和列表理解: new_api_df['category_list'] = new_api_df['Categories'].apply([item['BucketFullName'] for dictionary in new_api_df['Categories'] for item in dictionary])
但我收到以下错误: AttributeError: 'Categories.Filters.No Sale Made' is not a valid function for 'Series' object
我也试过: new_api_df['category_list'] = df['Categories'].apply(lambda x: x['BucketFullName'])
但得到错误:类型错误: TypeError: list indices must be integers or slices, not str
new_api_df 切片:
new_api_df.loc[0]:
Contact {'Id': 22143866, 'Type': 'Call', 'WavPath': '\...
RecordInfo {'Id': 22143866, 'RowNumber': 1, 'TotalRowCoun...
Measures {'ID': 22143866, 'TotalHoldCount': 0, 'Agitati...
Others {'ConfidenceAverage': 69, 'SequenceID': None, ...
Sections [{'CallId': 22143866, 'SectionId': 1041, 'Sect...
Categories [{'CallId': 22143866, 'BucketId': 1953, 'Secti...
Scores [{'CallId': 22143866, 'ScoreId': 399, 'ScoreNa...
ScoreComponents [{'CallId': 22143866, 'ScoreComponentId': 4497...```
我想你想要
df=pd.DataFrame({'Categories':[{'CallId': 22143866, 'BucketId': 1953, 'SectionId': 1256, 'BucketFullName': 'Categories.Filters.No Sale Made', 'Weight': 1.0}
, {'CallId': 22143866, 'BucketId': 2016, 'SectionId': 1255, 'BucketFullName': 'Categories.Imported.Objections', 'Weight': 3.0}
, {'CallId': 22143866, 'BucketId': 2017, 'SectionId': 1255, 'BucketFullName': 'Categories.Imported.Touting Benefits', 'Weight': 1.0}
]})
df['category_list']=df['Categories'].apply(lambda x: x[0]['BucketFullName'])
print(df)
# Categories \
#0 {'CallId': 22143866, 'BucketId': 1953, 'Sectio...
#1 {'CallId': 22143866, 'BucketId': 2016, 'Sectio...
#2 {'CallId': 22143866, 'BucketId': 2017, 'Sectio...
#
# category_list
#0 Categories.Filters.No Sale Made
#1 Categories.Imported.Objections
#2 Categories.Imported.Touting Benefits
更新
请注意,这会为每个单元格创建一个列表。
df['category_list']=df['Categories'].apply(lambda x: [m_dict['BucketFullName'] for m_dict in x])
那么你可以使用DataFrame.explode
df = df.explode('category_list')
#(df['Categories'].apply(lambda x: [m_dict['BucketFullName'] for m_dict in x])
# .explode()) #check the explode serie
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.