[英]Reshaping pandas dataframe with a column containing lists
Let's say I have a dataframe that looks like this: 假设我有一个看起来像这样的数据框:
import pandas as pd
data = [{"Name" : "Project A", "Feedback" : ['we should do x', 'went well']},
{"Name" : "Project B", "Feedback" : ['eat pop tarts', 'boo']},
{"Name" : "Project C", "Feedback" : ['bar', 'baz']}
]
df = pd.DataFrame(data)
df = df[['Name','Feedback']]
df
Name Feedback
0 Project A ['we should do x', 'went well']
1 Project B ['eat pop tarts', 'boo']
2 Project C ['bar', 'baz']
What I would like to do is reshape the dataframe, such that Name is the key and each element in the list of the Feedback column is a value like so: 我想做的是重塑数据框的形状,使Name是键,而Feedback列的列表中的每个元素都是这样的值:
Name Feedback
0 Project A 'we should do x'
1 Project A 'went well'
2 Project B 'eat pop tarts'
3 Project B 'boo'
4 Project C 'bar'
5 Project C 'baz'
What would be an efficient way to do this? 什么是有效的方法?
One option is to reconstruct the data frame by flattening column Feedback and repeat column Name : 一种选择是通过展平列Feedback并重复列Name来重建数据帧:
pd.DataFrame({
'Name': df.Name.repeat(df.Feedback.str.len()),
'Feedback': [x for s in df.Feedback for x in s]
})
# Feedback Name
#0 we should do x Project A
#0 went well Project A
#1 eat pop tarts Project B
#1 boo Project B
#2 bar Project C
#2 baz Project C
Here's another method: 这是另一种方法:
# Separate out values (NOTE- this assumes you'll always have two strings in list)
df['pos_0'] = df['Feedback'].str[0]
df['pos_1'] = df['Feedback'].str[1]
df
Name Feedback pos_0 pos_1
0 Project A [we should do x, went well] we should do x went well
1 Project B [eat pop tarts, boo] eat pop tarts boo
2 Project C [bar, baz] bar baz
Desired output: 所需的输出:
pd.melt(df, 'Name', ['pos_0', 'pos_1'], 'Feedback').drop('Feedback', axis=1)
Name value
0 Project A we should do x
1 Project B eat pop tarts
2 Project C bar
3 Project A went well
4 Project B boo
5 Project C baz
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.