[英]How to split list of dictionaries in a row into multiple rows of pandas DataFrame?
I have following Dataframe:我有以下 Dataframe:
terms periods
0 [741880, 3764106] [{"name":"2010 год", "date":"31.12.2010", "value":"6621"},{"name":"2000 год", "date":"31.12.2000", "value":"17913"},{"name":"2006 год", "date":"31.12.2006", "value":"5849"},{"name":"2003 год", "date":"31.12.2003", "value":"9211"},{"name":"2012 год", "date":"31.12.2012", "value":"7647"},{"name":"2011 год", "date":"31.12.2011", "value":"8382"},{"name":"2014 год", "date":"31.12.2014", "value":"7388"},{"name":"2004 год", "date":"31.12.2004", "value":"8851"}]
As you can see, it has a list of dictionaries in the row.如您所见,它在行中有一个字典列表。 Now I want something like:
现在我想要这样的东西:
terms date value
0 [741880, 3764106] 31.12.2010 6621
1 [741880, 3764106] 31.12.2000 17913
2 [741880, 3764106] 31.12.2006 5849
etc
So, the list of dictionaries must be split into rows depending on the number of elements in the list.因此,字典列表必须根据列表中元素的数量拆分成行。
How can I do that?我怎样才能做到这一点?
Try using apply()
and explode()
:尝试使用
apply()
和explode()
:
df2 = (df['periods'].apply(lambda x: [[i['date'],i['value']] for i in x])
.explode()
.apply(pd.Series, index=['date','value'])])
df = pd.concat([df['terms'], df2, axis=1).reset_index(drop=True)
Output: Output:
print(df)
terms date value
0 [741880, 3764106] 31.12.2010 6621
1 [741880, 3764106] 31.12.2000 17913
2 [741880, 3764106] 31.12.2006 5849
3 [741880, 3764106] 31.12.2003 9211
4 [741880, 3764106] 31.12.2012 7647
5 [741880, 3764106] 31.12.2011 8382
6 [741880, 3764106] 31.12.2014 7388
7 [741880, 3764106] 31.12.2004 8851
Just explode
the column periods
and apply ps.Series
to it.只需
ps.Series
列periods
并将explode
应用于它。 You can skip first two lines if the data in column periods
is already a list of dictionaries.如果列
periods
点中的数据已经是字典列表,则可以跳过前两行。 Use set reset index to keep terms column使用 set reset index 保留 terms 列
import ast
df['periods']=df['periods'].apply(ast.literal_eval)
df.set_index('terms').explode('periods').apply(lambda row: pd.Series(row['periods']), axis=1).reset_index()
OUTPUT: OUTPUT:
terms name date value
0 [741880, 3764106] 2010 год 31.12.2010 6621
1 [741880, 3764106] 2000 год 31.12.2000 17913
2 [741880, 3764106] 2006 год 31.12.2006 5849
3 [741880, 3764106] 2003 год 31.12.2003 9211
4 [741880, 3764106] 2012 год 31.12.2012 7647
5 [741880, 3764106] 2011 год 31.12.2011 8382
6 [741880, 3764106] 2014 год 31.12.2014 7388
7 [741880, 3764106] 2004 год 31.12.2004 8851
Try this:-尝试这个:-
periods = [{"name":"2010 год", "date":"31.12.2010", "value":"6621"},{"name":"2000 год", "date":"31.12.2000", "value":"17913"},{"name":"2006 год", "date":"31.12.2006", "value":"5849"},{"name":"2003 год", "date":"31.12.2003", "value":"9211"},{"name":"2012 год", "date":"31.12.2012", "value":"7647"},{"name":"2011 год", "date":"31.12.2011", "value":"8382"},{"name":"2014 год", "date":"31.12.2014", "value":"7388"},{"name":"2004 год", "date":"31.12.2004", "value":"8851"}]
print(f"{'Terms': <8}{'Name': <13}{'Date': <14}{'Value': <10}")
i = 0
for mem in periods:
for rel in mem:
print(f"{i: <8}{mem['name']:12}{mem['date']: <14}{mem['value']:6}")
i += 1
Output:- Output:-
Terms Name Date Value
0 2010 год 31.12.2010 6621
1 2010 год 31.12.2010 6621
2 2010 год 31.12.2010 6621
3 2000 год 31.12.2000 17913
4 2000 год 31.12.2000 17913
5 2000 год 31.12.2000 17913
6 2006 год 31.12.2006 5849
7 2006 год 31.12.2006 5849
8 2006 год 31.12.2006 5849
etc.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.