[英]Convert dataframe of list in columns to rows
我有一個這種類型的 Pandas DataFrame
col1 col2 col3
1 [blue] [in,out]
2 [green, green] [in]
3 [green] [in]
我需要將其轉換為保留第一列的數據框,並將列中的所有其他值作為行分布:
col1 value
1 blue
1 in
1 out
2 green
2 green
2 in
3 green
3 in
使用DataFrame.stack
與Series.explode
的轉換列表,持續一段數據與清潔DataFrame.reset_index
:
df1 = (df.set_index('col1')
.stack()
.explode()
.reset_index(level=1, drop=True)
.reset_index(name='value'))
替代DataFrame.melt
和DataFrame.explode
:
df1 = (df.melt('col1')
.explode('value')
.sort_values('col1')[['col1','value']]
.reset_index(drop=True)
)
print (df1)
col1 value
0 1 blue
1 1 in
2 1 out
3 2 green
4 2 green
5 2 in
6 3 green
7 3 in
或列表理解解決方案:
L = [(k, x) for k, v in df.set_index('col1').to_dict('index').items()
for k1, v1 in v.items()
for x in v1]
df1 = pd.DataFrame(L, columns=['col1','value'])
print (df1)
col1 value
0 1 blue
1 1 in
2 1 out
3 2 green
4 2 green
5 2 in
6 3 green
7 3 in
另一種解決方案可能包括:
col1
具有新值和df['col2']
和df['col3']
中的值的列表連接來制作value
列。代碼如下:
df_final = pd.DataFrame(
{
'col1': [
i for i, sublist in zip(df['col1'], (df['col2'] + df['col3']).values)
for val in range(len(sublist))
],
'value': sum((df['col2'] + df['col3']).values, [])
}
)
print(df_final)
col1 value
0 1 blue
1 1 in
2 1 out
3 2 green
4 2 green
5 2 in
6 3 green
7 3 in
d = []
c = []
for i in range(len(df)):
d.append([j for j in df['c2'][i]])
d.append([j for j in df['c3'][i]])
c.append(str(df['c1'][i]) * (len(df['c2'][i])+ len(df['c3'][i])))
c = [list(j) for j in c]
d = [i for sublist in d for i in sublist]
c = [i for sublist in d for i in sublist]
df1 = pd.DataFrame()
df1['c1'] = c
df1['c2'] = d
df = df1
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.