[英]How to merge several columns into one column with several records using python and pandas?
I have a data which I need to transform in order to get 2 cols insted of 4:我有一个数据需要转换才能获得 2 列,共 4 列:
data = [['123', 'Billy', 'Bill', 'Bi'],
['234', 'James', 'J', 'Ji'],
['543', 'Floyd', 'Flo', 'F'],
]
processed_data = ?
needed_df = pandas.DataFrame(processed_data, columns=['Number', 'Name'])
I expect the following behaviour:我期望以下行为:
['123', 'Billy']
['123', 'Bill']
['123', 'Bi']
['234', 'James']
['234', 'J']
['234', 'Ji']
I've tried to use for in for loop but getting the wrong result:我试图在 for 循环中使用 for 但得到了错误的结果:
for row in df.iterrows():
for col in df.columns:
new_row = ...
processed_df = pandas.concat(df, new_row)
Such a construction gives a too big result这样的结构给出了太大的结果
The similar question using sql:使用 sql 的类似问题:
How to split several columns into one column with several records in SQL? 如何将SQL中的多条记录拆分为一列?
Let use list comprehension to create pairs of Name and Number then create a new dataframe让我们使用列表理解来创建一对 Name 和 Number 然后创建一个新的 dataframe
pd.DataFrame([[x, z] for x, *y in data for z in y], columns=['Number', 'Name'])
Number Name
0 123 Billy
1 123 Bill
2 123 Bi
3 234 James
4 234 J
5 234 Ji
6 543 Floyd
7 543 Flo
8 543 F
Or, you can convert you exists data into a dataframe then perform pandas dataframe reshaping with melt
:或者,您可以将现有数据转换为 dataframe 然后执行 pandas dataframe 使用melt
重塑:
import pandas as pd
data = [['123', 'Billy', 'Bill', 'Bi'],
['234', 'James', 'J', 'Ji'],
['543', 'Floyd', 'Flo', 'F'],
]
df = pd.DataFrame(data)
df.melt(0).sort_values(0)
Output: Output:
0 variable value
0 123 1 Billy
3 123 2 Bill
6 123 3 Bi
1 234 1 James
4 234 2 J
7 234 3 Ji
2 543 1 Floyd
5 543 2 Flo
8 543 3 F
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.