[英]Join data from one column in to another column as a separate row
I have a pandas DataFrame like this:我有一个 pandas DataFrame 这样的:
Year1 Year2 Total
0 2010 2011 2500
1 2012 2013 3000
2 2014 2015 4000
I want to grab the data in the Year2
column and merge it with the Year1
column, and keep the Total
value associated with it, which should look like:我想获取
Year1
Year2
合并,并保持与之关联的Total
值,它应该如下所示:
Year1 Total
0 2010 2500
1 2011 2500
2 2012 3000
3 2013 3000
4 2014 4000
5 2015 4000
I have considered first of all duplicating the df
so that I get the second 'Total' value for the 2011, 2013 and 2015我首先考虑复制
df
,以便获得 2011、2013 和 2015 的第二个“总计”值
df = pd.DataFrame(np.repeat(df.values, 2, axis=0))
df.columns = ['Year1', 'Year2', 'Total']
but I'm still unsure of the steps to merge the column data from Year2
to Year1
.但我仍然不确定将列数据从
Year2
合并到Year1
的步骤。
You can achieve the desired output using append
function but with a few steps before:您可以使用
append
function 实现所需的 output 但之前需要执行几个步骤:
import pandas as pd
df = pd.read_csv('df.txt')
newDf = df[["Year2", "Total"]].rename(columns={"Year2":"Year1"})
df.drop(columns=["Year2"], inplace=True)
resultDf = df.append(newDf)
resultDf.sort_values("Year1")
Year1![]() |
Total![]() |
---|---|
2010 ![]() |
2500 ![]() |
2011 ![]() |
2500 ![]() |
2012 ![]() |
3000 ![]() |
2013 ![]() |
3000 ![]() |
2014 ![]() |
4000 ![]() |
2015 ![]() |
4000 ![]() |
You could melt
it:你可以
melt
它:
out = (pd.melt(df, id_vars=['Total']).rename(columns={'value':'Year1'})
.drop(columns='variable')[['Year1', 'Total']]
.sort_values(by='Year1').reset_index(drop=True))
or set_index
with "Total" + unstack
:或
set_index
与 "Total" + unstack
:
out = (df.set_index('Total').unstack().droplevel(0)
.reset_index(name='Year1')[['Year1', 'Total']]
.sort_values(by='Year1').reset_index(drop=True))
Output: Output:
Year1 Total
0 2010 2500
1 2011 2500
2 2012 3000
3 2013 3000
4 2014 4000
5 2015 4000
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.