[英]Append column to pandas dataframe
This is probably easy, but I have the following data:这可能很容易,但我有以下数据:
In data frame 1:在数据框 1 中:
index dat1
0 9
1 5
In data frame 2:在数据框 2 中:
index dat2
0 7
1 6
I want a data frame with the following form:我想要一个具有以下形式的数据框:
index dat1 dat2
0 9 7
1 5 6
I've tried using the append
method, but I get a cross join (ie cartesian product).我试过使用
append
方法,但我得到了一个交叉连接(即笛卡尔积)。
What's the right way to do this?这样做的正确方法是什么?
It seems in general you're just looking for a join:一般来说,您似乎只是在寻找加入:
> dat1 = pd.DataFrame({'dat1': [9,5]})
> dat2 = pd.DataFrame({'dat2': [7,6]})
> dat1.join(dat2)
dat1 dat2
0 9 7
1 5 6
您还可以使用:
dat1 = pd.concat([dat1, dat2], axis=1)
Both join()
and concat()
way could solve the problem. join()
和concat()
方式都可以解决问题。 However, there is one warning I have to mention: Reset the index before you join()
or concat()
if you trying to deal with some data frame by selecting some rows from another DataFrame.但是,我必须提到一个警告:如果您尝试通过从另一个 DataFrame 中选择一些行来处理某个数据框,请在
join()
或concat()
之前重置索引。
One example below shows some interesting behavior of join and concat:下面的一个例子展示了 join 和 concat 的一些有趣的行为:
dat1 = pd.DataFrame({'dat1': range(4)})
dat2 = pd.DataFrame({'dat2': range(4,8)})
dat1.index = [1,3,5,7]
dat2.index = [2,4,6,8]
# way1 join 2 DataFrames
print(dat1.join(dat2))
# output
dat1 dat2
1 0 NaN
3 1 NaN
5 2 NaN
7 3 NaN
# way2 concat 2 DataFrames
print(pd.concat([dat1,dat2],axis=1))
#output
dat1 dat2
1 0.0 NaN
2 NaN 4.0
3 1.0 NaN
4 NaN 5.0
5 2.0 NaN
6 NaN 6.0
7 3.0 NaN
8 NaN 7.0
#reset index
dat1 = dat1.reset_index(drop=True)
dat2 = dat2.reset_index(drop=True)
#both 2 ways to get the same result
print(dat1.join(dat2))
dat1 dat2
0 0 4
1 1 5
2 2 6
3 3 7
print(pd.concat([dat1,dat2],axis=1))
dat1 dat2
0 0 4
1 1 5
2 2 6
3 3 7
Perhaps too simple by anyways...也许无论如何都太简单了......
dat1 = pd.DataFrame({'dat1': [9,5]})
dat2 = pd.DataFrame({'dat2': [7,6]})
dat1['dat2'] = dat2 # Uses indices from dat1
Result:结果:
dat1 dat2
0 9 7
1 5 6
You can assign a new column.您可以分配一个新列。 Use indices to align correspoding rows:
使用索引对齐相应的行:
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [10, 20, 30]}, index=[0, 1, 2])
df2 = pd.DataFrame({'C': [100, 200, 300]}, index=[1, 2, 3])
df1['C'] = df2['C']
Result:结果:
A B C
0 1 10 NaN
1 2 20 100.0
2 3 30 200.0
Ignore indices:忽略索引:
df1['C'] = df2['C'].reset_index(drop=True)
Result:结果:
A B C
0 1 10 100
1 2 20 200
2 3 30 300
Just a matter of the right google search:只是一个正确的谷歌搜索问题:
data = dat_1.append(dat_2)
data = data.groupby(data.index).sum()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.