简体   繁体   中英

Adding a column to a dataframe in pandas with proper indexing

I have two data frames each of one column. One data frame is "contained" in the other in the sense that the index values and corresponding data are included in the other. In a way, the sub-data frame is just filtered data from the first.

I want to take the data from the sub-data frame and append it to a new column on the super-data frame where the index values in common have the corresponding shared data and the index values that are not in the sub-data frame simply have an np.nan as data.

So far I have df2 "contained" in df1 (ie the filtered data).

column1 = df1[0]
column2 = df2[0]

indx1 = df1.index
indx2 = df2.index

n = len(df1[column1])
m = len(df2[column2])

a = a = np.zeros(n)
a[:] = np.nan

df1['filtered'] = pd.Series(a, index=indx1)

i = 0
j = 0

while i < n:
    while j < m:
        if indx1.values[i] == indx2.values[j]:
            df1['filtered'].set_value(indx1[i], df2[column2].get_value(indx2[j]))
        t = j
        break

    i = i+1
    j = t+1

However this is not working for me, so any advice would be much appreciated. since it is syntactically correct (assuming I wrote it down here correctly), but it just runs forever.

Thanks

I'm trying to guess what you want to achieve.

df1
Out[927]: 
          A
0  0.077544
1  0.450615
2  0.427897
3  0.729260
4  0.679355
5  0.275869
6  0.441755
7  0.996711
8  0.358979
9  0.552371

df2
Out[928]: 
          A
4  0.679355
5  0.275869
6  0.441755
7  0.996711

pd.merge(df1,df2,how='left',left_index=True,right_index=True)
Out[929]: 
        A_x       A_y
0  0.077544       NaN
1  0.450615       NaN
2  0.427897       NaN
3  0.729260       NaN
4  0.679355  0.679355
5  0.275869  0.275869
6  0.441755  0.441755
7  0.996711  0.996711
8  0.358979       NaN
9  0.552371       NaN

You can use concat in this way:

df = pd.concat([df1, df2], axis=1)

df1:
  Column 1
1     0001
2     0001
3     0002
4     0002
5     0003
6     0003
7     0003

df2:
  Column 2
3     0001
4     0001
5     0002

output:
  Column 1 Column 2
1     0001      NaN
2     0001      NaN
3     0002     0001
4     0002     0001
5     0003     0002
6     0003      NaN
7     0003      NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM