简体   繁体   中英

Pandas: Copy column-wise values to rows in second dataframe (rephrased)

Rephrasing this a bit, sorry it was a bit unclear earlier.

Consider this data:

import pandas as pd

d = {
    "id": [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3],
    "col_to_fill": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
}

df = pd.DataFrame(d)

df

    id  col_to_fill
0   1   0
1   1   0
2   1   0
3   1   0
4   1   0
5   2   0
6   2   0
7   2   0
8   2   0
9   2   0
10  3   0
11  3   0
12  3   0
13  3   0
14  3   0  



d1 = {
    "id": [1, 2, 3],
    "val1": [23, 23, 0],
    "val2": [42, 44, 9],
    "val3": [12, 8, 55],
    "val4": [2, 88, 21],
    "val5": [53, 2, 33]
}

df2 = pd.DataFrame(d1)

df2

   id   val1    val2    val3    val4    val5
    1   23      42      12      2       53
    2   23      44      8       88      2
    3   0       9       55      21      33
    4   0       9       55      21      33

..............

In df I have rows with IDs that are duplicated down N times for each unique ID (N=5 in this case but want it to work for 21 in the real use case, or any if possible).

in df2 I have the same unique IDs in col 1, and some values column-wise (val1, val2, etc.)

Goal:

  • I want to have val1, val2, val3, val4, val5 down in each row in df for each instrument.

For ID 1 and 2:

id  col_to_fill
1   23
1   42
1   12
1   2
1   53
2   23
2   44
2   8
2   88
2   2

And so on......

This won't work:

df2.melt(id_vars=['id'])

Because that will be

id  variable
1   val1
1   val1
1   val1
1   val1
1   val1
2   val2
2   val2
2   val2
2   val2
2   val2

I need:

id  variable
1   val1
1   val2
1   val3
1   val4
1   val5
2   val1
2   val2
2   val3
2   val4
2   val5

(in addition to the actually values of course, but I want the values of those variables)

Use DataFrame.set_index with DataFrame.stack :

df2 = df2.set_index('id').stack().rename_axis(['id','new']).reset_index().drop(0, axis=1)
print (df2)
    id   new
0    1  val1
1    1  val2
2    1  val3
3    1  val4
4    1  val5
5    2  val1
6    2  val2
7    2  val3
8    2  val4
9    2  val5
10   3  val1
11   3  val2
12   3  val3
13   3  val4
14   3  val5

Then if need add to another DataFrame with different size is psosible use GroupBy.cumcount with DataFrame.merge trick:

d = {
    "id": [1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3],
    "col_to_fill": [0, 0, 0, 0, 0, 0, 0, 0, 0,0,0]
}

df = pd.DataFrame(d)

df['g'] = df.groupby('id').cumcount()
df2['g'] = df2.groupby('id').cumcount()

df = df.merge(df2, on=['id','g'])
print (df)
    id  col_to_fill  g   new
0    1            0  0  val1
1    1            0  1  val2
2    1            0  2  val3
3    1            0  3  val4
4    2            0  0  val1
5    2            0  1  val2
6    2            0  2  val3
7    2            0  3  val4
8    3            0  0  val1
9    3            0  1  val2
10   3            0  2  val3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM