The column label 'call_id' is not unique. For a multi-index, the label must be a tuple with elements corresponding to each level

Question

I have two PANDAS data-frames and I need to merge them on call_id. I have done this with different data frames. However, this time when I try

df = pd.merge(labels, sequences, on = "call_id")

I get

The column label 'call_id' is not unique.
For a multi-index, the label must be a tuple with elements corresponding to each level.

In [231]: labels
Out[231]: 
                      call_id                        confidences
1    6081bdea52c838000aaa53d3   {'1': 0.27, '2': 0.68, '0': 0.5}
2    6081c27bde933a000a4384b0             {'1': 0.73, '2': 0.27}
3    6081c54dd12abf000ab3c6f5             {'0': 0.66, '1': 0.67}
4    6081c666d7a1f7001cecce98             {'0': 0.22, '1': 0.82}
5    6081d8576eb5530043e3401f  {'2': 0.33, '1': 0.66, '0': 0.23}
..                        ...                                ...
480              transcript96             {'0': 0.38, '1': 0.73}
481              transcript97             {'0': 0.78, '2': 0.31}
482              transcript98             {'1': 0.65, '0': 0.46}
483              transcript99             {'2': 0.29, '1': 0.79}
484                     trsc1  {'0': 0.42, '2': 0.27, '1': 0.44}

[484 rows x 2 columns]

In [232]: sequences
Out[232]: 
                      call_id                                         sentiments
1    6081c27bde933a000a4384b0                     PENNNNNEENNPNPEPNPPNNNNNNNNNNN
2    6081c54dd12abf000ab3c6f5                                    NNPNNNPNNNPPNNN
3    6081c666d7a1f7001cecce98                                            NNNNNPP
4    6081d8576eb5530043e3401f  NNNNPNNNNNNNNNNNNNNNNNNPPNNNNNNNNNENNNNNNENNNN...
5    6081d8fb0ef716000a2ef933                 NNNNENNNPNEEENNNNNNNNNNNNNNNNNNPNE
..                        ...                                                ...
465              transcript96                                                NPN
466              transcript97      NNNNNEENNNNENPNNNNENNNNNPNNPNNNNNNNNPENNNPPPP
467              transcript98                 NNNNNNNNENNNPPNNNENNENNENNNENENNNP
468              transcript99                                              PENNN
469                     trsc1                                        NPNPEENEPPN

[469 rows x 2 columns]

Answer 1

You have to call the merge function different:

labels.merge(sequences, how='inner', on='call_id')

Please look in the how= method here: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html to be sure you understand the different options (keep all rows, only rows in the right or left DataFrame etc.)

The column label 'call_id' is not unique. For a multi-index, the label must be a tuple with elements corresponding to each level

Question

1 answers

solution1
0 2022-01-17 07:44:06

The column label 'call_id' is not unique. For a multi-index, the label must be a tuple with elements corresponding to each level

Question

1 answers

solution1 0 2022-01-17 07:44:06

solution1
0 2022-01-17 07:44:06