简体   繁体   中英

The column label 'call_id' is not unique. For a multi-index, the label must be a tuple with elements corresponding to each level

I have two PANDAS data-frames and I need to merge them on call_id. I have done this with different data frames. However, this time when I try

df = pd.merge(labels, sequences, on = "call_id")

I get

The column label 'call_id' is not unique.
For a multi-index, the label must be a tuple with elements corresponding to each level. 
In [231]: labels
Out[231]: 
                      call_id                        confidences
1    6081bdea52c838000aaa53d3   {'1': 0.27, '2': 0.68, '0': 0.5}
2    6081c27bde933a000a4384b0             {'1': 0.73, '2': 0.27}
3    6081c54dd12abf000ab3c6f5             {'0': 0.66, '1': 0.67}
4    6081c666d7a1f7001cecce98             {'0': 0.22, '1': 0.82}
5    6081d8576eb5530043e3401f  {'2': 0.33, '1': 0.66, '0': 0.23}
..                        ...                                ...
480              transcript96             {'0': 0.38, '1': 0.73}
481              transcript97             {'0': 0.78, '2': 0.31}
482              transcript98             {'1': 0.65, '0': 0.46}
483              transcript99             {'2': 0.29, '1': 0.79}
484                     trsc1  {'0': 0.42, '2': 0.27, '1': 0.44}

[484 rows x 2 columns]

In [232]: sequences
Out[232]: 
                      call_id                                         sentiments
1    6081c27bde933a000a4384b0                     PENNNNNEENNPNPEPNPPNNNNNNNNNNN
2    6081c54dd12abf000ab3c6f5                                    NNPNNNPNNNPPNNN
3    6081c666d7a1f7001cecce98                                            NNNNNPP
4    6081d8576eb5530043e3401f  NNNNPNNNNNNNNNNNNNNNNNNPPNNNNNNNNNENNNNNNENNNN...
5    6081d8fb0ef716000a2ef933                 NNNNENNNPNEEENNNNNNNNNNNNNNNNNNPNE
..                        ...                                                ...
465              transcript96                                                NPN
466              transcript97      NNNNNEENNNNENPNNNNENNNNNPNNPNNNNNNNNPENNNPPPP
467              transcript98                 NNNNNNNNENNNPPNNNENNENNENNNENENNNP
468              transcript99                                              PENNN
469                     trsc1                                        NPNPEENEPPN

[469 rows x 2 columns]

You have to call the merge function different:

labels.merge(sequences, how='inner', on='call_id')

Please look in the how= method here: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html to be sure you understand the different options (keep all rows, only rows in the right or left DataFrame etc.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM