简体   繁体   English

如何将 Pandas 中的多个列合并为一个系列?

[英]How to merge multiple columns in Pandas with a single series?

I have a dataframe (t) of codes for patients surgeries.我有一个 dataframe (t) 患者手术代码。 On any hospital admission they can have 5 surgeries or combination of surgeries, - the index on the left column is the indiviudal patient.在任何一次入院时,他们都可以进行 5 次手术或组合手术,- 左栏中的索引是个体患者。 I want to add the text description for all 5 surgeries to indivisual new columns.我想将所有 5 个手术的文本描述添加到单独的新列中。


        | OPERTN_01 | OPERTN_02 | OPERTN_03 | OPERTN_04 | OPERTN_05 |
| ------ | --------- | --------- | --------- | --------- | --------- |
| 85     | B041      | Y766      | Z943      | NaN       | NaN       |
| 144    | B041      | Y766      | Y539      | NaN       | NaN       |
| 260    | B041      | Y766      | NaN       | NaN       | NaN       |
| 276    | B041      | Y766      | NaN       | NaN       | NaN       |
| 345    | B041      | Y766      | NaN       | NaN       | NaN       |
| ...    | ...       | ...       | ...       | ...       | ...       |
| 557445 | B041      | Y461      | L714      | Z954      | Z942      |
| 557525 | B041      | Y766      | NaN       | NaN       | NaN       |
| 557533 | B041      | Y766      | E158      | Y766      | Y261      |
| 557765 | B041      | Y766      | NaN       | NaN       | NaN       |
| 557832 | B041      | Y766      | U051      | Y973      | Y981      |


I want to merge the code from all 5 columns ( also handing the null values) with the text description from this dataframe (opcs_short)我想将所有 5 列的代码(也传递 null 值)与来自这个 dataframe (opcs_short) 的文本描述合并

|      | opcs_4.9str | Description                                       |
| 0    | A011        | A01.1: Hemispherectomy                            |
| 1    | A012        | A01.2: Total lobectomy of brain                   |
| 2    | A013        | A01.3: Partial lobectomy of brain                 |
| 3    | A018        | A01.8: Other specified major excision of tissu... |
| 4    | A019        | A01.9: Unspecified major excision of tissue of... |
| ...  | ...         | ...                                               |
| 9673 | O439        | O43.9: Part of heart NEC                          |
| 9674 | O451        | O45.1: Bifurcation of aorta                       |
| 9675 | O452        | O45.2: Juxtarenal abdominal aorta                 |
| 9676 | O458        | O45.8: Specified other aorta NEC                  |
| 9677 | O459        | O45.9: Other aorta NEC                            |

I tried to do this using this code,我试着用这段代码来做到这一点,

t2 = t.merge(opcs_short, how = 'left',  left_on =['OPERTN_01','OPERTN_02', 'OPERTN_03', 'OPERTN_04', 'OPERTN_05'],  
              right_on =['opcs_4.9str','opcs_4.9str','opcs_4.9str','opcs_4.9str','opcs_4.9str'])

This produces这产生


|      | OPERTN_01 | OPERTN_02 | OPERTN_03 | OPERTN_04 | OPERTN_05 | opcs_4.9str | Description |
| 0    | B041      | Y766      | Z943      | NaN       | NaN       | NaN         | NaN         |
| 1    | B041      | Y766      | Y539      | NaN       | NaN       | NaN         | NaN         |
| 2    | B041      | Y766      | NaN       | NaN       | NaN       | NaN         | NaN         |
| 3    | B041      | Y766      | NaN       | NaN       | NaN       | NaN         | NaN         |
| 4    | B041      | Y766      | NaN       | NaN       | NaN       | NaN         | NaN         |
| ...  | ...       | ...       | ...       | ...       | ...       | ...         | ...         |
| 6410 | B041      | Y461      | L714      | Z954      | Z942      | NaN         | NaN         |
| 6411 | B041      | Y766      | NaN       | NaN       | NaN       | NaN         | NaN         |
| 6412 | B041      | Y766      | E158      | Y766      | Y261      | NaN         | NaN         |
| 6413 | B041      | Y766      | NaN       | NaN       | NaN       | NaN         | NaN         |
| 6414 | B041      | Y766      | U051      | Y973      | Y981      | NaN         | NaN         |

So nothing has merged.所以什么都没有合并。 I am not sure why but I know I haven't handled the null values.我不确定为什么,但我知道我没有处理 null 值。 Some patients have only one simple surgery and the rest of the columns are empty so I don't want to drop them.有的病人只有一个简单的手术,rest栏是空的,我不想掉。 TBH I am not sure using merge is the right approach here but don't have enough knowledge to know if a eg dictionary technique would be a better way. TBH 我不确定在这里使用合并是正确的方法但是没有足够的知识来知道例如字典技术是否是更好的方法。 The code description dataframe though has 19,000 records.代码描述 dataframe 虽然有 19,000 条记录。

I think a dictionary-based replace method is what you're after.我认为基于字典的替换方法就是您所追求的。 Does the following achieve your desired result?以下是否达到您想要的结果?

code_map = {i[1]: i[2] for i in opcs_short.to_records()}

for col in t.columns:
    t[col + " Description"] = t[col].replace(code_map)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM