繁体   English   中英

在过滤后的 dataframe 中搜索特定字符串,然后根据结果创建一个新列(Python/Pandas)

[英]Searching a filtered dataframe for a specific string and then creating a new column based on the results (Python/Pandas)

我正在尝试过滤我的 dataframe(医院)以查找“脑出血”列为真的情况。 然后,我想在 Brain_info 列中搜索特定单词(“cancer”),然后创建一个包含该单词(“cancer”)的新列。

我以前在没有过滤组件的情况下这样做过,但是我在这种情况下遇到了麻烦。

#What I have

| brain bleeding| brain info  |                                 |final diagnosis|
|---------------|-------------|                                 ----------------
| True          | BlahBlahBlah|       I want to add this column |               |
| True          | Cancer      |                                 |Cancer         |
| False         | Cancer      |                                 |               |


#Creating an empty column in my dataframe for the final diagnosis.
hospital["final_diagnosis"] = ""

#Filter cases where brain cancer is True
filt = (hospital["brain_bleeding"] == True)

#Search for the filtered cases if the diagnosis contains "cancer" and adds it to the corresponding "final_diagnosis" cell, if it is there. This is where my error is?
hospital.loc[filt, 'brain_info'].str.contains("cancer", case=False, na=False), "final diagnosis"] = "cancer"

有人可以帮我吗? 谢谢

假设您的文件是:

brain_bleeding  brain_info
True            BlahBlahBlah
True            Cancer
False           Cancer

您可以尝试以下方法:

#!/usr/bin/python
# -*- coding: utf-8 -*-
import pandas as pd

hospital = pd.read_csv('file.csv', sep='\t')

# add True to final_diagnosis column if brain is bleeding and brain info is cancer
hospital.loc[(hospital['brain_bleeding'] == True) & 
             (hospital['brain_info'] == 'Cancer'), 'final_diagnosis'] = True
hospital['final_diagnosis'].fillna('', inplace=True) # replace NaN with empty strings

print(hospital)

Output:


   brain_bleeding    brain_info final_diagnosis
0            True  BlahBlahBlah                
1            True        Cancer            True
2           False        Cancer        

注意:我已经根据您示例中的final_diagnosis列添加了两个条件 - 看起来您可能只需要一个条件(如果需要,免费提供两个删除一个一个)。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM