简体   繁体   English

根据行值使用其他列的名称填充新的 Pandas 数据框列

[英]Populate a new pandas dataframe column with names of other columns based on their row value

I want to add a new column in a dataframe with the names of other columns as values, based on a condition.我想根据条件在数据框中添加一个新列,将其他列的名称作为值。

import pandas as pd
data = pd.DataFrame({
'customer': ['bob', 'jerry', 'alice', 'susan'],
'internet_bill': ['paid', 'past_due', 'due_soon', 'past_due'],
'electric_bill': ['past_due', 'due_soon', 'past_due', 'paid'],
'water_bill': ['paid', 'past_due', 'paid', 'paid']})

Here's the dataframe.这是数据框。

    customer    internet_bill   electric_bill   water_bill
0   bob         paid            past_due        paid
1   jerry       past_due        due_soon        past_due
2   alice       due_soon        past_due        paid
3   susan       past_due        paid            paid

I want to add a new column summarizing what is 'past_due'.我想添加一个新列,总结什么是“过去的到期”。 Here's the desired result:这是想要的结果:

    customer    internet_bill   electric_bill   water_bill  past_due
0   bob         past_due        past_due        past_due    internet_bill, electric_bill, water_bill
1   jerry       past_due        due_soon        past_due    internet_bill, water_bill
2   alice       due_soon        past_due        paid        electric_bill
3   susan       past_due        paid            paid        internet_bill

I was able to do this in Excel with the following formula:我能够使用以下公式在 Excel 中执行此操作:

=TEXTJOIN(","&CHAR(10),TRUE,
IF(B2=Values!$A$1,$K$1,""),
IF(C2=Values!$A$1,$L$1,""),
IF(D2=Values!$A$1,$M$1,""))

Ultimately, my output will be an excel file for some nurses & hospital workers to follow up with patients (not bill collecting! Patient care stuff).最终,我的输出将是一个 excel 文件,供一些护士和医院工作人员跟进患者(不是账单收集!患者护理的东西)。 I have thought about using an excel writer library to just create an .xlsx and insert formulas.我曾考虑使用 excel 编写器库来创建 .xlsx 并插入公式。

AND - I was able to do this to catch one column, but my gut tells me there's a much better way.并且 - 我能够做到这一点来捕捉一列,但我的直觉告诉我有更好的方法。 Here's what I used to do that:这是我过去常常这样做的:

both['past_due'] = [
'internet_bill' if x == 'PAST_DUE' 
else 'None' for x in df['internet_bill']]

This would basically check the row in each targeted column if that row contained 'PAST_DUE', and if so, it would return the column name, move on to the next column, check for past due, add the column name.如果该行包含“PAST_DUE”,这将基本上检查每个目标列中的行,如果是,它将返回列名,移至下一列,检查逾期,添加列名。

I have had no success in finding anything close to this with searches, probably due to struggling to form a good question in the search bar.我在搜索中没有找到与此接近的任何内容,这可能是由于在搜索栏中努力形成一个好问题。 I haven't found any questions where someone is trying to pull other column names as a value based on a condition.我没有发现任何问题,有人试图根据条件将其他列名作为值。

Thanks for any help!谢谢你的帮助!

  >>>data['past_due'] = data.apply(lambda x: tuple(x[x == 'past_due'].index), 
  axis=1)
  >>>data
  Out[75]: 
    customer             ...                                  past_due
  0      bob             ...                          (electric_bill,)
  1    jerry             ...               (internet_bill, water_bill)
  2    alice             ...                          (electric_bill,)
  3    susan             ...                          (internet_bill,)
  [4 rows x 5 columns]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据条件和前一行值从其他列填充 Pandas Dataframe 列 - Populate Pandas Dataframe column from other columns based on a condition and previous row value 根据与其他列名称匹配的列值填充 Pandas Dataframe - Populate Pandas Dataframe Based on Column Values Matching Other Column Names Pandas 根据另一个数据框中的匹配列填充新的数据框列 - Pandas populate new dataframe column based on matching columns in another dataframe 根据其他列名创建自定义列名 pandas dataframe - Creating custom names for columns based on other column names in pandas dataframe 如何在新列中填充值 - How to populate values inside a new column based values from other columns in a dataframe in Pandas 通过合并其他列并根据先前的列名重命名,将新列添加到pandas数据框中 - Adding a new column to the pandas dataframe by merging other columns and renaming based on previous column names 根据其他列行中的过滤值在 pandas dataframe 中创建一个新列 - Create a new Column in pandas dataframe based on the filetered values in the row of other columns Pandas DataFrame使用其他列的名称聚合列作为值 - Pandas DataFrame aggregated column with names of other columns as value Pandas dataframe select 列基于其他 Z6A8064B5DF479455500553 列中的值47DC - Pandas dataframe select Columns based on other dataframe contains column value in it 如何根据其他列向pandas数据帧添加新行? - how to add new row to pandas dataframe based on other columns?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM