简体   繁体   English

根据列值从 df 访问一行

[英]Access a row from a df based on a column value

I am trying to find out the rows where the reliability is <0.70, but the output seems to include rows where Reliability is 0.70 as well.我试图找出可靠性 <0.70 的行,但 output 似乎也包括可靠性为 0.70 的行。 What could be wrong?有什么问题?


Original DF:原DF:


po_id po_name product year measure rate denominator numerator is_reported reliability po_id po_name 产品年份 衡量率 分母 分子 is_reported 可靠性


0 1051408 Aberdeen Care Alliance Commercial HMO/POS 18 CHLAMSCR 67.740000 62.0 42.0 True NaN 1 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 AMROV64 80.000000 20.0 16.0 True NaN 2 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 CISCOMBO10 17.650000 34.0 6.0 True NaN 3 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 OFCSTAFF 69.440000 NaN NaN True 0.76 4 1051408 Aberdeen Care Alliance Commercial HMO/POS 18 BCS5274 86.420000 302.0 261.0 True NaN 5 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 SPD1 57.810000 64.0 37.0 True NaN 6 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 PDCS 79.530000 127.0 101.0 True NaN 7 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 TCOC_250K_GEO_RISKADJ 289.281096 NaN NaN False NaN 8 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 CBPD4 67.440000 129.0 87.0 True NaN 9 1051408 Aberdeen Care Alliance Commercial HMO/POS 18 COORDINATE3 55.370000 NaN NaN True 0.74 0 1051408 Aberdeen Care Alliance Commercial HMO/POS 18 CHLAMSCR 67.740000 62.0 42.0 True NaN 1 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 AMROV64 80.000000 20.0 16.0 True NaN 2 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 CISCOMBO10 17.650000 34.0 6.0 True NaN 3 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 OFCSTAFF 69.440000 NaN NaN True 0.76 4 1051408 Aberdeen Care Alliance Commercial HMO/POS 18 BCS5274 86.420000 302.0 261.0 True NaN 5 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 SPD1 57.810000 64.0 37.0 True NaN 6 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 PDCS 79.530000 127.0 101.0 True NaN 7 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 TCOC_250K_GEO_RISKADJ 289.281096 NaN NaN False NaN 8 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 CBPD4 67.440000 129.0 87.0 True NaN 9 1051408 Aberdeen Care Alliance Commercial HMO/POS 18 COORDINATE3 55.370000 NaN NaN True 0.74


Code added to locate where reliability is less than 0.70 awards_test_df.loc[awards_test_df['reliability'] <0.70]添加代码以定位可靠性小于 0.70 的位置 Awards_test_df.loc[awards_test_df['reliability'] <0.70]


Output: Output:

po_id   po_name product year    measure rate    denominator numerator is_reported   reliability

191 1008200 Advancements Physicians Medical Center  Commercial HMO/POS  18  ACCESS3 58.13   NaN NaN True    0.60
515 1021102 Baird Medical Group Commercial HMO/POS  18  COORDINATE3 60.02   NaN NaN True    0.70
... ... ... ... ... ... ... ... ... ... ...
8606    1038400 Vf Healthcare   Commercial HMO/POS  18  OFCSTAFF    68.78   NaN NaN True    0.70
8620    1038400 Vf Healthcare   Commercial HMO/POS  18  MDINTERACT3 79.57   NaN NaN True    0.70
8800    1006001 Viva Physicians Commercial HMO/POS  18  ACCESS3 66.25   NaN NaN True    0.70
8869    1017708 Waltz Hospital  Commercial HMO/POS  19  MDINTERACT3 81.01   NaN NaN True    0.70
9142    1028100 Zeke Medical Group  Commercial HMO/POS  18  ACCESS3 56.37   NaN NaN True    0.70

Your code seems perfectly fine when reproducing it:复制代码时,您的代码看起来非常好:

import pandas as pd
data = [ { "po_id": 191, "po_name": 1008200, "product": "Advancements Physicians Medical Center  Commercial HMO/POS", "year": 18, "measure": "ACCESS3", "rate": 58.13, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.6 }, { "po_id": 515, "po_name": 1021102, "product": "Baird Medical Group Commercial HMO/POS", "year": 18, "measure": "COORDINATE3", "rate": 60.02, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 }, { "po_id": 8606, "po_name": 1038400, "product": "Vf Healthcare   Commercial HMO/POS", "year": 18, "measure": "OFCSTAFF", "rate": 68.78, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 }, { "po_id": 8620, "po_name": 1038400, "product": "Vf Healthcare   Commercial HMO/POS", "year": 18, "measure": "MDINTERACT3", "rate": 79.57, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 }, { "po_id": 8800, "po_name": 1006001, "product": "Viva Physicians Commercial HMO/POS", "year": 18, "measure": "ACCESS3", "rate": 66.25, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 }, { "po_id": 8869, "po_name": 1017708, "product": "Waltz Hospital  Commercial HMO/POS", "year": 19, "measure": "MDINTERACT3", "rate": 81.01, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 }, { "po_id": 9142, "po_name": 1028100, "product": "Zeke Medical Group  Commercial HMO/POS", "year": 18, "measure": "ACCESS3", "rate": 56.37, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 } ]
awards_test_df = pd.DataFrame(data)
awards_test_df.loc[awards_test_df['reliability'] <0.70]

Output: Output:

|    |   po_id |   po_name | product                                                    |   year | measure   |   rate |   denominator |   numerator | is_reported   |   reliability |
|---:|--------:|----------:|:-----------------------------------------------------------|-------:|:----------|-------:|--------------:|------------:|:--------------|--------------:|
|  0 |     191 |   1008200 | Advancements Physicians Medical Center  Commercial HMO/POS |     18 | ACCESS3   |  58.13 |           nan |         nan | True          |           0.6 |

It's just display formats.它只是显示格式。 Try below to check试试下面检查

df = pd.DataFrame({"reliability":np.random.uniform(.65,.75, 100)})

df = df.loc[df.reliability.lt(.7)].assign(twodp=df.reliability.round(2)).query("twodp.eq(.7)")


reliability可靠性 twodp twodp
0.695661 0.695661 0.7 0.7
0.699588 0.699588 0.7 0.7
0.698993 0.698993 0.7 0.7
0.697933 0.697933 0.7 0.7
0.698356 0.698356 0.7 0.7
0.699906 0.699906 0.7 0.7
0.695279 0.695279 0.7 0.7

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据条件从第一个 df 到另一个 df 的列值 - Column value from first df to another df based on condition 遍历 DF 列的行并根据条件更改值 - Iterate Through Row of a DF Column and change value based on a condition 根据列值创建列表并使用该列表从 df 中的字符串列中提取单词,而不用 for 循环覆盖行值 - Create list based on column value and use that list to extract words from string column in df without overwriting row value with for loop Pandas df 根据与不同列中的行匹配的字典中的值更改一列中的行的值 - Pandas df change the value of a row in one column based on a value in a dictionary matching a row in a different column Python 3:按最后一行的列值对 DF 的字典进行排序 - Python 3: Sort Dict of DF's by Column Value from Last Row 从 pandas df 获取上一行/特定列的值 - Get value of previous row / specific column from a pandas df 根据其他df的值计算新列值 - Calculate new column value based on values from other df 如果列值不在 df2 列中,则获取 df1 的行 - Get row of df1 if column value not in column df2 使用 df2 中的值,其中行值与 df1 列名匹配 - Use values from the df2 where row value matches df1 column name 迭代df列并根据行索引,列引用返回数据帧中的值 - iterate through df column and return value in dataframe based on row index, column reference
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM