[英]Counting the number of integer values in a column depending on the value of another column in pandas
Hi i have a column which has values in both integer and text, I am trying to write a function in python where i can get the count of only integer values corresponding to another value in another column. Hi i have a column which has values in both integer and text, I am trying to write a function in python where i can get the count of only integer values corresponding to another value in another column.
hear is the sample of the data,听到是数据的样本,
constructorId positionText
1 3
1 4
1 R
4 6
4 5
4 N
4 9
and i want the result to be like this我希望结果是这样的
constructorID positionText_count
1 2
4 3
here is the code这是代码
def not_finished(c):
r = 0
for c in hybrid_era_results['constructorId']:
y = hybrid_era_results['positionText']
if isinstance(y, int):
r = r+1
return r
this code does not throw an error but when i call the function it always returns the value 0. What am i doing wrong?此代码不会引发错误,但是当我调用 function 时,它总是返回值 0。我做错了什么?
IIUC, you can use to_numeric
to filter the numeric values, then groupby.sum
: IIUC,您可以使用to_numeric
过滤数值,然后groupby.sum
:
(pd.to_numeric(df['positionText'], errors='coerce')
.notna()
.groupby(df['constructorId'])
.sum()
.reset_index()
)
output: output:
constructorId positionText
0 1 2
1 4 3
To set up a custom column name, use .agg(positionText_count='sum')
in place of .sum()
(or .agg(**{'positionText_count': 'sum'})
if you want spaces of special characters):要设置自定义列名,请使用.agg(positionText_count='sum')
代替.sum()
(如果需要特殊字符的空格,请使用.agg(**{'positionText_count': 'sum'})
) :
constructorId positionText_count
0 1 2
1 4 3
As there are no clear details about the data, here a simple example to check the existance of intger values from col1 in col2, where both columns are in a str type.由于没有关于数据的明确细节,这里有一个简单的例子来检查 col2 中 col1 的整数值是否存在,其中两列都是 str 类型。
import pandas as pd
# dimple example df
df = pd.DataFrame({"col1": ["value1", "4", "5", "3"], "col2": ["3", "6", "value2", "4"]})
# get the index where we have integer in both columns
idx_col1 = df.col1.str.isdigit()
idx_col2 = df.col2.str.isdigit()
# retrieve integer values
df_col1_int = df.loc[idx_col1, "col1"]
df_col2_int = df.loc[idx_col2, "col2"]
# get only values from col1 tht exists in col2 (the same index is not required)
idx_exists = df_col1_int.isin(df_col2_int)
df_exists = df_col1_int[idx_exists]
# get the number of integers in common
len(df_exists)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.