简体   繁体   English

用条件计算数据框中的行数

[英]count number of rows in a dataframe with conditions

I have some issues in a code where i want to fill a dataframe, depending on another one.我在代码中有一些问题,我想填充一个数据框,这取决于另一个。 To explain, in a dataframe I have replacements of components classified with codes to know their specific emplacements.为了解释一下,在数据框中,我替换了用代码分类的组件以了解它们的具体位置。 I want to be able to count how many replacement I have and put this number in another dataframe.我希望能够计算出我有多少替代品并将这个数字放在另一个数据框中。 this part of my code looks like that:我的这部分代码看起来像这样:

import plotly.express as px

import pandas as pd

import numpy as np

#import excel from database

d=pd.read_excel("replacements.xlsx")
df=pd.DataFrame(d)

#we create 3 dataframe to put respectively number of replacements, percentages and failures rates.Here, we focus on the number of replacements, because it will be another process to fill the others. #我们创建了3个dataframe,分别放置了替换次数、百分比和失败率。这里,我们关注替换次数,因为这将是另一个过程来填充其他人。

tab_nb_replacements=pd.DataFrame(columns=['electrical auxiliary power supply','process monitoring','wind turbine system','generator system','transmission of electrical energy','structures connected to production','auxiliary systems'], index=['falaise_nb_replacements',...,'quittebeuf_nb_replacements])
As you can see, only some ligns are presented. 如您所见,只显示了一些 ligns。 Below, i fill with zero all the index 'falaise_nb_replacements' with 0 (I did it also for all indexes). 下面,我将所有索引 'falaise_nb_replacements' 都填充为零(我也为所有索引做了它)。
tab_nb_replacements['electrical auxiliary power supply']['falaise_nb_replacements']=np.where(((df['RDSPP code'].str[:1]=='B') & (df['WTName']=='Falaise')),tab_nb_replacements['electrical auxiliary power supply']['falaise_nb_replacements']+1,tab_nb_replacements['electrical auxiliary power supply']['falaise_nb_replacements'])

########### I tried different ways to obtain the number of replacements ###### ########### 我尝试了不同的方法来获取替换次数######

##NOTE: for the site falaise, we want to select a lign when the value in the column 'RDSPP code' starts with 'B' and when the value in the column 'WTName' is 'Falaise'. ##注意:对于站点 falaise,当“RDSPP 代码”列中的值以“B”开头且“WTName”列中的值为“Falaise”时,我们希望选择一个对齐。

##first method ##第一种方法

tab_nb_replacements['electrical auxiliary power supply']['falaise_nb_replacements']=(df[df['RDSPP code'].str[:1]=='B' and df['WTName']=='Falaise']).count()

#second method #第二种方法

"ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."

#third method #第三种方法

tab_nb_replacements['electrical auxiliary power supply']['falaise_nb_replacements']=(df[df['RDSPP code'].str[:1]=='B' and df['WTName']=='Falaise']).count()

Any of these methods gave me results.这些方法中的任何一种都给了我结果。 Indeed with these methods, I obtain:确实通过这些方法,我获得了:

 "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."

If anybody has a solution or some advices on it, it will be really helpful!如果有人有解决方案或一些建议,那将非常有帮助!

Bests,最好的,

For third method you mentioned df['WTName']=='Falaise' will give error, because df['WTName'] 's data type is pandas.series and you can't compare it with string .对于您提到的第三种方法df['WTName']=='Falaise'会出错,因为df['WTName']的数据类型是pandas.series并且您无法将其与string进行比较。 So you must cast it to string like below:因此,您必须将其转换为如下所示的字符串:

df['WTName'].astype(str) == 'Falaise'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM