简体   繁体   English

如果每个国家/地区代码的数量小于5,则需要警告消息

[英]Need warning message if count of each country code is less than 5

I am trying to get a warning or print message if count or frequency of a particular country code is less than 5. 如果特定国家/地区代码的计数或频率小于5,我会尝试收到警告或打印消息。

QuoteID
1500759-BE
1500759-BE
1500759-BE
1500759-BE
1605101-FR
1605101-FR
1605101-FR
1605119-FR
1605119-FR
1605119-FR
1605119-FR
1605119-FR
1600896-NL
1600896-NL
1600896-NL
1600898-NL
1600898-NL
1600898-NL
1600898-NL
1600898-NL
1600898-NL

Tried the below code 试过下面的代码

chars=('BE','FR','NL')
check_string=OutputData['QuoteID']

for char in chars:
  count = check_string.count(char)
  if count < 5:
    print ('count is less than 5 )

expected result is - "warning 'category BE' has less than 5 records" 预期结果是 - “警告”类别BE'少于5条记录“

OutputData - Data set name OutputData - 数据集名称
QuoteID - variable name QuoteID - 变量名称

values like 1500759-BE is value in variable and frequency or count of 'BE', 'FR' and 'NL' has to be counted and warning message required if count is less than 5. 1500759-BE的值是变量的值,“BE”的频率或计数,必须计算“FR”和“NL”,如果count小于5,则需要警告消息。

Many thanks in advance 提前谢谢了

You can use a Counter provided by Python's collections module to count the occurrences of the elements in a list. 您可以使用Python的collections模块提供的Counter来计算列表中元素的出现次数。 In addition you can extract the country codes given in your sample data by splitting all lines and strip off the last two elements of each line (which is the country code). 此外,您可以通过拆分所有行来提取样本数据中给出的国家/地区代码,并剥离每行的最后两个元素(即国家/地区代码)。

All in all I would suggest something like this: 总而言之,我会建议这样的事情:

from collections import Counter

data = """1500759-BE
1500759-BE
1500759-BE
1500759-BE
1605101-FR
1605101-FR
1605101-FR
1605119-FR
1605119-FR
1605119-FR
1605119-FR
1605119-FR
1600896-NL
1600896-NL
1600896-NL
1600898-NL
1600898-NL
1600898-NL
1600898-NL
1600898-NL
1600898-NL
"""

codes = [l[-2:] for l in data.splitlines()]

c = Counter(codes)

for k,v in c.items():
    if v < 5:
        print('less then 5 items for {}'.format(k))

As you tagged your question with python-2.7 you need to keep in mind to convert the Python3 code I provided to the Python2-equivalent. 当您使用python-2.7标记您的问题时,您需要记住将我提供的Python3代码转换为Python2等效代码。 That said, you need to use print output instead of print(output) and .items() would become .iteritems() . 也就是说,你需要使用print output而不是print(output).items()将成为.iteritems()

what is the type of QuoteID if its type is string then it works fine 什么是QuoteID的类型,如果它的类型是字符串然后它工作正常

alist = "1500759-BE1500759-BE1500759-BE1500759-BE1605101-FR1605101-FR1605101-FR1605119-FR1605119-FR1605119-FR1605119-FR1605119-FR1600896-NL1600896-NL1600896-NL1600898-NL1600898-NL1600898-NL1600898-NL1600898-NL1600898-NL"

chars=('BE','FR','NL')

for char in chars:

count = alist.count(char)

if count < 5:

    print ('count is less than 5' )

    print char

    print "\n"

if works fine for me 如果对我来说很好

You could use str.extract to extract the country codes from each QuoteID string as follows: 您可以使用str.extract从每个QuoteID字符串中提取国家/地区代码,如下所示:

In [16]: df['CountryCode'] = df['QuoteID'].str.extract('(?P<letter>BE|FR|NL)', expand=True)

In [17]: df
Out[17]: 
       QuoteID CountryCode
0   1500759-BE          BE
1   1500759-BE          BE
2   1500759-BE          BE
3   1500759-BE          BE
4   1605101-FR          FR
5   1605101-FR          FR
6   1605101-FR          FR
7   1605119-FR          FR
8   1605119-FR          FR
9   1605119-FR          FR
10  1605119-FR          FR
11  1605119-FR          FR
12  1600896-NL          NL
13  1600896-NL          NL
14  1600896-NL          NL
15  1600898-NL          NL
16  1600898-NL          NL
17  1600898-NL          NL
18  1600898-NL          NL
19  1600898-NL          NL
20  1600898-NL          NL

By using value_counts to compute the counts of unique values, you could then convert the series object to a dictionary by calling to_dict() followed by a list-comprehension to get your desired result. 通过使用value_counts计算唯一值的计数,您可以通过调用to_dict()后跟list-comprehensionseries对象转换为字典,以获得所需的结果。

In [18]: ["count of %s is %d" % (key, value) if value > 5 else   \
         "WARN!: count of category %s is less than 5" % (key)    \
         for key, value in df['CountryCode'].value_counts().to_dict().items()]
Out[18]: 
['WARN!: count of category BE is less than 5',
 'count of NL is 9',
 'count of FR is 8']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM