简体   繁体   English

计算列表中的子字符串出现在字符串中的次数

[英]Count how many times a substring from list appears in a string

I have a string: 我有一个字符串:

seq = '01234567890123456789'

that I want to first break up into intervals of 4. I found a previous answer 我想先分解为4个间隔。

n = 4    
[seq[i:i+n] for i in range(0,len(seq,n)]

which gives me 这给了我

['0123', '4567', '8901', '2345', '6789']

And now I want to compare my chunked up string to every entry in a list of substrings: 现在,我想将分块后的字符串与子字符串列表中的每个条目进行比较:

mylist = ['0123', '1111' '2345']

and return an array that counts how many times each substring appeared in the original string. 并返回一个数组,该数组计算每个子字符串出现在原始字符串中的次数。 I see a lot of examples of finding one substring in a larger string but I'm confused as to how to do this with a list of substrings 我看到了很多在较大的字符串中找到一个子字符串的示例,但我对如何使用子字符串列表感到困惑

generated = ['0123', '4567', '8901', '2345', '6789']
mylist =  ['0123', '1111' '2345']
result = [generated.count(element) for element in mylist ]

list.count(e) returns number of occurrences of e in list , so we can execute it for each item in mylist . list.count(e)返回liste的出现次数,因此我们可以对mylist每个项目执行它。

In this case result is [1,0,1] which means '0123' appeared in generated once, '1111' did not appear and '2345' appeared once as well. 在这种情况下,结果为[1,0,1] ,这意味着generated一次'0123' ,没有出现'1111'也出现了'2345'

Couldn't figure out from your description which way you want to compare occurrences. 无法从描述中找出要比较出现次数的方式。 If I got it the wrong way around just say and I'll be happy to edit. 如果我弄错了,请说出来,我们将很高兴进行编辑。

chunked = ['0123', '4567', '8901', '2345', '6789']
mylist = ['0123', '1111', '2345']
my_count = {}
for m in mylist:
   for c in chunked:
      if m == c:
          try:
              my_count[m] += 1
          except KeyError:
              my_count[m] = 1

>>> chunked
['0123', '4567', '8901', '2345', '6789']
>>> mylist
['0123', '1111', '2345']
>>> my_count
{'2345': 1, '0123': 1}
>>> 
seq = '01234567890123456789'
chunks = ['0123', '4567', '8901', '2345', '6789']
dict = {}
for elt in chunks:
    dict[elt] = seq.count(elt)

print(dict)

Add a tuple containing the chunk and its count in a list, in this case called counts. 在列表中添加一个包含块及其计数的元组,在这种情况下称为计数。

In [37]: seq = '01234567890123456789'

In [38]: chunks = ['0123', '4567', '8901', '2345', '6789']

In [39]: mylist =  ['0123', '1111' '2345']

In [40]: counts = [(i, mylist.count(i)) for i in chunks]

In [41]: counts
Out[41]: [('0123', 1), ('4567', 0), ('8901', 0), ('2345', 0), ('6789', 0)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM