简体   繁体   English

定义一个函数来计算文件中的行数,包含某个子字符串

[英]Defining a function to count the number of lines in a file, containing a certain substring

I'm kinda new to Python.我对 Python 有点陌生。 I'm trying to define a function that can count the number of lines in a file, containing a particular substring.我正在尝试定义一个函数,该函数可以计算包含特定子字符串的文件中的行数。 I also want to count the lines which have multiple values of my substring as just 1.我还想将具有多个子字符串值的行计算为 1。

Here's my code:这是我的代码:

def CLT(filename):
    with open(filename,'r') as f:
        pattern='ing'
        count=a=0
        k=f.readlines()
        for line in k:
            if pattern in k[a:]:
                count += 1
        return count

print( CLT('random_file.txt') )

Assume that my file has 25 instances where a string 'str' appears but it has 2 lines where 2 'str' appear on the same line.假设我的文件有 25 个实例,其中出现字符串 'str' 但它有 2 行,其中 2 个 'str' 出现在同一行。 So the ideal output to this problem should be 23.所以这个问题的理想输出应该是 23。

But its returning 0 as the number of lines.但它返回 0 作为行数。 I also recognize that my code doesn't do the part where the lines with multiple substrings will be counted as just 1 count.我也认识到我的代码没有执行将具有多个子字符串的行计算为 1 个计数的部分。 What can I do to improve this code?我可以做些什么来改进此代码?

Here is the code you might want to try,这是您可能想尝试的代码,

def CLT(filename):
    with open(filename, 'r') as f:
        pattern = 'ing'
        count = 0
        for line in f:
            if pattern in line:
                count += 1
        return count


print(CLT('random_file.txt'))

Hope this helps you!希望这对你有帮助!

You've got a slight error in your code:您的代码中有一个小错误:

if pattern in k[a:]:

should be:应该:

if pattern in line[a:]:

It looks like you're positioning yourself to use a to keep track of when you've already found the string in the line and you're now looking for an additional occurrence, but if not, you should remove it as it complicates the logic.看起来您正在定位自己以使用a来跟踪您何时已经在该行中找到该字符串并且您现在正在寻找其他事件,但如果没有,您应该将其删除,因为它使逻辑复杂化.

Otherwise, if you use a to show the index of where you already found an occurrence of the string in the line, you need to make sure to start looking again at index a + 1 so that you don't find the same occurrence again and again and end up in an infinite loop when you add a loop to check for further occurrences in the same line.否则,如果您使用a来显示您已经在该行中找到该字符串出现的位置的索引,则需要确保再次开始查看索引a + 1以便您不会再次找到相同的出现并且当您添加一个循环来检查同一行中的其他事件时,再次并以无限循环结束。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM