简体   繁体   English

读取文件中的行,如果包含字符串则打印行

[英]Read line in file, print line if it contains string

I have a working code that opens a file, looks for a string, and prints the line if it contains that string. 我有一个工作代码,可以打开一个文件,查找一个字符串,并在包含该字符串的情况下打印该行。 I'm doing this so that I can decide, manually, whether the line should be removed from my dataset or not. 我这样做是为了可以手动决定是否应从数据集中删除该行。

But it would be much better if I can tell the program to read the part of the line that contains the string that is between two commas. 但是,如果我可以告诉程序读取包含两个逗号之间的字符串的行部分,那就更好了。

The code I have now (see below) 我现在拥有的代码(见下文)

with open("dvd.txt") as f:
    for num, line in enumerate(f, 1):
        if " arnold " in line:
            num = str(num)
            print line + '' + num

Prints each line like this: 像这样打印每一行:

77.224998664,2014-10-19,386.5889,the best arnold ***** ,81,dvd-action,Cheese 5gr,online-dvd-king93,0.19976,18,/media/removable/backup/2014-10-19/all_items/cheese-5gr?feedback_page=1.html,    ships from: Germany    ships to: Worldwide  ,2014-07-30,online-dvd-king,93 1

I'd like it to print this instead: 我希望它打印出来:

,the best arnold ***** , 1

or 要么

the best arnold *****  1

I read this question, but I hope to avoid using CSV. 我读了这个问题,但我希望避免使用CSV。

If it is for whatever reason tricky to find the text between commas, or any other specific characters, it'd be useful to print the 3 words before and after the string I'm looking for. 如果由于某种原因而难以在逗号或其他任何特定字符之间查找文本,则在要查找的字符串前后打印这3个字会很有用。

This is very simple to do with str.split() . 使用str.split()非常简单。 Modifying your function as follows will produce the output you want. 如下修改函数将产生所需的输出。

with open("dvd.csv") as f:
    for num, line in enumerate(f, 1):
        if " arnold " in line:
            num = str(num)
            print line.split(',')[3] + '' + num 

str.split splits up a string into a list by the specified separator. str.split通过指定的分隔符将字符串分成列表。 To access the list entry you want, simply supply the appropriate index (which in your case should be 3). 要访问所需的列表条目,只需提供适当的索引(在您的情况下为3)。

As an aside, you can produce your output with the str.format() method to make it a little nicer: str.format() ,您可以使用str.format()方法产生输出,以使其更加str.format()

print "{} {}".format(line.split(',')[3], num)

This will also allow you to remove num = str(num) since the format method can handle multiple datatypes (as opposed to string concatenation which cannot). 这也使您可以删除num = str(num)因为format方法可以处理多种数据类型(与不能进行字符串连接的情况相反)。

As an alternative, you could make use of a regular expression as follows: 或者,您可以使用如下正则表达式:

with open("dvd.txt") as f:
    for num, line in enumerate(f, 1):
        re_arnold = re.search(r',\s*([^,]*?arnold[^,]*?)\s*,', line)

        if re_arnold:
            print '{} {}'.format(re_arnold.group(1), num)

This would then extract the whole entry (between the commas) regardless of which field it is in. 然后,这将提取整个条目(逗号之间),而不管其位于哪个字段中。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何逐行读取文件,然后在Python中包含包含值范围的字符串的情况下打印该行? - How to read a file line by line, then print the line if it contains a string with a range of value in Python? 逐行读取文件并打印包含两个不同字符串的文件 - Read file line by line and print if it contains two different strings 如果当前行包含字符串,如何从文件中打印下一行? - How to print the next line from a file, if current line contains a string? Python逐行读取文件并打印以进行调试 - Python read file line by line and print for debug 将文件中的行读入字符串 - read line in file into string 如何逐行读取文件并仅在python中打印具有特定字符串的行? - How do I read a file line by line and print the line that have specific string only in python? python读取文件,打印以特定字符串开头的行的一部分 - python read file, print part of line that begins with a specific string 如何读取包含字符串的行,然后提取不包含此字符串的行 - How to read the line that contains a string then extract this line without this string 如何从.INI文件中读取字符串,然后从文本文件中读取该字符串并使用Python打印整行? - How to read the string from an .INI file and then read that string from a text file and print the entire line using Python? 如果下一行包含特定字符串,则在文件的新行中添加字符串 - Adding string in new line of file if next line contains a specific string
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM