[英]Extracting information from a text file using Python
I have the below text file with information that looks like this:我有以下文本文件,其中包含如下信息:
# found importantstuffhere
found request could not find identifier. Please check the name and try again.
I also have line that look like this:
# found importantstuffhere
finding (identifier here) with blah blah blah.
I want to write a python code that will go throw the the text file and extract我想编写一个 python 代码,它将 go 抛出文本文件并提取
A. the first example is when the search failed, so I want to extract the 'importantstuffhere' and the phrase 'found request could not find identifier'. A. 第一个例子是搜索失败时,所以我想提取'importantstuffhere'和短语'found request could not find identifier'。
B. when it worked, as shown in second line, I want to extract 'importantstuffhere' and the phrase 'finding (identifier here)' B. 当它起作用时,如第二行所示,我想提取“importantstuffhere”和短语“finding (identifier here)”
Is this possible with python and if so how? python 是否有可能,如果可以,怎么办?
Bonus point:奖励点:
can I have the extracted values be placed in columns in a csv or excel file.我可以将提取的值放在 csv 或 excel 文件的列中吗? such as
如
column A column B A列 B列
importantstuffhere - and then for column B it would say either it found request could not find identifier or it would say finding (identifier here). importantstuffhere - 然后对于 B 列,它会说它找到请求找不到标识符,或者它会说正在查找(此处的标识符)。
Thank you for your time!感谢您的时间!
Note: the # in the text file are part of the text file, I did not write them here just for clarification.注意:文本文件中的#是文本文件的一部分,我没有写在这里只是为了澄清。
Essentially, extract the values needed, add them to a list so that I can later make them columns in a dataframe.本质上,提取所需的值,将它们添加到列表中,以便我以后可以将它们列在 dataframe 中。 perhaps list one has importantstuffhere and list 2 has the results
也许清单一有重要的东西,清单二有结果
script.py:脚本.py:
f = open('sampletext.txt', 'r')
lines = f.readlines()
important_stuff = []
{'line_number': None, 'line_text': ''}
for line_number, text in enumerate(lines):
if text.find('found request could not find identifier') != -1:
important_stuff.append({'line_number': line_number, 'line_text': text})
print(important_stuff)
The following will read a file, gather the lines into one string, and write them to a csv separated by commas:下面将读取一个文件,将这些行收集到一个字符串中,并将它们写入一个用逗号分隔的 csv:
f = open('sampletext.txt', 'r')
lines = f.readlines()
text_seperated_by_comma = ", ".join(lines)
text_without_line_breaks = text_seperated_by_comma.strip('\n')
with open('fileName.csv', 'w') as csv_file:
f = csv_file.write(text_without_line_breaks)
To check for a string then write the next line to csv file I have this:要检查字符串,然后将下一行写入 csv 文件,我有这个:
f = open('sampletext.txt', 'r')
lines = f.readlines()
csv_lines_to_write = []
SEARCH_TEXT = 'importantstuffhere'
for line_number, text in enumerate(lines):
if text.find(SEARCH_TEXT) != -1:
next_line_index = line_number + 1
next_line_text = lines[next_line_index]
assert type(SEARCH_TEXT) is str
assert type(next_line_text) is str
csv_line_to_write = SEARCH_TEXT, + ', ' + lines[next_line_index]
csv_lines_to_write.append(csv_line_to_write)
with open('fileName.csv', 'w') as csv_file:
for line in csv_lines_to_write:
csv_file.write(text_without_line_breaks)
I'm getting error我收到错误
csv_line_to_write = SEARCH_TEXT, + ', ' + lines[next_line_index]
TypeError: bad operand type for unary +: 'str'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.