[英]How can i extract the spesific data from .txt file?
I'm trying to extract few data from txt file(actually it's pdf file but i couldn't find a way to extract data from pdf so firstly i convert.pdf to.txt) but in that way this is a bit confusing.我正在尝试从 txt 文件中提取少量数据(实际上是 pdf 文件,但我找不到从 pdf 中提取数据的方法,所以首先我将.Z437175BA4191210EE004E1.txt 转换为有点令人困惑的方式) Are there better way to do that maybe module or something.
有没有更好的方法来做到这一点,也许是模块或其他东西。
with open("example.txt","r") as f:
for i in f.readlines():
strings = i.split(" ")
for item in strings:
if item == "Price":
order=strings.index("Price") #i found the index of price
real_price = strings[order+1] #then i took the info that i look for
print(f"Price is {real_price}")
#Price 12,90 that's how looks like in file
I used a regular expression to extract what you want.我使用正则表达式来提取您想要的内容。 Check this out.
看一下这个。
import os
import re
fname = 'example.txt'
path = './'
fpath = os.path.join(path, fname)
regex = r'[pP]rice ([\d,]+)'
# read file
with open(fpath, mode='r') as txt_file:
for line in txt_file.readlines():
# remove leading/trailing characters
line = line.strip()
result = re.search(regex, line)
# if result is not None
if result:
price = result.groups()[0].strip(',')
print(f'Price is {price}')
This is the input text file:这是输入文本文件:
This is a new document
the price of this is high
The specific price 12,90
Hello. Price 20,00.
A new price 30,40, is really high
This is the output:这是 output:
./extract_price.py
Price is 12,90
Price is 20,00
Price is 30,40
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.