简体   繁体   English

在文件中查找多行文本

[英]Finding text with multiple line in a file

I need to extract the fields name between opening and closing round brackets of the unique index and this can span in 2 or 3 or multiple lines我需要提取唯一索引的开始和结束圆括号之间的字段名称,这可以跨越 2 行或 3 行或多行

Here is the content of the file:这是文件的内容:

create index "informix".be_ach_detail_1_ix1 on "informix".be_ach_detail_1 (association,bank_number,batch_date) using btree ; 
create unique index "informix".bank_info_pk on "informix" .bank_info 
(merchant,bank_number,batch_date,sequence_number, association,transaction_code,ach_table) using btree ;  

Expected output:预期输出:

(merchant,bank_number,batch_date,sequence_number, association,transaction_code,ach_table) (商家、银行编号、批次日期、序列编号、关联、交易代码、ach_table)

I have tried multiple findall options but it's not working:我尝试了多个findall选项,但它不起作用:

myfile=re.findall(r'unique index\s.*\S*\)',myfile)[0]
myfile=re.findall(r'unique index\s.*\S*\)',myfile)[0]

You can use pattern = r"unique index[^(]*\\(([^)]*)\\)" which means:您可以使用pattern = r"unique index[^(]*\\(([^)]*)\\)"这意味着:

  • unique index search for exactly this substring in the text unique index在文本中准确搜索此子字符串
  • [^(]* match all characters except ( which will be the text until the bracket opens with ( [^(]*匹配除(之外的所有字符,这将是括号以(开头) 之前的文本
  • \\( the opening bracket (we have to escape the character with a backslash) \\(左括号(我们必须用反斜杠转义字符)
  • ([^)]*) the group which matches all characters except ) which will be the text until the bracket closes with ) ([^)]*)匹配除)之外的所有字符的组,该组将作为文本直到括号以)结束
  • \\) the closing bracket (we have to escape the character with a backslash) \\)结束括号(我们必须用反斜杠转义字符)
import re

text = """create index "informix".be_ach_detail_1_ix1 on "informix".be_ach_detail_1 (association,bank_number,batch_date) using btree ; create unique index "informix".bank_info_pk on "informix" .bank_info (merchant,bank_number,batch_date,sequence_number, association,transaction_code,ach_table) using btree ;"""

pattern = r"unique index[^(]*\(([^)]*)\)"
print(re.findall(pattern, text))

Which prints:哪个打印:

['merchant,bank_number,batch_date,sequence_number, association,transaction_code,ach_table']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM