I am going to use this Regular expression on my rtf file:
((?:^|\s)[^\s\\]+(?:\\(?!line)[A-Za-z]+\n?(?:-?\d+)?[ ]?)+)(\b[^\s\\])
As you see in https://regexr.com/
xxx\par\fi-240\li720 could not be matched completely due to having "-->" after it in my rtf file. The regular regex can only detect " xxx\par\fi- "
Do you have any idea how to solve it?
This is my rtf file:
{\rtf1\ansi\ansicpg1252\cocoartf2513
\cocoatextscaling0\cocoaplatform0{\fonttbl\f0\froman\fcharset0 Times-Roman;}
{\colortbl;\red255\green255\blue255;}
{\*\expandedcolortbl;;}
\paperw15000\paperh15840\margl1440\margt1440\margr1440\margb1440\deftab1134\widowctrl\lytexcttp\formshade\headery720\footery720\pgwsxn15000\pghsxn15840\marglsxn1440\margtsxn1440\margrsxn1440\margbsxn1440\pgbrdropt32\pard\pard\fi-240\li720\tx1200\tx1920\tx2640\tx3360\tx4080\tx4800\tx5520\tx6240\tx6960\tx7680\tx8400\tx9120\tx9840\tx10560\itap0\nowidctlpar\plain\f2\fs20\b\chshdng0\chcfpat0{XX, XX XX\plain\f2\fs20\chshdng0\chcfpat0\par\fi-240\li720\tx1200\tx1920\tx2640\tx3360\tx4080\tx4800\tx5520\tx6240\tx6960\tx7680\tx8400\tx9120\tx9840\tx10560 URN: xxx DOB: xx Sex: XX\par\fi-240\li720\tx1200\tx1920\tx2640\tx3360\tx4080\tx4800\tx5520\tx6240\tx6960\tx7680\tx8400\tx9120\tx9840\tx10560 Home address: 3 xxx xx, xxxxx 3134\par\pard\fi-240\li720\pard\pard\fi-240\li720\itap0\nowidctlpar Home Phone: Mobile Phone:}
xxxx\par\fi-240\li720 swab xxx\par\fi-240\li720 to d/w xxxx\par\fi-240\li720 -->case x/ XX\par\fi-240\li720 to x/x xxx}
The current pattern captures (\b[^\s\\])
in the last group, which starts with a word boundary and expects to match a single non whitespace char except \
In the example data, the next char after the whitespace char is a -
, and there is no word boundary between a whitespace char and -
.
What you might do is use an alternation which also accepts a -
after it (\b[^\s\\]|-)
The pattern would then look like
((?:^|\s)[^\s\\]+(?:\\(?!line)[A-Za-z]+\n?(?:-?\d+)?[ ]?)+)(\b[^\s\\]|-)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.