简体   繁体   中英

How to not allow matching of the regex when i have a single quote preceded by a white space?

I want to not match when i have odd number of quotes if the starting of the quote is preceded by a white space.In case of nested quotation, only the outermost quote is considered. ex.

please don "t turn on a "light". ->this should not match

please don"t turn on a "light". -> this should match "light"

I have done till matching of quotes

((?!^)(\s)".*?[\s]*"+)|(^".*?[\s]*"+)

sample test cases.

turn on "Light A" and "Light B" -> matches light A and light B

"Light A " was turned on -> matches Light A

She replied"as you say" -> does not matches

She replied "as you say" -> matches "as u say"

please don 't turn on a "light". ->this should not match

please don "t turn on a 'light'. ->this should not match

She replied "please turn on 'Light A'" -> matches please turn on light A

please don "t turn on a "light". ->this should not match

To replace a "..." substring in JS that has no whitespaces in front and after it you may use

.replace(/(\s|^)".*?"(?!\S)/g, '$1<REPLACEMENT_HERE>')

Or, to match any char including line break chars:

.replace(/(\s|^)"[^]*?"(?!\S)/g, '$1<REPLACEMENT_HERE>')

Or, if you only target the latest ECMAScript compatible JS environments, use

.replace(/(?<!\S)".*?"(?!\S)/g, '<REPLACEMENT_HERE>')

See the regex demo

Details

  • (\\s|^) - Group 1: whitespace or start of string
  • " - a "
  • .*? - any 0+ chars other than line break chars as few as possible
  • " - a "
  • (?!\\S) - whitespace or end of string should follow the current position.

The (?<!\\S) in the last example is a negative lookbehind that matches a location not immediately preceded with non-whitespace is not supported by the majority of browsers as of now.

I do not know Javascript but have some info about regex. Following code is written in ipython.

  1. Divided problem in parts:
  2. Remove unwanted double quotes : A double quote preceded and followed by alphanumeric. Replace unwanted double quote with blank.
    re.sub('?<=[\\w])(")(?[\\w])',"",string,0,re.I)
  3. Find substring within double quotes. re.findall("""(?<!\\w)"(.*?)"(?!\\w)""",re.sub('?<=[\\w])(")(?[\\w])',"",string,0,re.I),"",text,0,re.I),re.I)
  4. By now, all substring should not contain double quote. If it has it means double quote is wrongly placed. For eg: "please don "t turn on a "light". Filter substring with double quote. Complete function:
    def filter(text): 
     ...:     p1="""(?<!\w)"(.*?)"(?!\w)"""  
     ...:     p2="""(?<=[\w])(")(?=[\w])""" 
     ...:     matches=re.findall(p1,re.sub(p2,"",text,0,re.I),re.I)  
     ...:     for i in matches: 
     ...:         if re.search('"',i): 
     ...:            print("filtered",i) 
     ...:            matches.remove(i) 
     ...:         else: 
     ...:            print("kept",i) 
     ...:     return matches

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM