简体   繁体   中英

perl regex to match “”(string) syntax

I am new to Perl and regex and I need to extract all the strings from a text file. A string is identified by anything that is wrapped by double quotes.

Example of string:

"This is string"
"1!=2"
"This is \"string\""
"string1"."string2"
"S
t
r
i
n
g"

The code:

my $fh;

open($fh,'<','text.txt') or die "$!";

undef $/;
my $text = <$fh>;

my @strings = m/".*"/g; # this returns the most out "" in example 4
my @strings2 = m/"[^"]*"/g #fixed the above issue but does not take in example 3

Edited : I want to get (1) a double quote, followed by (2) zero or more occurrences of either a non-double-quote-non-backslash or a backslash followed by any character, followed by (3) a double quote. (2) can be anything but "

The regex provided below m/"(?:\\.|[^"])*"/g however when the there is a line with "string1".string2."string2" it will return "string1" string2 "string3"

Is there any wher to skip the previously matched word?

Can anyone please help?

One possible approach:

/"(?:\\.|[^"])*"/

在此处输入图片说明

... that reads as:

  • match double quotation mark,
  • followed by any number of...

    --- either any escaped character (any symbol prepended by \\ )

    --- or any character that's not a double quotation mark

  • followed by double quotation mark

The key trick here is using alternation that'll eat any escaped symbol - including escaped double quotation mark.

Demo .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM