简体   繁体   中英

Regular expression to parse multi-line string

I'm pulling a notes log from a Clearquest record (yes I know, CQ is incredibly old) which includes a few items I need encapsulated in doublequotes and separated via semi-colons, essentially making it jira-importable as a comment. Here is an example of what the notes log could contain. This gets stored inside a variable in perl.

===== State: In_Work by:user1 at 12/13/2010 10:47:23 =====

Generic notes log entry that can span multiple lines depending on length
of sentence.

===== State: In_Work by:user2 at 06/04/2010 17:34:42 =====

Another generic notes log entry.

From the above example what I need as a end result is something like the following:

my @notes_log_entries = ("\"12/13/2010 10:47:23;user1;Generic notes log entry that can span multiple lines depending on length\r\nof sentance\"", "\"06/04/2010 17:34:42;user2;Another generic notes log entry.\"");

The following code works but only for variables containing two notes log entries:

$Notes_Log = $resultset->GetColumnValue(7);
print "Notes Log Before: $Notes_Log\n";
$Notes_Log =~ s/\R//g;
$Notes_Log =~ s/^===== State: .* by:(.*) at (.*) =====(.*)===== State: .* by:(.*) at (.*) =====(.*)/rtx-$1;$2;$3\nrtx-$4;$5;$6/g;
print "Notes Log After:\n$Notes_Log\n";

Here is some example output from the above code:

Notes Log Before:
===== State: In_Work by:user1 at 12/13/2010 10:47:23 =====

Generic notes log entry that can span multiple lines depending on length
of sentence.

===== State: In_Work by:user2 at 06/04/2010 17:34:42 =====

Another generic notes log entry.

Notes Log After:
rtx-user1;12/13/2010 10:47:23;Generic notes log entry that can span multiple lines depending on length
of sentence.
rtx-user2;06/04/2010 17:34:42;Another generic notes log entry.

of course there likely are many more ways to conquer the problem, here is mine:

my @notes_log_entries = ();

while ($Notes_Log =~ s/===== State: .* by:(.*) at (.*) =====\R\R([\w\s.]*)\R?\R?//) {
    my $single_entry .= '"'.$2.';'.$1.';'.$3.'"';
    $single_entry =~ s/\s\s//g;
    push(@notes_log_entries, $single_entry);    
}

The while loop often helps me with regex creation as it reduces the complexity of the actual regex.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM