簡體   English   中英

解析多行字符串的正則表達式

[英]Regular expression to parse multi-line string

我正在從 Clearquest 記錄中提取筆記日志(是的,我知道,CQ 非常古老),其中包括一些我需要用雙引號封裝並用分號分隔的項目,基本上使其可作為評論導入 jira。 以下是注釋日志可能包含的內容的示例。 這被存儲在 perl 中的一個變量中。

===== State: In_Work by:user1 at 12/13/2010 10:47:23 =====

Generic notes log entry that can span multiple lines depending on length
of sentence.

===== State: In_Work by:user2 at 06/04/2010 17:34:42 =====

Another generic notes log entry.

從上面的示例中,我需要的最終結果如下所示:

my @notes_log_entries = ("\"12/13/2010 10:47:23;user1;Generic notes log entry that can span multiple lines depending on length\r\nof sentance\"", "\"06/04/2010 17:34:42;user2;Another generic notes log entry.\"");

以下代碼有效,但僅適用於包含兩個筆記日志條目的變量:

$Notes_Log = $resultset->GetColumnValue(7);
print "Notes Log Before: $Notes_Log\n";
$Notes_Log =~ s/\R//g;
$Notes_Log =~ s/^===== State: .* by:(.*) at (.*) =====(.*)===== State: .* by:(.*) at (.*) =====(.*)/rtx-$1;$2;$3\nrtx-$4;$5;$6/g;
print "Notes Log After:\n$Notes_Log\n";

以下是上述代碼的一些示例輸出:

Notes Log Before:
===== State: In_Work by:user1 at 12/13/2010 10:47:23 =====

Generic notes log entry that can span multiple lines depending on length
of sentence.

===== State: In_Work by:user2 at 06/04/2010 17:34:42 =====

Another generic notes log entry.

Notes Log After:
rtx-user1;12/13/2010 10:47:23;Generic notes log entry that can span multiple lines depending on length
of sentence.
rtx-user2;06/04/2010 17:34:42;Another generic notes log entry.

當然,可能有更多方法可以解決這個問題,這是我的:

my @notes_log_entries = ();

while ($Notes_Log =~ s/===== State: .* by:(.*) at (.*) =====\R\R([\w\s.]*)\R?\R?//) {
    my $single_entry .= '"'.$2.';'.$1.';'.$3.'"';
    $single_entry =~ s/\s\s//g;
    push(@notes_log_entries, $single_entry);    
}

while 循環經常幫助我創建正則表達式,因為它降低了實際正則表達式的復雜性。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM