簡體   English   中英

正則表達式:匹配除X之外的所有內容,然后匹配X

[英]RegEx: Match everything but X, then X

病態嘗試使您容易理解:

  <!-- <BASIC_INFO> KOREAN = ¼®À¯ ENGLISH = OIL CODE = AA01 ACTIVE = FALSE LABEL = 0 </BASIC_INFO> <OPTION> ANIMATION = ¿©±â¿¡ ¼³¸í </OPTION> <BUY_INFO> BUYABLE = FALSE BUYTYPE = 9 BUYOPTION = 0 COST = 0 ADD_DINAR = 0 REQ_BP = 0 REQ_LVL = 1 RANDOM_NUM = 0 </BUY_INFO> <USE_INFO> APPLY_TARGET = 0 APPLY_OPTION = 0 ADD_POING = 0 DURATIONTIME = 0 </USE_INFO> <ABILITY_INFO> </ABILITY_INFO> //--> <!-- <BASIC_INFO> KOREAN = Âü³ª¹« ENGLISH = OAK CODE = AB01 ACTIVE = FALSE LABEL = 0 </BASIC_INFO> <OPTION> ANIMATION = ¿©±â¿¡ ¼³¸í </OPTION> <BUY_INFO> BUYABLE = FALSE BUYTYPE = 9 BUYOPTION = 0 COST = 0 ADD_DINAR = 0 REQ_BP = 0 REQ_LVL = 1 RANDOM_NUM = 0 </BUY_INFO> <USE_INFO> APPLY_TARGET = 0 APPLY_OPTION = 0 ADD_POING = 0 DURATIONTIME = 0 </USE_INFO> <ABILITY_INFO> </ABILITY_INFO> //--> 

我想匹配<!-//->中的所有內容,無法為其找到正則表達式...第一個匹配應如下所示:

  <BASIC_INFO> KOREAN = ¼®À¯ ENGLISH = OIL CODE = AA01 ACTIVE = FALSE LABEL = 0 </BASIC_INFO> <OPTION> ANIMATION = ¿©±â¿¡ ¼³¸í </OPTION> <BUY_INFO> BUYABLE = FALSE BUYTYPE = 9 BUYOPTION = 0 COST = 0 ADD_DINAR = 0 REQ_BP = 0 REQ_LVL = 1 RANDOM_NUM = 0 </BUY_INFO> <USE_INFO> APPLY_TARGET = 0 APPLY_OPTION = 0 ADD_POING = 0 DURATIONTIME = 0 </USE_INFO> <ABILITY_INFO> </ABILITY_INFO> 
<!--(?<NodeContent>[^//\-\-\>]*)//-->

這是我嘗試過的方法,但它匹配每個字符! 這意味着如果/,-和>在<!-//->內,則失敗。 有人知道如何解決嗎?

編輯

這是整個文檔結構的樣子: http : //pastebin.com/cyESrLTB-我的目標是將其轉換為XML。

嘗試:

<!--(?<NodeContent>.*?)//-->

? 將匹配標記為“惰性”,因此它將嘗試匹配盡可能少的字符。 分解如下:

  • <!--匹配<!--
  • (?<NodeContent>.*?) -匹配.*? 懶惰地,給它一個組名NodeContent
  • //--> -匹配//-->

您在這里不需要Regex,請使用HtmlAgilityPack之類的html解析器

var doc = new HtmlAgilityPack.HtmlDocument();
doc.Load(fname);
var comments = doc.DocumentNode.SelectNodes("//comment()")
                .Select(n => n.InnerText)
                .ToList();

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM