[英]RegEx: Match everything but X, then X
病態嘗試使您容易理解:
<!-- <BASIC_INFO> KOREAN = ¼®À¯ ENGLISH = OIL CODE = AA01 ACTIVE = FALSE LABEL = 0 </BASIC_INFO> <OPTION> ANIMATION = ¿©±â¿¡ ¼³¸í </OPTION> <BUY_INFO> BUYABLE = FALSE BUYTYPE = 9 BUYOPTION = 0 COST = 0 ADD_DINAR = 0 REQ_BP = 0 REQ_LVL = 1 RANDOM_NUM = 0 </BUY_INFO> <USE_INFO> APPLY_TARGET = 0 APPLY_OPTION = 0 ADD_POING = 0 DURATIONTIME = 0 </USE_INFO> <ABILITY_INFO> </ABILITY_INFO> //--> <!-- <BASIC_INFO> KOREAN = Âü³ª¹« ENGLISH = OAK CODE = AB01 ACTIVE = FALSE LABEL = 0 </BASIC_INFO> <OPTION> ANIMATION = ¿©±â¿¡ ¼³¸í </OPTION> <BUY_INFO> BUYABLE = FALSE BUYTYPE = 9 BUYOPTION = 0 COST = 0 ADD_DINAR = 0 REQ_BP = 0 REQ_LVL = 1 RANDOM_NUM = 0 </BUY_INFO> <USE_INFO> APPLY_TARGET = 0 APPLY_OPTION = 0 ADD_POING = 0 DURATIONTIME = 0 </USE_INFO> <ABILITY_INFO> </ABILITY_INFO> //-->
我想匹配<!-//->中的所有內容,無法為其找到正則表達式...第一個匹配應如下所示:
<BASIC_INFO> KOREAN = ¼®À¯ ENGLISH = OIL CODE = AA01 ACTIVE = FALSE LABEL = 0 </BASIC_INFO> <OPTION> ANIMATION = ¿©±â¿¡ ¼³¸í </OPTION> <BUY_INFO> BUYABLE = FALSE BUYTYPE = 9 BUYOPTION = 0 COST = 0 ADD_DINAR = 0 REQ_BP = 0 REQ_LVL = 1 RANDOM_NUM = 0 </BUY_INFO> <USE_INFO> APPLY_TARGET = 0 APPLY_OPTION = 0 ADD_POING = 0 DURATIONTIME = 0 </USE_INFO> <ABILITY_INFO> </ABILITY_INFO>
<!--(?<NodeContent>[^//\-\-\>]*)//-->
這是我嘗試過的方法,但它匹配每個字符! 這意味着如果/,-和>在<!-//->內,則失敗。 有人知道如何解決嗎?
這是整個文檔結構的樣子: http : //pastebin.com/cyESrLTB-我的目標是將其轉換為XML。
嘗試:
<!--(?<NodeContent>.*?)//-->
?
將匹配標記為“惰性”,因此它將嘗試匹配盡可能少的字符。 分解如下:
<!--
匹配<!--
(?<NodeContent>.*?)
-匹配.*?
懶惰地,給它一個組名NodeContent
。 //-->
-匹配//-->
您在這里不需要Regex,請使用HtmlAgilityPack之類的html解析器
var doc = new HtmlAgilityPack.HtmlDocument();
doc.Load(fname);
var comments = doc.DocumentNode.SelectNodes("//comment()")
.Select(n => n.InnerText)
.ToList();
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.