[英]java Need regex to extract weeknumbers from string
我得到了可能包含一個或多個星期數的雜志名稱列表。
例:
足球國際wk43
國家地理(wk50)
教科書wk39 / wk43
一些雜志周12至16
另一本雜志wk36_38
另一本雜志wk36_wk38
等等,我想要的是最后一部分。 所以:
國際足球周43
國家地理周50
課文第39周-第43周
一些雜志第12周-第16周
另一本雜志第36周-第38周
另一本雜志第36周-第38周
我開始於:
Pattern pat = Pattern.compile("(wk|week)[\\(\\_]?([0-9]{1,2}\\-?[0-9]{0,2})");
但這不適用於:
(some wk36 tm 42)", "(some wk36/wk37)", "(some wk36_wk37)", "some wk36_37", "some wk36_wk37"
我嘗試執行以下操作:
讀取直到Week或wk(wk | week)的第一次出現,然后獲取所有內容。
用周替換每次出現的wk
以某種方式替換所有非數字字符(例如/ _-)。
但是我被卡住了。 有任何想法嗎? 提前致謝。
您可以將Matcher#appendReplacement
與以下正則表達式一起使用:
(?i)w(?:e{2})?k(\\d+)(?:(?:\\s*until\\s*|[ _\\/])(?:w(?:e{2})?k)?(\\d+))?
這是代碼演示 :
String rx = "(?i)w(?:e{2})?k(\\d+)(?:(?:\\s*until\\s*|[ _\\/])(?:w(?:e{2})?k)?(\\d+))?";
String s = "Soccer International wk43\nNational Geopgraphic (wk50)\nSchoolpaper wk39/wk43\nSome magazine week12 until 16\nAnother magazine wk36_38\nAnother magazine wk36_wk38";
StringBuffer result = new StringBuffer();
Matcher m = Pattern.compile(rx).matcher(s);
while (m.find()) {
String replacement = m.group(2) == null ? // Check if Group 2 is matched
"week " + m.group(1): // If not, use just Group 1
"week " + m.group(1) + " - week " + m.group(2); // If yes, Group 2 is added
m.appendReplacement(result, replacement); // Add the replacement
}
m.appendTail(result);
System.out.println(result.toString());
針對更復雜場景的更新:
String rx = "(?i)w(?:e{2})?k\\s*(\\d+)(?: +(\\d{4})\\b)?(?:(?:\\s*(?:until|tm)\\s*|[ _/])(?:w(?:e{2})?k)?(\\d+)(?: +(\\d{4})\\b)?)?";
String s = "wk 1 2016\n(wk 47 2015 tm 9 2016)\nSoccer International wk43\nNational Geopgraphic (wk50)\nSchoolpaper wk39/wk43\nSome magazine week12 until 16\nAnother magazine wk36_38\nAnother magazine wk36_wk38";
StringBuffer result = new StringBuffer(); // week 47 (2015) - week 9 (2016)
Matcher m = Pattern.compile(rx).matcher(s); // week 1 (2016)
while (m.find()) {
String replacement = "";
String prt1 = ""; String prt2 = "";
if (m.group(2) != null) {
prt1 += " (" + m.group(2) + ")";
}
if (m.group(4) != null) {
prt2 += " (" + m.group(4) + ")";
}
if (m.group(3) == null) {
replacement = "week " + m.group(1) + prt1;
} else {
replacement = "week " + m.group(1) + prt1 + " - week " + m.group(3) + prt2;
}
m.appendReplacement(result, replacement);
}
m.appendTail(result);
System.out.println(result.toString());
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.