[英]regex to find all words after specific word?
我有一個如下字符串:
Features: -Includes hanging accessories. -Artist: William-Adolphe Bouguereau. -Made with 100pct cotton canvas. -100pct Anti-shrink pine wood bars and Epson anti-fade ultra chrome inks. -100pct Hand-made and inspected in the U.S.A. -Orientation: Horizontal. **Subject: -Figures/Nautical and beach.** Gender: -Unisex/Both. Size: -Mini 17'' and under/Small 18''-24''/Medium 25''-32''/Large 33''-40''/Oversized 41'' and above. Style: -Fine art. Color: -Blue. Country of Manufacture: -United States. Product Type: -Print of painting. Region: -Europe. Primary Art Material: -Canvas. Dimensions: -8'' H x 12'' W x 0.75'' D: 0.72 lb. -12'' H x 18'' W x 0.75'' D: 1.14 lbs. -12'' H x 18'' W x 1.5'' D: 2.45 lbs. -18'' H x 26'' W x 0.75'' D: 1.44 lbs. Paintings Prints Tori White Wildon Photography Photos Posters Abstract Black D cor Designs Framed Hazelwood Hokku Home Landscape Oil Accent 075 12 15 18 26 40 60 8 D H W x 1 1017 1824 2532 holidays, christmas gift gifts for girls boys
我必須找到特定單詞之后的單詞。
我想提取上面例子中"Subject"
一詞之后的單詞。
輸出應如下所示:
Subject: -Figures/Nautical and beach.
我試過下面的正則表達式:
re.compile('(?<=subject)(.{30}(?:\s|.))',re.I)
但是,在指定主題關鍵字之后沒有固定數量的單詞,因此我無法指定單詞的確切數量。
如何停在“peroid”或space.There沒有特定的停止標准。
你的(?<=subject)(.{30}(?:\\s|.))
正則表達式斷言subject
之后的位置。 然后抓取除了換行符號以外的30個字符,然后匹配空格或任何字符,但匹配換行符號。 這不符合您的要求,因為子串可以是任何長度。
您可以將基於交替的正則表達式與捕獲組一起使用:
subject:\s*([^.]+|\S+)
請參閱正則表達式演示
細節 :
subject:
- 文字subject:
字符串 \\s*
- 0+空格 ([^.]+|\\S+)
- 第1組捕獲1個或多個非周期符號或1個非空白符號 注意 :備選的順序在這里很重要 ,因為[^.]+
匹配空格,而\\S+
則不匹配。 如果\\s*
之后的子字符串以點開頭,則\\S+
將匹配該子字符串直到空格。
Python演示 :
import re
p = re.compile(r'subject:\s*([^.]+|\S+)', re.IGNORECASE)
s = "Features: -Includes hanging accessories. -Artist: William-Adolphe Bouguereau. -Made with 100pct cotton canvas. -100pct Anti-shrink pine wood bars and Epson anti-fade ultra chrome inks. -100pct Hand-made and inspected in the U.S.A. -Orientation: Horizontal. **Subject: -Figures/Nautical and beach.** Gender: -Unisex/Both. Size: -Mini 17'' and under/Small 18''-24''/Medium 25''-32''/Large 33''-40''/Oversized 41'' and above. Style: -Fine art. Color: -Blue. Country of Manufacture: -United States. Product Type: -Print of painting. Region: -Europe. Primary Art Material: -Canvas. Dimensions: -8'' H x 12'' W x 0.75'' D: 0.72 lb. -12'' H x 18'' W x 0.75'' D: 1.14 lbs. -12'' H x 18'' W x 1.5'' D: 2.45 lbs. -18'' H x 26'' W x 0.75'' D: 1.44 lbs. Paintings Prints Tori White Wildon Photography Photos Posters Abstract Black D cor Designs Framed Hazelwood Hokku Home Landscape Oil Accent 075 12 15 18 26 40 60 8 D H W x 1 1017 1824 2532 holidays, christmas gift gifts for girls boys"
m = p.search(s)
if m:
print(m.group()) # this includes Subject:
print(m.group(1)) # this does not include Subject:
正則表達式:
(Subject:.+)\*\*
Match Subject and content after that till '**'
碼:
str = 'Features: -Includes hanging accessories. -Artist: William-Adolphe Bouguereau. -Made with 100pct cotton canvas. -100pct Anti-shrink pine wood bars and Epson anti-fade ultra chrome inks. -100pct Hand-made and inspected in the U.S.A. -Orientation: Horizontal. **Subject: -Figures/Nautical and beach.** Gender: -Unisex/Both. Size: -Mini 17'' and under/Small 18''-24''/Medium 25''-32''/Large 33''-40''/Oversized 41'' and above. Style: -Fine art. Color: -Blue. Country of Manufacture: -United States. Product Type: -Print of painting. Region: -Europe. Primary Art Material: -Canvas. Dimensions: -8'' H x 12'' W x 0.75'' D: 0.72 lb. -12'' H x 18'' W x 0.75'' D: 1.14 lbs. -12'' H x 18'' W x 1.5'' D: 2.45 lbs. -18'' H x 26'' W x 0.75'' D: 1.44 lbs. Paintings Prints Tori White Wildon Photography Photos Posters Abstract Black D cor Designs Framed Hazelwood Hokku Home Landscape Oil Accent 075 12 15 18 26 40 60 8 D H W x 1 1017 1824 2532 holidays, christmas gift gifts for girls boys'
import re
a = re.search(r'(Subject:.+)\*\*',str)
print(a.group(1))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.