文件中的模式匹配

Question

我試圖在一個文件中找到多個匹配項。 我使用以下代碼：

f = open('/home/evi.nastou/Documenten/filename')
text = f.read()
#print text
urls = re.findall(r"_8o _8r lfloat\" href=\"(.+?)\" onclick=", text)
for url in urls:
    print url.replace('\\','')

但不會返回任何結果。

另一方面，當我在變量中傳遞整個文本時，它確實找到了模式。 有人可以幫幫我嗎？

ps文件中文本的一部分：

Answer 1

問題似乎是您的正則表達式。

使用這個：

r'href\s*=\s*(.+)\s+onclick\s*='

碼：

import re
text = open('test.txt').read() # contains your string

urls = re.findall(r'href\s*=\s*(.+?)\s+onclick\s*=', text)
for url in urls:
    print url.replace('\\','')

輸出：

"http://www.facebook.com/name"

我的正則表達式說明：

href    # match href
\s*     # match 0 or more spaces
=       # match =
\s*     # match 0 or more spaces
(.+?)   # match any character (non - greedy)
\s+     # match 1 or more spaces
onclick # match onclick
\s*     # match 0 or more spaces
=       # match =

文件中的模式匹配

問題描述

1 個解決方案

解決方案1
1 已采納 2013-04-04 11:38:06

文件中的模式匹配

問題描述

1 個解決方案

解決方案1 1 已采納 2013-04-04 11:38:06

解決方案1
1 已采納 2013-04-04 11:38:06