簡體   English   中英

在Python中選擇正則表達式

[英]select regular expression in Python

我有一個字符串,這是

"contributors_enabled": false, "geo_enabled": false, "created_at": "Fri Nov 11 15:38:06 +0000 2016"}, "text": "Facts On Managed Forex Trading htps:////t.co////E4cxCvvjD #forex #binaryoptions #cryptocurrency #stockmarket", "timestamp_ms": "1509073455803",.

我將使用正則表達式選擇文本:

Facts On Managed Forex Trading htps:////t.co////E4cxCvvjD #forex #binaryoptions #cryptocurrency #stockmarket

在“ text”之后:“”和“ timestamp_ms”之前:

是否可以收集這些文本?

可能? 是。

def text_scrap(text, start, end):
    """This function returns the data between start and end."""
    _,_,rest = text.partition(start)
    result,_,_ = rest.partition(end)
    return result

my_text = "contributors_enabled": false, "geo_enabled": false, "created_at": "Fri Nov 11 15:38:06 +0000 2016"}, "text": "Facts On Managed Forex Trading htps:////t.co////E4cxCvvjD #forex #binaryoptions #cryptocurrency #stockmarket", "timestamp_ms": "1509073455803",.

data_scrapped = text_scrap(my_text, start=' "text": "', end="timestamp_ms") # use our new shiny function
print(data_scrapped)

好主意? 可能不是。

您的代碼是字典,因此您可以更輕松地訪問字典的“文本”鍵。 請選中以了解字典。

盡管從字符串看來,您的整個字符串似乎都可以被解析,因為它似乎是JSON。 但是,由於您正在尋找與正則表達式相關的解決方案,所以希望以下對您有用。

import re

pattern = '"text": "(.*), "timestamp_ms"'

str = """
"contributors_enabled": false, "geo_enabled": false, "created_at": "Fri Nov 11 15:38:06 +0000 2016"}, "text": "Facts On Managed Forex Trading htps:////t.co////E4cxCvvjD #forex #binaryoptions #cryptocurrency #stockmarket", "timestamp_ms": "1509073455803",.
"""

print re.findall(pattern, string=str)[0]

輸出:

Facts On Managed Forex Trading htps:////t.co////E4cxCvvjD #forex #binaryoptions #cryptocurrency #stockmarket"

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM