簡體   English   中英

帶有“”的Python正則表達式問題

[英]Python Regular Expression problem with ""

我有一個像這樣的字符串:

['"country":"UK","email":"abc@vip.com","x_id":123,"level":0',
'"country":"UK","email":"bcd@vip.com","x_id":234,"level":1',
'"country":"UK","email":"efg@vip.com","x_id":456']

我想獲取 x_id 和 level 並將其轉換為 DataFrame ,如:

x_id  level

123    0
234    1
456    NAN 

我在 python 中使用 re 但我無法得到結果。 這是我的代碼:

data_raw=['"country":"UK","email":"abc@vip.com","x_id":123,"level":0','"country":"UK","email":"bcd@vip.com","x_id":234,"level":1',
'"country":"UK","email":"efg@vip.com","x_id":456]
data=pd.DataFrame(data_raw)
data['x_id']=data.apply(lambda x:re.search(r'(\"x_id\":)\d{1-10}',x))

您可以使用 pandas 的str.extract()方法,該方法將正則表達式作為參數並默認應用於 Series 的每個元素:

import pandas as pd

data_raw = ['"country":"UK","email":"abc@vip.com","x_id":123,"level":0',
            '"country":"UK","email":"bcd@vip.com","x_id":234,"level":1',
            '"country":"UK","email":"efg@vip.com","x_id":456']
data = pd.Series(data_raw)

x_id = data.str.extract('"x_id":(\d*)')
level = data.str.extract('"level":(\d*)')

results = pd.concat([x_id, level], axis=1)
results.columns = ['x_id', 'level']
display(results)

輸出:

    x_id    level
0   123     0
1   234     1
2   456     NaN

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM