簡體   English   中英

在Python中使用Regex將一個字符串替換為另一個字符串:錯誤:re.error:位置0處的逃逸\\ w錯誤

[英]Replacing one string with another string using Regex in Python: Error: re.error: bad escape \w at position 0

我正在嘗試取代出現的情況; 例如“ word one”和“ word_one”。 用“ _”代替空格。

這是我的代碼:

labels_ls = ['word <= 0.01', 'word_two <= 0.23', 'word three <= 0.01']

regex_whitespace = r'\w+\s+\w+\b'
new_regex = r'\w+\_+\w+\b'
pattern = re.compile(regex_whitespace) # this I just added after reviewing other related questions

# Loop through labels_ls to find any ngrams whitespace separated labels (i.e gilt maximal)

for i in labels_ls:
    if re.match(regex_whitespace, i):
        # replace the whitespace with a '_' to form gilt*maximal
        new_string = re.sub(pattern, new_regex, i)
        print('new string: ', new_string)

我在https://pythex.org上測試了我的正則表達式,它可以按要求工作,但是當我運行此代碼時,出現以下錯誤:

re.error:位置0處的逃逸錯誤\\ w

我查看了所有相關的已回答問題:

如何解決-錯誤:位置0處的轉義符\\ u錯誤

正則表達式:將一種模式替換為另一種

我已經嘗試過刪除上述問題中提到的正則表達式前的r,但是它仍然無法正常工作。

我也嘗試使用compile(),但這也沒有解決問題

labels_ls = ['internal_punctuation <= 0.042', 'darf <= 0.717', 'formal_global_yes <= 0.5', 'wert <= 0.272', 'signal <= 0.5', 'Flesch_Index <= 0.813', 'zulass <= 0.379', 'polarity <= 0.713', 'Nb_of_auxiliary <= 0.071', 'gini = 0.0', 'polarity <= 0.375', 'gini = 0.0', 'Nb_of_verbs <= 0.094', 'weakwords_nb <= 0.143', 'passive_global_yes <= 0.5', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'Nb_of_verbs <= 0.094', 'passive_global_yes <= 0.5', 'WPS <= 0.062', 'measurement_values_no <= 0.5', 'gini = 0.0', 'SPW <= 0.575', 'weird_words <= 0.042', 'weakwords_nb <= 0.036', 'SPW <= 0.272', 'gini = 0.0', 'words_nb <= 0.033', 'gini = 0.5', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'Flesch_Index <= 0.774', 'SPW <= 0.331', 'gini = 0.0', 'gini = 0.0', 'Comp_conj <= 0.375', 'SPW <= 0.111', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'Sub_Conj <= 0.25', 'weird_words <= 0.208', 'zsdf <= 0.5', 'signal <= 0.297', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'words_nb <= 0.164', 'Aux_Start_no <= 0.5', 'gini = 0.0', 'Nb_of_Umsetzbarkeit_conj <= 0.167', 'werden <= 0.125', 'darf <= 0.297', 'polarity <= 0.925', 'SPW <= 0.376', 'WPS <= 0.11', 'numerical_values <= 0.091', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'WPS <= 0.11', 'gini = 0.0', 'gini = 0.0', 'polarity <= 0.25', 'gini = 0.0', 'Flesch_Index <= 0.663', 'words_nb <= 0.033', 'SPW <= 0.475', 'gini = 0.0', 'gini = 0.0', 'Comp_conj <= 0.125', 'gini = 0.56', 'gini = 0.0', 'Flesch_Index <= 0.75', 'gini = 0.444', 'gini = 0.0', 'Aux_Start_yes <= 0.5', 'darf <= 0.241', 'Nb_of_verbs <= 0.156', 'gini = 0.0', 'SPW <= 0.246', 'polarity <= 0.675', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'Sub_Conj <= 0.25', 'numerical_values <= 0.227', 'funktion <= 0.348', 'internal_punctuation <= 0.458', 'polarity <= 0.375', 'gini = 0.0', 'Nb_of_verbs <= 0.031', 'gini = 0.0', 'Flesch_Index <= 0.409', 'gini = 0.0', 'numerical_values <= 0.136', 'WPS <= 0.065', 'darf <= 0.359', 'Nb_of_Umsetzbarkeit_conj <= 0.167', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'formal_global_no <= 0.5', 'WPS <= 0.164', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gilt randbeding <= 0.181', 'fahrzeug <= 0.352', 'gini = 0.0', 'zulass <= 0.082', 'gini = 0.0', 'gini = 0.0', 'fur <= 0.194', 'weakwords_nb <= 0.321', 'gini = 0.444', 'gini = 0.0', 'gini = 0.0', 'Nb_of_Umsetzbarkeit_conj <= 0.167', 'Nb_of_verbs <= 0.344', 'gini = 0.0', 'gini = 0.0', 'words_nb <= 0.178', 'gini = 0.0', 'words_nb <= 0.224', 'gini = 0.0', 'gini = 0.0']

您需要使用

regex_whitespace = r'(\w+)\s+(\w+)\b'

然后再:

new_string = re.sub(pattern, r'\1_\2', i)

在線查看Python演示

關鍵是您需要將與第一個正則表達式匹配的單詞chars 捕獲捕獲組中 ,然后對匹配的組值使用反向引用 new_regex = r'\\w+\\_+\\w+\\b'是多余的,因為您不能使用正則表達式作為替換,替換模式只能包含反向引用和轉義序列(必須在此處轉義反斜杠)。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM