簡體   English   中英

Python-標記化,替換單詞

[英]Python - tokenizing, replacing words

我正在嘗試創建一些帶有隨機單詞的句子。 具體來說,我會有類似的東西:

"The weather today is [weather_state]."

並能夠執行類似在[方括號]中找到所有標記的操作,然后將其與字典或列表中的隨機對應標記交換,從而使我擁有:

"The weather today is warm."
"The weather today is bad."

要么

"The weather today is mildly suiting for my old bones."

請記住,[bracket]令牌的位置並不總是在同一位置,並且我的字符串中會有多個括號中的令牌,例如:

"[person] is feeling really [how] today, so he's not going [where]."

我真的不知道從哪里開始,或者這甚至是使用令牌化或令牌模塊的最佳解決方案。 非常感謝任何指向我正確方向的提示!

編輯:為澄清起見,我真的不需要使用方括號,任何非標准字符都可以。

您正在使用回調函數查找re.sub:

words = {
    'person': ['you', 'me'],
    'how': ['fine', 'stupid'],
    'where': ['away', 'out']
}

import re, random

def random_str(m):
    return random.choice(words[m.group(1)])


text = "[person] is feeling really [how] today, so he's not going [where]."
print re.sub(r'\[(.+?)\]', random_str, text)

#me is feeling really stupid today, so he's not going away.   

注意,與format方法不同,這允許對占位符進行更復雜的處理,例如

[person:upper] got $[amount if amount else 0] etc

基本上,您可以在此之上構建自己的“模板引擎”。

您可以使用format方法。

>>> a = 'The weather today is {weather_state}.'
>>> a.format(weather_state = 'awesome')
'The weather today is awesome.'
>>>

也:

>>> b = '{person} is feeling really {how} today, so he\'s not going {where}.'
>>> b.format(person = 'Alegen', how = 'wacky', where = 'to work')
"Alegen is feeling really wacky today, so he's not going to work."
>>>

當然,這種方法只適用, 如果你可以從方括號來卷曲那些切換。

如果使用括號而不是方括號,則您的字符串可以用作字符串格式模板 您可以使用itertools.product用很多替代品來填充它:

import itertools as IT

text = "{person} is feeling really {how} today, so he's not going {where}."
persons = ['Buster', 'Arthur']
hows = ['hungry', 'sleepy']
wheres = ['camping', 'biking']

for person, how, where in IT.product(persons, hows, wheres):
    print(text.format(person=person, how=how, where=where))

產量

Buster is feeling really hungry today, so he's not going camping.
Buster is feeling really hungry today, so he's not going biking.
Buster is feeling really sleepy today, so he's not going camping.
Buster is feeling really sleepy today, so he's not going biking.
Arthur is feeling really hungry today, so he's not going camping.
Arthur is feeling really hungry today, so he's not going biking.
Arthur is feeling really sleepy today, so he's not going camping.
Arthur is feeling really sleepy today, so he's not going biking.

要生成隨機句子,可以使用random.choice

for i in range(5):
    person = random.choice(persons)
    how = random.choice(hows)
    where = random.choice(wheres)
    print(text.format(person=person, how=how, where=where))

如果必須使用方括號格式不包含大括號,則可以用大括號替換方括號,然后按上述步驟操作:

text = "[person] is feeling really [how] today, so he's not going [where]."
text = text.replace('[','{').replace(']','}')

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM