[英]If between two different characters in a text file, Python
I am basically trying to use python for a find and replace, but make it only apply to strings between "{s:" and the following "},".我基本上是在尝试使用 python 进行查找和替换,但使其仅适用于“{s:”和以下“},”之间的字符串。 I have a long text file of many of the following:
我有一个包含以下许多内容的长文本文件:
["c", "DashedSentence", {s: "Yo limpio mi cuarto todos los sábados."},
"Question", {q: "¿Cuándo limpio mi cuarto?",
as: ["Todos los sábados.",
"Todos los domingos."]}],
["c", "DashedSentence", {s: "Nosotros contestamos el correo cada semana."},
"Question", {q: "¿Con qué frecuencia contestamos el correo?",
as: ["Cada semana.",
"Cada dos semanas."]}],
In the end, I want phrases grouped together by underscores within the "s:" sections, by replacing " mi " with " mi_" to yield "mi_cuarto", and similarly with "los" "el" ... and many more that aren't in the given examples.最后,我希望在“s:”部分中用下划线将短语组合在一起,将“mi”替换为“mi_”以产生“mi_cuarto”,类似地替换为“los”“el”......等等不在给定的示例中。
All I have so far is:到目前为止我所拥有的是:
s = open("stimuli.txt").read()
word = [' mi ','los ']
phrase = [' mi_',' los_']
for i in range(len(word)):
if BETWEEN "{s:" and "},":
s = s.replace(word[i],phrase[i])
f = open("stimuli_phrases.txt", 'w')
f.write(file)
Of course, BETWEEN isn't real, that's what I'm looking for.当然,BETWEEN 不是真的,这就是我要找的。 I might not be approaching the problem the right way, so I'm also open to any alternative ideas!
我可能没有以正确的方式解决问题,所以我也愿意接受任何其他想法! I appreciate the help, thanks!
感谢您的帮助,谢谢!
edit: The desired output groups noun phrases and prepositional phrases with in the {s:} sections, like so:编辑:所需的输出将 {s:} 部分中的名词短语和介词短语分组,如下所示:
["c", "DashedSentence", {s: "Yo limpio mi_cuarto todos_los_sábados."},
"Question", {q: "¿Cuándo limpio mi cuarto?",
as: ["Todos los sábados.",
"Todos los domingos."]}],
["c", "DashedSentence", {s: "Nosotros contestamos el_correo cada_semana."},
"Question", {q: "¿Con qué frecuencia contestamos el correo?",
as: ["Cada semana.",
"Cada dos semanas."]}],
The file you gave is JSON formatted , which mean it could easily be parsed with the builtin python json library :您提供的文件是JSON 格式的,这意味着可以使用内置的python json 库轻松解析它:
import json
with open("/path/to/your/file", "r") as f:
data = json.load(f)
for item in data:
try:
s = item['s']
except (TypeError, KeyError):
pass
Of course, if you do not want or can parse this file as json, you could use the re library :当然,如果您不想或可以将此文件解析为 json,则可以使用re 库:
import re
to_process = re.findall("{s:\"(.+)}\"", yourtext)
To learn or practice with regex, look at there: https://regexr.com/
要学习或练习正则表达式,请查看: https : //regexr.com/
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.