[英]Replacing [[Words]] with other [[Words]] from a reference file in Notepad++ using Javascript
I have a translation file that looks like this:我有一个看起来像这样的翻译文件:
Apple=Apfel
Apple pie=Apfelkuchen
Banana=Banane
Bananaisland=Bananen Insel
Cherry=Kirsche
Train=Zug
...500+ more lines like that ...500 多行这样的行
now I have a file I need to work on with text.现在我有一个需要处理文本的文件。 Only certain parts of text needs to be replaced, example:
只需要替换文本的某些部分,例如:
The [[Apple]] was next to the [[Banana]]. Meanwhile the [[Cherry]] was chilling by the [[Train]].
The [[Apple pie]] tastes great on the [[Bananaisland]].
Result needs to be结果需要
The [[Apfel]] was next to the [[Banane]]. Meanwhile the [[Kirsche]] was chilling by the [[Zug]].
The [[Apfelkuchen]] tastes great on the [[Bananen Insel]].
There are way too many incident to copy/paste manually.手动复制/粘贴的事件太多了。 What is an easy way to search for [[XXX]] and replace from another file as mentioned?
如上所述,搜索 [[XXX]] 并从另一个文件替换的简单方法是什么?
I tried getting help for this for many hours but to no avail.我试图为此寻求帮助很多小时,但无济于事。 The closest I have gotten was this script:
我得到的最接近的是这个脚本:
import re
separators = "=", "\n"
def custom_split(sepr_list, str_to_split):
# create regular expression dynamically
regular_exp = '|'.join(map(re.escape, sepr_list))
return re.split(regular_exp, str_to_split)
with open('D:/_working/paired-search-replace.txt') as f:
for l in f:
s = custom_split(separators, l)
editor.replace(s[0], s[1])
However, this will replace too much, or not consistent.但是,这样会替换太多,或者不一致。 Eg [[Apple]] gets correctly replaced by [[Apfel]] but [[File:Apple.png]] gets wrongly replaced by [[File:Apfel.png]] and [[Apple pie]] gets replaced by [[Apfel pie]], so I tried tweaking the regular expression for hours on end to no avail.
例如 [[Apple]] 被 [[Apfel]] 正确替换,但 [[File:Apple.png]] 被错误地替换为 [[File:Apfel.png]] 并且 [[Apple pie]] 被 [[ Apfel pie]],所以我尝试连续数小时调整正则表达式无济于事。 Does anyone have any info -in very simple terms please- how I can fix this/achieve my goal?
有没有人有任何信息 - 请用非常简单的术语 - 我如何解决这个问题/实现我的目标?
This is a little tricky because [ is a meta character in regex.这有点棘手,因为 [ 是正则表达式中的元字符。
I'm sure there is a more efficient way to do it but this works:我确信有一种更有效的方法可以做到这一点,但这很有效:
replaces="""Apple=Apfel
Apple pie=Apfelkuchen
Banana=Banane
Bananaisland=Bananen Insel
Cherry=Kirsche
Train=Zug"""
text = """
The [[Apple]] was next to the [[Banana]]. Meanwhile the [[Cherry]] was chilling by the [[Train]].
The [[Apple pie]] tastes great on the [[Bananaisland]].
"""
if __name__ == '__main__':
import re
for replace in replaces.split('\n'):
english, german = replace.split('=')
text = re.sub(rf'\[\[{english}\]\]', f'[[{german}]]', text)
print(text)
outputs:输出:
The [[Apfel]] was next to the [[Banane]]. Meanwhile the [[Kirsche]] was chilling by the [[Zug]].
The [[Apfelkuchen]] tastes great on the [[Bananen Insel]].
First, read in the file with translations:首先,读入带有翻译的文件:
translations={}
with open('file/with/translations.txt', 'r', encoding='utf-8') as f:
for line in f:
items = line.strip().split('=', 1)
translations[items[0]] = items[1]
I assume the phrases/words are unique in the file.我假设文件中的短语/单词是唯一的。 Then, you need to match all substrings between
[[
and ]]
, capture the text in between (with a regex like \[\[(.*?)]]
, see the online demo ), check if there is a key with the group 1 value in the translations
dictionary, and replace with [[
+ dictionary value + ]]
if there is such a key, or return the whole match if there is no such a translation:然后,您需要匹配
[[
和]]
之间的所有子字符串,捕获中间的文本(使用像\[\[(.*?)]]
这样的正则表达式,请参阅在线演示),检查是否有一个键translations
字典中的第 1 组值,如果有这样的键,则替换为[[
+ 字典值 + ]]
,如果没有这样的翻译,则返回整个匹配项:
text = """The [[Apple]] was next to the [[Banana]]. Meanwhile the [[Cherry]] was chilling by the [[Train]].
The [[Apple pie]] tastes great on the [[Bananaisland]]."""
import re
translated_text = re.sub(r"\[\[(.*?)]]", lambda x: f'[[{translations[x.group(1)]}]]' if x.group(1) in translations else x.group(), text)
Output: Output:
>>> translated_text
'The [[Apfel]] was next to the [[Banane]]. Meanwhile the [[Kirsche]] was chilling by the [[Zug]]. \nThe [[Apfelkuchen]] tastes great on the [[Bananen Insel]].'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.