简体   繁体   English

替换 a.nfo 文件中的文本

[英]Replacing text in a .nfo file

I have a colors.nfo file where I want to replace variables and get a new generated nfo-file without losing the template.我有一个 colors.nfo 文件,我想在其中替换变量并获取新生成的 nfo 文件而不丢失模板。

There are ascii-signs which I don't know how to handle.有一些我不知道如何处理的 ascii 符号。 Every time I load with file.open and replace the variables and write it to a new file, there are weird signs and the template is destroyed.每次我用 file.open 加载并替换变量并将其写入新文件时,都会出现奇怪的迹象并且模板被破坏。

Here is an image of the file: https://i.imgur.com/8lqqXpg.png这是文件的图像: https://i.imgur.com/8lqqXpg.png

Here is the uploaded file to handle with: Click to download -- Hope its okay.这是要处理的上传文件: 点击下载——希望没问题。 otherwise i will delete!否则我会删除!

Hope you understand the problem.希望你能理解问题。 Want to replace the "%REPLACE1%", "%REPLACE2%" and "%REPLACE3%" with for example "BLACKGREY", "REDWHITE"....想用例如“BLACKGREY”、“REDWHITE”...替换“%REPLACE1%”、“%REPLACE2%”和“%REPLACE3%”。

  • Tried to load it in a string with f.open尝试使用 f.open 将其加载到字符串中
  • after I replaced it with string.replace("%REPLACE1", "BLACKGREY")在我用 string.replace("%REPLACE1", "BLACKGREY") 替换它之后
  • after I write a new file with f.write在我用 f.write 写一个新文件之后
  • file is destroyed and the ascii signs are unreadable and the template is not like before文件被破坏,ascii 符号不可读,模板不像以前

Code Example:代码示例:

replaceString = []
f = open("colors.nfo")
for line in f:
    replaceString.append(line.rstrip())
f.close()
replaceColors = "\n".join(replaceString)
print(replaceColors.replace("%REPLACE1%", "BLACKGREY"))

Output: Output:

ÛÛ³    [x] Yellow      [ ] Yellow       [ ] Yellow      ³ÛÛ
ÛÛ³    [x] Pink        [ ] Pink         [ ] %REPLACE3%  ³ÛÛ
ÛÛ³    [ ] Green       [ ] green        [ ] Green       ³ÛÛ
ÛÛ³    [ ] Red         [ ] red          [ ] Red         ³ÛÛ
ÛÛ³    [ ] Blue        [ ] blue         [ ] Blue        ³ÛÛ
ÛÛ³    [ ] Black       [ ] %REPLACE2%   [ ] black       ³ÛÛ
ÛÛ³    [ ] White       [ ] white        [ ] white       ³ÛÛ
ÛÛ³    [ ] grey        [ ] grey         [ ] grey        ³ÛÛ
ÛÛ³    [ ] brown       [ ] brown        [ ] brown       ³ÛÛ
ÛÛ³    [ ] BLACKGREY  [ ] orange       [ ] orange      ³ÛÛ
ÛÛ³    [ ] purple      [ ] purple       [ ] purple      ³ÛÛ

How it should be:应该如何:

██│    [x] Yellow      [ ] Yellow       [ ] Yellow      │██
██│    [x] Pink        [ ] Pink         [ ] %REPLACE3%  │██
██│    [ ] Green       [ ] green        [ ] Green       │██
██│    [ ] Red         [ ] red          [ ] Red         │██
██│    [ ] Blue        [ ] blue         [ ] Blue        │██
██│    [ ] Black       [ ] %REPLACE2%   [ ] black       │██
██│    [ ] White       [ ] white        [ ] white       │██
██│    [ ] grey        [ ] grey         [ ] grey        │██
██│    [ ] brown       [ ] brown        [ ] brown       │██
██│    [ ] BLACKGREY   [ ] orange       [ ] orange      │██
██│    [ ] purple      [ ] purple       [ ] purple      │██

I don't want these "ÛÛ" in my new created file.我不希望在新创建的文件中出现这些“ÛÛ”。 I want to have the "blackboxes" like in the screen.我想在屏幕上有“黑匣子”。 Replacing is not the problem.更换不是问题。 Problem is the structure after loading the file into a string.问题是将文件加载到字符串后的结构。 when I write these string to a new file, the template does not look like in the screen shown.当我将这些字符串写入新文件时,模板看起来不像显示的屏幕。

There are two problems here这里有两个问题

  1. reading and writing the data correctly正确读取和写入数据
  2. preserving the structure保留结构

Reading and Writing读写

According to Wikipedia , .nfo files of this type are encoded with the cp437 text encoding.根据Wikipedia ,这种类型的 .nfo 文件使用 cp437 文本编码进行编码。 Therefore this encoding must be specified when reading and writing the file.因此,在读取和写入文件时必须指定此编码。

with open('colors.nfo', 'r', encoding='cp437') as f:
    ...

with open('colors.nfo', 'w', encoding='cp437') as f:
    ...

Alternatively, the file may be opened in binary mode, and all operations carried out using bytes rather than text.或者,可以以二进制模式打开文件,并且所有操作都使用字节而不是文本执行。

with open('colors.nfo', 'rb') as f:
    ...

If no encoding is specified, Python uses the system default, probably cp1252 in this case.An 8-bit encoding like cp1252 is able to decode the file, but where cp437 decodes b'\xdb' as 'FULL BLOCK' (█), cp1252 decodes it as 'LATIN CAPITAL LETTER U WITH CIRCUMFLEX' (Û), corrupting the data as seen in the question.如果没有指定编码,Python 使用系统默认值,在这种情况下可能是 cp1252。像 cp1252 这样的 8 位编码能够解码文件,但是 cp437 将b'\xdb'解码为 'FULL BLOCK' (█), cp1252 将其解码为“带有 CIRCUMFLEX 的拉丁大写字母 U”(Û),从而破坏了问题中所见的数据。

Preserving Structure保留结构

The data has a fixed-width format, so care must be taken when replacing text that the target and replacement strings are both of equal length, otherwise the columns will not be aligned correctly.数据具有固定宽度格式,因此在替换文本时必须注意目标字符串和替换字符串的长度相同,否则列将无法正确对齐。

target = '%REPLACE1%'
replacement = 'BLACKGREY'

delta = len(target) - len(replacement)

padding = ' ' * abs(delta)
if delta > 0:
    replacement += padding
else:
    target += padding

Solutions解决方案

Here's a complete script.这是一个完整的脚本。

replacements = { 
        '%REPLACE1%': 'BLACKGREY',
        '%REPLACE2%': 'REDWHITE',
}   

with open('colors.nfo', encoding='cp437') as f:
    data = f.read()

for target, replacement in replacements.items():
    delta = len(target) - len(replacement)
    padding = ' ' * abs(delta)
    if delta > 0:
        replacement += padding
    else:
        target += padding

    data = data.replace(target, replacement)

with open('new-colors.nfo', 'w', encoding='cp437') as f:
    f.write(data)

Processing as binary data is the same, except that the files are opened and closed in binary mode, and strings are declared as bytes rather than str .作为二进制数据的处理是相同的,除了文件以二进制模式打开和关闭,并且字符串被声明为bytes而不是str

replacements = { 
        b'%REPLACE1%': b'BLACKGREY',
        b'%REPLACE2%': b'REDWHITE',
}   

with open('colors.nfo', 'rb') as f:
    data = f.read()

for target, replacement in replacements.items():
    delta = len(target) - len(replacement)
    padding = b' ' * abs(delta)
    if delta > 0:
        replacement += padding
    else:
        target += padding

    data = data.replace(target, replacement)

with open('new-colors.nfo', 'wb') as f:
    f.write(data)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM