[英]What config file format to use for user-friendly strings of arbitrary bytes?
So I made a short Python script to launch files in Windows with ambiguous extensions by examining their magic number/file signature first: 因此,我制作了一个简短的Python脚本,通过首先检查其魔幻数字/文件签名来在Windows中启动具有歧义扩展名的文件 :
I'd like to compile it to a .exe to make association easier (either using bbfreeze or rewriting in C), but I need some kind of user-friendly config file to specify the matching byte strings and program paths. 我想将其编译为.exe以使关联更容易(使用bbfreeze或在C中重写),但是我需要某种用户友好的配置文件来指定匹配的字节字符串和程序路径。 Basically I want to put this information into a plain text file somehow:
基本上,我想以某种方式将此信息放入纯文本文件中:
magic_numbers = {
# TINA
'OBSS': r'%PROGRAMFILES(X86)%\DesignSoft\Tina 9 - TI\TINA.EXE',
# PSpice
'*version': r'%PROGRAMFILES(X86)%\Orcad\Capture\Capture.exe',
'x100\x88\xce\xcf\xcfOrCAD ': '', #PSpice?
# Protel
'DProtel': r'%PROGRAMFILES(X86)%\Altium Designer S09 Viewer\dxp.exe',
# Eagle
'\x10\x80': r'%PROGRAMFILES(X86)%\EAGLE-5.11.0\bin\eagle.exe',
'\x10\x00': r'%PROGRAMFILES(X86)%\EAGLE-5.11.0\bin\eagle.exe',
'<?xml version="1.0" encoding="utf-8"?>\n<!DOCTYPE eagle ': r'%PROGRAMFILES(X86)%\EAGLE-5.11.0\bin\eagle.exe',
# PADS Logic
'\x00\xFE': r'C:\MentorGraphics\9.3PADS\SDD_HOME\Programs\powerlogic.exe',
}
(The hex bytes are just arbitrary bytes, not Unicode characters.) (十六进制字节只是任意字节,不是Unicode字符。)
I guess a .py file in this format works, but I have to leave it uncompiled and somehow still import it into the compiled file, and there's still a bunch of extraneous content like {
and ,
to be confused by/screw up. 我猜这种格式的.py文件可以工作,但是我必须不进行编译,而仍以某种方式将其导入已编译的文件中,并且还有大量无关的内容,例如
{
和,
以供混淆。
I looked at YAML, and it would be great except that it requires base64-encoding binary stuff first, which isn't really what I want. 我看了YAML,它很棒,除了它首先需要base64编码的二进制内容,这不是我真正想要的。 I'd prefer the config file to contain hex representations of the bytes.
我希望配置文件包含字节的十六进制表示形式。 But also ASCII representations, if that's all the file signature is.
但也就是ASCII表示,如果仅此而已是文件签名。 And maybe also regexes.
也许还有正则表达式。 :D (In case the XML-based format can be written with different amounts of whitespace, for instance)
:D(例如,如果基于XML的格式可以使用不同数量的空格编写)
Any ideas? 有任何想法吗?
You've already got your answer: YAML. 您已经有了答案:YAML。
The data you posted up above is storing text representations of binary data; 您在上方发布的数据用于存储二进制数据的文本表示形式; that will be fine for YAML, you just need to parse it properly.
这对YAML很好,您只需要正确解析它即可。 Usually you'd use something from the binascii module;
通常,您会使用binascii模块中的内容; in this case, likely the
binascii.a2b_qp
function. 在这种情况下,可能是
binascii.a2b_qp
函数。
magic_id_str = 'x100\x88\xce\xcf\xcfOrCAD '
magic_id = binascii.a2b_qp(magic_id_str)
To elucidate, I will use a unicode character as an easy way to paste binary data into the REPL (Python 2.7): 为了阐明这一点,我将使用unicode字符作为将二进制数据粘贴到REPL(Python 2.7)的简单方法:
>>> a = 'Φ'
>>> a
'\xce\xa6'
>>> binascii.b2a_qp(a)
'=CE=A6'
>>> magic_text = yaml.load("""
... magic_string: '=CE=A6'
... """)
>>> magic_text
{'magic_string': '=CE=A6'}
>>> binascii.a2b_qp(magic_text['magic_string'])
'\xce\xa6'
I would suggest doing this a little differently. 我建议这样做有所不同。 I would decouple these two settings from each other:
我将这两个设置彼此分离:
For the first part, I would use python-magic , a library that has bindings to libmagic . 对于第一部分,我将使用python-magic ,该库具有与libmagic的绑定。 You can have python-magic use a custom magic file like this:
您可以让python-magic使用这样的自定义魔术文件:
import magic
m = magic.Magic(magic_file='/path/to/magic.file')
Your users can specify a custom magic file mapping magic numbers to mimetypes. 您的用户可以指定一个自定义魔术文件,将魔术数字映射到模仿类型。 The syntax of magic files is documented .
魔术文件的语法已记录在案 。 Here's an example showing the magic file for the TIFF format:
这是显示TIFF格式的魔术文件的示例:
# Tag Image File Format, from Daniel Quinlan (quinlan@yggdrasil.com)
# The second word of TIFF files is the TIFF version number, 42, which has
# never changed. The TIFF specification recommends testing for it.
0 string MM\x00\x2a TIFF image data, big-endian
!:mime image/tiff
0 string II\x2a\x00 TIFF image data, little-endian
!:mime image/tiff
The second part then is pretty easy, since you only need to specify text data now. 第二部分非常简单,因为您现在只需要指定文本数据即可。 You could go with an INI or yaml format, as suggested by others, or you could even have just a simple tab-delimited file like this:
您可以按照其他人的建议使用INI或yaml格式,或者甚至可以使用一个简单的制表符分隔文件,如下所示:
image/tiff C:\Program Files\imageviewer.exe
application/json C:\Program Files\notepad.exe
I've used some packages to build configuration files, also yaml. 我用了一些软件包来构建配置文件,也就是yaml。 I recommend that you use ConfigParser or ConfigObj.
我建议您使用ConfigParser或ConfigObj。
At last, the best option If you wanna build a human-readable configuration file with comments I strongly recommend use ConfigObj. 最后,最好的选择如果您想构建带有注释的人类可读配置文件,我强烈建议您使用ConfigObj。
Enjoy! 请享用!
Example of ConfigObj ConfigObj的示例
With this code: 使用此代码:
You can use ConfigObj to store them too. 您也可以使用ConfigObj来存储它们。 Try this one: import configobj
试试这个:import configobj
def createConfig(path):
config = configobj.ConfigObj()
config.filename = path
config["Sony"] = {}
config["Sony"]["product"] = "Sony PS3"
config["Sony"]["accessories"] = ['controller', 'eye', 'memory stick']
config["Sony"]["retail price"] = "$400"
config["Sony"]["binary one"]= bin(173)
config.write()
You get this file: 您得到此文件:
[Sony]
product = Sony PS3
accessories = controller, eye, memory stick
retail price = $400
binary one = 0b10101101
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.