简体   繁体   English

Pathlib read_text 作为字符串文字

[英]Pathlib read_text as a string literal

I am trying to generate some json data from txt files.我正在尝试从 txt 文件中生成一些 json 数据。

The txt files are generated from books, using their ocr, which makes them inestimable (i can't randomly change the chars i don't like, since they could be important) and unreliable (the ocr could have gone wrong, the author could have inserted symbols that would mess with my code). txt 文件是从书籍中生成的,使用它们的 ocr,这使得它们无法估量(我不能随机更改我不喜欢的字符,因为它们可能很重要)并且不可靠(ocr 可能出错了,作者可以插入了会弄乱我的代码的符号)。

As of now, i have this:截至目前,我有这个:

output_folder = Path(output_folder)
    
value = json.loads('{"nome": "' + file_name[:len(file_name)-4] + '", "testu": "' + (Path(filename).read_text()) + '"}')
    path = output_folder / (file_name[:len(file_name)-4] + "_opare.json")
    with path.open(mode="w+") as working_file:
        working_file.write("[" + str(value) + "]")
        working_file.close()

This throws me the error json.decoder.JSONDecodeError: Invalid control character which i understood is caused by my book starting (yes) with a ' (a quote).这向我抛出了错误json.decoder.JSONDecodeError: Invalid control character是由我的书以 ' (引号)开头(是)引起的。

I've read about string literals, that seem to be relevant for my case, but i didn't uderstood how i could use them.我读过关于字符串文字的内容,这似乎与我的情况有关,但我不知道如何使用它们。

What can i do?我能做些什么?

Thanks谢谢

Why would you make a json just to parse it again?你为什么要做一个 json 只是为了再次解析它? You can just create a dictionary:你可以只创建一个字典:

value = {
  "nome": file_name[:len(file_name)-4],
  "testu":Path(filename).read_text(),
}

Reading between the lines, the JSONDecodeError doesn't actually come from this code, does it?从字里行间看, JSONDecodeError实际上并不是来自这段代码,是吗? It comes from the code that's reading your file later.它来自稍后读取文件的代码。

You can't write a dict to a JSON file using str(value) .您不能使用str(value)将 dict 写入 JSON 文件。 Python's dict-to-string conversion uses single quotes, which is not legal in JSON. Python 的 dict 到字符串的转换使用单引号,这在 JSON 中是不合法的。 You need to convert it back to JSON:您需要将其转换回 JSON:

    with path.open(mode="w+") as working_file:
        json.dump( [value], working_file )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 pathlib read_text() 方法如何在 Windows 10 Enterprise 上正确显示 German Umlaute? - How can pathlib read_text() method display German Umlaute correctly on Windows 10 Enterprise? 达克斯袋read_text()行顺序 - Dask Bag read_text() line order Python read_text() 添加额外的字符串 - Python read_text() adding extra strings 是否可以在 Python 脚本中使用 read_text() 获取变量的内容? - Is it possible to get the contents of a variable with read_text() in Python script? Weasyprint 在调用 write_pdf 时获得未定义的属性:“AttributeError: 'PosixPath' 对象没有属性 'read_text'” - Weasyprint get undefined property at invoking write_pdf: "AttributeError: 'PosixPath' object has no attribute 'read_text'" AttributeError: 'PosixPath' object 在构建 heroku 应用程序时没有属性 'read_text' - AttributeError: 'PosixPath' object has no attribute 'read_text' while building heroku app 在 Python 中将文件作为字符串文字读取 - Read file as a string literal in Python pathlib 和字符串连接的控制顺序 - Control order of pathlib and string concatenation 文件字符串 pathlib 的真实路径 - Real path of file string pathlib pathlib 路径`write_text` 在追加模式 - pathlib Path `write_text` in append mode
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM