I am trying to generate some json data from txt files.
The txt files are generated from books, using their ocr, which makes them inestimable (i can't randomly change the chars i don't like, since they could be important) and unreliable (the ocr could have gone wrong, the author could have inserted symbols that would mess with my code).
As of now, i have this:
output_folder = Path(output_folder)
value = json.loads('{"nome": "' + file_name[:len(file_name)-4] + '", "testu": "' + (Path(filename).read_text()) + '"}')
path = output_folder / (file_name[:len(file_name)-4] + "_opare.json")
with path.open(mode="w+") as working_file:
working_file.write("[" + str(value) + "]")
working_file.close()
This throws me the error json.decoder.JSONDecodeError: Invalid control character
which i understood is caused by my book starting (yes) with a ' (a quote).
I've read about string literals, that seem to be relevant for my case, but i didn't uderstood how i could use them.
What can i do?
Thanks
Why would you make a json just to parse it again? You can just create a dictionary:
value = {
"nome": file_name[:len(file_name)-4],
"testu":Path(filename).read_text(),
}
Reading between the lines, the JSONDecodeError
doesn't actually come from this code, does it? It comes from the code that's reading your file later.
You can't write a dict to a JSON file using str(value)
. Python's dict-to-string conversion uses single quotes, which is not legal in JSON. You need to convert it back to JSON:
with path.open(mode="w+") as working_file:
json.dump( [value], working_file )
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.