I am reading multiple json_file
and storing them in json_text
like this:
json_text = json_file.read()
When I print
json_text
I get the following information:
{
"speech": {
"text": "<p>Lords</p><p>We are all in the same boat</p><p>It is time for us to help</p>",
"id": null,
"doc_id": null,
"fave": "N",
"system": "2015-09-24 13:00:17"
}
}
<type 'str'>
I was assuming I would get this as a dict by using json.loads()
but that doesn't work:
ValueError: No JSON object could be decoded
Apparently loads()
doesn't identify json_text
as JSON, even though it is a valid JSON according to http://jsonlint.com So I thought I'd use dump()
and then loads()
:
json_dumps = json.dumps(json_text)
json_loads = json.loads(json_dumps)
print json_loads, type(json_loads)
Gives:
{
"speech": {
"text": "<p>Lords</p><p>We are all in the same boat</p><p>It is time for us to help</p>",
"id": null,
"doc_id": null,
"fave": "N",
"system": "2015-09-24 13:00:17"
}
}
<type 'unicode'>
I've also tried using ast
and literal_eval()
on json_text
but then I get:
ValueError: malformed string
So. The scenario is that I have multiple json-files in a folder. I want to load these files and take specific keys and store them in a pandas
DataFrame
. I've tried pd.read_json()
but it just tells me that there is something wrong with my json
.
This is my code:
path_to_json = 'folder/'
json_files = [pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
for index, js in enumerate(json_files):
with open(os.path.join(path_to_json, js)) as json_file:
json.load(json_file)
Gives ValueError: No JSON object could be decoded
and therefor I've tried using json_file.read()
et.c.
As I mentioned in the comments it will also cause a ValueError
if the encoding is not ASCII based. For example the following json.loads
fails:
>>> json.loads(u'{"id": null}'.encode("utf16"))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python2.7/json/__init__.py", line 339, in loads
return _default_decoder.decode(s)
File "/usr/lib64/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib64/python2.7/json/decoder.py", line 382, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
One way you could look at the encoding is to print(repr(json_text))
which could show additional bytes (like in UTF-16):
>>> print(repr(u'{"id": null}'.encode("utf16")))
'\xff\xfe{\x00"\x00i\x00d\x00"\x00:\x00 \x00n\x00u\x00l\x00l\x00}\x00'
The json.load
and json.loads
both support in Python 2 an encoding parameter. But that only applies to ASCII based encodings, so for UTF-16 you get the same ValueError
:
>>> json.loads(u'{"id": null}'.encode("utf16"), encoding="utf16")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python2.7/json/__init__.py", line 352, in loads
return cls(encoding=encoding, **kw).decode(s)
File "/usr/lib64/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib64/python2.7/json/decoder.py", line 382, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
If that is still the case (and you are sure that the issue it being encoded incorrectly) you can either decode the string manually:
json_text = json_text.decode("utf16")
Or you can load the file using codecs.open
:
with codecs.open(json_file_name, "r", encoding="utf16") as f:
print(json.load(f))
# or
# json_text = f.read()
(Note that I'm using UTF-16 here, but this might not be in the case for you)
And looking from your JSON text the characters itself are all ASCII characters, so any ASCII based encoding (eg latin-1) would still work without any decoding because there is no difference between that JSON content encoded in ASCII, UTF8 or latin-1.
As a side note you dumped the text and loaded it, and got a unicode
object back. In theory (if my answer is correct) you should be able to actually load json_loads
(aka json.loads(json_loads)
).
Not sure what's your error. I'm able to run as expected with this code:
import json
strdata = """
{
"speech": {
"text": "<p>Lords</p><p>We are all in the same boat</p><p>It is time for us to help</p>",
"id": null,
"doc_id": null,
"fave": "N",
"system": "2015-09-24 13:00:17"
}
}
"""
data = json.loads(strdata)
print(data)
It seems to be a question of encoding, maybe already by reading from file. You should use the appropriate encoding as parameter in json.loads.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.