简体   繁体   English

用python解析特殊的json格式

[英]parse special json format with python

I want to get GPSLatitude and GPSLongitude value, but I can't use python position because the position is pretty random. 我想获取GPSLatitude和GPSLongitude值,但是我不能使用python position,因为该位置非常随机。 I get the value by tag's value, how can I do that? 我通过标签的值获取值,该怎么办?

jsonFlickrApi({ "photo": { "id": "8566959299", "secret": "141af38562", "server": "8233", "farm": 9, "camera": "Apple iPhone 4S", 
    "exif": [
      { "tagspace": "JFIF", "tagspaceid": 0, "tag": "JFIFVersion", "label": "JFIFVersion", 
        "raw": { "_content": 1.01 } },
      { "tagspace": "JFIF", "tagspaceid": 0, "tag": "ResolutionUnit", "label": "Resolution Unit", 
        "raw": { "_content": "inches" } },
      { "tagspace": "JFIF", "tagspaceid": 0, "tag": "XResolution", "label": "X-Resolution", 
        "raw": { "_content": 72 }, 
        "clean": { "_content": "72 dpi" } },
      { "tagspace": "JFIF", "tagspaceid": 0, "tag": "YResolution", "label": "Y-Resolution", 
        "raw": { "_content": 72 }, 
        "clean": { "_content": "72 dpi" } },
      { "tagspace": "GPS", "tagspaceid": 0, "tag": "GPSLatitudeRef", "label": "GPS Latitude Ref", 
        "raw": { "_content": "North" } },
      { "tagspace": "GPS", "tagspaceid": 0, "tag": "GPSLatitude", "label": "GPS Latitude", 
        "raw": { "_content": "39 deg 56' 44.40\"" }, 
        "clean": { "_content": "39 deg 56' 44.40\" N" } },
      { "tagspace": "GPS", "tagspaceid": 0, "tag": "GPSLongitudeRef", "label": "GPS Longitude Ref", 
        "raw": { "_content": "East" } },
      { "tagspace": "GPS", "tagspaceid": 0, "tag": "GPSLongitude", "label": "GPS Longitude", 
        "raw": { "_content": "116 deg 16' 10.20\"" }, 
        "clean": { "_content": "116 deg 16' 10.20\" E" } },
    ] }, "stat": "ok" })

You don't say whether you using one of the Flickr APIs; 您没有说是否使用Flickr API之一; I assume not because handling JSON responses is trivial if you are using an API such as flickrapi . 我不认为是因为如果您使用的是诸如flickrapi之类的API,那么处理JSON响应就变得微不足道了。

import flickrapi

api_key = '88341066e8f0a40516599d28d8170627'   # from flickr's API explorer
secret = 'sssshhhh'
flickr = flickrapi.FlickrAPI(api_key, secret, format='parsed-json')
response = flickr.photos.getExif(photo_id='8566959299')
lat_long = {exif['tag']: exif['clean']['_content']
                    for exif in response['photo']['exif']
                        if exif['tag'] in (u'GPSLongitude', u'GPSLatitude')}

>>> from pprint import pprint
>>> pprint(lat_long)
{u'GPSLatitude': u'39 deg 56\' 44.40" N',
 u'GPSLongitude': u'116 deg 16\' 10.20" E'}

But continuing with the assumption that you are not using an API, the response format that you are seeing is actually JSONP which is better suited to Javascript than it is Python. 但是继续假设您未使用API​​,您看到的响应格式实际上是JSONP ,它比Python更适合Javascript。 You can, however, request a response in JSON representation that does not have the enclosing jsonFlickrApi() function wrapper. 但是,您可以请求不包含jsonFlickrApi()函数包装器的JSON表示形式的响应。 Do this by specifying format=json&nojsoncallback=1 in the query parameters of the request. 通过在请求的查询参数中指定format=json&nojsoncallback=1来执行此操作。 Using the requests library makes requesting and parsing the JSON response easy, but this will work just as well with urllib2.urlopen() combined with json.loads() if you can't use requests eg 使用请求库使得请求和解析JSON响应容易,但是这将与工作一样好urllib2.urlopen()联合json.loads()如果你不能使用requests

import requests

params = {'api_key': '88341066e8f0a40516599d28d8170627',
          'api_sig': '7b2dcfb2cd3a747179c2ed0fdc492699',
          'format': 'json',
          'method': 'flickr.photos.getExif',
          'nojsoncallback': '1',
          'photo_id': '8566959299',
          'secret': 'sssshhhh'}    
response = requests.get('https://api.flickr.com/services/rest/', params=params)
data = response.json()
lat_long = {exif['tag']: exif['clean']['_content']
                for exif in data['photo']['exif']
                    if exif['tag'] in (u'GPSLongitude', u'GPSLatitude')}

>>> from pprint import pprint
>>> pprint(lat_long)
{u'GPSLatitude': u'39 deg 56\' 44.40" N',
 u'GPSLongitude': u'116 deg 16\' 10.20" E'}

If looking at the whole string as jsonFlickrApi(XXX) , XXX is a standard JSON string. 如果将整个字符串视为jsonFlickrApi(XXX) ,则XXX是标准JSON字符串。 With json library, XXX can be converted to python dictionary and then parsed easily. 使用json库,可以将XXX转换为python字典,然后轻松解析。

With the exception of the last comma just before the closing bracket ] , the entire object returned by the FlickrAPI is valid json . 除了正括号[ ]前的最后一个逗号以外,FlickrAPI返回的整个对象都是有效的json

Assuming that that comma is merely a copy-paste error ( example evidence suggests this is the case), then the builtin json module still won't be usable as is. 假设逗号只是一个复制粘贴错误( 示例证据表明是这种情况),那么内置的json模块仍然无法按原样使用。 That's because even though a string like "116 deg 16' 10.20\\" E" is valid json , python's json module will complain with a ValueError because the double quote " isn't sufficiently quoted: 那是因为即使像"116 deg 16' 10.20\\" E"这样的字符串都是有效的json ,Python的json模块也会抱怨ValueError,因为双引号"的引号不够:

>>> import json
>>> json.loads('{"a": "2"}')
{u'a': u'2'}
>>> json.loads('{"a": "2\""}')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/json/decoder.py", line 365, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.7/json/decoder.py", line 381, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Expecting , delimiter: line 1 column 10 (char 9)

The solution is to add another escaping backslash: 解决方案是添加另一个转义的反斜杠:

>>> json.loads('{"a": "2\\""}')
{u'a': u'2"'}

For your full jsonFlickrApi response, you could add those extra backslashes with the re module : 对于完整的jsonFlickrApi响应,您可以使用re模块添加这些额外的反斜杠:

>>> import re
>>> response = """jsonFlickrApi({ "photo": { "id": "8566959299", "secret": "141af38562", "server": "8233", "farm": 9, "camera": "Apple iPhone 4S", 
...     "exif": [
...       { "tagspace": "JFIF", "tagspaceid": 0, "tag": "JFIFVersion", "label": "JFIFVersion", 
...         "raw": { "_content": 1.01 } },
...       { "tagspace": "JFIF", "tagspaceid": 0, "tag": "ResolutionUnit", "label": "Resolution Unit", 
...         "raw": { "_content": "inches" } },
...       { "tagspace": "JFIF", "tagspaceid": 0, "tag": "XResolution", "label": "X-Resolution", 
...         "raw": { "_content": 72 }, 
...         "clean": { "_content": "72 dpi" } },
...       { "tagspace": "JFIF", "tagspaceid": 0, "tag": "YResolution", "label": "Y-Resolution", 
...         "raw": { "_content": 72 }, 
...         "clean": { "_content": "72 dpi" } },
...       { "tagspace": "GPS", "tagspaceid": 0, "tag": "GPSLatitudeRef", "label": "GPS Latitude Ref", 
...         "raw": { "_content": "North" } },
...       { "tagspace": "GPS", "tagspaceid": 0, "tag": "GPSLatitude", "label": "GPS Latitude", 
...         "raw": { "_content": "39 deg 56' 44.40\"" }, 
...         "clean": { "_content": "39 deg 56' 44.40\" N" } },
...       { "tagspace": "GPS", "tagspaceid": 0, "tag": "GPSLongitudeRef", "label": "GPS Longitude Ref", 
...         "raw": { "_content": "East" } },
...       { "tagspace": "GPS", "tagspaceid": 0, "tag": "GPSLongitude", "label": "GPS Longitude", 
...         "raw": { "_content": "116 deg 16' 10.20\"" }, 
...         "clean": { "_content": "116 deg 16' 10.20\" E" } }
...     ] }, "stat": "ok" })"""
>>> quoted_resp = re.sub('deg ([^"]+)"', r'deg \1\\"', response[14:-1])

That quoted response can then be used in a call to json.loads and you can then easily access the required data in the newly generated dictionary structure: 然后,可以将引用的响应用于调用json.loads ,然后可以在新生成的字典结构中轻松访问所需的数据:

>>> photodict = json.loads(quoted_resp)
>>> for meta in photodict['photo']['exif']:                                                                                                               
...     if meta["tagspace"] == "GPS" and meta["tag"] == "GPSLongitude":
...         print(meta["clean"]["_content"])
... 
116 deg 16' 10.20" E

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM