简体   繁体   中英

How to convert msgpack to json format

I have used kafka to stream my messages which encoded using msgpack. After that, i used msgpack to decode the messages. But i cant find anyway to align or format the messages in order to make it more readable.

consumer = KafkaConsumer(
   'frontier-done',
   bootstrap_servers=['localhost:9092'],
   auto_offset_reset='smallest',
   value_deserializer=lambda x: msgpack.loads(x, encoding='utf-8'))

Output/Messages

[b'pc', [b'https://en.wikipedia.org/wiki/SMS', 200, {b'scrapy_callback': None, b'scrapy_errback': None, b'scrapy_meta': {b'link_text': b'Short Message Service', b'download_timeout': 180.0, b'download_slot': b'en.wikipedia.org', b'download_latency': 0.04313206672668457, b'depth': 0}, b'origin_is_frontier': True, b'domain': {b'netloc': b'en.wikipedia.org', b'name': b'en.wikipedia.org', b'scheme': b'https', b'sld': b'', b'tld': b'', b'subdomain': b'', b'fingerprint': b'0acd465bbb0ec47c393eee1b4ae069f228dde142'}, b'fingerprint': b'7b2bc785328543b718bf06be33c59bbaa89a2793', b'state': 0, b'score': 1.0, b'jid': 0, b'encoding': b'utf-8'}, {b'Date': [b'Tue, 02 Jul 2019 08:07:17 GMT'], b'Content-Type': [b'text/html; charset=UTF-8'], b'Server': [b'mw1319.eqiad.wmnet'], b'X-Content-Type-Options': [b'nosniff'], b'P3P': [b'CP="This is not a P3P policy! See https://en.wikipedia.org/wiki/Special:CentralAutoLogin/P3P for more info."'], b'X-Powered-By': [b'HHVM/3.18.6-dev'], b'Content-Language': [b'en'], b'Last-Modified': [b'Mon, 01 Jul 2019 16:28:27 GMT'], b'Backend-Timing': [b'D=180346 t=1561998535634972'], b'Vary': [b'Accept-Encoding,Cookie,Authorization,X-Seven'], b'X-Varnish': [b'238585272 211131050, 146864479 137022708, 329891973 239570064, 789362513 563374386'], b'Via': [b'1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1)'], b'Age': [b'56300'], b'X-Cache': [b'cp1075 hit/7, cp2019 hit/3, cp5007 hit/3, cp5008 hit/8'], b'X-Cache-Status': [b'hit-front'], b'Server-Timing': [b'cache;desc="hit-front"'], b'Strict-Transport-Security': [b'max-age=106384710; includeSubDomains; preload'], b'X-Analytics': [b'ns=0;page_id=28207;WMF-Last-Access=02-Jul-2019;WMF-Last-Access-Global=02-Jul-2019;https=1'], b'X-Client-Ip': [b'61.6.17.213'], b'Cache-Control': [b'private, s-maxage=0, max-age=0, must-revalidate'], b'Accept-Ranges': [b'bytes']}, None]]

So i think a best way is to convert the message to json format. As json format can use JSON Pretty Print.

The output contains byte values. You need to decode them first.

Belows is a workable example based on https://stackoverflow.com/a/57014807/5312776 .

import json

def decode_list(l):
    result = []
    for item in l:
        if isinstance(item, bytes):
            result.append(item.decode())
            continue
        if isinstance(item, list):
            result.append(decode_list(item))
            continue
        if isinstance(item, dict):
            result.append(decode_dict(item))
            continue
        result.append(item)
    return result

def decode_dict(d):
    result = {}
    for key, value in d.items():
        if isinstance(key, bytes):
            key = key.decode()
        if isinstance(value, bytes):
            value = value.decode()
        if isinstance(value, list):
            value = decode_list(value)
        elif isinstance(value, dict):
            value = decode_dict(value)
        result.update({key: value})
    return result

text = [b'pc', [b'https://en.wikipedia.org/wiki/SMS', 200, {b'scrapy_callback': None, b'scrapy_errback': None, b'scrapy_meta': {b'link_text': b'Short Message Service', b'download_timeout': 180.0, b'download_slot': b'en.wikipedia.org', b'download_latency': 0.04313206672668457, b'depth': 0}, b'origin_is_frontier': True, b'domain': {b'netloc': b'en.wikipedia.org', b'name': b'en.wikipedia.org', b'scheme': b'https', b'sld': b'', b'tld': b'', b'subdomain': b'', b'fingerprint': b'0acd465bbb0ec47c393eee1b4ae069f228dde142'}, b'fingerprint': b'7b2bc785328543b718bf06be33c59bbaa89a2793', b'state': 0, b'score': 1.0, b'jid': 0, b'encoding': b'utf-8'}, {b'Date': [b'Tue, 02 Jul 2019 08:07:17 GMT'], b'Content-Type': [b'text/html; charset=UTF-8'], b'Server': [b'mw1319.eqiad.wmnet'], b'X-Content-Type-Options': [b'nosniff'], b'P3P': [b'CP="This is not a P3P policy! See https://en.wikipedia.org/wiki/Special:CentralAutoLogin/P3P for more info."'], b'X-Powered-By': [b'HHVM/3.18.6-dev'], b'Content-Language': [b'en'], b'Last-Modified': [b'Mon, 01 Jul 2019 16:28:27 GMT'], b'Backend-Timing': [b'D=180346 t=1561998535634972'], b'Vary': [b'Accept-Encoding,Cookie,Authorization,X-Seven'], b'X-Varnish': [b'238585272 211131050, 146864479 137022708, 329891973 239570064, 789362513 563374386'], b'Via': [b'1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1)'], b'Age': [b'56300'], b'X-Cache': [b'cp1075 hit/7, cp2019 hit/3, cp5007 hit/3, cp5008 hit/8'], b'X-Cache-Status': [b'hit-front'], b'Server-Timing': [b'cache;desc="hit-front"'], b'Strict-Transport-Security': [b'max-age=106384710; includeSubDomains; preload'], b'X-Analytics': [b'ns=0;page_id=28207;WMF-Last-Access=02-Jul-2019;WMF-Last-Access-Global=02-Jul-2019;https=1'], b'X-Client-Ip': [b'61.6.17.213'], b'Cache-Control': [b'private, s-maxage=0, max-age=0, must-revalidate'], b'Accept-Ranges': [b'bytes']}, None]]

print(decode_list(text))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM