简体   繁体   中英

Python: Decompress zlib string

I am working on a python script to read mzXML files. I can parse the data I want but it is zlib compressed. I've tried using the zlib library but it gives me the following error:

zlib.error: Error -3 while decompressing data: incorrect header check

I understand the error is due to a missing header check but I don't know how to give it that. Here is the data I am trying to decode

compressedLen="3830" precision="64" byteOrder="network" contentType="m/z-int"eJwtWXdclNcShVhjRQUVC116c+lFWFl6c+lLiVQReIpYAFFQBCxYELazgCIqFtAYlWjsGpJYYoxRX2yJGsEeXzDYYnvxHPlnfjNz5szMvfvdhnDRxQda//4J/coqyz/KxSeWn/goR9rfFH6UpTPGQxoFHId9SYED8FMO9739UZab/ATdrX8JcOVWC4CbHHQnDbrzAaOP0nhRKvUZoyHdvmqDvVyajDjrJBfwlB95DLtVtwXqWda9HXlML7oiriL0FOyCxUmQFcnZwE9YshF5K7JUsJuefEZ/7VTw62/8hfrDodBtK1Y1f5SVV81gH7t8I+yVPWJIvd2XwVc1KBf8Y/QnAl+VYow6rao7gFsZMQG6Q8Is1Lmq30Lo4wc2wV+d0wzdpEAH+prKM+AzKvsbcl0y40w6JqKOmgXlyOO4/zvga5Z+x/7FcZA1e7YCZ15oSPwzAeoUDHMH33pzGew2na8QX5uegPwBNg+oV8bAP2pxKvC1x3+D39n9Cca39ko/+D27i6k/eQJ+Q58BjH/1FnYjKz3E1U04D//orgzUXWe1EtJqsSd46oQjIa0vL6ffj78z203nwFcX0AkelzFjqEdVQEZHD6aeLkFcsFhEveAI4r2Ef5O/SAMZen4T9ZKTwFmmNKOuunVOnK/5F1hv3ShI8bMU1qMKhR6/tJnxG6wQHzy1iPmau9CvaAx//3WtSuAsFeXEt4cAJzw0APNTt/tr4Hz8ihl/8A74gwML6D9agLjwJXmMP/kbcL6604g/rUZ8cpaM9f04H3qiaxHmq+6SCnbXD87kuzoL/E57bNjfjZvA295TsN5bc4ATv3iPPur+bAHOy0+X/D068HvE1xPfmwgZPTgPfqmWHucjV4J6pf1F4PH3uw8pHSKBnKk1iH6dM5DGZ2ORRzqqCPwhWdeIHzMafMk2FyCl+krgPF/o0284Dflj5m+jbsrfZ1TyefKZf4+4EEEh67Ohf+qQ69QdAiEjfrFlvCAfeB9bAeqQOv8Oe6hyFuv1/BL4jDQ/6kIhZFDvLIy31J+/N0FrIvWw4eATVOmQP2Iz5y+/l/nF26E7/q+E9cYVQo+N3Mp+Ex7C7mwfSH/KO/Qr8f4eUpr6GLhpDQLWkz0CvFl3OJ/S3FLETd8yinyzTsDuf13GegrCYQ/tKCa+iOPi0m7O+sp9YU8oUTC+0hR6RFM58SvfIK94yWD2W9MImfjiv4yX6gE3Yw7XUWm9LfTE1b2Mb+pEHWl/DaK/JQR8mWcr6d+6g/P3jyPr3W4Nu/huKfnbazleulnE75mB+Pi4HPr3+UK6vshh/IF9wIcn1NJ/zBvznDjFjPV3fgP/Fx0biT9dBd4MMddn6QWOf7IV1z/pjU7E+X2tYf5b3E8kj+cQ38W+om64Mt+DTOBi5xVR/3MT4mPuTuH89fB7zAjUZj0vvoE9ooXrrPT1VdgT/PXofz8I+Mhj3M9k2jLIkHlv0Zes32Pk8Y+PQr2yofngSWzn+icbwbgU0WrgZKN/QL3xig7Gj3vK+diZCbvMYCHyprazXpnZWcRLOrk+ySzjIL304rSg2+4Fb+IKSpmDHFIc10Z+wQrwCky5f8lcWyGTb31O3YvrsyB7APl9T4I3yCyKumg/+oqN+xQf6MH44KUcj5A+wCWkjaT+7w/3o3Rw+IfxUUM4f4Y9HJ+4bOD8l3LdliVuAl/epqGMT+M8ex/65M/8FnGC2l30z7wAOcZyE/ubxXOO02Br1pffBRmTeIr4edOBC5qdTn9hMnjDWpzIv9AXMua0NvlKub8lfdBiveW+qD9rugf5lldgfKLqa2GXrbZHvGgOf1eydWcgxSscyFfL/T/8VSjzy3g+8/9wjn61FPlmDHej3vgCuMViW+qb+B190TaT9W5N5vilF4BHtvPTemTxnv5d3Odiu2fCLvvqAHAZm3Xo38/9MUF8nPwHPoc9oyuN9R3i/pfdVkb+Y9yfUhMu0X/SHjK/jOcK2Q8DwTM35wb958Tgy594kOPxUwrsOdrcR2UXsyHzIn2Z//JXkBEdZ1nf1W7gc/ee4Pj+zvlcNor7iuxuDPSChAvE3/cB31xdnjtlj3geiwqOpf+pAfyFxycxvmcZ8mV67SL+Oc+txZmL6H99Fv6SvS8Z/47zukTCc7lc63vYPafvgV3el99riVskpHwg53duEs9R8qHGsC85w/OrXIe/h8zsFki5rjns854vR175mBLoC/LX0T/uOvpdWDCefIac96pa/r7lps5pGJ/r1cRbpGNeFhRzPZPbcH4Wz/dkvP0LyMVBKsYLrBGf9Y7zInfl73f5/XrY5Z7ZwBcufEG/z0zoM3WeIo98agfnW8LvSx7I9arQnd+7PDQd/ozuLPYXyfUuv2go64uxA19cp5D8kqPQp93+nfr0m+BJ2rmf+PS7kHbts+mfaYLxdby4mv68ctQhHLia9czrgX1snxvs//JExHnM/4713S+DbrsxjPP3aBn6cjtkz3p7uK9ayfidyf9uAt47XMH4f3hODN/B71OhxXOHwzoN4hV9imF3WV4AvELvFewB6/m9KsbzXOx+egH9xivBZ3iJ50OFKe9NTnfOAaewsoDdu/cB8ii8p6Gv2M4K8vncBs46ZjzxftHgc33YQj14HaSvC+97irBMxLukX2C+T+d9z2beJxTxHyCzjH6FXZHO353n+R7qWTfQz6SmVPLntnH83zSyngUbwDtZWkN/aRukv/Zd8lcvQh/OfXnuVKx5CSn86yXxNcc4v+rhrFd6GLwB3q30N54F3uJznr8VzS+Bs97Nc52ibSnyi3feYz1HlnC81f8j/kQ/4CZ+a8x+Tn0Fv5/Un+P7/SX4Q6b/Rb4fryHOxm0k43/hOS/qZ55/FTczIcO2riS+awtkxNUy5n/QirhAMxOO95OrwPs/HEP9L57Lorp471U8fwZ9knwj8731Aj4hrwu8Si1v4MIKtoBfOYDfQeTmceBTDrmP8fCztKR/xGDEhR/byfjRRuB3zX8JfuU4nheEU7gPK615LxMee0O//VTOXzb3a+XkXuD9mnguVPo8BW7a8BXURdmQIelFxAcyXlheiLqUoavg91yfTD3CD3UlbR/BfFHGsIeuEJEvjvu+6K4N9UyuRzH3etlPTjFk0G6uJ8rZQ4B3j/Enfn4p+AQ5n/ot+5b9Kbj/KyvNEZfQfRl9KVfMhgxy4flXuXYvcF79LlJf/woytus78Cg1PEdJxt+j3rYE+cSv3xO/ez38bkXryH9YAJz7CgPWd5Tfe8jmDI7HSd4vA1dJOZ8/8nuKnNGf+J957hbncj1XXue5KUCdRPwfvD8bJemxnvuxqD+ybDD1R8MwXo5mU1jPnzOR17KD7yrKns8g/d/y/qPsvQ2cccan8Xsdj7zenx1hvW/5jmN3So78qs94r7LYVwK7amgZ+2mqg12lw/t+iHYGeFR6Ssjgc79R19cFT/D8n5FfNVGC/HGWkeQzPAmcjfoK7CqzvYx/eYT8VlwHRAe5TqjsOF7WE+ayPqcNsDtlBZPfnetscDZ/Ryqvs8AFtuaRX6RC38ZxNehXFbwWcbHpfaiHtwNvZ9ZFvujbsIe9MkCcKpHzPv1dK+tPuY580x7dYr+p3F9SGrmeq2Y9Bz4qyot6USJ4RDdrWF+pKWS8vTb9Sw9wfpv70l/Hc5nZwuPUFanw21XfIl7FdTGojfuIqnEQ6+2zg3pzPqT4fSfrb+X92svGlXo774+WlbyXqb55jfoFzkc4Xhdegc9RJuB4/7offPp6R9jvzZuIS07je5vqThV4TXdK6H/SAp5xC69wvHrvwW91Jpv53zSAzyHfE7p6gB1ksBf3A7U+78lii1bwqSfwPOfVZYW61EYO8E8o4b1AbVGNOh3nVcOutvaF7nztH+RR25kj3if6FupSC9bC71E9lvxu5LVM5bus2ssZumc1f3fqgNuQgRG8t6rDcpEnpL8j46P2IG5KVgXxEhfU5fNQzX5m893R5V4a6yuMB09kLs+v6jKeb6esVVNfOxEySKub/a3/gHomTvqD+TRzweMq4blK3XQKftMPD1lfazTq8LuUTX1HCXCu1ziP6j0diJ9q1s76Dp9GvKEB3zHUJ1ciPkDYzvgfd0CPKOV7ifqP8+ALUw0l332es3zKeO9UP1oMXvvECvb7ZyLwVrt4/le/WcD5rtJGfL32euj610aAv37IVujO+yIQXz/6AnRLhT79BnbIMzm6HXz1JkXQp36xiH7LJMSZTOZ9sN7OAf0FxLYzn6AYunjYafDWC0dBmkdwna73Wwndtn0f9TC+v7lU2TI+rhv8Nv5Z1BN/Ap/Nh+HUv7jC+ozLWE/WMYyrQdAU5svpD2nXxne++vxxiA9c0Jf6vDDEeyxSM77EDv6Q7Z+znuXZ4LNTXCO+djr0IB3eu+o38T3c+zXPD/U7+d7k9Nlp1reX770BY/n91x/UgvQ4e4X1HU1Ff7b7uR/Wd56DtAqJpX7rMsereBX57t4Av9s+Fet7NAQ40ToF+Z+8A07g95Dxb/MhAzzeAK8Z2AK+8C/z0KdG5z+IEy30pq73jvMRy3crjf4a+B1O8/1RY8D1R+8G3780NgaIc8jfhro0nj8jj0m3IfP5813Udh/fhTVBW4B3D+C7iCZCAt3CZAP9abvB7xLC71qTk4N+PJ5uw3hrZqfA7vxhNHCaCr7PhAw2p75CHzxmq7hva9SPgfcp5T6oaSwETqTh+66mpRC4yTX8P5DmHr8zx93jGP88AbjAl9vQb4P2eeQLfsL3m4Z+4+F3G3sY+AbdB+jHtYLvUA0GnEf/hB7YG0zmIr/X0QD6HfIRb6TH98iG8JXwO4Y/h70hm+dd1wFrmH8O3/WcJUOIL+L/M6YOfEy98inqcHjPd9+Gmh3g8/7yAP34mf+rx/G9ouEH/r/BbT/PXw2Xf0Eek9zJ1J88RB2CaI5jQ89hzq+I72SNAzYj3mK9C/I06v4K/6S5fA9rtOG7m+M5vm82CoaB19GV77qN7nxPdfVzox5rhzzGfXQZv5nf0

It's peaks data from a Proteomic scan. Any help in understanding this information would be appreciated.

Solved this issue. Was able to read the Base64 decoded data using code from this library

Here is the code snippet:

def parse_peaks(peaks_decoded):
#Based on code by Taejoon Kwon (https://code.google.com/archive/p/massspec-toolbox/)
tmp_size = len(peaks_decoded)/4
unpack_format1 = ">%dL" % tmp_size

idx = 0
mz_list = []
intensity_list = []
for tmp in struct.unpack(unpack_format1,peaks_decoded):
    tmp_i = struct.pack("I",tmp)
    tmp_f = struct.unpack("f",tmp_i)[0]
    if( idx % 2 == 0 ):
        mz_list.append( float(tmp_f) )
    else:
        intensity_list.append( float(tmp_f) )
    idx += 1
return mz_list,intensity_list

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM