简体   繁体   中英

Python Adding data from a list of lists into its own row of a dataframe

The code I have below takes a JSON response and extracts some data that we need from it. The response is odd, sometimes its a dict sometimes its a list. My code accounts for that. I really only need the list response. The problem is sometimes the list response is a list of lists. In the csv image below row 4 is an example of this. My code goes into the third column(C) and takes the first list of data and adds its to a dataframe. Their are six values and they get their own column and then it moves to the next row. Row 2 is a example of this. it has in column c one list. The problem is when i come to a row like row 4. Row 4 has multiple lists in column C. The image shows at least two but it can be any number. I need to take any dataset found in Column C and give it its own row in the new dataframe. so the data from column c in row 2 would get one row of 6 columns in the dataframe, the data from row 4 would get at least 3 rows since it shows at least 3 datsets. Row 5 and six would also return 3 rows each to the dataframe and then 7 would only return 1.

数据集

import json, time
from websocket import create_connection
import pandas as pd
   
# start with empty dataframe
df = pd.DataFrame()   
super_x = []
ws = create_connection("wss://ws.kraken.com/")

ws.send(json.dumps({
    "event": "subscribe",
    "pair": ["BTC/USD"],
    "subscription": {"name": "trade"}
}))

timeout = time.time() + 60*.20
while time.time() < timeout:
    js = json.loads(ws.recv())
    if isinstance(js, dict):
        df = pd.concat([df, pd.json_normalize(js)])
        #super_x.append( [super_x, pd.json_normalize(js)])
    elif isinstance(js, list):        
        df = pd.concat([df, pd.json_normalize({"event":"trade",
        #super_x.append([super_x, pd.json_normalize({"event":"trade",
                                               "trade":{                                                                                                     
                                                   "s0":js[1][0][0],
                                                   "s1":js[1][0][1], 
                                                   "s2":js[1][0][2],
                                                   "s3":js[1][0][3],                                                
                                                   "s4":js[1][0][4],
                                                   "s5":js[1][0][5],  
                                                   "pair":js[3]}
                                              })
                       ] ) 
                              
        try:
            fd = ([pd.json_normalize({"trade":{ "s":js[1] }}) ]) 
            if fd:
                print(len(fd))
        
        except:
            print("An exception occurred")


    
    else:
         f"unknown socket data {js}"
   
    
    
    #print(js)
    #time.sleep(1)
df = pd.concat(super_x, axis=0)
#data filters
df = df[df['event'] != 'systemStatus'] 
df = df[df['event'] != 'subscriptionStatus']
df = df[df['event'] != 'heartbeat']     
#column drop for csv
cols = [0,2,3,4,5,6,7] 
df.drop(df.columns[cols],axis=1,inplace=True)
df.columns =['event','price','volume', 'time', 'side', 'orderType', 'misc', 'pair']
csv_file = "kracktwo-test.csv"
df.to_csv(csv_file, index=False, encoding='utf-8')  

in my code, the elif is where I get the data i need. if you look at the line "trade":{ "s0":js[1][0][0]} , There will always be data in js[1][0] I need to look for and append the data if there is any at js[1][1] , js[1][2] , ... I just dont quite understand how I would do that.

here is an image of a csv that soewhat works correctly.. the data i want has been put into rows under their own column. However this example is showing only data from the first datasets of each row, if the row was really like row four from the first csv image it would only return the first one, thats the problem.

在此处输入图像描述

new image: 显示有多个列表的行

here is the raw response coming in. It shows everything that Im currently filtering out.

{'connectionID': 16068280472185995247, 'event': 'systemStatus', 'status': 'online', 'version': '1.7.2'}
{'channelID': 321, 'channelName': 'trade', 'event': 'subscriptionStatus', 'pair': 'XBT/USD', 'status': 'subscribed', 'subscription': {'name': 'trade'}}
[321, [['46720.00000', '0.00110000', '1612842883.462662', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
{'event': 'heartbeat'}
[321, [['46720.00000', '1.00000000', '1612842885.072037', 'b', 'm', '']], 'trade', 'XBT/USD']
[321, [['46720.00000', '0.01500000', '1612842885.083810', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
[321, [['46719.90000', '0.03710320', '1612842886.195731', 's', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
[321, [['46720.00000', '0.00100000', '1612842886.966132', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
[321, [['46719.90000', '0.00718180', '1612842887.736970', 's', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
[321, [['46720.00000', '0.20000000', '1612842889.436244', 'b', 'l', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
[321, [['46720.00000', '0.00849880', '1612842889.692922', 'b', 'l', '']], 'trade', 'XBT/USD']
[321, [['46720.00000', '0.05000000', '1612842890.358690', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
{'event': 'heartbeat'}
[321, [['46720.00000', '0.10702055', '1612842891.977570', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
[321, [['46720.00000', '0.07603446', '1612842892.601437', 'b', 'm', '']], 'trade', 'XBT/USD']
[321, [['46719.90000', '0.01604475', '1612842893.217442', 's', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
[321, [['46720.00000', '1.16008431', '1612842894.457002', 'b', 'm', '']], 'trade', 'XBT/USD']
[321, [['46720.00000', '0.01500000', '1612842894.478225', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
[321, [['46720.00000', '0.01000000', '1612842895.156688', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
{'event': 'heartbeat'}
[321, [['46720.00000', '0.00874369', '1612842897.466145', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
{'event': 'heartbeat'}
{'event': 'heartbeat'}
[321, [['46720.00000', '0.32680412', '1612842900.426143', 'b', 'l', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
[321, [['46720.00000', '0.02000000', '1612842900.731235', 'b', 'l', '']], 'trade', 'XBT/USD']
[321, [['46720.00000', '0.02000000', '1612842900.818573', 'b', 'l', '']], 'trade', 'XBT/USD']
[321, [['46720.00000', '0.01510000', '1612842900.904646', 'b', 'l', '']], 'trade', 'XBT/USD']
[321, [['46720.00000', '0.00668944', '1612842901.064427', 'b', 'l', '']], 'trade', 'XBT/USD']
[321, [['46720.00000', '1.57551815', '1612842901.223155', 'b', 'l', '']], 'trade', 'XBT/USD']
[321, [['46720.00000', '0.19335759', '1612842901.465767', 'b', 'l', ''], ['46720.00000', '0.10000000', '1612842901.467930', 'b', 'l', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
[321, [['46725.00000', '0.00200000', '1612842901.772735', 'b', 'l', '']], 'trade', 'XBT/USD']
[321, [['46728.20000', '0.30000000', '1612842901.830095', 'b', 'l', ''], ['46729.60000', '0.00500000', '1612842901.832807', 'b', 'l', '']], 'trade', 'XBT/USD']
[321, [['46730.00000', '0.70000000', '1612842902.123385', 'b', 'l', ''], ['46730.80000', '0.00107000', '1612842902.125857', 'b', 'l', ''], ['46740.00000', '2.00000000', '1612842902.128813', 'b', 'l', ''], ['46740.70000', '0.34406831', '1612842902.131029', 'b', 'l', ''], ['46742.50000', '0.00062959', '1612842902.133150', 'b', 'l', ''], ['46744.60000', '0.20000000', '1612842902.136065', 'b', 'l', ''], ['46750.00000', '0.01851050', '1612842902.138491', 'b', 'l', ''], ['46750.00000', '0.03423252', '1612842902.141181', 'b', 'l', '']], 'trade', 'XBT/USD']
[321, [['46729.90000', '0.00100000', '1612842902.149428', 's', 'm', '']], 'trade', 'XBT/USD']
[321, [['46750.00000', '0.14725698', '1612842902.153561', 'b', 'm', ''], ['46750.00000', '0.10274302', '1612842902.154768', 'b', 'm', '']], 'trade', 'XBT/USD']
[321, [['46750.00000', '0.50000000', '1612842902.158276', 'b', 'm', '']], 'trade', 'XBT/USD']
[321, [['46750.00000', '0.05000000', '1612842902.162690', 'b', 'm', ''], ['46750.00000', '0.10000000', '1612842902.166186', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
[321, [['46750.00000', '0.10695187', '1612842903.077553', 'b', 'm', '']], 'trade', 'XBT/USD']
[321, [['46750.00000', '0.49430511', '1612842903.099799', 'b', 'l', ''], ['46756.00000', '0.00110002', '1612842903.102014', 'b', 'l', ''], ['46756.60000', '0.00079851', '1612842903.103715', 'b', 'l', ''], ['46763.70000', '0.00043351', '1612842903.105738', 'b', 'l', ''], ['46766.10000', '0.15000000', '1612842903.107645', 'b', 'l', '']], 'trade', 'XBT/USD']
[321, [['46774.20000', '0.04480930', '1612842903.128691', 'b', 'm', ''], ['46774.20000', '0.08000000', '1612842903.131908', 'b', 'm', ''], ['46774.20000', '0.15015145', '1612842903.134977', 'b', 'm', ''], ['46774.20000', '0.02503925', '1612842903.138306', 'b', 'm', ''], ['46787.80000', '0.10000000', '1612842903.139867', 'b', 'm', ''], ['46787.80000', '0.03445583', '1612842903.141510', 'b', 'm', ''], ['46787.90000', '0.01000000', '1612842903.143097', 'b', 'm', ''], ['46789.30000', '0.03050492', '1612842903.145436', 'b', 'm', ''], ['46789.30000', '0.02503925', '1612842903.149362', 'b', 'm', '']], 'trade', 'XBT/USD']
[321, [['46766.30000', '0.01163594', '1612842903.171495', 's', 'm', ''], ['46766.30000', '0.00336406', '1612842903.177694', 's', 'm', '']], 'trade', 'XBT/USD']
[321, [['46766.30000', '0.00044563', '1612842903.183960', 's', 'm', ''], ['46766.30000', '0.00000116', '1612842903.187847', 's', 'm', '']], 'trade', 'XBT/USD']
[321, [['46789.30000', '0.04445583', '1612842903.192119', 'b', 'm', ''], ['46789.30000', '0.04554417', '1612842903.194391', 'b', 'm', '']], 'trade', 'XBT/USD']
[321, [['46789.30000', '0.01000000', '1612842903.198400', 'b', 'm', ''], ['46789.30000', '0.02500000', '1612842903.201485', 'b', 'm', '']], 'trade', 'XBT/USD']
[321, [['46766.70000', '0.00079858', '1612842903.223055', 's', 'l', '']], 'trade', 'XBT/USD']
[321, [['46766.90000', '0.00079233', '1612842903.258566', 's', 'l', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
{'event': 'heartbeat'}
[321, [['46771.00000', '0.00100000', '1612842904.708929', 's', 'm', '']], 'trade', 'XBT/USD']
[321, [['46771.10000', '0.01000000', '1612842904.753285', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
[321, [['46760.90000', '0.00042129', '1612842906.035380', 'b', 'l', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
{'event': 'heartbeat'}
[321, [['46748.80000', '0.00200000', '1612842907.720085', 's', 'm', '']], 'trade', 'XBT/USD']
[321, [['46746.60000', '0.11226863', '1612842908.307851', 's', 'm', ''], ['46745.30000', '0.42381782', '1612842908.310121', 's', 'm', ''], ['46745.30000', '0.00101472', '1612842908.313349', 's', 'm', ''], ['46745.30000', '0.00000243', '1612842908.315922', 's', 'm', ''], ['46745.30000', '0.00000001', '1612842908.318515', 's', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
{'event': 'heartbeat'}
{'event': 'heartbeat'}
[321, [['46745.40000', '0.05952963', '1612842911.490887', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
{'event': 'heartbeat'}
{'event': 'heartbeat'}
{'event': 'heartbeat'}
[321, [['46745.40000', '0.14047037', '1612842915.495179', 'b', 'm', ''], ['46745.40000', '0.02460000', '1612842915.497478', 'b', 'm', ''], ['46745.50000', '0.02460000', '1612842915.499842', 'b', 'm', ''], ['46747.50000', '0.14000000', '1612842915.501424', 'b', 'm', ''], ['46762.60000', '0.08000000', '1612842915.503684', 'b', 'm', ''], ['46764.40000', '0.09032963', '1612842915.505680', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
{'event': 'heartbeat'}
[321, [['46745.30000', '0.00350000', '1612842916.786464', 's', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
[321, [['46745.40000', '0.00060000', '1612842918.255090', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
{'event': 'heartbeat'}
{'event': 'heartbeat'}
{'event': 'heartbeat'}
[321, [['46745.40000', '0.12000000', '1612842921.591952', 'b', 'm', ''], ['46745.40000', '0.02460000', '1612842921.594207', 'b', 'm', ''], ['46745.40000', '0.04320000', '1612842921.595638', 'b', 'm', ''], ['46745.40000', '0.08000000', '1612842921.596980', 'b', 'm', ''], ['46745.40000', '0.13436069', '1612842921.598368', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
[321, [['46745.40000', '0.00695149', '1612842923.109038', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
{'event': 'heartbeat'}
[321, [['46745.40000', '0.04706335', '1612842924.637917', 'b', 'l', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
{'event': 'heartbeat'}
[321, [['46745.30000', '0.04019430', '1612842926.694212', 's', 'l', ''], ['46744.30000', '0.00164674', '1612842926.696474', 's', 'l', '']], 'trade', 'XBT/USD']
[321, [['46723.00000', '0.04457400', '1612842927.166026', 'b', 'l', '']], 'trade', 'XBT/USD']
[321, [['46723.00000', '0.06542600', '1612842927.503626', 'b', 'm', ''], ['46723.80000', '0.17967021', '1612842927.506574', 'b', 'm', ''], ['46723.90000', '0.17123573', '1612842927.508561', 'b', 'm', ''], ['46724.30000', '0.16162447', '1612842927.510756', 'b', 'm', ''], ['46726.60000', '0.32000000', '1612842927.512988', 'b', 'm', ''], ['46727.20000', '0.20000000', '1612842927.515160', 'b', 'm', ''], ['46729.30000', '0.06510000', '1612842927.517510', 'b', 'm', ''], ['46729.40000', '0.06510000', '1612842927.519568', 'b', 'm', ''], ['46730.00000', '0.08000000', '1612842927.521465', 'b', 'm', ''], ['46738.60000', '0.14184359', '1612842927.523459', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
{'event': 'heartbeat'}
[321, [['46724.40000', '0.10000000', '1612842928.216970', 'b', 'l', ''], ['46726.50000', '0.06510000', '1612842928.219388', 'b', 'l', ''], ['46729.20000', '0.10000000', '1612842928.222759', 'b', 'l', ''], ['46738.00000', '0.23490000', '1612842928.224467', 'b', 'l', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
{'event': 'heartbeat'}
{'event': 'heartbeat'}
{'event': 'heartbeat'}
[321, [['46721.70000', '0.50363900', '1612842932.573493', 's', 'm', ''], ['46721.70000', '0.42600000', '1612842932.575444', 's', 'm', ''], ['46721.70000', '0.29516000', '1612842932.577342', 's', 'm', ''], ['46721.70000', '0.11232674', '1612842932.578843', 's', 'm', ''], ['46719.90000', '0.03000000', '1612842932.580577', 's', 'm', ''], ['46719.90000', '0.01000000', '1612842932.582029', 's', 'm', ''], ['46719.90000', '0.42500000', '1612842932.584071', 's', 'm', ''], ['46713.10000', '0.10000000', '1612842932.585913', 's', 'm', ''], ['46713.10000', '0.09787426', '1612842932.587333', 's', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
[321, [['46718.00000', '0.00562852', '1612842933.770955', 'b', 'm', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
{'event': 'heartbeat'}
{'event': 'heartbeat'}
{'event': 'heartbeat'}
{'event': 'heartbeat'}
{'event': 'heartbeat'}
{'event': 'heartbeat'}
[321, [['46717.90000', '0.06849619', '1612842941.402657', 's', 'l', '']], 'trade', 'XBT/USD']
[321, [['46717.90000', '0.06849619', '1612842941.426547', 's', 'l', '']], 'trade', 'XBT/USD']
{'event': 'heartbeat'}
{'event': 'heartbeat'}
{'event': 'heartbeat'}

step1. get the data list

import json, time
from websocket import create_connection
import pandas as pd
   
super_x = []
ws = create_connection("wss://ws.kraken.com/")
ws.send(json.dumps({
    "event": "subscribe",
    "pair": ["BTC/USD"],
    "subscription": {"name": "trade"}
}))
timeout = time.time() + 60*.20

# only keep list type
while time.time() < timeout:
    js = json.loads(ws.recv())
    if isinstance(js, list):
        print(js)
        super_x.append(js)

step2. handle the data.

# parse the data
df = pd.DataFrame(super_x, columns=['channelID', 'trade', 'event', 'pair']).explode('trade')
df[['price', 'volume', 'time', 'side', 'orderType', 'misc']] = pd.DataFrame(df['trade'].tolist()).values
cols = ['event', 'price', 'volume', 'time', 'side', 'orderType', 'misc', 'pair']
dfn = df[cols].copy()

print(dfn)

       event    price      volume               time side orderType misc     pair
    0  trade  46737.2  0.03499059  1612848385.323798    s         m       XBT/USD
    0  trade  46737.2  0.01500941  1612848385.328784    s         m       XBT/USD
    1  trade  46736.8  0.06296629  1612848388.057267    s         m       XBT/USD
    1  trade  46736.8  0.01000000  1612848388.060013    s         m       XBT/USD
    1  trade  46736.8  0.00003371  1612848388.061986    s         m       XBT/USD
    1  trade  46731.3  0.02404310  1612848388.063164    s         m       XBT/USD
    2  trade  46732.6  0.03170000  1612848390.196840    s         l       XBT/USD
    3  trade  46734.7  0.10000000  1612848392.086250    s         m       XBT/USD
    4  trade  46735.9  0.00425878  1612848394.057669    s         m       XBT/USD

You can iterate over the elements of the JSON list in a dictionary comprehension.

trade = {f"s{i}": val for i, val in enumerate(js[1][0])}
trade["pair"] = js[3]
df = pd.concat([df, pd.json_normalize({"event": "trade", "trade": trade)] )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM