I have this DataFrame:
import pandas as pd
df = [{"username": "last",
"time_data": "{\"hours\":[{\"hour\":\"00:00\",\"postCount\":\"5\",\"topicCount\":\"3\",\"totalCount\":80},{\"hour\":\"01:00\",\"postCount\":\"11\",\"topicCount\":\"6\",\"topciCount\":31}"
},
{"username": "truk",
"time_data": "{\"hours\":[{\"hour\":\"00:00\",\"postCount\":\"11\",\"topicCount\":\"6\",\"totalCount\":362},{\"hour\":\"01:00\",\"postCount\":\"22\",\"topicCount\":\"8\",\"topicCount\":355}"
}]
df = pd.DataFrame(df)
df
I have used this code to get the "postCount" of both '00:00' and '01:00':
df_h0 = df.copy()
df_h0['hour']='00:00'
df_h0['totalCount']=df.time_data.str.split('"00:00","postCount":"').str[1].str.split('","topic').str[0]
df_h0 = df_h0.fillna(0)
df_h1 = df.copy()
df_h1['hour']='01:00'
df_h1['totalCount']=df.time_data.str.split('"01:00","postCount":"').str[1].str.split('","topic').str[0]
df_h1 = df_h1.fillna(0)
df_tot = df_h0.append([df_h1])
df_tot.head()
But now I want to get the "totalCount" which is not just next to the hours. Anyone knows how to do that?
Expected output:
time_data username hour totalCount
0 {"hours":[{"hour":"00:00","postCount":"5","top... last 00:00 80
1 {"hours":[{"hour":"00:00","postCount":"11","to... truk 00:00 362
0 {"hours":[{"hour":"00:00","postCount":"5","top... last 01:00 31
1 {"hours":[{"hour":"00:00","postCount":"11","to... truk 01:00 355
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.