繁体   English   中英

列表中的DataFrame重复字典

[英]DataFrame repeated dictionaries in a list

我有一个JSON文件,其中包含嵌套字典的列表- (Json Sample)

{"posts": [{"url": "http://twitter.com/AkEl_Saruman/status/1084067040481169408", "title": "", "type": "Twitter", "language": "tr", "assignedCategoryId": 19058723389, "assignedEmotionId": 0, "categoryScores": [{"categoryId": 19058723389, "categoryName": "Basic Negative", "score": 0.58}, {"categoryId": 19058723388, "categoryName": "Basic Positive", "score": 0.01}, {"categoryId": 19058723391, "categoryName": "Basic Neutral", "score": 0.41}], "emotionScores": [], "imageInfo": [], "monitorId": 19058723386, "guid": "1084067040481169408", "parentGuid": "1083973777493512192", "engagementType": "RETWEET", "documentsUrls": ["https://www.yenisafak.com/dunya/firatin-dogusunda-turkiyeye-sabotaj-3430594"]}, {"url": "http://twitter.com/eazngl2/status/1084067263895007232", "title": "", "type": "Twitter", "location": "SAU", "geolocation": {"id": "SAU", "name": "Saudi Arabia", "country": "SAU"}, "language": "ar", "assignedCategoryId": 19058723391, "assignedEmotionId": 0, "categoryScores": [{"categoryId": 19058723389, "categoryName": "Basic Negative", "score": 0.01}, {"categoryId": 19058723388, "categoryName": "Basic Positive", "score": 0.39}, {"categoryId": 19058723391, "categoryName": "Basic Neutral", "score": 0.6}], "emotionScores": [], "imageInfo": [], "monitorId": 19058723386, "guid": "1084067263895007232", "parentGuid": "1084044740461502465", "engagementType": "RETWEET", "mediaUrls": ["http://pbs.twimg.com/media/DwtM8PYW0AEE-oR.jpg"]}, {"url": "http://twitter.com/h_7hm/status/1084067289723535360", "title": "", "type": "Twitter", "location": "SAU", "geolocation": {"id": "SAU", "name": "Saudi Arabia", "country": "SAU"}, "language": "ar", "assignedCategoryId": 19058723391, "assignedEmotionId": 0, "categoryScores": [{"categoryId": 19058723389, "categoryName": "Basic Negative", "score": 0.17}, {"categoryId": 19058723388, "categoryName": "Basic Positive", "score": 0.01}, {"categoryId": 19058723391, "categoryName": "Basic Neutral", "score": 0.81}], "emotionScores": [], "imageInfo": [], "monitorId": 19058723386, "guid": "1084067289723535360", "parentGuid": "1083854364547207175", "engagementType": "RETWEET"}, {"url": "http://twitter.com/BeooutQ_2BQ/status/1084067316311224325", "title": "", "type": "Twitter", "location": "Bogota, Bogota, COL", "geolocation": {"id": "COL.Bogota.Bogota", "name": "Bogota", "country": "COL", "state": "Bogota", "city": "Bogota"}, "language": "ar", "assignedCategoryId": 19058723389, "assignedEmotionId": 0, "categoryScores": [{"categoryId": 19058723389, "categoryName": "Basic Negative", "score": 0.52}, {"categoryId": 19058723388, "categoryName": "Basic Positive", "score": 0.24}, {"categoryId": 19058723391, "categoryName": "Basic Neutral", "score": 0.24}], "emotionScores": [], "imageInfo": [], "monitorId": 19058723386, "guid": "1084067316311224325", "parentGuid": "1084066998399758336", "engagementType": "REPLY"}, {"url": "http://twitter.com/ekspreshbrajans/status/1084067335680471040", "title": "", "type": "Twitter", "location": "Adana, Mediterranean Region, TUR", "geolocation": {"id": "TUR.Mediterranean Region.Adana", "name": "Adana", "country": "TUR", "state": "Mediterranean Region", "city": "Adana"}, "language": "tr", "authorGender": "M", "assignedCategoryId": 19058723389, "assignedEmotionId": 0, "categoryScores": [{"categoryId": 19058723389, "categoryName": "Basic Negative", "score": 0.57}, {"categoryId": 19058723388, "categoryName": "Basic Positive", "score": 0.04}, {"categoryId": 19058723391, "categoryName": "Basic Neutral", "score": 0.39}], "emotionScores": [], "imageInfo": [], "monitorId": 19058723386, "guid": "1084067335680471040", "documentsUrls": ["http://ekspreshaberajansi.com/2019/01/12/fuat-ugur-feto-neyse-sozcu-de-o/"]}, {"url": "http://twitter.com/mualitass/status/1084067769094754305", "title": "", "type": "Twitter", "location": "Istanbul, Marmara Region, TUR", "geolocation": {"id": "TUR.Marmara Region.Istanbul", "name": "Istanbul", "country": "TUR", "state": "Marmara Region", "city": "Istanbul"}, "language": "tr", "authorGender": "M", "assignedCategoryId": 19058723389, "assignedEmotionId": 0, "categoryScores": [{"categoryId": 19058723389, "categoryName": "Basic Negative", "score": 0.96}, {"categoryId": 19058723388, "categoryName": "Basic Positive", "score": 0.03}, {"categoryId": 19058723391, "categoryName": "Basic Neutral", "score": 0.01}], "emotionScores": [], "imageInfo": [], "monitorId": 19058723386, "guid": "1084067769094754305", "parentGuid": "1084020709586845696", "engagementType": "RETWEET"}, {"url": "http://twitter.com/smail_gomra/status/1084067900732907520", "title": "", "type": "Twitter", "language": "ar", "assignedCategoryId": 19058723391, "assignedEmotionId": 0, "categoryScores": [{"categoryId": 19058723389, "categoryName": "Basic Negative", "score": 0.0}, {"categoryId": 19058723388, "categoryName": "Basic Positive", "score": 0.32}, {"categoryId": 19058723391, "categoryName": "Basic Neutral", "score": 0.68}], "emotionScores": [], "imageInfo": [], "monitorId": 19058723386, "guid": "1084067900732907520", "parentGuid": "1084062244781113347", "engagementType": "RETWEET", "mediaUrls": ["https://video.twimg.com/ext_tw_video/1084060799595933698/pu/vid/640x480/bZ1hcR-mCViaG2vQ.mp4?tag=8"]}, {"url": "http://twitter.com/taliphan_197878/status/1084067941556068352", "title": "", "type": "Twitter", "location": "Izmir, Aegean Region, TUR", "geolocation": {"id": "TUR.Aegean Region.Izmir", "name": "Izmir", "country": "TUR", "state": "Aegean Region", "city": "Izmir"}, "language": "tr", "assignedCategoryId": 19058723391, "assignedEmotionId": 0, "categoryScores": [{"categoryId": 19058723389, "categoryName": "Basic Negative", "score": 0.24}, {"categoryId": 19058723388, "categoryName": "Basic Positive", "score": 0.05}, {"categoryId": 19058723391, "categoryName": "Basic Neutral", "score": 0.72}], "emotionScores": [], "imageInfo": [], "monitorId": 19058723386, "guid": "1084067941556068352", "parentGuid": "1084016623294525440", "engagementType": "RETWEET", "documentsUrls": ["https://m.turkiyegazetesi.com.tr/yazarlar/fuat-ugur/606039.aspx"]}, {"url": "http://twitter.com/spor26/status/1084067995326963714", "title": "", "type": "Twitter", "location": "Eskisehir, Central Anatolian Region, TUR", "geolocation": {"id": "TUR.Central Anatolian Region.Eskisehir", "name": "Eskisehir", "country": "TUR", "state": "Central Anatolian Region", "city": "Eskisehir"}, "language": "tr", "assignedCategoryId": 19058723391, "assignedEmotionId": 0, "categoryScores": [{"categoryId": 19058723389, "categoryName": "Basic Negative", "score": 0.11}, {"categoryId": 19058723388, "categoryName": "Basic Positive", "score": 0.04}, {"categoryId": 19058723391, "categoryName": "Basic Neutral", "score": 0.85}], "emotionScores": [], "imageInfo": [], "monitorId": 19058723386, "guid": "1084067995326963714", "documentsUrls": ["http://dlvr.it", "http://www.spor26.com/haberdetay/10980-mehmet-ozcana-fransadan-talip.html?utm_source=dlvr.it&utm_medium=twitter"], "mediaUrls": ["http://pbs.twimg.com/media/DwtiGMAUwAIlqAH.jpg"]}, {"url": "http://twitter.com/xcV44s101gjyOn3/status/1084067920781733888", "title": "", "type": "Twitter", "location": "DZA", "geolocation": {"id": "DZA", "name": "Algeria", "country": "DZA"}, "language": "ar", "assignedCategoryId": 19058723391, "assignedEmotionId": 0, "categoryScores": [{"categoryId": 19058723389, "categoryName": "Basic Negative", "score": 0.0}, {"categoryId": 19058723388, "categoryName": "Basic Positive", "score": 0.32}, {"categoryId": 19058723391, "categoryName": "Basic Neutral", "score": 0.68}], "emotionScores": [], "imageInfo": [], "monitorId": 19058723386, "guid": "1084067920781733888", "parentGuid": "1084062244781113347", "engagementType": "RETWEET", "mediaUrls": ["https://video.twimg.com/ext_tw_video/1084060799595933698/pu/vid/640x480/bZ1hcR-mCViaG2vQ.mp4?tag=8"]}, {"url": "http://twitter.com/Sidar66187750/status/1084068273380093955", "title": "", "type": "Twitter", "location": "Diyarbakir, Southeastern Anatolian Region, TUR", "geolocation": {"id": "TUR.Southeastern Anatolian Region.Diyarbakir", "name": "Diyarbakir", "country": "TUR", "state": "Southeastern Anatolian Region", "city": "Diyarbakir"}, "language": "tr", "assignedCategoryId": 19058723388, "assignedEmotionId": 0, "categoryScores": [{"categoryId": 19058723389, "categoryName": "Basic Negative", "score": 0.02}, {"categoryId": 19058723388, "categoryName": "Basic Positive", "score": 0.94}, {"categoryId": 19058723391, "categoryName": "Basic Neutral", "score": 0.05}], "emotionScores": [], "imageInfo": [], "monitorId": 19058723386, "guid": "1084068273380093955", "parentGuid": "1083705784817643521", "engagementType": "RETWEET"}, {"url": "http://twitter.com/ahmetAcer14/status/1084068082673483776", "title": "", "type": "Twitter", "location": "Adana, Mediterranean Region, TUR", "geolocation": {"id": "TUR.Mediterranean Region.Adana", "name": "Adana", "country": "TUR", "state": "Mediterranean Region", "city": "Adana"}, "language": "und", "assignedCategoryId": 0, "assignedEmotionId": 0, "categoryScores": [], "emotionScores": [], "imageInfo": [], "monitorId": 19058723386, "guid": "1084068082673483776", "parentGuid": "1084054837183029249", "engagementType": "RETWEET", "documentsUrls": ["https://www.turkiyegazetesi.com.tr/yazarlar/fuat-ugur/606038.aspx", "https://twitter.com/gencosmansarper/status/1083956208309018625"]}, {"url": "http://twitter.com/xcV44s101gjyOn3/status/1084068442758631425", "title": "", "type": "Twitter", "location": "DZA", "geolocation": {"id": "DZA", "name": "Algeria", "country": "DZA"}, "language": "ar", "assignedCategoryId": 19058723391, "assignedEmotionId": 0, "categoryScores": [{"categoryId": 19058723389, "categoryName": "Basic Negative", "score": 0.02}, {"categoryId": 19058723388, "categoryName": "Basic Positive", "score": 0.35}, {"categoryId": 19058723391, "categoryName": "Basic Neutral", "score": 0.63}], "emotionScores": [], "imageInfo": [], "monitorId": 19058723386, "guid": "1084068442758631425", "parentGuid": "1084049278119612421", "engagementType": "RETWEET", "mediaUrls": ["http://pbs.twimg.com/media/DwtRDo0WsAER1-T.jpg", "http://pbs.twimg.com/media/DwtRCGuXQAAm35S.jpg", "http://pbs.twimg.com/media/DwtRC3KW0AA5RA3.jpg", "http://pbs.twimg.com/media/DwtRERfW0AA7sIN.jpg"]}, {"url": "http://twitter.com/Antaakya_Akgenc/status/1084066851242557440", "title": "", "type": "Twitter", "location": "Mediterranean Region, TUR", "geolocation": {"id": "TUR.Mediterranean Region", "name": "Mediterranean Region", "country": "TUR", "state": "Mediterranean Region"}, "language": "tr", "assignedCategoryId": 19058723391, "assignedEmotionId": 0, "categoryScores": [{"categoryId": 19058723389, "categoryName": "Basic Negative", "score": 0.37}, {"categoryId": 19058723388, "categoryName": "Basic Positive", "score": 0.17}, {"categoryId": 19058723391, "categoryName": "Basic Neutral", "score": 0.46}], "emotionScores": [], "imageInfo": [{"url": "http://pbs.twimg.com/media/DwthDPkX0AAxAWJ.jpg", "brands": [], "objects": [{"score": 0.89882, "classId": 3324, "className": "Demonstration/protest"}]}, {"url": "http://pbs.twimg.com/media/DwthDPkXQAAf_0t.jpg", "brands": []}, {"url": "http://pbs.twimg.com/media/DwthDPcXcAETe37.jpg", "brands": [], "objects": [{"score": 0.96125, "classId": 3119, "className": "Sitting"}, {"score": 0.83501, "classId": 3488, "className": "People"}]}], "monitorId": 19058723386, "guid": "1084066851242557440", "mediaUrls": ["http://pbs.twimg.com/media/DwthDPkX0AAxAWJ.jpg", "http://pbs.twimg.com/media/DwthDPkXQAAf_0t.jpg", "http://pbs.twimg.com/media/DwthDPcXcAETe37.jpg"]}, {"url": "http://twitter.com/Charles70067222/status/1084068597453017090", "title": "", "type": "Twitter", "language": "tr", "authorGender": "M", "assignedCategoryId": 19058723391, "assignedEmotionId": 0, "categoryScores": [{"categoryId": 19058723389, "categoryName": "Basic Negative", "score": 0.33}, {"categoryId": 19058723388, "categoryName": "Basic Positive", "score": 0.39}, {"categoryId": 19058723391, "categoryName": "Basic Neutral", "score": 0.27}], "emotionScores": [], "imageInfo": [], "monitorId": 19058723386, "guid": "1084068597453017090", "parentGuid": "1084068468910161920", "engagementType": "REPLY"}, {"url": "http://twitter.com/fbardia/status/1084068551059783681", "title": "", "type": "Twitter", "language": "es", "authorGender": "M", "assignedCategoryId": 19058723391, "assignedEmotionId": 0, "categoryScores": [{"categoryId": 19058723389, "categoryName": "Basic Negative", "score": 0.04}, {"categoryId": 19058723388, "categoryName": "Basic Positive", "score": 0.09}, {"categoryId": 19058723391, "categoryName": "Basic Neutral", "score": 0.87}], "emotionScores": [], "imageInfo": [], "monitorId": 19058723386, "guid": "1084068551059783681", "parentGuid": "1083916530784657408", "engagementType": "REPLY"}], "totalPostsAvailable": 16, "status": "success"}]

当我为这个JSON文件添加数据帧时,它看起来像这样: 在此处输入图片说明

我需要将“帖子”列中的刺伤数据框化。 我尝试了这段代码:

with open('test.json') as data_file:    
   d = json.loads(data_file.read())
df= pd.DataFrame(d[0]['posts'])
df

df仅带第一行的值或包含[84个网址]的索引[0] ,如下图所示: 在此处输入图片说明

我需要的:

有没有一种方法可以对所有索引中的字典进行Dataframe?

提前致谢!

我相信你需要json_normalize

import json
from pandas.io.json import json_normalize    

with open('test.json') as file:    
    j = json.load(file)  

df = json_normalize(j, 'posts', ['totalPostsAvailable','status'])
print (df)

不漂亮,但您可以尝试这样的操作(未经测试):

df = pd.DataFrame(d[0]['posts'])

for i in range(1,len(d['posts'])):
        df1= pd.DataFrame(d.iloc[i]['posts'])
        df = pd.concat([df,df1])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM