簡體   English   中英

如何將此數據轉換為 dataframe? (使用熊貓)

[英]How do I turn this data into a dataframe? (using pandas)

我已經從這個網站上抓取了唐納德特朗普的推文,現在我正試圖將數據分解並放在 Pandas dataframe 的單獨列中。 數據如下所示:

1.Jan 8th 2021 - 10:44:28 AM ESTJan 8 2021 - 10:44am 84k 511k ShowTo all of those who have asked, I will not be going to the Inauguration on January 20th.
2.Jan 8th 2021 - 9:46:38 AM ESTJan 8 2021 - 9:46am 109k 481k ShowThe 75,000,000 great American Patriots who voted for me, AMERICA FIRST, and MAKE AMERICA GREAT AGAIN, will have a GIANT VOICE long into the future. They will not be disrespected or treated unfairly in any way, shape or form!!!
3.Jan 7th 2021 - 7:10:24 PM ESTJan 7 2021 - 7:10pm 155k 629k Showhttps:
4.Jan 6th 2021 - 6:01:04 PM ESTJan 6 2021 - 6:01pm 0 0 DeletedThese are the things and events that happen when a sacred landslide election victory is so unceremoniously & viciously stripped away from great patriots who have been badly & unfairly treated for so long. Go home with love & in peace. Remember this day forever!
5.Jan 6th 2021 - 4:17:24 PM ESTJan 6 2021 - 4:17pm 0 0 Deleted
6.Jan 6th 2021 - 3:13:26 PM ESTJan 6 2021 - 3:13pm 156k 730k ShowI am asking for everyone at the U.S. Capitol to remain peaceful. No violence! Remember, WE are the Party of Law & Order – respect the Law and our great men and women in Blue. Thank you!

我想有日期、時間、喜歡、轉發和文本的列。 作為參考,一條推文的 HTML 如下所示:

<div class="tweet___2xXtA ttaTweet" data-order="2" data-tweet="{&quot;date&quot;:1610117198000,&quot;favorites&quot;:&quot;480997&quot;,&quot;id&quot;:&quot;1347555316863553542&quot;,&quot;isRetweet&quot;:false,&quot;retweets&quot;:&quot;108844&quot;,&quot;text&quot;:&quot;The 75,000,000 great American Patriots who voted for me, AMERICA FIRST, and MAKE AMERICA GREAT AGAIN, will have a GIANT VOICE long into the future. They will not be disrespected or treated unfairly in any way, shape or form!!!&quot;}"><div class="metadata___k2QYt"><div class="date___1q2oK"><span class="index___2pyvE">2.</span><span class="desktop___2PAhP">Jan 8th 2021 - 9:46:38 AM EST</span><span class="mobile___7uOzX">Jan 8 2021 - 9:46am</span></div><div class="icons___3WEom"><span><div class="icon___3WJKU  "><svg xmlns="http://www.w3.org/2000/svg" width="24px" height="24px" viewBox="0 0 24 24"><path d="M12 2c5.514 0 10 4.486 10 10s-4.486 10-10 10-10-4.486-10-10 4.486-10 10-10zm0-2c-6.627 0-12 5.373-12 12s5.373 12 12 12 12-5.373 12-12-5.373-12-12-12zm-3 11v4h2.953l1.594 2h-6.547v-6h-2l3-4 3 4h-2zm6 2v-4h-2.922l-1.594-2h6.516v6h2l-3 4-3-4h2z"></path></svg></div> 109k</span><span><div class="icon___3WJKU  "><svg xmlns="http://www.w3.org/2000/svg" width="24px" height="24px" viewBox="0 0 24 24"><path d="M17.516 3c2.382 0 4.487 1.564 4.487 4.712 0 4.963-6.528 8.297-10.003 11.935-3.475-3.638-10.002-6.971-10.002-11.934 0-3.055 2.008-4.713 4.487-4.713 3.18 0 4.846 3.644 5.515 5.312.667-1.666 2.333-5.312 5.516-5.312zm0-2c-2.174 0-4.346 1.062-5.516 3.419-1.17-2.357-3.342-3.419-5.515-3.419-3.403 0-6.484 2.39-6.484 6.689 0 7.27 9.903 10.938 11.999 15.311 2.096-4.373 12-8.041 12-15.311 0-4.586-3.414-6.689-6.484-6.689z"></path></svg></div> 481k</span><span><div class="icon___3WJKU  "><svg xmlns="http://www.w3.org/2000/svg" width="24px" height="24px" viewBox="0 0 24 24"><path d="M12 2c5.514 0 10 4.486 10 10s-4.486 10-10 10-10-4.486-10-10 4.486-10 10-10zm0-2c-6.627 0-12 5.373-12 12s5.373 12 12 12 12-5.373 12-12-5.373-12-12-12zm6.5 8.778c-.441.196-.916.328-1.414.388.509-.305.898-.787 1.083-1.362-.476.282-1.003.487-1.564.597-.448-.479-1.089-.778-1.796-.778-1.59 0-2.758 1.483-2.399 3.023-2.045-.103-3.86-1.083-5.074-2.572-.645 1.106-.334 2.554.762 3.287-.403-.013-.782-.124-1.114-.308-.027 1.14.791 2.207 1.975 2.445-.346.094-.726.116-1.112.042.313.978 1.224 1.689 2.3 1.709-1.037.812-2.34 1.175-3.647 1.021 1.09.699 2.383 1.106 3.773 1.106 4.572 0 7.154-3.861 6.998-7.324.482-.346.899-.78 1.229-1.274z"></path></svg></div> Show</span></div></div><div class="text___k4s4q plain___1AdbJ"><span><span class="">The 75,000,000 great American Patriots who voted for me, AMERICA FIRST, and MAKE AMERICA GREAT AGAIN, will have a GIANT VOICE long into the future. They will not be disrespected or treated unfairly in any way, shape or form!!!</span></span></div></div>

任何想法如何做到這一點? 我對在 Pandas 中抓取和組織數據非常陌生,所以任何提示都會有所幫助!

我會將推文分配到向量中,然后將其合並到 dataframe。 如果您願意接受另一條建議,請為每條推文提供一個“ID”編號,以便您可以在代碼中輕松引用它們。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM