繁体   English   中英

将嵌套字典列表转换为 DataFrame Python

[英]Converting Nested Dictionary List to a DataFrame Python

我正在研究 NLP model,它能够解析新闻文章中的问题和后续答案。 但是,为了在最小 RAM 限制内工作,我不得不分解文章中的每个句子并分别解析 QnA 和 append。 根据有多少文章、段落、句子,在处理之后它吐出越来越复杂的混乱。


[[{'answer': 'Meta Platforms ',
   'question': 'What is the name of the company that became two of the most talked-about social media companies in recent months?'},
  {'answer': ' Twitter ',
   'question': 'What was the name of the two most talked-about social media companies in recent months?'},
  {'answer': 'Elon Musk ', 'question': 'Who took a 9.2% stake in Twitter?'}],
 [{'answer': '$54.20 ',
   'question': 'How much did Musk bid to acquire Twitter?'},
  {'answer': ' $44 billion ',
   'question': 'How much did Musk bid to acquire Twitter?'}],
 [{'answer': 'a "poison pill" defense ',
   'question': "What did Twitter initially adopt against Musk's offer?"},
  {'answer': 'failing to provide adequate information about its spam and bot accounts ',
   'question': 'Why did Musk try to back out of the deal?'}],
 [{'answer': 'Twitter ',
   'question': 'Which company has a stock price below Musk\'s "best and final" offer?'},
  {'answer': ' 26% ',
   'question': 'What is Twitter\'s stock price below Musk\'s "best and final" offer?'},
  {'answer': 'Getty Images ',
   'question': "What is the name of the image source that Twitter's stock price remains 26% below Musk's offer?"},
  {'answer': 'February ', 'question': "When did Meta's downfall start?"}],
 [{'answer': 'ByteDance ', 'question': 'Who was TikTok?'},
  {'answer': ' ⁇!--//--> ⁇! ',
   'question': "What was the name of the company's first-quarter report in April?"},
  {'answer': ' ⁇!-- googletag.cmd.push ',
   'question': "What was the name of the company's first-quarter report?"},
  {'answer': 'Sheryl Sandberg ',
   'question': "Who was Meta's chief operating officer?"}],
 [{'answer': 'deteriorating macro environment ',
   'question': 'What caused Snap to reduce its second-quarter guidance in late May?'},
  {'answer': ' deteriorating macro environment ',
   'question': 'What caused Snap to reduce its second-quarter guidance in late May?'},
  {'answer': 'Twitter and Meta ',
   'question': 'Which two companies have been terrible investments over the past 12 months?'}],
 [{'answer': 'Twitter ',
   'question': "What company's stock has tumbled more than 30% during that period?"},
  {'answer': ' Meta ', 'question': "Who's stock has plummeted over 40%?"},
  {'answer': ' 30% ',
   'question': "How much has Twitter's stock tumbled during that period?"}],

这是一个部分示例。 简单地做时:

pd.DataFrame(data)

结果是 13578 行 × 32 列 DataFrame - 所以有很多嵌套 - 嵌套的深度是随机的并且基于所提供的文章。 我尝试修改 flatten-dict 和 deep flatten 来尝试让形状更熟悉,但这两种选择都没有让我更接近。

我需要做的是能够将 output 变成两列问答输出。 我试过在展平时指定列,但它总是会导致错误。 关于如何 go 关于这个普遍未知深度的任何提示?

鉴于您的示例数据:

df = pd.DataFrame(sub for el in your_list_of_lists for sub in el)

这会给你:

|    | answer                                                                  | question                                                                                                          |
|---:|:------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------|
|  0 | Meta Platforms                                                          | What is the name of the company that became two of the most talked-about social media companies in recent months? |
|  1 | Twitter                                                                 | What was the name of the two most talked-about social media companies in recent months?                           |
|  2 | Elon Musk                                                               | Who took a 9.2% stake in Twitter?                                                                                 |
|  3 | $54.20                                                                  | How much did Musk bid to acquire Twitter?                                                                         |
|  4 | $44 billion                                                             | How much did Musk bid to acquire Twitter?                                                                         |
|  5 | a "poison pill" defense                                                 | What did Twitter initially adopt against Musk's offer?                                                            |
|  6 | failing to provide adequate information about its spam and bot accounts | Why did Musk try to back out of the deal?                                                                         |
|  7 | Twitter                                                                 | Which company has a stock price below Musk's "best and final" offer?                                              |
|  8 | 26%                                                                     | What is Twitter's stock price below Musk's "best and final" offer?                                                |
|  9 | Getty Images                                                            | What is the name of the image source that Twitter's stock price remains 26% below Musk's offer?                   |
| 10 | February                                                                | When did Meta's downfall start?                                                                                   |
| 11 | ByteDance                                                               | Who was TikTok?                                                                                                   |
| 12 | ⁇!--//--> ⁇!                                                            | What was the name of the company's first-quarter report in April?                                                 |
| 13 | ⁇!-- googletag.cmd.push                                                 | What was the name of the company's first-quarter report?                                                          |
| 14 | Sheryl Sandberg                                                         | Who was Meta's chief operating officer?                                                                           |
| 15 | deteriorating macro environment                                         | What caused Snap to reduce its second-quarter guidance in late May?                                               |
| 16 | deteriorating macro environment                                         | What caused Snap to reduce its second-quarter guidance in late May?                                               |
| 17 | Twitter and Meta                                                        | Which two companies have been terrible investments over the past 12 months?                                       |
| 18 | Twitter                                                                 | What company's stock has tumbled more than 30% during that period?                                                |
| 19 | Meta                                                                    | Who's stock has plummeted over 40%?                                                                               |
| 20 | 30%                                                                     | How much has Twitter's stock tumbled during that period? 

您可以从主列表的每个子列表构建一个 dataframe 并将它们全部连接起来( input_arr是您的列表列表):

df = pd.concat([pd.DataFrame(x) for x in input_arr]).reset_index(drop=True)
df = df[['question', 'answer']]
print(df)

Output:

                                             question                                             answer
0   What is the name of the company that became tw...                                    Meta Platforms 
1   What was the name of the two most talked-about...                                           Twitter 
2                   Who took a 9.2% stake in Twitter?                                         Elon Musk 
3           How much did Musk bid to acquire Twitter?                                            $54.20 
4           How much did Musk bid to acquire Twitter?                                       $44 billion 
5   What did Twitter initially adopt against Musk'...                           a "poison pill" defense 
6           Why did Musk try to back out of the deal?  failing to provide adequate information about ...
7   Which company has a stock price below Musk's "...                                           Twitter 
8   What is Twitter's stock price below Musk's "be...                                               26% 
9   What is the name of the image source that Twit...                                      Getty Images 
10                    When did Meta's downfall start?                                          February 
11                                    Who was TikTok?                                         ByteDance 
12  What was the name of the company's first-quart...                                      ⁇!--//--> ⁇! 
13  What was the name of the company's first-quart...                           ⁇!-- googletag.cmd.push 
14            Who was Meta's chief operating officer?                                   Sheryl Sandberg 
15  What caused Snap to reduce its second-quarter ...                   deteriorating macro environment 
16  What caused Snap to reduce its second-quarter ...                   deteriorating macro environment 
17  Which two companies have been terrible investm...                                  Twitter and Meta 
18  What company's stock has tumbled more than 30%...                                           Twitter 
19                Who's stock has plummeted over 40%?                                              Meta 
20  How much has Twitter's stock tumbled during th...                                               30% 

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM