简体   繁体   English

问答+NLP中的问题生成

[英]Questions generation in question answering +NLP

I have a dataset (around 3K to 4K) excel files, each of them has around more or less 12K records which are combinations of FAQs, Email Conversations, comments from the blogs, chats etc.我有一个数据集(大约 3K 到 4K)excel 文件,每个文件都有大约 12K 条记录,这些记录是常见问题解答、Email 对话、博客评论、聊天等的组合。

Best part is, it has 2 columns one for Questions and another for Answers .最好的部分是,它有两列,一列用于Questions ,另一列用于Answers

One Sample record from an excel- (Note - can't expose client data so creating only one record at my own to explain the scenario ).来自 excel 的一个示例记录 - (注意 -无法公开客户数据,因此我自己只创建一个记录来解释场景)。

eg.例如。 Sample Question - What are IIT colleges in India?示例问题 - What are IIT colleges in India?

Sample Answer - The Indian Institutes of Technology (IITs) are autonomous public institutes of higher education, located in India. They are governed by the Institutes of Technology Act, 1961 which has declared them as institutions of national importance and lays down their powers, duties, and framework for governance. The Institutes of Technology Act, 1961 lists twenty-three institutes.Each IIT is autonomous, linked to the others through a common council (IIT Council), which oversees their administration. The Minister of Human Resource Development is the ex officio Chairperson of the IIT Council. As of 2018, the total number of seats for undergraduate programs in all IITs is 11,279.示例答案 - The Indian Institutes of Technology (IITs) are autonomous public institutes of higher education, located in India. They are governed by the Institutes of Technology Act, 1961 which has declared them as institutions of national importance and lays down their powers, duties, and framework for governance. The Institutes of Technology Act, 1961 lists twenty-three institutes.Each IIT is autonomous, linked to the others through a common council (IIT Council), which oversees their administration. The Minister of Human Resource Development is the ex officio Chairperson of the IIT Council. As of 2018, the total number of seats for undergraduate programs in all IITs is 11,279. The Indian Institutes of Technology (IITs) are autonomous public institutes of higher education, located in India. They are governed by the Institutes of Technology Act, 1961 which has declared them as institutions of national importance and lays down their powers, duties, and framework for governance. The Institutes of Technology Act, 1961 lists twenty-three institutes.Each IIT is autonomous, linked to the others through a common council (IIT Council), which oversees their administration. The Minister of Human Resource Development is the ex officio Chairperson of the IIT Council. As of 2018, the total number of seats for undergraduate programs in all IITs is 11,279.

Client's requirement is-客户的要求是——

Generate as many as simple questions from (above sample answer) paragraph along with their answers and append it in the same excel.从(上面的示例答案)段落中生成尽可能多的简单问题以及他们的答案和 append 它在同一个 excel 中。

(he will then process each excel further by feeding it to his some tool which generates chat-bot stories). (然后他将通过将每个 excel 提供给他的某个生成聊天机器人故事的工具来进一步处理它)。

eg.例如。

  • Are IITs autonomous? IIT 是自主的吗? (Answer: Yes ) (答案: Yes
  • What governs the IITs?什么管理 IIT? (Answer: The Institutes of Technology Act, 1961 ) (答案: The Institutes of Technology Act, 1961
  • In which country IITs are located? IIT位于哪个国家/地区? (Answer: India ) (答案: India
  • How many institutes does The Institutes of Technology Act, 1961 lists? 1961 年《技术学院法》列出了多少个机构? (Answer: twenty-three ) etc. (答案: twenty-three )等。

Answers generation I can do it using AllenAI, but not sure how to generate questions?答案生成我可以使用 AllenAI 来完成,但不确定如何生成问题? I tried a repo but it looks incomplete and need more efforts as I'm newbie to NLP or ML, so not getting how to do those changes.我尝试了一个repo ,但它看起来不完整,需要更多的努力,因为我是 NLP 或 ML 的新手,所以不知道如何进行这些更改。

Any help on generating questions in question answering?在问答中生成问题有什么帮助吗?

Can I create any model on top of existing linguistic model such as spacy's models to generate entities and then generate the questions?我可以在现有的语言 model(例如 spacy 的模型)之上创建任何 model 以生成实体然后生成问题吗?

Instead of using source and target languages as input and output respectively, use your passages and questions instead.不要分别使用源语言和目标语言作为输入和 output,而是使用您的段落和问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM