如何仅从单词生成有意义的句子？

Question

I want to generate a sentence from list of words.我想从单词列表中生成一个句子。 I have tried n-gram model but it only generates the text from already existing sentence ie we input a sentence and it outputs the next generated words based on the value of n.我已经尝试过 n-gram 模型，但它只从已经存在的句子生成文本，即我们输入一个句子，它根据 n 的值输出下一个生成的单词。 Which model will be helpful to generate a meaningful sentence from only the list of words and which dataset should be used to train the model?哪个模型将有助于仅从单词列表生成有意义的句子，以及应该使用哪个数据集来训练模型？

Answer 1

The dataset: Just take a dataset constisting of sentences.数据集：只需获取一个由句子组成的数据集。 Tokenize each sentence and shuffle the sentences.标记每个句子并打乱句子。 These shuffled tokens are your input, your sentence the output.这些打乱的标记是您的输入，您的句子是 output。 Therefore you can generate as many samples as you wish:因此，您可以根据需要生成任意数量的样本：

def create_input(sentence):
    tokens = nltk.word_tokenize(sentence)
    shuffle(tokens)
    return tokens

More difficult is the model : You could try to Fine-Tune a BERT model and I guess it will probably work.更困难的是model ：您可以尝试微调 BERT model，我想它可能会起作用。

Answer 2

What you want is called lexically constrained beam search in natural language generation literature.你要的在自然语言生成文献中叫做lexically constrained beam search。

pip install -q git+https://github.com/huggingface/transformers.git

then this code can generated a sentence with the forced words list.然后这段代码可以用强制词列表生成一个句子。

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("t5-base")
model = AutoModelForSeq2SeqLM.from_pretrained("t5-base")

encoder_input_str = "Generate a sentence:"

force_words = ["I", "school"]

input_ids = tokenizer(encoder_input_str, return_tensors="pt").input_ids
force_words_ids = tokenizer(force_words, add_special_tokens=False).input_ids

outputs = model.generate(
    input_ids,
    force_words_ids=force_words_ids,
    num_beams=5,
    num_return_sequences=1,
    no_repeat_ngram_size=1,
    remove_invalid_values=True,
)


print("Output:\n" + 100 * '-')
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

For further information refer to this .有关更多信息，请参阅此。

If you don't want to use deep learning, index a lot of sentences, search for the keywords using a retrieval system like Lucence, and retrieve a sentence that is closest to your query.如果你不想使用深度学习，索引很多句子，使用像 Lucence 这样的检索系统搜索关键字，并检索最接近你的查询的句子。

Answer 3

You can use GPT-J.您可以使用 GPT-J。 It is a free GPT model and its performance is comparable to GPT-3.它是免费的 GPT model，其性能可与 GPT-3 媲美。 The model takes the input that you provide it with, and tries to complete it. model 接受您提供的输入，并尝试完成它。

How I use GPT-J to generate a sentence from a set of keywords:我如何使用 GPT-J 从一组关键字生成句子：

Input:输入：

Make a sentence with the following words: earth, dirt, alligator
Sentence: While the alligator is a species which mainly lives in the water, the earth is not uncommon territory and they like to dig through the dirt.

Make a sentence with the following words: shape, lantern, hair
Sentence:

Output: Output：

Make a sentence with the following words: earth, dirt, alligator
Sentence: While the alligator is a species which mainly lives in the water, the earth is not uncommon territory and they like to dig through the dirt.

Make a sentence with the following words: shape, lantern, hair
Sentence: The hair is so thick on the lantern that it is almost like a shape.

How to tweak to a certain use-case?如何调整到某个用例？

Giving an example of what you want in the input (example keywords + sentence) can help GPT to understand the structure of the desired output.在输入中给出您想要的示例（示例关键字 + 句子）可以帮助 GPT 理解所需 output 的结构。 Explicitly explaining the GPT what the desired task is in the input (Make a sentence...) can help it to understand the task in my experience.明确解释 GPT 在输入中期望的任务是什么（造句……）可以帮助它理解我的经验中的任务。

You can change the complexity of the output sentence by changing the example sentence to something like: An alligator likes to dig dirt out of the earth.您可以通过将例句更改为类似以下内容来更改 output 句子的复杂性： An alligator likes to dig dirt out of the earth.

How to use?如何使用？

Git repo: https://github.com/kingoflolz/mesh-transformer-jax Git 回购： https://github.com/kingoflolz/mesh-transformer-jax

As shown in the repo, you can use the web demo of the model for testing, and you can implement it using Colab.如repo所示，可以使用model的web demo进行测试，也可以使用Colab实现。

Web demo: https://6b.eleuther.ai/ Web 演示： https://6b.eleuther.ai/

Colab notebook: http://colab.research.google.com/github/kingoflolz/mesh-transformer-jax/blob/master/colab_demo.ipynb Colab 笔记本： http://colab.research.google.com/github/kingoflolz/mesh-transformer-jax/blob/master/colab_demo.ipynb

I do not recommend trying to run it locally.我不建议尝试在本地运行它。

Answer 4

Thanks to text generation models like GPT-3, GPT-J, and GPT-NeoX, you can generate content out of simple keywords.得益于 GPT-3、GPT-J 和 GPT-NeoX 等文本生成模型，您可以使用简单的关键字生成内容。

For example, let's say you want to generate a product description out of a couple of keywords, you could use few-shot learning and do something like this:例如，假设您想用几个关键词生成产品描述，您可以使用小样本学习并执行如下操作：

Generate a product description out of keywords.

Keywords: shoes, women, $59
Sentence: Beautiful shoes for women at the price of $59.
###
Keywords: trousers, men, $69
Sentence: Modern trousers for men, for $69 only.
###
Keywords: gloves, winter, $19
Sentence: Amazingly hot gloves for cold winters, at $19.
###
Keywords: t-shirt, men, $39
Sentence:

I actually wrote an article about this that you might find useful: effectively using GPT-J with few-shot learning我实际上写了一篇关于此的文章，您可能会发现它很有用： effectively using GPT-J with few-shot learning

如何仅从单词生成有意义的句子？

问题描述

4 个解决方案

解决方案1
1 2020-11-29 17:34:48

解决方案2
1 2022-12-07 02:56:25

解决方案3
0 2022-01-28 16:33:30

How I use GPT-J to generate a sentence from a set of keywords:我如何使用 GPT-J 从一组关键字生成句子：

How to tweak to a certain use-case?如何调整到某个用例？

How to use?如何使用？

解决方案4
0 2022-03-24 13:14:43

如何仅从单词生成有意义的句子？

问题描述

4 个解决方案

解决方案1 1 2020-11-29 17:34:48

解决方案2 1 2022-12-07 02:56:25

解决方案3 0 2022-01-28 16:33:30

How I use GPT-J to generate a sentence from a set of keywords:我如何使用 GPT-J 从一组关键字生成句子：

How to tweak to a certain use-case?如何调整到某个用例？

How to use?如何使用？

解决方案4 0 2022-03-24 13:14:43

解决方案1
1 2020-11-29 17:34:48

解决方案2
1 2022-12-07 02:56:25

解决方案3
0 2022-01-28 16:33:30

解决方案4
0 2022-03-24 13:14:43