I am new in NLP. I am trying to extract the summary of the paragraphs using Gensim in python.
I am facing a problem with a short paragraph, it is giving me a warning as given below and doesn't give me a summary of the short paragraph.
Here is my code in Python:
import logging
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
from gensim.summarization import summarize
text = "short paragraph"
print ('Summary:')
print (summarize(text))
It is giving me warning as follows:
2018-02-01 17:31:47,247 : WARNING : Input text is expected to have at least 10 sentences.
2018-02-01 17:31:47,253 : INFO : adding document #0 to Dictionary(0 unique tokens: [])
2018-02-01 17:31:47,258 : INFO : built Dictionary(52 unique tokens: ['clearli', 'adult', 'chang', 'member', 'visit']...) from 4 documents (total 70 corpus positions)
2018-02-01 17:31:47,262 : WARNING : Input corpus is expected to have at least 10 documents.
2018-02-01 17:31:47,285 : WARNING : Couldn't get relevant sentences.
The output is(Printing only summary label not the actual summary of the short paragraph):
Summary:
Am I missing something? Is there any other library for the same.
Are you really using "some paragraph"
as an input? If so, I find it odd that your script isn't throwing a ZeroDivisionError
. The gensim summarize is based on TextRank . As per the docs :
"The input should be a string, and must be longer than INPUT_MIN_LENGTH sentences for the summary to make sense. The text will be split into sentences using the split_sentences method in the summarization.texcleaner module. Note that newlines divide sentences."
With this in mind, have a look at this .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.