简体   繁体   中英

Text Summarization with Gensim with short paragraph

I am new in NLP. I am trying to extract the summary of the paragraphs using Gensim in python.

I am facing a problem with a short paragraph, it is giving me a warning as given below and doesn't give me a summary of the short paragraph.

Here is my code in Python:

 import logging
 logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
 from gensim.summarization import summarize

text = "short paragraph"
print ('Summary:')
print (summarize(text))

It is giving me warning as follows:

2018-02-01 17:31:47,247 : WARNING : Input text is expected to have at least 10 sentences.
2018-02-01 17:31:47,253 : INFO : adding document #0 to Dictionary(0 unique tokens: [])
2018-02-01 17:31:47,258 : INFO : built Dictionary(52 unique tokens: ['clearli', 'adult', 'chang', 'member', 'visit']...) from 4 documents (total 70 corpus positions)
2018-02-01 17:31:47,262 : WARNING : Input corpus is expected to have at least 10 documents.
2018-02-01 17:31:47,285 : WARNING : Couldn't get relevant sentences.

The output is(Printing only summary label not the actual summary of the short paragraph):

Summary:

Am I missing something? Is there any other library for the same.

Are you really using "some paragraph" as an input? If so, I find it odd that your script isn't throwing a ZeroDivisionError . The gensim summarize is based on TextRank . As per the docs :

"The input should be a string, and must be longer than INPUT_MIN_LENGTH sentences for the summary to make sense. The text will be split into sentences using the split_sentences method in the summarization.texcleaner module. Note that newlines divide sentences."

With this in mind, have a look at this .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM