简体   繁体   中英

Summarization-Text rank algorithm

What are the advantages of using text rank algorithm for summarization over BERT summarization? Even though both can be used as extractive summarization method, is there any particular advantage for text rank?

TextRank implementations tend to be lightweight and can run fast even with limited memory resources, while the transformer models such as BERT tend to be rather large and require lots of memory. While the TinyML community has outstanding work on techniques to make DL models run within limited resources, there may be a resource advantage for some use cases.

Some of the TextRank implementations can be "directed" by adding semantic relations, which one can consider as a priori structure to enrich the graph used -- or in some cases means of incorporating human-in-the-loop approaches. Those can provide advantages over supervised learning models which have been trained purely on data. Even so, there are similar efforts for DL in general (eg, variations on the theme of transfer learning ) from which transformers may benefit.

Another potential benefit is that TextRank approaches tend to be more transparent , while transformer models can be challenging in terms of explainability . There are tools that help greatly, but this concern becomes important in the context of model bias and fairness , data ethics , regulatory compliance , and so on.

Based on personal experience, while I'm the lead committer for one of the popular TextRank open source implementations , I only use its extractive summarization features for use cases where a "cheap and fast" solution is needed. Otherwise I'd recommend considering more sophisticated approaches to summarization. For example, I recommend keeping watch on the ongoing research by the author of TextRank, Rada Mihalcea , and her graduate students at U Michigan.

In terms of comparing "Which text summarization methods work better?" I'd point toward work on abstractive summarization , particularly recent work by John Bohannon, et al. , at Primer . For excellent examples, check the "Daily Briefings" of CV19 research which their team generates using natural language understanding, knowledge graph, abstractive summarization, etc. Amy Heineike discusses their approach in "Machines for unlocking the deluge of COVID-19 papers, articles, and conversations" .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM