简体   繁体   English

aws comprehend 可以用于将文档拆分为句子吗?

[英]Can aws comprehend be used in splitting documents to sentences?

I started to try aws comprehend.我开始尝试aws理解。 One thing I noticed is that the sentences in the document will affect the sentiment analysis and entity extraction results especially when mixed sentiment sentences exist or some sentences are not capitalized in the document.我注意到的一件事是,文档中的句子会影响情感分析和实体提取结果,尤其是当文档中存在混合情感句子或某些句子没有大写时。 So correctly splitting the sentences is an important step.所以正确拆分句子是很重要的一步。 However, I can't find an API in comprehend that splits the document in sentences.但是,我无法在理解中找到将文档拆分成句子的 API。 Is it because comprehend doesn't have the step?是因为领悟没有步骤吗? If there is, could someone points out how to obtain the splitting results?如果有,有人能指出如何获得分裂结果吗?

BTW, I tried Stanford coreNLP and Google Language Cloud.顺便说一句,我尝试了斯坦福 coreNLP 和谷歌语言云。 They both make mistakes in some cases.在某些情况下,他们都会犯错误。

Here is what I did: I added '>>>' as a separator between reviews when I was scraping them, then I used this code:这是我所做的:我在抓取评论时添加了“>>>”作为评论之间的分隔符,然后我使用了以下代码:

reviews = all_reviews_as_text.split('>>>')  
responses = []  
for review in reviews: 
    response = comprehend.detect_sentiment(Text=review, LanguageCode="en")
    responses.append(response)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM