简体   繁体   English

Android从字符串文本中提取句子

[英]Android extract sentences from string text

I'm working on a recipe application on Android. 我正在Android上开发食谱应用程序。 My recipe source(Rest API) is basically a service that allows users to manually submit recipes to the site, and that's what they archive and return to the client in the form of JSON. 我的食谱来源(Rest API)基本上是一项服务,允许用户手动将食谱提交到站点,这就是他们存档并以JSON形式返回给客户端的内容。

Basically, the "Directions" portion of the JSON just returns one big blob of text as the instructions. 基本上,JSON的“ Directions”部分仅返回一大段文本作为指令。 It is not formatted excepted for line breaks. 除换行符外,未格式化。

To give you an idea, here's an example of some instructions returned 为了给您一个想法,下面是一些返回说明的示例

Place cast iron skillet in oven and heat oven to 500 degrees. 将铸铁锅放在烤箱中,然后加热到500度。 Bring steak(s) to room temperature. 将牛排加热至室温。 Season both sides with salt, pepper, garlic powder, and cayenne pepper. 用盐,胡椒粉,大蒜粉和辣椒粉调味两面。 When oven reaches temperature, remove pan and place on range over high heat. 当烤箱达到温度时,移开锅,放在高温上。 Immediately place steak in the middle of hot, dry pan. 立即将牛排放在热的干锅中。 Cook 30 seconds without moving. 煮30秒不动。 Turn with tongs and cook another 30 seconds, then put the pan straight into the oven for 2 minutes. 用钳子转动,再煮30秒钟,然后将平底锅直接放入烤箱2分钟。 Flip steak and cook for another 2 minutes. 翻转牛排,再煮2分钟。 (This time is for medium rare steaks. If you prefer medium, add a minute to both of the oven turns.) Remove steak from pan, cover loosely with foil, and rest for 2 minutes. (这一次是中等稀有牛排的时间。如果您更喜欢中等牛排,则在两个烤箱的转盘上都加一分钟。)从锅中取出牛排,用箔纸松散地盖上盖子,静置2分钟。 Serve whole or slice thin and fan onto plate. 将整个或薄片切成薄片,然后扇形散布在板上。

I need to break this long string of text into sentences so that I display each in a custom view that I've designed. 我需要将长长的文本字符串分解为句子,以便在设计的自定义视图中显示每个文本。

Does anyone know of any libraries that I can use to accomplish this? 有人知道我可以用来完成此任务的任何库吗? I understand that English language detection may be difficult, and after research I'm learning that this is not really doable with regular expressions, so I'm looking at my options at this point. 我知道英语检测可能很困难,经过研究后,我得知使用正则表达式确实无法做到这一点,因此我现在在考虑我的选择。 If anyone knows of anything, please feel free to share! 如果有人知道什么,请随时分享! Thanks as usual guys. 像往常一样谢谢你们。

private void ParagraphToSentences(){
    String paragraph = "Place cast iron skillet in oven and heat oven to 500 degrees. Bring steak(s) to room temperature. Season both sides with salt, pepper, garlic powder, and cayenne pepper. When oven reaches temperature, remove pan and place on range over high heat. Immediately place steak in the middle of hot, dry pan. Cook 30 seconds without moving. Turn with tongs and cook another 30 seconds, then put the pan straight into the oven for 2 minutes. Flip steak and cook for another 2 minutes. (This time is for medium rare steaks. If you prefer medium, add a minute to both of the oven turns.) Remove steak from pan, cover loosely with foil, and rest for 2 minutes. Serve whole or slice thin and fan onto plate.";
    Pattern sentencePatterns = Pattern.compile("[^.!?\\s][^.!?]*(?:[.!?](?!['\"]?\\s|$)[^.!?]*)*[.!?]?['\"]?(?=\\s|$)", Pattern.MULTILINE | Pattern.COMMENTS);
    Matcher match = sentencePatterns.matcher(paragraph);
    while (match.find()) {
        System.out.println(match.group());
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM