简体   繁体   English

Perl中的split()

[英]split() in perl

How to divide text into sentences. 如何将文本分为句子。 In my opinion, I should use split() and print them, but I donˇt now have. 在我看来,我应该使用split()并打印它们,但现在没有了。 I am just started learning Perl. 我才刚刚开始学习Perl。

My text 我的文字

A block of text is a stack of line boxes. 一块文本是一堆线框。 In the case of 'left', 'right' and 'center', this property specifies how the inline-level boxes within each line box align with respect to the line box's left and right sides; 对于“ left”,“ right”和“ center”,此属性指定每个线框内的行内​​框如何相对于线框的左侧和右侧对齐。 alignment is not with respect to the viewport. 对齐不关于视口。 In the case of 'justify', this property specifies that the inline-level boxes are to be made flush with both sides of the line box if possible, by expanding or contracting the contents of inline boxes, else aligned as for the initial value. 在'justify'的情况下,此属性指定通过扩大或收缩内联框的内容,使内联级框与行框的两侧齐平,否则将其与初始值对齐。 See also 'letter-spacing' and 'word-spacing'. 另请参见“字母间距”和“单词间距”。

如果这实际上不是家庭作业,那么我只会使用处理该问题的CPAN模块之一,例如Lingua :: Sentence ,它似乎正在积极开发中。

One way to do it is using split in combination with look-behind. 一种方法是将split与后向组合结合使用。

 perl -nlwe 'print for split /(?<=\S[.!?])\s+/' < data.txt

This works for your sample data. 这适用于您的示例数据。

What you want to do here is eliminate the space separating sentences. 您要在此处执行的操作是消除句子之间的空格。 An end of sentence is defined as one of .!? 句子结尾定义为.!? preceded by a non-whitespace character. 前面是非空格字符。 Tweak as desired. 根据需要进行调整。

try 尝试

$paragraph = "Text. Text";
@sentences = split(/\./, $paragraph);
print @sentences;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM