简体   繁体   English

在新线上打印每个句子

[英]Print Each Sentence On New Line

I have a text like this in a text file 我在文本文件中有这样的文字

Long sleeve wool coat in black. Breast pocket.

and I want an output where every sentence is printed in the next line something like this. 我想要一个输出,其中每个句子都打印在下一行中。

Long sleeve wool coat in black.

Breast pocket.

I tried the following question but as it was asked it's giving the output as 我尝试了以下问题但是因为它被问到它给出了输出

Long sleeve wool coat in black.

Breast pocket.

None

and also I have to do this to multiple text files reading from the original file I have to overwrite that file in this way breaking up the lines. 而且我必须对从原始文件中读取的多个文本文件执行此操作,我必须以这种方式覆盖该文件以分解行。 But when I try doing that only None is getting written to it not the existing lines. 但是,当我尝试这样做时,只有没有写入它而不是现有的行。

Any help is appreciated thanks in advance. 任何帮助表示感谢提前。

Try: 尝试:

s = 'Long sleeve wool coat in black. Breast pocket.'
print(s.replace('. ', '.\n'))

Try: 尝试:

in_s = 'Long sleeve wool coat in black. Breast pocket.'
in_s += ' '
out = in_s.split('. ')[:-1]
print('.\n'.join(out))

Explanation: 说明:

  • in_s += ' ' add a space at the end of the string so that it ends in `'. in_s += ' '在字符串的末尾添加一个空格,使其以`'结尾。 `` like any other sentence. ``像任何其他句子一样。
  • ...in_s.split('. ')... split the text wherever there is a period followed by a space ( '. ' ). ...in_s.split('. ')...将文本分割成有句点,后跟空格( '. ' )。
  • ...[:-1] remove the last value, which, if the text ends in a period and a space, will be None ...[:-1]删除最后一个值,如果文本以句点和空格结尾,则该值将为None
  • ...'\\n.join(out) seperate the values with a period and newline before printing. ...'\\n.join(out)在打印前用句点和换行符分隔值。

Do yourself a favour and use nltk instead of regular expressions or even a simple str.split() : 帮个忙,使用nltk而不是正则表达式,甚至是简单的str.split()

from nltk import sent_tokenize

string = "Long sleeve wool coat in black. Breast pocket. Mr. Donald Trump is the president of the U.S.A."

for sent in sent_tokenize(string):
    print(sent)

Which yields 哪个收益率

Long sleeve wool coat in black.
Breast pocket.
Mr. Donald Trump is the president of the U.S.A.

This approach most likely works even for edge cases while most others won't. 这种方法很可能适用于边缘情况,而大多数其他方法则不然。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM