简体   繁体   中英

Python : Using dot to make each sentence a new line in the presence of decimal points

I want to have each sentence in a new line. A sentence ends by a . . I tried the following code:

import re 
text='This text has (15.16 +/- 1.01). And it also has 20.1 km(3) during 4/2002- 
and 1/2018'
text=re.sub('\.', '\n',text)

When I try to make each sentence a new line by replacing . with \\n I get four lines instead of two because of the decimal points. I do not need to keep the numbers. I just want to get the alphabetic characters and clean everything else

This text has
And it also has during and

And solution?

>>> import re

>>> text='This text has 15.16. And it also has 64.6190. twent one guns. hi. 16. 40.5'

>>> print(re.sub('[\d]*\.(?:[\d]*[\.]*[\ ]*)*', '\n',text))


#OUTPUT
This text has 
And it also has 
twent one guns
hi

Edit: Do you want to eliminate +/- ?

>>> text = 'This text has 15.16 +/- 1.01. And it also has 64.6190. hi. the tommy is bad. + one-two is negative one.'

>>> print(re.sub('[\d]*\.(?:[\d]*[\.]*[\ ]*)*|[\ ]*[+\-\/]+[\ ]*', '\n',text))



#OUTPUT
This text has 


And it also has 
hi
the tommy is bad

one
two is negative one

Edit: this one is even simpler

text=re.sub(r'(\d*\.)+', r'\n',text)

you can adjust regex pattern as you wish, simply:

import re 
text='This text has 15.16. And it also has 64.6190.'
text=re.sub(r'\d*\.\d*\.', r'\n',text)

print(text)

output:

This text has 
 And it also has 

Explanation: look for any number of digits followed by dot, then again any number of digits followed by dot and replace it with new line

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM