I want to have each sentence in a new line. A sentence ends by a .
. I tried the following code:
import re
text='This text has (15.16 +/- 1.01). And it also has 20.1 km(3) during 4/2002-
and 1/2018'
text=re.sub('\.', '\n',text)
When I try to make each sentence a new line by replacing .
with \\n
I get four lines instead of two because of the decimal points. I do not need to keep the numbers. I just want to get the alphabetic characters and clean everything else
This text has
And it also has during and
And solution?
>>> import re
>>> text='This text has 15.16. And it also has 64.6190. twent one guns. hi. 16. 40.5'
>>> print(re.sub('[\d]*\.(?:[\d]*[\.]*[\ ]*)*', '\n',text))
#OUTPUT
This text has
And it also has
twent one guns
hi
Edit: Do you want to eliminate +/-
?
>>> text = 'This text has 15.16 +/- 1.01. And it also has 64.6190. hi. the tommy is bad. + one-two is negative one.'
>>> print(re.sub('[\d]*\.(?:[\d]*[\.]*[\ ]*)*|[\ ]*[+\-\/]+[\ ]*', '\n',text))
#OUTPUT
This text has
And it also has
hi
the tommy is bad
one
two is negative one
Edit: this one is even simpler
text=re.sub(r'(\d*\.)+', r'\n',text)
you can adjust regex pattern as you wish, simply:
import re
text='This text has 15.16. And it also has 64.6190.'
text=re.sub(r'\d*\.\d*\.', r'\n',text)
print(text)
output:
This text has
And it also has
Explanation: look for any number of digits followed by dot, then again any number of digits followed by dot and replace it with new line
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.