I wrote a code where I replaced all the whitespaces with new lines from a.txt file which is basically a novel. Doing this, seperated the words in new lines for me, but there are some empty lines and I want to remove those. So I am trying to remove all the whitespaces except the new lines. How might I do that using regex?
First I did this to replace whitespaces with new lines:
text= re.sub(r'\s',r'\n',text)
Then I tried this to remove empty lines which is not doing the job actually:
text= re.sub(r'(\s)(^\n)',"",text)
You may use:
text = re.sub(r'[^\S\r\n]+', '', text)
The regex pattern [^\S\r\n]+
will match any whitespace character (read not \S
, which means \s
) except for newlines and carriage returns.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.