简体   繁体   中英

How to ignore white space inbetween words but not other characters?

I want to rename a long list of file names to make them more searchable. The names where auto generated so there is some odd spacing issues. I wrote a little python script that does what I want. But I don't want to remove white spaces between words. For instance I have two names:

0 130 — HG — 1500 — 12"  (Page 1 of 2)  
01 30 — HD LOW POINT DRAIN  

They should read :

0130-HG-1500-12"  
0130-HD LOW POINT DRAIN  

My code so far :

import os
import re

for filename in os.listdir("."):
    if not filename.endswith(".py"):
        os.replace(filename, re.sub("[(].*?[)]", "",  # Remove anything between ()
                                    "".join(filename.split()  # Remove any whitespaces
                                            ).replace("—", "-")))  # Replace Em dash with hyphen  

Everything is working except I cant figure out how to not strip white spaces between words only.

If by "words" you mean "strings made up of letters" then

re.sub('((?<=[^a-zA-Z]) | (?=[^a-zA-Z]))', '', filename)

will do the trick. In plain language, that would be "replace every space that is either after or before a non-letter character with nothing". Output:

In [24]: re.sub('((?<=[^A-Z]) | (?=[^A-Z]))', '', '01 30 — HD LOW POINT DRAIN  ')
Out[24]: '0130—HD LOW POINT DRAIN'

In [25]: re.sub('((?<=[^A-Z]) | (?=[^A-Z]))', '', '0 130 — HG — 1500 — 12"')
Out[25]: '0130—HG—1500—12"'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM