简体   繁体   中英

python: Replacing special characters in a string

I read the artist of a song from its MP3 tag, then create a folder based on that name. The problem I have is when the name contains a special character like 'AC\\DC'. So I wrote this code to deal with that.

def replace_all(text):
  print "replace_all"
  dictionary = {'\\':"", '?':"", '/':"", '...':"", ':':"", chr(148):"o"}

  for i, j in dictionary.iteritems():
      text = text.replace(i,j)

  return text

What I am running into now is how to deal with non-english characters like an umlaout o in Motorhead or Blue Oyster cult.

As you see I tried adding the ascii-string version of umlaout o at the end of the dictionary but that failed with

UnicodeDecodeError:  'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128)

I found this code, though I don't understand it.

def strip_accents(s):
  return ''.join((c for c in unicodedata.normalize('NFD', s) if unicodedata.category(c) != 'Mn'))

It enabled me to remove the accent marks from the path of proposed dir/filenames.

I suggest using unicode for both input text and the chars replaced. In your example chr(148) is clearly not a unicode symbol.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM