The following def clean_sheet_title
function references INVALID_TITLE_CHAR
and INVALID_TITLE_CHAR_MAP
to strip out invalid characters and limits the title
to 31 characters -
# This strips characters that are invalid to Excel
INVALID_TITLE_CHARS = ["]", "[", "*", ":", "?", "/", "\\", "'"]
INVALID_TITLE_CHAR_MAP = {ord(x): "" for x in INVALID_TITLE_CHARS}
# How would I remove strings, as well as the characters from INVALID_TITLE_CHARS?
INVALID_TITLE_NAMES = ["zz_ FeeRelationship", " Family"]
def clean_sheet_title(title):
title = title or ""
title = title.strip()
title = title.translate(INVALID_TITLE_CHAR_MAP)
return title[:31]
My question is how I would expand this to also remove strings from within the INVALID_TITLE_NAMES
list?
What I've tried:
I have tried making the following update to def clean_sheet_title
however this makes no difference to title
-
INVALID_TITLE_CHARS = ["]", "[", "*", ":", "?", "/", "\\", "'"]
INVALID_TITLE_CHAR_MAP = {ord(x): "" for x in INVALID_TITLE_CHARS}
INVALID_TITLE_NAMES = ["zz_ FeeRelationship", "Family"]
def clean_sheet_title(title):
title = title or ""
title = title.strip()
title = title.translate(INVALID_TITLE_CHAR_MAP, "")
for name in INVALID_TITLE_NAMES:
title = title.replace(name, "")
return title[:31]
Examples:
Current function ability - if title
== Courtenay:Family
then currently the def clean_sheet_title
will ensure the title will be Courtenay Family
.
Desired function ability - Sometimes title
can be prefixed or sufixed with either zz_ FeeRelationship
or Family
, in both cases, these strings should be dropped. Eg zz_ FeeRelationship Courtenay:Family
would become Courtenay
Try this:
for name in INVALID_TITLE_NAMES:
title = title.replace(name, "")
Is that the result you are trying to achieve? It should replace each invalid name in title
with an empty string.
You could use regular expressions to match any of your keywords or characters and replace them with an empty string:
import re
INVALID_TITLE_CHARS = ["]", "[", "*", ":", "?", "/", "\\", "'"]
INVALID_TITLE_NAMES = ["zz_ FeeRelationship", " Family"]
inv_char_grp = re.escape("".join(INVALID_TITLE_CHARS))
inv_name_grp = "|".join(re.escape(name) for name in INVALID_TITLE_NAMES)
regex = f"[{inv_char_grp}]|{inv_name_grp}"
title = "zz_ FeeRelationship Courtenay: Family"
result = re.sub(regex, "", title)
print(result)
which prints Courtenay
An explanation of the regular expressions:
INVALID_TITLE_CHARS
, they need to be escaped so that the regex engine recognizes them as literal characters instead of using their special meaning. So we join all the characters in INVALID_TITLE_CHARS
, then use re.escape
to escape the resulting string. This gives us the regex inv_char_grp = r"\]\[\*:\?/\\'"
[
and ]
to denote that we want to match one of any of those characters using `f"[{inv_char_grp}]".INVALID_TITLE_NAMES
. Since these are whole strings, we won't use a character group for them. Instead, we can use the |
operator to indicate that we want to match any of its operands. Also remember to escape the names in case they contain any special characters.The final regex we get is
[\]\[\*:\?/\\']|zz_\ FeeRelationship|\ Family
[\]\[\*:\?/\\'] : Any of these chars ][*:?/\
| : Or
zz_\ FeeRelationship : Exactly zz_, then a space, then FeeRelationship
| : Or
\ Family : Exactly one space, then Family
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.