[英]How do I convert a string to a valid variable name in Python?
I need to convert an arbitrary string to a string that is a valid variable name in Python.我需要将任意字符串转换为 Python 中有效变量名的字符串。
Here's a very basic example:这是一个非常基本的示例:
s1 = 'name/with/slashes'
s2 = 'name '
def clean(s):
s = s.replace('/', '')
s = s.strip()
return s
# the _ is there so I can see the end of the string
print clean(s1) + '_'
That is a very naive approach.这是一种非常幼稚的做法。 I need to check if the string contains invalid variable name characters and replace them with ''
我需要检查字符串是否包含无效的变量名字符并将它们替换为 ''
What would be a pythonic way to do this?什么是pythonic方式来做到这一点?
Well, I'd like to best Triptych's solution with ... a one-liner!好吧,我想最好的 Triptych 的解决方案是……一个单线!
>>> def clean(varStr): return re.sub('\W|^(?=\d)','_', varStr)
...
>>> clean('32v2 g #Gmw845h$W b53wi ')
'_32v2_g__Gmw845h_W_b53wi_'
This substitution replaces any non-variable appropriate character with underscore and inserts underscore in front if the string starts with a digit.此替换用下划线替换任何非可变的适当字符,如果字符串以数字开头,则在前面插入下划线。 IMO, 'name/with/slashes' looks better as variable name
name_with_slashes
than as namewithslashes
. IMO, 'name/with/
name_with_slashes
' 作为变量名name_with_slashes
看起来比作为namewithslashes
。
According to Python , an identifier is a letter or underscore, followed by an unlimited string of letters, numbers, and underscores: 根据 Python ,标识符是一个字母或下划线,后跟无限的字母、数字和下划线字符串:
import re
def clean(s):
# Remove invalid characters
s = re.sub('[^0-9a-zA-Z_]', '', s)
# Remove leading characters until we find a letter or underscore
s = re.sub('^[^a-zA-Z_]+', '', s)
return s
Use like this:像这样使用:
>>> clean(' 32v2 g #Gmw845h$W b53wi ')
'v2gGmw845hWb53wi'
您应该构建一个正则表达式,它是允许字符的白名单,并替换不在该字符类中的所有内容。
You can use the built in func: str.isidentifier()
in combination with filter()
.您可以将内置的 func:
str.isidentifier()
与filter()
结合使用。 This requires no imports such as re
and works by iterating over each character and returning it if its an identifier.这不需要诸如
re
导入,并且通过迭代每个字符并在它是标识符时返回它来工作。 Then you just do a ''.join
to convert the array to a string again.然后您只需执行
''.join
以再次将数组转换为字符串。
s1 = 'name/with/slashes'
s2 = 'name '
def clean(s):
s = ''.join(filter(str.isidentifier, s))
return s
print f'{clean(s1)}_' #the _ is there so I can see the end of the string
使用 re 模块,并去除所有无效字符。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.