简体   繁体   English

如何在 Python 中将字符串转换为有效的变量名?

[英]How do I convert a string to a valid variable name in Python?

I need to convert an arbitrary string to a string that is a valid variable name in Python.我需要将任意字符串转换为 Python 中有效变量名的字符串。

Here's a very basic example:这是一个非常基本的示例:

s1 = 'name/with/slashes'
s2 = 'name '

def clean(s):
    s = s.replace('/', '')
    s = s.strip()

    return s

# the _ is there so I can see the end of the string
print clean(s1) + '_'

That is a very naive approach.这是一种非常幼稚的做法。 I need to check if the string contains invalid variable name characters and replace them with ''我需要检查字符串是否包含无效的变量名字符并将它们替换为 ''

What would be a pythonic way to do this?什么是pythonic方式来做到这一点?

Well, I'd like to best Triptych's solution with ... a one-liner!好吧,我想最好的 Triptych 的解决方案是……一个单线!

>>> def clean(varStr): return re.sub('\W|^(?=\d)','_', varStr)
...

>>> clean('32v2 g #Gmw845h$W b53wi ')
'_32v2_g__Gmw845h_W_b53wi_'

This substitution replaces any non-variable appropriate character with underscore and inserts underscore in front if the string starts with a digit.此替换用下划线替换任何非可变的适当字符,如果字符串以数字开头,则在前面插入下划线。 IMO, 'name/with/slashes' looks better as variable name name_with_slashes than as namewithslashes . IMO, 'name/with/ name_with_slashes ' 作为变量名name_with_slashes看起来比作为namewithslashes

According to Python , an identifier is a letter or underscore, followed by an unlimited string of letters, numbers, and underscores: 根据 Python ,标识符是一个字母或下划线,后跟无限的字母、数字和下划线字符串:

import re

def clean(s):

   # Remove invalid characters
   s = re.sub('[^0-9a-zA-Z_]', '', s)

   # Remove leading characters until we find a letter or underscore
   s = re.sub('^[^a-zA-Z_]+', '', s)

   return s

Use like this:像这样使用:

>>> clean(' 32v2 g #Gmw845h$W b53wi ')
'v2gGmw845hWb53wi'

您应该构建一个正则表达式,它是允许字符的白名单,并替换不在该字符类中的所有内容。

You can use the built in func: str.isidentifier() in combination with filter() .您可以将内置的 func: str.isidentifier()filter()结合使用。 This requires no imports such as re and works by iterating over each character and returning it if its an identifier.这不需要诸如re导入,并且通过迭代每个字符并在它是标识符时返回它来工作。 Then you just do a ''.join to convert the array to a string again.然后您只需执行''.join以再次将数组转换为字符串。

s1 = 'name/with/slashes'
s2 = 'name '

def clean(s):
    s = ''.join(filter(str.isidentifier, s))
    return s

print f'{clean(s1)}_' #the _ is there so I can see the end of the string

使用 re 模块,并去除所有无效字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM