I am trying to process a string with CHAR(int) and NCHAR(int) to convert those instances with their ASCII counter-parts. An example would be something like this:
CHAR(124) + (SELECT TOP 1 CAST(name AS VARCHAR(8000)) FROM (SELECT TOP 1 colid, name FROM [Projects]..[syscolumns]
WHERE xtype=char(85)
AND id = OBJECT_ID(NCHAR(69)+NCHAR(78)+NCHAR(95)+NCHAR(69)+NCHAR(109)+NCHAR(112)+NCHAR(108))
Note that I don't want to do anything to VARCHAR(int), and just to the CHAR(int) and NCHAR(int) parts only. The above should translate to:
|(SELECT TOP 1 CAST(name AS VARCHAR(8000)) FROM (SELECT TOP 1 colid, name FROM [Projects]..[syscolumns] WHERE xtype=U AND id = OBJECT_ID(EN_Empl)
Note that any "+" on either side of CHAR(int) or NCHAR(int) should be removed. I tried the the following:
def conv(m):
return chr(int(m.group(2)))
print re.sub(r'([\+ ]?n?char\((.*?)\)[\+ ]?)', conv, str, re.IGNORECASE)
where str
=the raw string that must be processed.
Somehow, the VARCHAR(8000) is being picked up. If I tweak the regex, the "=" after xtype is going away, rather than just the space and the "+" on either side of a CHAR(int) or NCHAR(int) instance.
Hope someone can pull me out of this.
ADDITIONAL SAMPLE STRINGS:
String "char(124)+(Select Top 1 cast(name as varchar(8000)) from (Select Top 1 colid,name From [Projects]..[syscolumns] Where id = OBJECT_ID(NCHAR(69)+NCHAR(78)+NCHAR(95)+NCHAR(69)+NCHAR(109)+NCHAR(112)+NCHAR(108)))"
Regex: r'(\\bn?char\\((\\d+)\\)(?:\\s*\\+\\s*)?)'
Result: "|(Select Top 1 cast(name as varchar(8000)) from (Select Top 1 colid,name From [Projects]..[syscolumns] Where id = OBJECT_ID(ENCHAR(78)+NCHAR(95)+NCHAR(69)+NCHAR(109)+NCHAR(112)+NCHAR(108)))"
You have three issues:
flags=re.IGNORECASE
and not just re.IGNORECASE
in re.sub . That is a keyword argument. \\b
to find the word boundary. str
as a name since you will overwrite the built-in by the same name This works:
import re
tgt='''\
CHAR(124) + (SELECT TOP 1 CAST(name AS VARCHAR(8000)) FROM (SELECT TOP 1 colid, name FROM [Projects]..[syscolumns]
WHERE xtype=char(85)
AND id = OBJECT_ID(NCHAR(69)+NCHAR(78)+NCHAR(95)+NCHAR(69)+NCHAR(109)+NCHAR(112)+NCHAR(108))'''
pat=r'(\bn?char\((\d+)\)(?:\s*\+\s*)?)'
def conv(m):
return chr(int(m.group(2)))
print re.sub(pat, conv, tgt, flags=re.IGNORECASE)
More completely:
import re
tgt='''\
CHAR(124) + (SELECT TOP 1 CAST(name AS VARCHAR(8000)) FROM (SELECT TOP 1 colid, name FROM [Projects]..[syscolumns]
WHERE xtype=char(85)
AND id = OBJECT_ID(NCHAR(69)+NCHAR(78)+NCHAR(95)+NCHAR(69)+NCHAR(109)+NCHAR(112)+NCHAR(108))'''
pat=r'(\bn?char\((\d+)\)(?:\s*\+\s*)?)'
def conv(m):
return chr(int(m.group(2)))
print re.sub(r'''
( # group 1
\b # word boundary
n?char # nchar or char
\( # literal left paren
(\s*\d+\s*) # digits surrounded by spaces
\) # literal right paren
(?:\s*\+\s*)? # optionally followed by a concating '+'
) '''
, conv, tgt, flags=re.VERBOSE | re.IGNORECASE)
Prints:
|(SELECT TOP 1 CAST(name AS VARCHAR(8000)) FROM (SELECT TOP 1 colid, name FROM [Projects]..[syscolumns]
WHERE xtype=U
AND id = OBJECT_ID(EN_Empl)
You can go a long way just by adding the word boundary ( \\b
) assertion, but I'd like to suggest that you (1) use re.VERBOSE
to write a regexp someone can understand later; (2) compile the regexp to reduce clutter at the call site; and, (3) tighten some of the matching criteria. Like so:
def conv(m):
return chr(int(m.group(1)))
pat = re.compile(r"""[+\s]* # optional whitespace or +
\b # word boundary
n?char # NCHAR or CHAR
\( # left paren
([\d\s]+) # digits or spaces - group 1
\) # right paren
[+\s]* # optional whitespace or +
""", re.VERBOSE | re.IGNORECASE)
print pat.sub(conv, data)
Note that I changed your str
to data
: str
is the name of a heavily used builtin function, and it's a Really Bad Idea to create a variable with the same name.
You only need to use a word boundary \\b
:
def conv(m):
return chr(int(m.group(1)))
print re.sub(r'\bn?char\(([^)]+)\)(?:\s*\+\s*)?', conv, str, re.IGNORECASE)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.