简体   繁体   中英

Deleting a word in a string from a file, Python regex

I'm scanning a the text of a C file and searching for any comments in the file, comments being in the form..

/* this is a comment */

My regex expression to find comments is

comment = r'\/\*(?:[^*]|\*[^/])*\*\/'

Then I do this to scan the file and to find comments...

for line in pstream:
            findComment = re.search(comment, line)
            if findComment:
                Comment = findComment.group(0)
                if isinstance(Comment, str):
                    print(Comment)
                if isinstance(line, str):
                    print(line)
                line = re.sub(Comment, "", line)
                print(line)

I want to find the comments and delete them from the text of the file..

But my output for the above code is..

/* hello */
#include  /* hello */ "AnotherFile.h"
#include  /* hello */ "AnotherFile.h"

On the second print of line I want /* hello */ to not be there, which I would assume would mean that the comment was deleted from file.. But my re.sub doesn't seem to do anything to it..

Any help?

EDIT: I'm not sure why the two #include prints are in a lighter shade, but to clarify, they are also printed just like /* hello */ is

I tested my re.sub in another file with the code

import re

line = '#include /* hello */ "file.h"'
Comment = '/* hello */'

line = re.sub(Comment, " ", line)

print(line)

And it prints..

#include /* hello */ "file.h"

But I don't want the /* hello */ to be there :(

I see you are using the Comment as a regex. Since it may (and does in this case) contain special regex metacharacters, you need to re.escape them.

Use re.escape(Comment) :

line = re.sub(re.escape(Comment), "", line)

See demo

The output of the second print is now as expected:

/* hello */
#include  /* hello */ "AnotherFile.h"
#include   "AnotherFile.h"

To make sure the initial spaces are removed, you can append r"\\s*" in the beginning ( see demo ):

line = re.sub(r"\s*" + re.escape(Comment), "", line)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM