简体   繁体   中英

How can I isolate the final part of the string that I don't want with Python?

I'm using Python 3. I'm trying to remove from each line of a file a specific part of it. Each line is a string and they have the same similar format, like so:

/* 7 */
margin-top:1.5rem!important/* 114 */
}/* 115 *//* 118 *//* 121 */
.mb-2{/* 122 */
margin-bottom:.5rem!important/* 123 */
}/* 124 *//* 127 *//* 130 *//* 133 *//* 137 */

I want to remove in each line that "has multiple quoted numbers" like this for example (with 3 quoted numbers):

}/* 115 *//* 118 *//* 121 */

or this for example (with 5 quoted numbers)

}/* 124 *//* 127 *//* 130 *//* 133 *//* 137 */

I want to remove all quoted numbers except the "first one", so for the first example, output should be like this:

}/* 115 */

and for the second example, output should be like this:

}/* 124 */

Here is my code:

def clean_file(file_name, new_output_file_name):
    with open(file_name, 'r') as src:
        with open(new_output_file_name, 'w') as output_file:
            for line in src:
                if line.strip().startswith('/*'):
                    output_file.write('%s' % '')
                elif line # How can I isolate and remove the final part I don't want?
                    output_file.write('%s\n' % line.rstrip('\n')) # How can I isolate and remove the final part I don't want?
                else:
                    output_file.write('%s\n' % line.rstrip('\n'))


clean_file('new.css', 'clean.css')

How can I isolate and remove the final part of the string that I don't want with Python?

You can use re.sub for this. Use this regex to search:

(/\* \d+ \*/)/\*.+

And replace it with r"\1"

RegEx Demo

Code:

import re
src = '}/* 124 *//* 127 *//* 130 *//* 133 *//* 137 */'
print (re.sub(r'(/\* \d+ \*/)/\*.+', r'\1', src))
## }/* 124 */

RegEx Breakup:

  • (/\* \d+ \*/) : Match a string that has /* <number> */ and capture in group #1
  • /\* : Match /*
  • .+ : Followed by 1+ of any char till end
  • `\1': Is replacement that puts capture value of group #1 back
def clean_file(file_name, new_output_file_name):
with open(file_name, 'r') as src:
    with open(new_output_file_name, 'w') as output_file:
        for line in src:
            output_file.write(re.sub(r'(/\*.*?\*/)/\*.*\*/',r'\1',line)
            # this regex not only removes the digits 
            # but also removes any other comments that is present 
            # after the first comment in the file


clean_file('new.css', 'clean.css')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM