简体   繁体   English

字符之间的Python Grabbing String

[英]Python Grabbing String in between characters

If I have a string like / Hello how are you /, how am I supposed to grab this line and delete it using a python script. 如果我有一个类似/ Hello的字符串, 您好 /,我应该如何抓取这一行并使用python脚本将其删除。

import sys
import re

i_file = sys.argv[1];

def stripwhite(text):
    lst = text.split('"')
    for i, item in enumerate(lst):
        if not i % 2:
            lst[i] = re.sub("\s+", "", item)
    return '"'.join(lst)

with open(i_file) as i_file_comment_strip:

        i_files_names = i_file_comment_strip.readlines()

        for line in i_files_names:
                with open(line, "w") as i_file_data:
                        i_file_comment = i_file_data.readlines();
                        for line in i_file_comment:
                                i_file_comment_data = i_file_comment.strip()

In the i_file_comment I have the lines from i_file_data and i_file_comment contains the lines with the "/ ... /" format. 在i_file_comment中,我具有来自i_file_data的行,而i_file_comment包含具有“ / ... /”格式的行。 Would I use a for loop through each character in the line and replace every one of those characters with a ""? 我会使用for循环遍历该行中的每个字符,并将其中的每个字符替换为“”吗?

If you want to remove the /Hello how are you/ you can use regex: 如果要删除/ Hello,您/可以如何使用regex:

import re
x = 'some text /Hello how are you/ some more text'
print (re.sub(r'/.*/','', x))

Output: 输出:

some text  some more text

If you know you have occurences of a fixed string in your lines, you can simply do 如果您知道行中出现了固定的字符串,则只需

for line in i_file_comment:
    line = line.replace('/Hello how are you/', '')

however, if what you have is multiple occurences of strings delimited by / (ie /foo/, /bar/), I think using a simple regex will sufice: 但是,如果您有多次出现以/分隔的字符串(即/ foo /,/ bar /),我认为使用简单的正则表达式就足够了:

>>> import re
>>> regex = re.compile(r'\/[\w\s]+\/')
>>> s = """
... Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
... /Hello how are you/ ++ tempor incididunt ut labore et dolore magna aliqua.
... /Hello world/ -- ullamco laboris nisi ut aliquip ex ea commodo
... """
>>> print re.sub(regex, '', s)  # find substrings matching the regex, replace them with '' on string s

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
 ++ tempor incididunt ut labore et dolore magna aliqua.
 -- ullamco laboris nisi ut aliquip ex ea commodo

>>>

just adjust the regex to what you need to get rid of :) 只需将正则表达式调整为您需要摆脱的:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM