[英]Search and replace from a selected text using regex in Python
I would like to select a text from a file in Python and replace only from the selected phrase until a certain text.我想 select Python 中的文件中的文本,并仅从选定的短语替换到特定文本。
with open ('searchfile.txt', 'r' ) as f:
content = f.read()
content_new = re.sub('^\S*', '(.*?\/)', content, flags = re.M)
with open ('searchfile.txt', 'w') as f:
f.write(content_new)
searchfile.txt contains the below text: searchfile.txt 包含以下文本:
abc/def/efg 212 234 asjakj
hij/klm/mno 213 121 ashasj
My aim is to select everything from the line until the first space and then replace it with the text until the first occurance of backslash /我的目标是 select 从行到第一个空格的所有内容,然后用文本替换它,直到第一次出现反斜杠 /
Example:例子:
^\S*
selects everything until the first space in my file which is "abc/def/efg".
^\S*
选择所有内容,直到我的文件中的第一个空格是"abc/def/efg".
I would like to replace this text with only "abc"
and "hij" in different lines我想用不同行中的
"abc"
和“hij”替换此文本
My regexp (.*?\/)
does not work for me here.我的正则表达式
(.*?\/)
在这里对我不起作用。
You can split the content
with whitespace, get the first item and split it with /
and take the first item:您可以使用空格拆分
content
,获取第一项并使用/
拆分并获取第一项:
content_new = content.split()[0].split('/')[0]
See the Python demo .请参阅Python 演示。
If you plan to use a regex, you may use如果您打算使用正则表达式,您可以使用
match = re.search(r'^[^\s/]+', content, flags = re.M)
if match:
content_new = match.group()
See the Python demo .请参阅Python 演示。 Details :
详情:
^
- start of a line (due to re.M
) ^
- 一行的开头(由于re.M
)[^\s/]+
- one or more chars other than whitespace and /
. [^\s/]+
- 除空格和/
之外的一个或多个字符。Try this:尝试这个:
>>> s = 'abc/def/efg 212 234 asjakj'
>>> p = s.split(' ', maxsplit=1)
>>> p
['abc/def/efg', '212 234 asjakj']
>>> p[0] = p[0].split('/', maxsplit=1)[0]
>>> p
['abc', '212 234 asjakj']
>>> s = ' '.join(p)
>>> s
'abc 212 234 asjakj'
One-liner solution:一线解决方案:
>>> s.replace(s[:s.index(' ')], s[:s.index('/')], 1)
'abc 212 234 asjakj'
May be this can help可能这可以帮助
import re
s = "abc/def/efg 212 234 asjakj"
pattern = r"^(.*?\/)"
replace = "xyz/"
op = re.sub(pattern, replace, s)
print (op)
<path><space>
.<path><space>
。<path>
) has at least one slash /
surrounded by words.<path>
) 至少有一个斜杠/
被单词包围。 Where path is words delimited by slashes.其中路径是由斜杠分隔的单词。 For example
abc/de
.例如
abc/de
。 But but not one of those:但不是其中之一:
abc
/de
abc/file.txt
abc/
Could also match for the pattern and only extract the first path-element before the slash then.也可以匹配模式,然后只提取斜线之前的第一个路径元素。
import re
line = "abc/def/efg 212 234 asjakj"
extracted = '' # default
if re.match(r'^(\w+/\w+)+ ', line):
extracted = line.split('/')[0] # even simpler than Wiktors split
print(extracted)
The extraction can be done in two ways:提取可以通过两种方式完成:
(1) Just the first path-element, like Wiktor answered . (1) 只是第一个路径元素,就像Wiktor 回答的那样。
first_path_element = "abc/def/efg 212 234 asjakj".split('/')[0]
print(first_path_element)
(2) Some may find a regex shorter and more expressive: (2) 有些人可能会发现正则表达式更短且更具表现力:
import re
first_path_element = re.findall(r'^(\w+)/', "abc/def/efg 212 234 asjakj")[0]
print(first_path_element)
Here is a solution which is working for reading from the file, searching a pattern, replacing with a new one and writing into the same file.这是一个解决方案,用于从文件中读取、搜索模式、替换为新模式并写入同一文件。
file_name = ("/home/searchfile.txt")
with open(file_name) as file:
lines = file.readlines()
result_data = []
for line in lines:
line = line.strip()
space_split = line.split(" ")
prefix = space_split[0].split("/")[0]
result = prefix + " " + " ".join(space_split[1:])
result_data.append(result)
with open(file_name, "w") as file:
lines = file.writelines("\n".join(result_data)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.