[英]python search replace using wildcards
somewhat confused.. but trying to do a search/repace using wildcards 有点困惑..但尝试使用通配符进行搜索/重新调用
if i have something like: 如果我有类似的东西:
<blah.... ssf ff>
<bl.... ssf dfggg ff>
<b.... ssf ghhjj fhf>
and i want to replace all of the above strings with say, 我想用以下方法替换所有上述字符串,
<hh >t
any thoughts/comments on how this can be accomplished? 关于如何实现这一点的任何想法/意见?
thanks 谢谢
update (thanks for the comments!) 更新(感谢您的评论!)
i'm missing something... 我错过了一些东西......
my initial sample text are: 我的初始示例文本是:
Soo Choi</span>LONGEDITBOX">Apryl Berney
Soo Choi</span>LONGEDITBOX">Joel Franks
Joel Franks</span>GEDITBOX">Alexander Yamato
and i'm trying to get 而且我想要得到
Soo Choi foo Apryl Berney
Soo Choi foo Joel Franks
Joel Franks foo Alexander Yamato
i've tried derivations of 我试过推算
name=re.sub("</s[^>]*\">"," foo ",name)
but i'm missing something... 但我错过了一些东西......
thoughts... thanks 想法......谢谢
How about like this, with regex 这个怎么样,正则表达式
import re
YOURTEXT=re.sub("<b[^>]*>","<hh >t",YOURTEXT)
请参阅此处相当有用的Python 正则表达式手册,或者参见正则表达式HOWTO部分5.2搜索和替换的更多动手方法。
don't have to use regex 不必使用正则表达式
for line in open("file"):
if "<" in line and ">" in line:
s=line.rstrip().split(">")
for n,i in enumerate(s):
if "<" in i:
ind=i.find("<")
s[n]=i[:ind] +"<hh "
print '>t'.join(s)
output 产量
$ cat file
blah <blah.... ssf ff> blah
blah <bl.... ssf dfggg ff> blah <bl.... ssf dfggg ff>
blah <b.... ssf ghhjj fhf>
$ ./python.py
blah <hh >t blah
blah <hh >t blah <hh >t
blah <hh >t
Sounds like a job for the "re" module, here's a little sample function for you although you could just use the one re.sub() line. 听起来像“re”模块的工作,这里有一个小样本函数,虽然你可以使用一个re.sub()行。
Use the "re" module, a simple re.sub should do the trick: 使用“re”模块,一个简单的re.sub应该可以做到这一点:
import re
def subit(msg):
# Use the below if the string is multiline
# subbed = re.compile("(<.*?>)" re.DOTALL).sub("(<hh >t", msg)
subbed = re.sub("(<.*?>)", "<hh >t", msg)
return subbed
# Your messages bundled into a list
msgs = ["blah <blah.... ssf ff> blah",
"blah <bl.... ssf dfggg ff> blah <bl.... ssf dfggg ff>",
"blah <b.... ssf ghhjj fhf>"]
# Iterate the messages and print the substitution results
for msg in msgs:
print subit(msg)
I would suggest taking a look at the docs for the "re" module, it is well documented and might help you achieve more accurate text manipulation/replacement. 我建议看看“re”模块的文档,它有很好的文档记录,可能会帮助您实现更准确的文本操作/替换。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.