[英]How can I replace dollar signs with brackets?
我編寫了以下源代碼來用方括號替換美元符號:
text= r"mu-the mean direction $\mu \in [-\pi,\pi)$. kappa-a concentration parameter $\kappa > 0$. "
def replace_dollar(self, text):
new_text = ""
flag = False
for ch in text:
if ch == '$':
if flag is False:
new_text += "["
flag = True
else:
new_text += "]"
flag = False
else:
new_text += ch
return new_text
if __name__ == '__main__':
new_text = replace_dollar(text)
print(new_text)
Output
mu-the mean direction [\mu \in [-\pi,\pi)]. kappa-a concentration parameter [\kappa > 0].
有沒有更有效的技術?
常用表達。 也許效率不高(你必須測量),但肯定更整潔,更不容易出錯。
>>> import re
>>> re.sub(r'\$(.*?)\$', r'[\1]', text)
'mu-the mean direction [\\mu \\in [-\\pi,\\pi)]. kappa-a concentration parameter [\\kappa > 0]. '
編輯:如果您要重復使用它,請預編譯 RE 以提高效率:
>>> dollar_to_brackets = re.compile(r'\$(.*?)\$')
>>> dollar_to_brackets.sub(r'[\1]', text)
'mu-the mean direction [\\mu \\in [-\\pi,\\pi)]. kappa-a concentration parameter [\\kappa > 0]. '
這是使用您當前邏輯的版本,但更具可讀性。
text= r"mu-the mean direction $\mu \in [-\pi,\pi)$. kappa-a concentration parameter $\kappa > 0$. "
bracket = "[]" # Index this string with values True/False, which are 0/1
right = False # Indicates to use right bracket (or not)
while '$' in text: # In each iteration, replace the left-most occurrence
text = text.replace('$', bracket[right], 1)
right = not right # Switch between left and right brackets
print(text) # This is for tracing only
Output:
mu-the mean direction [\mu \in [-\pi,\pi)$. kappa-a concentration parameter $\kappa > 0$.
mu-the mean direction [\mu \in [-\pi,\pi)]. kappa-a concentration parameter $\kappa > 0$.
mu-the mean direction [\mu \in [-\pi,\pi)]. kappa-a concentration parameter [\kappa > 0$.
mu-the mean direction [\mu \in [-\pi,\pi)]. kappa-a concentration parameter [\kappa > 0].
另一種正則表達式方式:
import re
from itertools import cycle
def replace_dollar(text):
return re.sub(r'\$', lambda _, c=cycle('[]'): next(c), text)
使用問題的示例文本對結果進行基准測試:
8.41 us original
0.82 us Prune
4.66 us Woodford
4.09 us Woodford2
2.12 us Manuel
並且文本乘以 10(因此它有 40 個美元符號 - OP 說他們有“不超過 50 個美元符號” ,所以也許這是一個現實的測試):
91.60 us original
30.75 us Prune
27.33 us Woodford
27.41 us Woodford2
11.79 us Manuel
基准代碼, 在線試用! :
from timeit import repeat
from functools import partial
import re
from itertools import cycle
def original(text):
new_text = ""
flag = False
for ch in text:
if ch == '$':
if flag is False:
new_text += "["
flag = True
else:
new_text += "]"
flag = False
else:
new_text += ch
return new_text
def Prune(text):
bracket = "[]"
right = False
while '$' in text:
text = text.replace('$', bracket[right], 1)
right = not right
return text
def Woodford(text):
return re.sub(r'\$(.*?)\$', r'[\1]', text)
def Woodford2(text, dollar_to_brackets=re.compile(r'\$(.*?)\$')):
return dollar_to_brackets.sub(r'[\1]', text)
def Manuel(text):
return re.sub(r'\$', lambda _, c=cycle('[]'): next(c), text)
def benchmark():
text = r"mu-the mean direction $\mu \in [-\pi,\pi)$. kappa-a concentration parameter $\kappa > 0$. "
funcs = original, Prune, Woodford, Woodford2, Manuel
# Correctness
expect = original(text)
for func in funcs:
assert func(text) == expect
# Speed
number = 10 ** 4
for _ in range(3):
for func in funcs:
t = min(repeat(partial(func, text), number=number)) / number
print('%.2f us ' % (t * 1e6), func.__name__)
print()
benchmark()
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.