[英]Slow regex in Python?
我正在嘗試匹配這些字符串
{@csm.foo.bar}
沒有匹配任何這些
{@csm.foo.bar-@csm.ooga.booga}
{@csm.foo.bar-42}
我用的正則表達式是
r"\{@csm.((?:[a-zA-Z0-9_]+\.?)+)\}"
如果字符串包含多個匹配項,則會使速度變慢。 為什么? 如果我取消括號匹配,它將運行非常快,就像這樣
r"@csm.((?:[a-zA-Z0-9_]+\.?)+)"
但這不是我想要的。
有任何想法嗎?
這是示例輸入:
<dockLayout id="popup" y="0" x="0" width="{@csm.screenWidth}" height="{@csm.screenHeight}">
<dataNumber id="selopacity_Volt" name="selopacity_Volt" value="0" />
<dataNumber id="selopacity_Amp" name="selopacity_Amp" value="0" />
<animate trigger="{@m_ds_ML.VIMPBM_BatteryVoltage.valstr}" triggerOn="*" targetNode="selopacity_Volt" targetAttr="value" to="1" dur="0ms" ease="in" />
<animate trigger="{@m_ds_ML.VIMPBM_BatteryVoltage.valstr}" triggerOn="65024" targetNode="selopacity_Volt" targetAttr="value" to="0" dur="0ms" ease="in" />
<animate trigger="{@m_ds_ML.VIMPBM_BatteryCurrent.valstr}" triggerOn="*" targetNode="selopacity_Amp" targetAttr="value" to="1" dur="0ms" ease="in" />
<animate trigger="{@m_ds_ML.VIMPBM_BatteryCurrent.valstr}" triggerOn="65024" targetNode="selopacity_Amp" targetAttr="value" to="0" dur="0ms" ease="in" />
<dockLayout id="item" width="{@csm.screenWidth}" height="{@csm.screenHeight}" depth="-1" clip="false" xmlns="http://www.tat.se/kastor/kml" >
<dockLayout id="list_item_title" x="0" width="{@csm.screenWidth}" height="{@csm.Gearselection.text_heght-@csm.pageVisualCP_y}">
<text id="volt_amp_text" x="0" ellipsize="false" font="{@csm.listUnselFont}" color="{@csm.itemUnselColor}" dockLayout.halign="left" dockLayout.valign="bottom" string="{ItemTitle}" />
</dockLayout>
<dockLayout id="gear_layout" y="0" x="0" width="{@csm.screenWidth}" height="{@csm.vmImage_y_gearselection-@csm.pageVisualCP_y}">
<image id="battery_image" x="0" dockLayout.halign="left" dockLayout.valign="bottom" opacity="1" src="{@m_MenuModel.Gauges.VoltAmpereMeter.image}"/>
</dockLayout>
<!--DockLayout for Voltage Value-->
<dockLayout id="volt_value" x="0" width="{@csm.VoltAmpereMeter.volt_value_x-@csm.VoltAmpereMeter.List_x}" height="{@csm.vmImage_y_gearselection-@csm.pageVisualCP_y}">
<text id="volt_value_text" x="0" opacity="{selopacity_Volt*selopacity_Amp}" ellipsize="false" font="{@csm.listUnselFont}" color="{@csm.itemSelColor}" dockLayout.halign="right" dockLayout.valign="bottom" string="{@m_ds_ML.VIMPBM_BatteryVoltage.valstr}" >
</text>
</dockLayout>
<!--DockLayout for Voltage Unit-->
<dockLayout id="volt_unit" x="{@csm.VoltAmpereMeter.volt_unit_x-@csm.VoltAmpereMeter.List_x}" width="{@csm.screenWidth}" height="{@csm.vmImage_y_gearselection-@csm.pageVisualCP_y}">
<text id="volt_unit_text" x="0" opacity="{selopacity_Volt*selopacity_Amp}" ellipsize="false" font="{@csm.listUnselFont}" color="{@csm.itemSelColor}" dockLayout.halign="left" dockLayout.valign="bottom" string="V" >
</text>
</dockLayout>
<!--DockLayout for Ampere Value-->
<dockLayout id="ampere_value" x="0" width="{@csm.VoltAmpereMeter.ampere_value_x-@csm.VoltAmpereMeter.List_x}" height="{@csm.vmImage_y_gearselection-@csm.pageVisualCP_y}">
<text id="ampere_value_text" x="0" opacity="{selopacity_Amp*selopacity_Volt}" ellipsize="false" font="{@csm.listUnselFont}" color="{@csm.itemSelColor}" dockLayout.halign="right" dockLayout.valign="bottom" string="{@m_ds_ML.VIMPBM_BatteryCurrent.valstr}" >
</text>
</dockLayout>
<!--DockLayout for Ampere Unit-->
<dockLayout id="ampere_unit" x="{@csm.VoltAmpereMeter.ampere_unit_x-@csm.VoltAmpereMeter.List_x}" width="{@csm.screenWidth}" height="{@csm.vmImage_y_gearselection-@csm.pageVisualCP_y}">
<text id="ampere_unit_text" x="0" opacity="{selopacity_Amp*selopacity_Volt}" ellipsize="false" font="{@csm.listUnselFont}" color="{@csm.itemSelColor}" dockLayout.halign="left" dockLayout.valign="bottom" string="A" >
</text>
</dockLayout>
<!--DockLayout for containing Data Not Available text-->
<dockLayout id="no_data_textline" x="{@csm.VoltAmpereMeter.List_x1-@csm.VoltAmpereMeter.List_x}" width="{@csm.screenWidth}" height="{@csm.vmImage_y_gearselection-@csm.pageVisualCP_y}">
<text id="no_data_text" x="0" opacity="{1-(selopacity_Amp*selopacity_Volt)}" ellipsize="false" font="{@csm.listSelFont}" color="{@csm.itemSelColor}" dockLayout.halign="left" dockLayout.valign="bottom" string="{text1}" >
</text>
</dockLayout>
<!--<rect id="test_rect1" x="{151-28}" y="0" width="1" height="240" opacity="1" fill="#00ff00" />
<rect id="test_rect1" x="{237-28}" y="0" width="1" height="240" opacity="1" fill="#00ff00" />
<rect id="test_rect1" x="{160-28}" y="0" width="1" height="240" opacity="1" fill="#00ff00" />
<rect id="test_rect1" x="{246-28}" y="0" width="1" height="240" opacity="1" fill="#00ff00" />
<rect id="test_rect8" x="0" y="{161-40}" width="320" height="1" opacity="1" fill="#00ff00" />
<rect id="test_rect1" x="{109-28}" y="0" width="1" height="240" opacity="1" fill="#00ff00" />-->
</dockLayout>
</dockLayout>
您能否提供第一個匹配為“ dog slow”的字符串的測試用例? 順便說一句,盡管我不知道這對性能是否{@csm
,但是RE中有一個不精確之處-它匹配{@csm
開頭后的任何單個字符,而不僅僅是點號; 更好的表達式(可能更快,因為它不會使任何點成為“可選”)可能是:
r'\{@csm((?:\.\w+)+)\}'
我並不是一位正則表達式專家,但這可能是由於比賽結束時的括號所致。 您可能會嘗試匹配r"\\{@csm.((?:[a-zA-Z0-9_]+\\.?)+)"
而只是手動檢查結束括號是否出現在末尾。
您可能需要給出一個慢速運動的更好的例子。 對於包含匹配項和不匹配項的合理長字符串:
x="".join(['{@csm.foo.bar-%d}\n{@csm.foo.%dx.baz}\n' % (a,a)
for a in xrange(10000)])
mymatch=r"\{@csm.((?:[a-zA-Z0-9_]+\.?)+)\}"
for y in re.finditer(mymatch,x):
print y.group(0)
可以正常工作,但是如果您有足夠長的字符串並且搜索效果很差,則可能會遇到問題。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.