简体   繁体   中英

Python regex vs Regex101

Input string:

I0419 01:52:16.606123 136 TrainerInternal.cpp:181] Pass=15 Batch=74 samples=3670 AvgCost=263.331 Eval: classification_error_evaluator=0.970178 I0419 01:52:16.815407 136 Tester.cpp:115] Test samples=458 cost=203.737 Eval: classification_error_evaluator=0.934446

Pattern:

Pass=([0-9]+).*classification_error_evaluator=(0.[0-9]+).*classification_error_evaluator=(0.[0-9]+)

Desired output:

(15, 0.970178, 0.934446)

And on Regex101( https://regex101.com/r/Hwxsib/1 ), it seems like I'm capturing the right pattern.

But in Python, it didn't match the groups and it caught nothing:

import re

x = "I0419 01:52:16.606123   136 TrainerInternal.cpp:181]  Pass=15 Batch=74 samples=3670 AvgCost=263.331 Eval: classification_error_evaluator=0.970178 I0419 01:52:16.815407   136 Tester.cpp:115]  Test samples=458 cost=203.737 Eval: classification_error_evaluator=0.934446"

pattern = "Pass=([0-9]+).*classification_error_evaluator=(0\.[0-9]+).*classification_error_evaluator=(0\.[0-9]+)"

re.match(pattern, x)

What is the difference between the regex101 settings as compared to Python re package? Or are they the same? Do they have different flags or settings/something?

Why isn't the pattern matching in Python?

You probably want re.search , re.match only will return a match if it appears at the beginning of your string

regex101 also shows you the code it uses: https://regex101.com/r/Hwxsib/1/codegen?language=python

From the regex101 code, here's what it is doing (copied and edited for brevity):

import re

regex = r"..."

test_str = "..."

matches = re.finditer(regex, test_str)

...

You want to use re.search . match will only return if the match is at the start of the string

import re

x = "I0419 01:52:16.606123   136 TrainerInternal.cpp:181]  Pass=15 Batch=74 samples=3670 AvgCost=263.331 Eval: classification_error_evaluator=0.970178 I0419 01:52:16.815407   136 Tester.cpp:115]  Test samples=458 cost=203.737 Eval: classification_error_evaluator=0.934446"

pattern = r'Pass=([0-9]+).*classification_error_evaluator=(0\.[0-9]+).*classification_error_evaluator=(0\.[0-9]+)'

print re.search(pattern, x).groups(1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM