简体   繁体   中英

how to interpret string using regular expression?

I am using a data string on which interpretation needs to be done to get the key value as "45dB typical, 48 dB max" from "LpA: 45dB typical, 48 dB max".I tried using my below code but got the different string.

I tried solving problem using regular expression pattern '(.*)LpA:(\\w*)\\n


data_str="""With AC power supply (with 24 PoE+ ports loaded for C9300 SKUs)
●  LpA: 45dB typical, 48 dB max
●  LwA: 5.6B typical, 5.9B max
With AC power supply (with half the number of PoE+ ports loaded for C9300L SKUs)
●  LpA: 44dB typical, 47 dB max
●  LwA: 5.5B typical, 5.8B max
Typical: Noise emission for a typical configuration
Maximum: Statistical maximum to account for variation in production"""

pattern_type=re.compile('(.*)LpA:(\w*)\n',re.I)
key = pattern_type.sub(r"\2","%r"%data_str)
print(key)


I expect:
'''45dB typical, 48 dB max'''
but out put getting is:
'''45dB typical, 48 dB max ● LwA: 5.6B typical, 5.9B max With AC power supply (with half the number of PoE+ ports loaded for C9300L SKUs) ● LpA: 44dB typical, 47 dB max ● LwA: 5.5B typical, 5.8B max Typical: Noise emission for a typical configuration Maximum: Statistical maximum to account for variation in production'''

It seems like you try to match the entire string and then substitute it with one of the matching groups. Instead, just use re.search to get that one matching group. Also, you probably want to use . instead of \\w as the substring contains spaces and other non-word characters.

>>> pattern_type = re.compile('LpA: (.*)')
>>> key = pattern_type.search(data_str)
>>> key.group(1)
45dB typical, 48 dB max

Just use a Positive Lookbehind :

(?<=LpA: ).+$

Regex Demo

Explanation:

(?<=LpA: )   Assert that matching LpA, but do not capture in final match
.+           Capture any character
$            Till end of line

Code snippet:

regex = re.compile("(?<=LpA: ).+$", re.M)
for match in regex.findall(*your_string_here*):
    print(match)

This should work:

res = re.search('LpA:(.*)\n', data_str)
if res: #if res is not None
    key = res.group(1).strip()
    print(key)

The code below works fine. I also provided you some comment on the regex pattern used.

import re

data_str="""With AC power supply (with 24 PoE+ ports loaded for C9300 SKUs)
●  LpA: 45dB typical, 48 dB max
●  LwA: 5.6B typical, 5.9B max
With AC power supply (with half the number of PoE+ ports loaded for C9300L SKUs)
●  LpA: 44dB typical, 47 dB max
●  LwA: 5.5B typical, 5.8B max
Typical: Noise emission for a typical configuration
Maximum: Statistical maximum to account for variation in production"""


# LpA:\s+([^\n]+)\n
# 
# Options: Case insensitive; Exact spacing; Dot doesn’t match line breaks; ^$ don’t match at line breaks; Regex syntax only
# 
# Match the character string “LpA:” literally (case insensitive) «LpA:»
# Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) «\s+»
#    Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
# Match the regex below and capture its match into backreference number 1 «([^\n]+)»
#    Match any character that is NOT the line feed character «[^\n]+»
#       Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
# Match the line feed character «\n»
regex = re.compile(r"LpA:\s+([^\n]+)\n", re.I)

for match in regex.findall(data_str):
    print(match)

The output I get is the following

45dB typical, 48 dB max
44dB typical, 47 dB max

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM