简体   繁体   中英

Unable to find a pattern in a string using regular expression

I am trying to extract all valid hexadecimal value which represent color in a CSS code.

Specifications of HEX Color Code

  1. It must start with a '#' symbol.
  2. It can have 3 or 6 digits.
  3. Each digit is in range 0-F or 0-f.

Here is the sample input

#BED
{
    color: #FfFdF8; background-color:#aef;
    font-size: 123px;
    background: -webkit-linear-gradient(top, #f9f9f9, #fff);
}
#Cab
{
    background-color: #ABC;
    border: 2px dashed #fff;
}

Sample output

#FfFdF8
#aef
#f9f9f9
#fff
#ABC
#fff

Explanation

#BED and #Cab satisfy the Hex Color Code criteria, but they are used as selectors and not as color codes in the given CSS. So the actual color codes are

#FfFdF8
#aef
#f9f9f9
#fff
#ABC
#fff

What I tried in python

import re
pattern = r'^#([A-Fa-f0-9]{3}){1,2}$'
n = int(input())
hexNum = []
for _ in range(n):
   s = input()
   if ':' in s and '#' in s:
       result = re.findall(pattern,s)
       if result:
           hexNum.extend(result)
for num in hexNum:
    print(num)

When I'm running the above code on the sample input, it is printing nothing. So what's wrong I'm doing here? Is it the matching pattern? Or is it the logic I'm applying?

Please somebody explain me!

Get rid of the anchors ^ and $ , since they make it only match the entire input line.

Get rid of the capture groups, so that re.findall() will just return whole matches, not the group matches. Use (?:...) to create a non-capturing group so you can use the {1,2} quantifier.

pattern = r'#(?:[A-Fa-f0-9]{3}){1,2}'

You have a two or three part problem:

  1. Remove CSS comments, which often contain code-looking stuff (optional, but recommended)
    • Regex matching comments is /\*.*?\*/
  2. Only look inside curly braces (eg not at selectors)
    • Regex matching curly braces is \{.*?\}
  3. find color codes
    • Regex for color codes is #(?:[A-Fa-f0-9]{3}){1,2}

Bringing it all together:

import re
def color_codes(css_text):
    codes = []
    # remove comments
    css_text = re.sub(r'/\*.*?\*/', '', css_text, re.S)
    # consider only {} blocks
    for block in re.finditer(r'\{.*?\}', css_text, re.S):
        # find color codes
        codes.extend(re.findall(r'#(?:[A-Fa-f0-9]{3}){1,2}', block.group(0)))
    return codes

Note: This is probably not a fool-proof solution. For that, you'd want to switch from simple regex to a full parser. But it's close enough if you just need something quick and don't mind some edge cases.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM