Python findall() start digit and end word

Question

I have this string

procesor = "2x2.73 GHz Mongoose M5 & 2x2.50 GHz Cortex-A76 & 4x2.0 GHz Cortex-A55"

and I need this CPU core list by using the re.findall()

Out:['2x2.73 GHz', '2x2.50 GHz', '4x2.0 GHz']

Please help me. I'm stuck here:

re.findall('(\d+[A-Za-z])',procesor)
Out[1]: ['2x', '2x', '4x']

Answer 1

Use

re.findall(r'\d+x\d+(?:\.\d+)?\s*GHz', procesor)

See regex proof .

Explanation

--------------------------------------------------------------------------------
  \d+                      digits (0-9) (1 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
  x                        'x'
--------------------------------------------------------------------------------
  \d+                      digits (0-9) (1 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (optional
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    \.                       '.'
--------------------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  )?                       end of grouping
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  GHz                      'GHz'

If you need it case insensitive:

re.findall(r'\d+x\d+(?:\.\d+)?\s*GHz', procesor, re.I)

Answer 2

In a more human readable format [0-9] represents one digit:

processor = "2x2.73 GHz Mongoose M5 & 2x2.50 GHz Cortex-A76 & 4x2.0 GHz Cortex-A55"
re.findall(r'[0-9]+x[0-9]+.[0-9]* GHz', processor)

Returns:

['2x2.73 GHz', '2x2.50 GHz', '4x2.0 GHz']

Answer 3

This regex-pattern can helps you: ([\\d.]+)\\s?[xX]\\s?([\\d.]+)\\s?GHz or insentitive case (?i)([\\d.]+)\\s?x\\s?([\\d.]+)\\s?GHz

See the sample in regex101 !

Append this to your Python source:

processor  = """2x2.73 GHz Mongoose M5 & 2x2.50 GHz Cortex-A76 & 4x2.0 GHz Cortex-A55"""
CPU_Cores = re.findall("([\d.]+)\s?[xX]\s?([\d.]+)\s?GHz", processor)
print (CPU_Cores)

Output

[('2', '2.73'), ('2', '2.50'), ('4', '2.0')]

Explaination

([\\d.]+)\\s?[xX]\\s?([\\d.]+)\\s?GHz

The first group ([\\d.]+) matches first real-number.
\\s?[xX]\\s? matches x , x , x , X , X , X .
The second group ([\\d.]+) matches second real-number.
\\s? is optional that matches whitespace character or nothing.
GHz matches literally word GHz.

Python findall() start digit and end word

Question

3 answers

solution1
2 2020-11-21 21:54:09

solution2
1 2020-11-21 23:13:21

solution3
1 2020-11-21 23:53:56

Python findall() start digit and end word

Question

3 answers

solution1 2 2020-11-21 21:54:09

solution2 1 2020-11-21 23:13:21

solution3 1 2020-11-21 23:53:56

solution1
2 2020-11-21 21:54:09

solution2
1 2020-11-21 23:13:21

solution3
1 2020-11-21 23:53:56