How to find words in a string containing at least one underscore and capital letters

Question

I would like to match all the words in a string containing

at least one underscore (but the word cannot neither start nor end with it)
at least two uppercase letters
all the letters must be uppercase.

For example (and that is best result I got):

test_string = "test_string TEST_STRING TEST_string _TEST_STRING_ TESTSTRING ANOTHER_TEST_STRING"
p = re.compile("(\S*[A-Z_]\S*[_]\S*)") 
p.search(test_string)

The words I would like to obtain from the search method are:

TEST_STRING (the second word, not the substring of _TEST_STRING_)
ANOTHER_TEST_STRING

But I am obtaining

TEST_STRING
TEST_STRING (which is the substring of _TEST_STRING_).

Thank you

Answer 1

You regex (\\S*[A-Z_]\\S*[_]\\S*) uses \\S* which will match a non-whitespace character and repeats that 0+ times so you would for example also match __ or A_

You might use:

\b[A-Z]+_[A-Z_]*[A-Z]\b

Explanation

\\b Word boundary
[AZ]+ Match 1+ uppercase chars
_ Match underscore
[A-Z_]* Match 0+ times either an uppercase char or an underscore
[AZ] Match an uppercase char
\\b Word boundary

re.search will return the first location where the regex matches. You could use findall instead:

import re
test_string = "test_string TEST_STRING TEST_string _TEST_STRING_ TESTSTRING ANOTHER_TEST_STRING"
p = re.compile(r"\b[A-Z]+_[A-Z_]*[A-Z]\b") 
print(re.findall(p,test_string))

Result

['TEST_STRING', 'ANOTHER_TEST_STRING']

See the regex demo | Python demo

Answer 2

This should work:

import re

regex = r"\b([A-Z]+(?:_[A-Z]+){1,})\b"
test_str = "test_string TEST_STRING TEST_string _TEST_STRING_ TESTSTRING ANOTHER_TEST_STRING"
matches = re.findall(regex, test_str, re.MULTILINE)

Output:

>>> matches
['TEST_STRING', 'ANOTHER_TEST_STRING']

How to find words in a string containing at least one underscore and capital letters

Question

2 answers

solution1
3 ACCPTED 2019-01-11 16:42:10

solution2
0 2019-01-11 16:29:14

How to find words in a string containing at least one underscore and capital letters

Question

2 answers

solution1 3 ACCPTED 2019-01-11 16:42:10

solution2 0 2019-01-11 16:29:14

solution1
3 ACCPTED 2019-01-11 16:42:10

solution2
0 2019-01-11 16:29:14