i have a requirement, i need to extract substring from String using regex.
for example, here is my sample data:
Hello, "How" are "you" What "are" you "doing?"
from this example data, i need to extract only second and fourth occurrence of double quoted data.
my requirement is : you doing?
i tried with below regex but i am unable to extract as per my requirement.
"(.*?)"
We can use re.findall
and then slice the result to get the first and third matches:
import re
string = 'Hello, "How" are "you" What "are" you "doing?"'
result = re.findall('".+?"', string)[1::2]
print(result)
Here, the regex matches any number of characters contained within double quote marks, but tries to match as few as possible (a non-greedy match), otherwise we would end up with one single match, "How" are "you" What "are" you "doing?"
.
Output:
['"you"', '"doing?"']
If you want to combine them without the quote marks, you can use str.strip
along with str.join
:
print(' '.join(string.strip('"') for string in result))
Output:
you doing?
An alternative method would be to just split on "
:
result = string.split('"')[1::2][1::2]
print(result)
Output:
['you', 'doing?']
This works because, if you separate the string by double quote marks, then the output will be as follows:
This means that we can take every even element to get the ones that are in quotes. We can then just slice the result again to get the 2nd and 4th results.
Regex only solution. May not be 100% accurate since it matches every second occurrence rather than just the 2nd and 4th, but it works for the example.
"[^"]+"[^"]+("[^"]+")
Demonstration in JS:
var str = 'Hello, "How" are "you" What "are" you "doing?"'; var regex = /"[^"]+"[^"]+("[^"]+")/g match = regex.exec(str); while (match != null) { // matched text: match[0] // match start: match.index // capturing group n: match[n] console.log(match[1]) match = regex.exec(str); }
We can try using re.findall
to extract all quoted terms. Then, build a string using only even entries in the resulting list:
input = "Hello, \"How\" are \"you\" What \"are\" you \"doing?\""
matches = re.findall(r'\"([^"]+)\"', input)
matches = matches[1::2]
output = " ".join(matches)
print(output)
you doing?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.