I have a string
<img width="24" src="https://someurl.com" height="24" alt="FirstName LastName" id="ember44" class="global-nav__me-photo ember-view"> id="ember44" class="global-nav__me-photo ember-view">
In the RegEx I need to select everything except
alt="FirstName LastName"
Tried some kind of expression
alt.+(?!alt)
but still not in place. Thank you in advance!
Instead of trying to match everything that isn't your "anti-search" string, how about replacing that string with nothing?
s = """<img width="24" src="https://someurl.com" height="24" alt="FirstName LastName" id="ember44" class="global-nav__me-photo ember-view"> id="ember44" class="global-nav__me-photo ember-view">"""
s_new = re.sub(r'alt=\"[^\"]+\"\s+', '', s)
# '<img width="24" src="https://someurl.com" height="24" id="ember44" class="global-nav__me-photo ember-view"> id="ember44" class="global-nav__me-photo ember-view">'
Explanation ( Try online ):
alt=\"[^\"]+\"\s+
-----------------
alt=\" \" : Literally alt=, followed by quotes
[^\"]+ : One or more non-quote characters
\s+ : One or more whitespace
How about this then
<(.*)(?=alt=\"[^\"]+\")(?:alt=\"[^\"]+\")([^>]+)>
You can test this here https://regex101.com/r/P0DLw5/1
This is basically get me everything until alt="...", then match but ignore alt="..." then get me everything after it.
This is not perfect by any means but i'm going by your current example.
Well, in order to invert a regex, you could use re.sub()
which takes 3 required arguments. a pattern, replacement, and a original string.
So, you could invert like this
import re
s = '<img width="24" src="https://someurl.com" height="24" alt="FirstName LastName" id="ember44" class="global-nav__me-photo ember-view"> id="ember44" class="global-nav__me-photo ember-view">'
pattern = r'alt=".*?"'
without_alt = re.sub(pattern, '', s)
print(without_alt)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.