I have a set of entries for matching. One pattern, that I have already written, matches strings that consist of "-" only, eg:
----
---
--------
The regex code for matching these is
r"^[-]+$"
Now I need to match strings that contain "-" character, but only if there are other word characters in the string, eg:
---vsgvrf-sgwfwrgfs---
-----hwvchwbfk
bfcbewubf------
-efefe-ege-
-gdiwen
Etc. I have tried next patterns, but they do not work and skip some of the string:
r"(\w[^-])+-+(\w[^-])*"
r"\w[^-]+-+\w[^-]+|-+\w[^-]+|\w[^-]+-+"
Could somebody help me here?
Use this regex (word char before or after a dash):
r'(-\w|\w-)'
If hyphen and \w
both are required, how about:
^(?:-+\w|\w+-)[\w-]*$
If just \w
is required with an optional amount of hyphens and word characters :
^-*\w[-\w]*$
Another demo at regex101 (to me your question reads like this would suffice)
You can do it with ((?:\w+-+|-+\w+)+\w*-*)
.
import re
m = re.compile(r'((?:\w+-+|-+\w+)+\w*-*)')
lines = ['-Before', '-Before-Middle', 'Middle-After-', 'After-', '--All--Three--', 'No Match', '-', '--', '---']
for line in lines:
if m.match(line):
print(m.match(line).groups())
Output:
('-Before',)
('-Before-Middle',)
('Middle-After-',)
('After-',)
('--All--Three--',)
Another option with a negative lookahead asserting not only word characters:
^(?!\w+$)-*\w[\w-]*$
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.