简体   繁体   中英

RegEx that matches characters after semicolon in the same line

I need some help with the Regular Expressions. I need a RegEx that matches with characters if they are after a semicolon AND in the same line of a previous word.

Let me explain that:

在此处输入图像描述

I need something like this. I have to make a function that does not allow to introduce character after a semicolon in the same line , and I think I could do it with this sort of RegEx.

Thank you.

I am not sure I understood your question, but would something like this help? This regular expression

if I understand it correctly, (?<=;)[A-Za-z]+ might does your work. The python documentation is helpful: https://docs.python.org/3/library/re.html

Well, you've got two ways to do it:

  • A : Create a regular expression to validate correct input.

  • B : Create a regular expression to find incorrect input.

I would use option 1, but it depends on what you need to do.

A: Regex to validate correct lines

In this case, we'll use the m modifier to set the regex engine to search by line (m = multiline). This means that ^ matches the beginning of a line and $ matches the end of a line.

Then we want to match some characters which are not the semicolon itself. To do this we use the [^ ] group meaning "anything which is not in the provided list of characters". So to say any char except the semicolon we'll have to use [^;] .

Now, this char is not alone as they'll be probably many of them. To do that we can either use the * or + operators that respectively mean "0 or more times" and "1 or more times". If the data before the semicolon is mandatory then we'll use the + operator. This leads to [^;]+ to say any char which is not a semicolon, 1 or more times.

Then we'll capture this with the () operators. This will let us have direct access to this value without having to take the line and remove the semicolon with a truncation by our own.

After this capturation, we have the semicolon and then maybe some empty spaces or not and then the end of the line. For the spaces after, it's up to you. It would be \s* to say any kind of space, tab or blank char 0 or n times.

At the end we get this regex: ^([^;]+);\s*$ with the m and g flags

m for multiline and g for global, which means don't stop at the first match but look for all of them.

Test it here: https://regex101.com/r/sT59eu/1/

B: Regex to find invalid lines

Well, this could be rather easy too: ;.+$

. means any char. So here we'll find the lines with something behind the semicolon.

Test it here: https://regex101.com/r/ocDofm/1/

But you will NOT find lines with missing semicolons!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM