简体   繁体   English

Python正则表达式为字符串中的所有单词添加一个字符,除了和

[英]Python regex to add a character to all words in a string except and

I want to be able to generate 'foos, bars and bees' from 'foo, bar and bee' using re.sub. 我希望能够使用re.sub从'foo, bar and bee'生成'foos, bars and bees'

I can't even get just adding 's' to all words to work. 我甚至不能只为所有单词添加's'来工作。 I'll work on excluding 'and' once I get that part. 一旦我得到那部分,我将努力排除'和'。 I have tried subbing \\b with "s" but that matches beginnings and endings of words. 我试过用"s"修饰\\b ,但它匹配单词的开头和结尾。 If I use '\\w*\\b' then the whole word is replaced. 如果我使用'\\w*\\b'则替换整个单词。 I am trying to figure this out using the Python docs, and (?P) or (?<=...) lookbehind assertions seem like they might be what I'm after, but I am having trouble getting those to cooperate, and the examples are limited. 我试图用Python文档来解决这个问题,并且(?P)(?<=...) lookbehind断言似乎可能是我所追求的,但是我很难让它们合作,这些例子有限。

This works, based on the replacement accepting a callable: 这是有效的,基于替换接受可调用:

re.sub('(\w+)', lambda m: m.group(1) + 's' if m.group(1) != 'and' else 'and', 'foo, bar and bee')

It was inspired by an old bug report (second to last entry). 它的灵感来自旧的错误报告 (倒数第二)。

EDIT: A shorter and probably more readable solution: 编辑:更短且可能更易读的解决方案:

re.sub('(and)|(\w+)', lambda m: m.group(1) or m.group(2) + 's', 'foo, bar and bee')

It also has the benefit of making it easier to add other words to the exception list, as isedev suggested in a comment. 它还有一个好处,就是可以更容易地将其他单词添加到例外列表中,就像评论中提出的isedev一样。

Without considering words to exclude, the following will add an 's' to the end of all words in the string: 在不考虑要排除的单词的情况下,以下内容将在字符串中的所有单词的末尾添加“s”:

re.sub('([a-zA-Z]+)','\\1s','foo, bar and bee')
-> 'foos, bars ands bees'

To pluralise words in a more generic and less error prone way, you might want to take a look at the inflect package (for English at least). 要以更通用且更不容易出错的方式复数单词,您可能需要查看inflect包(至少对于英语)。

The below code would add s to all the words except the word and , 下面的代码将增加s向所有的字,除了单词and

>>> import re
>>> s = "foo, bar and bee "
>>> m = re.sub(r'(?!and)(\b\w+\b)', r'\1s', s)
>>> m
'foos, bars and bees '

Negative lookahead asserts that it would match one or more word characters but not a \\band\\b . 否定前瞻断言它会匹配一个或多个单词字符但不匹配\\band\\b \\b here, means word boundary which matches between a word character and a non-word character. \\b这里,表示在单词字符和非单词字符之间匹配的单词边界。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用正则表达式查找字符串中除单词中的所有单词以外的所有单词 - Find all the words in string except one word in python with regex Python-替换字符串中的所有单词,除了某些单词 - Python - replace all words in a string except some Python:大写字符串中除一个单词以外的单词的第一个字符 - Python : Capitalize first character of words in a string except one word 在Python中使用正则表达式从字符串中提取具有特定字符的单词列表 - Extract list of words with specific character from string using regex in Python Python正则表达式用于匹配除给定字符串以外的所有字符串 - Python regex for matching all strings except a given string Python3正则表达式:删除/和|以外的所有字符 从字符串 - Python3 Regex: Remove all characters except / and | from string Python-用正则表达式中的字符串替换字符串EXCEPT的所有实例 - Python - Replace all instances of string EXCEPT for those in regex 删除所有数字,除了使用 python regex 组合成字符串的数字 - Remove all numbers except for the ones combined to string using python regex Python正则表达式,删除除Unicode字符串的连字符以外的所有标点符号 - Python regex, remove all punctuation except hyphen for unicode string 正则表达式:匹配除字符串开头以外的字符 - Regex: match a character except at the beginning of a string
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM