简体   繁体   English

在第二次出现''(空格)后如何删除所有字符

[英]how can I remove all characters after the second occurrence of a ' ' (space)

My name regex has been proven faulty on a couple entries: 我的名字正则表达式在几个条目上被证明是错误的:

find_name = re.search(r'^[^\d]*', clean_content)

The above would output something like this on a few entries: 上面将在一些条目上输出类似的内容:

TERRI BROWSING APT A # current output

So, I need a way to trim that out; 因此,我需要一种方法来进行修剪; it's tripping the rest of my program. 它使我的程序的其余部分绊倒了。 The only identifier I can think of is if I can somehow detect the second space; 我能想到的唯一标识符是我是否可以某种方式检测到第二个空间。 and remove all characters after it. 并删除其后的所有字符。

I only need the first and last name; 我只需要名字和姓氏; ie

TERRI BROWSING # desired

After I remove those characters I could just .strip() out the trailing space, just need a way to remove all after second space.... or maybe detect only to get two words, nothing more. 之后,我删除这些字符我可以.strip()在尾随的空间,只是需要一种方法后,第二空间删除所有.... 也许只能检测到拿到两个词,仅此而已。

Maybe you don't need regex but can use simple splits and joins: 也许您不需要正则表达式,但可以使用简单的拆分和联接:

text = 'TERRI BROWSING APT A'
' '.join(text.split(' ')[0:2])
# 'TERRI BROWSING'

You can do: 你可以做:

^\S+\s+\S+
  • ^ matches the start of the string ^匹配字符串的开头

  • \\S+ matches one or more non-whitespaces \\S+匹配一个或多个非空格

  • \\s+ matches one or more whitespaces \\s+匹配一个或多个空格


Also, assuming the whitespace is actually a space character, you can find the index of the second space using str.find and slice the string upto that point: 同样,假设空白实际上是一个空格字符,则可以使用str.find找到第二个空格的索引,并将字符串切成这一点:

text[:text.find(' ', text.find(' ') + 1)] 

Example: 例:

In [326]: text = 'TERRI BROWSING APT A'                                                                                                                                                                     

In [327]: re.search(r'^\S+\s+\S+', text).group()                                                                                                                                                            
Out[327]: 'TERRI BROWSING'

In [338]: text[:text.find(' ', text.find(' ') + 1)]                                                                                                                                                         
Out[338]: 'TERRI BROWSING'

If you want to remove the rest, you could match 2 times a non whitespace char \\S* followed by a space and capture that in a group. 如果要删除其余部分,可以将非空白字符\\S*匹配2倍,后跟一个空格,然后将其捕获到一个组中。 Then match any char 0+ times and replace with the first capturing group using re.sub 然后匹配任何char 0+次,并使用re.sub替换为第一个捕获组

^(\S* \S* ).*

Regex demo | 正则表达式演示 | Python demo Python演示

import re

print(re.sub(r"^(\S* \S* ).*", r"\1", "TERRI BROWSING APT A"))

Result 结果

TERRI BROWSING 特里浏览

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM