删除并替换字符串中的多个逗号

Question

I have this dataset我有这个数据集

df = pd.DataFrame({'name':{0: 'John,Smith', 1: 'Peter,Blue', 2:'Larry,One,Stacy,Orange' , 3:'Joe,Good' , 4:'Pete,High,Anne,Green'}})

yielding:产生：

name
0   John,Smith
1   Peter,Blue
2   Larry,One,Stacy,Orange
3   Joe,Good
4   Pete,High,Anne,Green

I would like to:我想：

remove commas (replace them by one space)删除逗号（用一个空格代替）
wherever I have 2 persons in one cell, insert the "&"symbol after the first person family name and before the second person name.当我在一个单元格中有 2 个人时，在第一人称姓氏之后和第二人称姓名之前插入“&”符号。

Desired output:所需的 output：

name
0   John Smith
1   Peter Blue
2   Larry One & Stacy Orange
3   Joe Good
4   Pete High & Anne Green

Tried this code below, but it simply removes commas.在下面尝试了这段代码，但它只是删除了逗号。 I could not find how to insert the "&"symbol in the same code.我找不到如何在同一代码中插入“&”符号。

df['name']= df['name'].str.replace(r',', '', regex=True)

Disclaimer: all names in this table are fictitious.免责声明：本表中所有姓名均为虚构。 No identification with actual persons (living or deceased)is intended or should be inferred.无意或不应推断出与真实人物（生者或死者）的身份。

Answer 1

I would do it following way我会按照以下方式进行

import pandas as pd
df = pd.DataFrame({'name':{0: 'John,Smith', 1: 'Peter,Blue', 2:'Larry,One,Stacy,Orange' , 3:'Joe,Good' , 4:'Pete,High,Anne,Green'}})
df['name'] = df['name'].str.replace(',',' ').str.replace(r'(\w+ \w+) ', r'\1 & ', regex=True)
print(df)

gives output给出 output

                       name
0                John Smith
1                Peter Blue
2  Larry One & Stacy Orange
3                  Joe Good
4    Pete High & Anne Green

Explanation: replace , s using spaces, then use replace again to change one-or-more word characters followed by space followed by one-or-more word character followed by space using content of capturing group (which includes everything but last space) followed by space followed by & character followed by space.说明：使用空格替换, s，然后再次使用替换来更改一个或多个单词字符后跟空格后跟一个或多个单词字符后跟空格使用捕获组的内容（包括除最后一个空格之外的所有内容）空格后跟&字符后跟空格。

Answer 2

With single regex replacement:使用单个正则表达式替换：

df['name'].str.replace(r',([^,]+)(,)?', lambda m:f" {m.group(1)}{' & ' if m.group(2) else ''}")

0                  John Smith
1                  Peter Blue
2    Larry One & Stacy Orange
3                    Joe Good
4      Pete High & Anne Green

Answer 3

This should work:这应该工作：

import re

def separate_names(original_str):
    spaces = re.sub(r',([^,]*(?:,|$))', r' \1', original_str)
    return spaces.replace(',', ' & ')

df['spaced'] = df.name.map(separate_names)
df

I created a function called separate_names which replaces the odd number of commas with spaces using regex.我创建了一个名为 separate_names 的 function，它使用正则表达式将奇数个逗号替换为空格。 The remaining commas (even) are then replaced by & using the replace function. Finally I used the map function to apply separate_names to each row.然后使用替换 function 将剩余的逗号（偶数）替换为 &。最后我使用 map function 将 separate_names 应用于每一行。 The output is as follows: output如下：

Answer 4

In replace statement you should replace comma with space.在replace语句中，您应该用空格替换逗号。 Please put space between '' -> so you have ' '请在 '' -> 之间放置空格，这样你就有 ' '

df['name']= df['name'].str.replace(r',', ' ', regex=True)
                           inserted space ^ here

删除并替换字符串中的多个逗号

问题描述

4 个解决方案

解决方案1
3 已采纳 2023-01-31 13:21:19

解决方案2
3 2023-01-31 13:45:12

解决方案3
2 2023-01-31 13:46:04

解决方案4
-2 2023-01-31 13:15:39

删除并替换字符串中的多个逗号

问题描述

4 个解决方案

解决方案1 3 已采纳 2023-01-31 13:21:19

解决方案2 3 2023-01-31 13:45:12

解决方案3 2 2023-01-31 13:46:04

解决方案4 -2 2023-01-31 13:15:39

解决方案1
3 已采纳 2023-01-31 13:21:19

解决方案2
3 2023-01-31 13:45:12

解决方案3
2 2023-01-31 13:46:04

解决方案4
-2 2023-01-31 13:15:39