在 Dataframe 中，删除电话号码中的括号和破折号，同时注意国际前缀

Question

In data frame, how to remove unnecessary thing from Contact number在数据框中，如何从联系电话中删除不必要的东西

df df

Id Phone
1  (+1)123-456-7890
2  (123)-(456)-(7890)
3  123-456-7890

Final Output最终 Output

Id  Phone
1   1234567890
2   1234567890
3   1234567890

Answer 1

I would use a regex with str.replace here:我会在这里使用带有str.replace的正则表达式：

df['Phone2'] = df['Phone'].str.replace(r'^(?:\(\+\d+\))|\D', '', regex=True)

output: output：

   Id               Phone      Phone2
0   1    (+1)123-456-7890  1234567890
1   2  (123)-(456)-(7890)  1234567890
2   3        123-456-7890  1234567890

regex:正则表达式：

^(?:\(\+\d+\)) # match a (+0) leading identifier
|              # OR
\D             # match a non-digit

regex demo正则表达式演示

notes on the international prefix:关于国际前缀的注释：

This might be important to keep.这可能很重要。

Keep the prefixes:保留前缀：

df['Phone2'] = df['Phone'].str.replace(r'[^+\d]', '', regex=True)

output: output：

   Id               Phone          Phone2
0   1    (+1)123-456-7890    +11234567890
1   2  (123)-(456)-(7890)      1234567890
2   3        123-456-7890      1234567890
3   4  (+380)123-456-7890  +3801234567890

Only drop a specific prefix (here +1 ):仅删除特定前缀（此处为+1 ）：

df['Phone2'] = df['Phone'].str.replace(r'^(?:\(\+1\))|[^+\d]', '', regex=True)
# or, more flexible
df['Phone2'] = df['Phone'].str.replace(r'(?:\+1\D)|[^+\d]', '', regex=True)

output: output：

   Id               Phone          Phone2
0   1    (+1)123-456-7890      1234567890
1   2  (123)-(456)-(7890)      1234567890
2   3        123-456-7890      1234567890
3   4  (+380)123-456-7890  +3801234567890

在 Dataframe 中，删除电话号码中的括号和破折号，同时注意国际前缀

问题描述

1 个解决方案

解决方案1
4 已采纳 2022-09-22 13:38:58

notes on the international prefix:关于国际前缀的注释：

在 Dataframe 中，删除电话号码中的括号和破折号，同时注意国际前缀

问题描述

1 个解决方案

解决方案1 4 已采纳 2022-09-22 13:38:58

notes on the international prefix:关于国际前缀的注释：

解决方案1
4 已采纳 2022-09-22 13:38:58