熊猫系列的矢量化格式功能

Question

Say I start with a Series of unformatted phone numbers (as strings), and I would like to format them as (XXX) YYY-ZZZZ. 说我开始用Series未格式化的电话号码（如字符串），我想将它们格式化为（XXX）YYY-ZZZZ。

I can get the sub-components of my input using regular expressions and str.match or str.extract . 我可以使用正则表达式和str.match或str.extract来获取输入的子组件。 And I can perform the formatting using the result of either: 我可以使用以下任一结果执行格式化：

ser = pd.Series(data=['1234567890', '2345678901', '3456789012']) 

matched = ser.str.match(r'(\d{3})(\d{3})(\d{4})')

extracted = ser.astype(str).str.extract(r'(?P<first>\d{3})(?P<second>\d{3})(?P<third>\d{4})')

formatmatched = matched.apply(lambda x: '({0}) {1}-{2}'.format(*x))
print 'formatmatched'
print formatmatched

formatextracted = extracted.apply(lambda x: '({first}) {second}-{third}'.format(**x.to_dict()), axis=1)
print 'formatextracted'
print formatextracted

Results: 结果：

formatmatched
0    (123) 456-7890
1    (234) 567-8901
2    (345) 678-9012
dtype: object
formatextracted
0    (123) 456-7890
1    (234) 567-8901
2    (345) 678-9012
dtype: object

Is there a vectorized way to apply that formatting command in either context? 是否有矢量化的方法可以在任一上下文中应用该格式设置命令？

Answer 1

You can do this directly with Series.str.replace() : 您可以使用Series.str.replace()直接执行此操作：

In [47]: s = pandas.Series(["1234567890", "5552348866", "13434"])

In [49]: s
Out[49]: 
0    1234567890
1    5552348866
2         13434
dtype: object

In [50]: s.str.replace(r"(\d{3})(\d{3})(\d{4})", r"(\1) \2-\3")
Out[50]: 
0    (123) 456-7890
1    (555) 234-8866
2             13434
dtype: object

You could also imagine doing another transformation first to remove any non-digit characters. 您还可以想象首先进行另一种转换以除去所有非数字字符。

Answer 2

Why don't you try this: 你为什么不试试这个：

import pandas as pd
ser = pd.Series(data=['1234567890', '2345678901', '3456789012']) 
def f(val):
    return '({0}) {1}-{2}'.format(val[:3],val[3:6],val[6:])
print ser.apply(f)

熊猫系列的矢量化格式功能

问题描述

2 个解决方案

解决方案1
2 已采纳 2014-02-27 21:15:29

解决方案2
0 2014-02-27 20:41:41

熊猫系列的矢量化格式功能

问题描述

2 个解决方案

解决方案1 2 已采纳 2014-02-27 21:15:29

解决方案2 0 2014-02-27 20:41:41

解决方案1
2 已采纳 2014-02-27 21:15:29

解决方案2
0 2014-02-27 20:41:41