在pandas数据帧中将数字和字母串转换为int / float

Question

I feel like there has to be a quick solution to my problem, I hacked out a poorly implemented solution using multiple list comprehensions which is not ideal whatsoever. 我觉得必须快速解决我的问题，我使用多个列表推导法破解了一个实施不当的解决方案，这种方法并不理想。 Maybe someone could help out here. 也许有人可以在这里帮忙。

I have a set of values which are strings (eg 3.2B, 1.5M, 1.1T) where naturally the last character denotes million, billion, trillion. 我有一组值是字符串（例如3.2B，1.5M，1.1T），其中最后一个字符自然表示百万，十亿，万亿。 Within the set there are also NaN/'none' values which should remain untouched. 在集合中还有NaN /'none'值，这些值应保持不变。 I wish to convert these to floats or ints, so in the given example (3200000000, 1500000, 1100000000000) 我希望将它们转换为浮点数或整数，因此在给定的示例中（3200000000,1500000,1100000000000）

TIA TIA

Answer 1

You could create a function: and applymap it to every entry in the dataframe: 您可以创建一个函数：并将其应用于数据applymap每个条目：

powers = {'B': 10 ** 9, 'M': 10 ** 6, 'T': 10 ** 12}
# add some more to powers as necessary

def f(s):
    try:
        power = s[-1]
        return int(s[:-1]) * powers[power]
    except TypeError:
        return s

df.applymap(f)

Answer 2

Setup 设定
Borrowing @MaxU's pd.DataFrame 借用@ MaxU的pd.DataFrame

df = pd.DataFrame({'col': ['123.456', '78M', '0.5B']})

Solution 解
Replace strings with scientific notation then use astype(float) 用科学记数法替换字符串然后使用astype(float)

d = dict(M='E6', B='E9', T='E12')

df.replace(d, regex=True).astype(float)

            col
0  1.234560e+02
1  7.800000e+07
2  5.000000e+08

Answer 3

Demo: 演示：

In [58]: df
Out[58]:
       col
0  123.456
1      78M
2     0.5B

In [59]: d = {'B': 10**9, 'M': 10**6}

In [60]: df['new'] = \
    ...: df['col'].str.extract(r'(?P<val>[\d.]+)\s*?(?P<mult>\D*)', expand=True) \
    ...:   .replace('','1') \
    ...:   .replace(d, regex=True) \
    ...:   .astype(float) \
    ...:   .eval('val * mult')
    ...:

In [61]: df
Out[61]:
       col           new
0  123.456  1.234560e+02
1      78M  7.800000e+07
2     0.5B  5.000000e+08

在pandas数据帧中将数字和字母串转换为int / float

问题描述

3 个解决方案

解决方案1
6 已采纳 2013-01-08 15:55:00

解决方案2
3 2017-10-04 22:37:08

解决方案3
2 2017-10-04 22:25:58

在pandas数据帧中将数字和字母串转换为int / float

问题描述

3 个解决方案

解决方案1 6 已采纳 2013-01-08 15:55:00

解决方案2 3 2017-10-04 22:37:08

解决方案3 2 2017-10-04 22:25:58

解决方案1
6 已采纳 2013-01-08 15:55:00

解决方案2
3 2017-10-04 22:37:08

解决方案3
2 2017-10-04 22:25:58