简体   繁体   English

在 CSV dataframe 中将数十亿转换为数百万

[英]Converting Billions to Millions in a CSV dataframe

the question I have pertains to formatting in pandas / python. The question below is stated.我的问题与 pandas / python 中的格式有关。以下问题已说明。

The trading volume numbers are large.交易量很大。 Scale the trading volume to be in millions of shares.将交易量扩大到数百万股。 Ex: 117,147,500 shares will become 117.1475 million after scaling.例:11714.75万股扩容后为11714.75万股。

This is what the dataframe looks like.这就是 dataframe 的样子。 I need it to be fixed for all 125 rows.我需要为所有 125 行修复它。

在此处输入图像描述

Probably the simplest way is to divide the whole column by a million可能最简单的方法是将整个列除以一百万

apple['volume'] = apple['volume'].div(1000000)

You can substitute numbers like 117147500 in the following two ways: either with floating point numbers:您可以通过以下两种方式替换 117147500 等数字:或者使用浮点数:

import pandas as pd
dictionary = {'Column':[4,5,6,7], 'Volume':[117147500,12000,14000,18000]}
df = pd.DataFrame(dictionary)
df

df_scaled_column=df['Volume']/1000000

# Replace old column with scaled values
df['Volume'] = df_scaled_column
df

Out: 
   Column    Volume
0       4  117.1475
1       5    0.0120
2       6    0.0140
3       7    0.0180

or with strings.或者用字符串。 In particular I use a function that I found from an answer to this SE post formatting long numbers as strings in python :特别是我使用了一个 function,我从这个 SE 帖子的答案中找到了它,它在 python 中将长数字格式化为字符串

import pandas as pd
dictionary = {'Column':[4,5,6,7], 'Volume':[117147500,12000,14000,18000]}
df = pd.DataFrame(dictionary)
df

# Function defined in a old StackExchange post
def human_format(num):
    num = float('{:.3g}'.format(num))
    magnitude = 0
    while abs(num) >= 1000:
        magnitude += 1
        num /= 1000.0
    return '{}{}'.format('{:f}'.format(num).rstrip('0').rstrip('.'), ['', 'K', 'M', 'B', 'T'][magnitude])

# Example of what the function does
human_format(117147500) #'117M'

# Create empty list
numbers_as_strings = []

# Fill the empty list with the formatted values
for number in df['Volume']:
    numbers_as_strings.append(human_format(number))

# Create a dataframe with only one column containing formatted values
dictionary = {'Volume': numbers_as_strings}
df_numbers_as_strings = pd.DataFrame(dictionary)

# Replace old column with formatted values
df['Volume'] = df_numbers_as_strings
df

Out: 
   Column Volume
0       4   117M
1       5    12K
2       6    14K
3       7    18K

You can use transform() method ( https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.transform.html ) and divide those volume numbers by 1000,000.您可以使用 transform() 方法 ( https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.transform.html ) 并将这些体积数除以 1000,000。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM