简体   繁体   English

包含 Pandas DataFrame to_string 中浮点数列表的 Pandas 格式单元格

[英]Pandas format cells containing list of float in Pandas DataFrame to_string

How to use float_format argument from Pandas.DataFrame.to_string() to format within cell list of floats?如何使用float_format Pandas.DataFrame.to_string() float_format参数在Pandas.DataFrame.to_string()单元格列表中格式化?

I have a Pandas data frame df with a column ( col3 ) containing list of floats:我有一个 Pandas 数据框df其中有一列( col3 )包含浮点数列表:

col1                    col2                                                        col3
   0      0.9999999350619099   [0.9999999350619099, 1e-12, 6.493808010308148e-08, 1e-12]
   1  5.8284650223352606e-08  [0.9999999417153463, 1e-12, 5.8284650223352606e-08, 1e-12]
   0      0.9999998660870891   [0.9999998660870891, 1e-12, 1.339129086538945e-07, 1e-12]

When running df.to_string() , I want to be able to format all floats (ie in col2 and col3 ) like this for instance:运行df.to_string() ,我希望能够像这样格式化所有浮点数(即在col2col3 ):

col1       col2                              col3
   0      1.000  [0.999, 1e-12, 6.493e-08, 1e-12]
   1  5.828e-08  [0.999, 1e-12, 5.828e-08, 1e-12]
   0      1.000  [0.999, 1e-12, 1.339e-07, 1e-12]

I tried by providing a custom function float2string (cf MWE below) to Pandas.DataFrame.to_string() argument float_format but it only formats col2 which is a column of floats, and not col3 .我尝试通过向Pandas.DataFrame.to_string()参数float_format提供自定义函数float2string (参见下面的 MWE)进行float_format但它仅格式化col2 ,这是一列浮点数,而不是col3

NWE:新世界:

from collections import Iterable
import pandas

# data
df = pandas.DataFrame({
    'col1': [0, 1, 0], 
    'col2': [0.9999999350619099, 5.8284650223352606e-08, 0.9999998660870891], 
    'col3': [
        [0.9999999350619099, 1e-12, 6.493808010308148e-08, 1e-12], 
        [0.9999999417153463, 1e-12, 5.8284650223352606e-08, 1e-12], 
        [0.9999998660870891, 1e-12, 1.339129086538945e-07, 1e-12]
    ]},
    index = ['a1', 'a2', 'a3']
)

# formating function
def float2string(input):
    """convert float to string for printing

    Input (float)
    Output (string)
    """
    if isinstance(input, Iterable):
        return list(map(float2string, input))
    else:
        if input is None:
            return None
        else:
            if float(input).is_integer():
                return "{}".format(input)
            if abs(input) < 1e-2 or abs(input) > 1e2:
                return "{:.2e}".format(input)
            else:
                return "{:.3f}".format(input)

# print
print(df.to_string(float_format = float2string))

and I get我得到

    col1     col2                                                        col3
a1     0    1.000   [0.9999999350619099, 1e-12, 6.493808010308148e-08, 1e-12]
a2     1 5.83e-08  [0.9999999417153463, 1e-12, 5.8284650223352606e-08, 1e-12]
a3     0    1.000   [0.9999998660870891, 1e-12, 1.339129086538945e-07, 1e-12]

SOLUTION : thanks to @oskros answer below解决方案:感谢下面的@oskros 回答

print(df.to_string(
    float_format = float2string,
    formatters = {'col3': float2string}
))

Usually you are making it more difficult for yourself if you have columns which are iterable objects themselves (lists, tuples, dicts, etc.) - is there any specific reason for having column 3 as a list of 3 objects, instead of splitting it into 3 separate columns?通常,如果您有本身是可迭代对象的列(列表、元组、字典等),那么您自己会变得更加困难 - 将第 3 列作为 3 个对象的列表,而不是将其拆分为有什么具体原因吗? 3个单独的列?

But if you have a specific need to format the data this way, then you are almost there with your solution.但是,如果您有特定的需要以这种方式格式化数据,那么您的解决方案几乎就在那里。 Simply specify your custom function as a formatter只需将您的自定义函数指定为格式化程序

print(df.to_string(formatters = [float2string, float2string, float2string]))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM