简体   繁体   English

AttributeError:“ DataFrame”对象没有属性“ tolist”

[英]AttributeError: 'DataFrame' object has no attribute 'tolist'

When I run this code in Jupyter Notebook: 当我在Jupyter Notebook中运行以下代码时:

columns = ['nkill', 'nkillus', 'nkillter','nwound', 'nwoundus', 'nwoundte', 'propvalue', 'nperps', 'nperpcap', 'iyear', 'imonth', 'iday']

for col in columns:
    # needed for any missing values set to '-99'
    df[col] = [np.nan if (x < 0) else x for x in 
df[col].tolist()]
    # calculate the mean of the column
    column_temp = [0 if math.isnan(x) else x for x in df[col].tolist()]
    mean = round(np.mean(column_temp))
    # then apply the mean to all NaNs
    df[col].fillna(mean, inplace=True)

I receive the following error: 我收到以下错误:

AttributeError                            Traceback 
(most recent call last)
<ipython-input-56-f8a0a0f314e6> in <module>()
  3 for col in columns:
  4     # needed for any missing values set to '-99'
----> 5     df[col] = [np.nan if (x < 0) else x for x in df[col].tolist()]
  6     # calculate the mean of the column
  7     column_temp = [0 if math.isnan(x) else x for x in df[col].tolist()]

/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py in __getattr__(self, name)
   4374             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   4375                 return self[name]
-> 4376             return object.__getattribute__(self, name)
   4377 
   4378     def __setattr__(self, name, value):

AttributeError: 'DataFrame' object has no attribute 'tolist'

The code works fine when I run it in Pycharm, and all of my research has led me to conclude that it should be fine. 当我在Pycharm中运行该代码时,它可以正常工作,而我的所有研究都使我得出结论,认为它应该很好。 Am I missing something? 我想念什么吗?

I've created a Minimal, Complete, and Verifiable example below: 我在下面创建了一个最小,完整和可验证的示例:

import numpy as np
import pandas as pd
import os
import math

# get the path to the current working directory
cwd = os.getcwd()

# then add the name of the Excel file, including its extension to get its relative path
# Note: make sure the Excel file is stored inside the cwd
file_path = cwd + "/data.xlsx"

# Copy the database to file
df = pd.read_excel(file_path)

columns = ['nkill', 'nkillus', 'nkillter', 'nwound', 'nwoundus', 'nwoundte', 'propvalue', 'nperps', 'nperpcap', 'iyear', 'imonth', 'iday']

for col in columns:
    # needed for any missing values set to '-99'
    df[col] = [np.nan if (x < 0) else x for x in df[col].tolist()]
    # calculate the mean of the column
    column_temp = [0 if math.isnan(x) else x for x in df[col].tolist()]
    mean = round(np.mean(column_temp))
    # then apply the mean to all NaNs
    df[col].fillna(mean, inplace=True)

You have an XY Problem . 您有XY问题 You've described what you are trying to achieve in your comments, but your approach is not appropriate for Pandas. 您已经在评论中描述了您要实现的目标,但是您的方法不适用于熊猫。

Avoid for loops and list 避免for循环和list

With Pandas, you should look to avoid explicit for loops or conversion to Python list . 使用Pandas时,您应该避免显式的for循环或转换为Python list Pandas builds on NumPy arrays which support vectorised column-wise operations. Pandas基于NumPy数组构建,该数组支持矢量化列式操作。

So let's look at how you can rewrite: 因此,让我们看一下如何重写:

for col in columns:
    # values less than 0 set to NaN
    # calculate the mean of the column with 0 for NaN
    # then apply the mean to all NaNs

You can now use Pandas methods to achieve the above. 现在,您可以使用Pandas方法来实现上述目的。

apply + pd.to_numeric + mask + fillna apply + pd.to_numeric + mask + fillna

You can define a function mean_update and use pd.DataFrame.apply to apply it to each series: 您可以定义一个函数mean_update并使用pd.DataFrame.apply将其应用于每个系列:

df = pd.DataFrame({'A': [1, -2, 3, np.nan],
                   'B': ['hello', 4, 5, np.nan],
                   'C': [-1.5, 3, np.nan, np.nan]})

def mean_update(s):
    s_num = pd.to_numeric(s, errors='coerce')  # convert to numeric
    s_num = s_num.mask(s_num < 0)              # replace values less than 0 with NaN
    s_mean = s_num.fillna(0).mean()            # calculate mean
    return s_num.fillna(s_mean)                # replace NaN with mean

df = df.apply(mean_update)                     # apply to each series

print(df)

     A     B     C
0  1.0  2.25  0.75
1  1.0  4.00  3.00
2  3.0  5.00  0.75
3  1.0  2.25  0.75

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Bokeh:AttributeError:&#39;DataFrame&#39;对象没有属性&#39;tolist&#39; - Bokeh: AttributeError: 'DataFrame' object has no attribute 'tolist' AttributeError: &#39;NDArray&#39; 对象没有属性 &#39;ravel&#39; 或 &#39;tolist&#39; - AttributeError: 'NDArray' object has no attribute 'ravel' or 'tolist' AttributeError:“SeriesGroupBy”对象没有属性“tolist” - AttributeError: 'SeriesGroupBy' object has no attribute 'tolist' 获取 AttributeError: 'builtin_function_or_method' object 没有属性 'tolist' - Getting AttributeError: 'builtin_function_or_method' object has no attribute 'tolist' AttributeError: 'DataFrame' object 没有属性 - AttributeError: 'DataFrame' object has no attribute Dataframe AttributeError:&#39;DataFrame&#39;对象没有属性&#39;icol&#39; - Dataframe AttributeError: 'DataFrame' object has no attribute 'icol' AttributeError: &#39;DataFrame&#39; 对象没有属性 &#39;Class&#39; - AttributeError: 'DataFrame' object has no attribute 'Class' AttributeError: &#39;DataFrame&#39; 对象没有属性 &#39;path&#39; - AttributeError: 'DataFrame' object has no attribute 'path' AttributeError: &#39;DataFrame&#39; 对象没有属性 &#39;save&#39; - AttributeError: 'DataFrame' object has no attribute 'save' AttributeError:“ DataFrame”对象没有属性“ _example” - AttributeError: 'DataFrame' object has no attribute '_example'
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM