![](/img/trans.png)
[英]TypeError: object of type 'numpy.float64' has no len() when printing the regression coefficient of the first column in dataframe
[英]TypeError: object of type 'float' has no len() when get dataframe length
test.csv 是这样的:
device_id,upload_time
12345678901,2020-06-01 07:40:20+00:00
123456,2020-06-01 07:40:40+00:00
123456,2020-06-01 07:41:00+00:00
123456,2020-06-01 07:41:02+00:00
123456,2020-06-01 07:41:04+00:00
123456,2020-06-01 07:41:08+00:00
12345678901,2020-06-01 07:41:10+00:00
12345678901,2020-06-01 07:41:18+00:00
12345678901,2020-06-01 07:41:20+00:00
,2020-06-01 07:41:24+00:00
,2020-06-01 07:41:40+00:00
12345678901,2020-06-01 07:42:00+00:00
12345678901,2020-06-01 07:42:20+00:00
12345678901,2020-06-01 07:42:22+00:00
12345678901,2020-06-01 07:42:24+00:00
12345678901,2020-06-01 07:42:26+00:00
12345678901,2020-06-01 07:42:28+00:00
12345678901,2020-06-01 07:42:40+00:00
1234,2020-06-01 07:43:00+00:00
1234,2020-06-01 07:43:12+00:00
您可以将 deviceid 转换为int
或str
,没问题。 我使用此代码来获取新的数据框。
import pandas as pd
df = pd.read_csv(r'test.csv', encoding='utf-8', parse_dates=[1])
df = df[pd.notnull(df['device_id'])] #Delete rows where device_id is null.
a = df[df['device_id'].map(len)!=11] #Get data whose device_id length is not 11.
b = df[df['device_id'].map(len)==11] #Get data whose device_id length is 11.
但错误信息是:
类型错误:“float”类型的对象没有 len()
哪里错了?
下面的代码会帮助你
将浮点值转换为字符串将有助于了解位数。
import pandas as pd
df = pd.read_csv(r'test.csv', encoding='utf-8', parse_dates=[1])
# to remove the null(nan)
df = df.dropna()
or
df = df[df['device_id'].isnull()==False]
or
df = df[df['device_id'].isna()==False]
a = df[df['device_id'].astype(str).map(len)!=11]
b = df[df['device_id'].astype(str).map(len)==11]
另一种方法
a = df[df['device_id'].astype(str).str.len()!=11]
b = df[df['device_id'].astype(str).str.len()==11]
另一种方法
a = df[df['device_id'].astype(str).apply(len)!=11]
b = df[df['device_id'].astype(str).apply(len)==11]
对于您指定的输入文件,尽管所有值都是int
类型,但出于某种原因, device_id
列似乎被视为float
数据类型。 由于此原因,您在尝试计算长度时将面临一个问题:
例子:
len('12345')
#will give you len = 5, which is the correct length
然而,
len('12345.0')
#will give you len = 7, which is wrong since it considers the decimal point too
因此,最好将您的数据类型转换为int
,然后对int
列的str
版本执行长度检查,如下所示:
参考:
len 参数可以是序列(字符串、元组或列表)或映射(字典)。 https://docs.python.org/2/library/functions.html#len
在调用 len 函数之前,您应该验证参数是否是这种类型之一。 您可以调用方法 isinstance() 来验证它。 看看如何使用它。 https://docs.python.org/2/library/functions.html#isinstance
所以试试这个,
import pandas as pd
df = pd.read_csv(r'sample.csv', parse_dates=[1])
df = df[pd.notnull(df['device_id'])] #Delete rows where device_id is null.
#Convert to int
df['device_id'] = df['device_id'].astype(float).astype(int)
#len function cannot be computed on an int column directly. You should convert to str and then compute len
a = df[df['device_id'].astype(str).map(len)!=11]
b = df[df['device_id'].astype(str).map(len)==11]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.