如何获取熊猫数据框中列的唯一长度？

Question

This is the csv data:这是csv数据：

staff_id,clock_time,device_id,latitude,longitude
1003,2020/8/27 2:55,d_8,26.39899424,117.7866387
1003,2020/8/26 7:45,d_8,26.39900029,117.7866379
1003,2020/8/26 3:09,d_8,26.40672436,117.8008659
1003,2020/8/26 0:26,d_8,26.89169118,117.1612365
1234567,2020/8/25 9:38,d_8,26.89764297,117.1760012
123456789,2020/5/19 8:29,d_8,24.47420087,118.1085551
1003,2020/5/18 9:06,d_8,24.473124,118.1705641
1003,2020/5/16 7:54,d_8,24.5101858,117.8954614

I use this code to get the staff_id unique length in the dataframe:我使用此代码获取数据帧中的staff_id唯一长度：

import pandas as pd

df = pd.read_csv(r'for_test.csv', encoding='utf-8',parse_dates=[1])
staff_id_list = df.staff_id.values.tolist()
staff_id_length_list = [len(str(item)) for item in staff_id_list]
staff_id_length_list = list(set(staff_id_length_list))
print(staff_id_length_list)

The output is : [9, 4, 7]输出为： [9, 4, 7]

Although the output is correct, I want to use the pandas method to get the length instead of the python method.虽然输出是正确的，但是我想用pandas的方法来获取长度，而不是python的方法。

What should I do?我该怎么办？

Answer 1

Use Series.astype with Series.str.len and Series.unique :将Series.astype与Series.str.len和Series.unique ：

a = df.staff_id.astype(str).str.len().unique()
print (a)
[4 7 9]

If need list:如果需要清单：

L = df.staff_id.astype(str).str.len().unique().tolist()
print (L)
[4, 7, 9]

Answer 2

Use pandas.Series.astype with str.len and unique :使用pandas.Series.astype与str.len和unique ：

df["staff_id"].astype(str).str.len().unique()

Output:输出：

array([4, 7, 9])

Answer 3

你可以试试下面的——

df['len'] = df['staff_id'].str.len().drop_duplicates()

如何获取熊猫数据框中列的唯一长度？

问题描述

3 个解决方案

解决方案1
0 已采纳 2020-09-25 08:12:39

解决方案2
0 2020-09-25 08:13:06

解决方案3
0 2020-09-25 08:13:25

如何获取熊猫数据框中列的唯一长度？

问题描述

3 个解决方案

解决方案1 0 已采纳 2020-09-25 08:12:39

解决方案2 0 2020-09-25 08:13:06

解决方案3 0 2020-09-25 08:13:25

解决方案1
0 已采纳 2020-09-25 08:12:39

解决方案2
0 2020-09-25 08:13:06

解决方案3
0 2020-09-25 08:13:25