列表的熊猫列：如何设置项目的dtype

Question

I have a dataframe which has multiple columns containing lists and the length of the lists in each row are different: 我有一个数据框，其中有多个包含列表的列，并且每一行中列表的长度不同：

tweetid tweet_date    user_mentions       hashtags
00112   11-02-2014    []                  []
00113   11-02-2014    [00113]             [obama, trump]
00114   30-07-2015    [00114, 00115]      [hillary, trump, sanders]
00115   30-07-2015    []                  []

The dataframe is a concat of three different dataframes and I'm not sure whether the items in the lists are of the same dtype. 该数据框是三个不同数据框的组合，我不确定列表中的项目是否具有相同的dtype。 For example, in the user_mentions column, sometime the data is like: 例如，在user_mentions列中，有时数据如下：

[00114, 00115]

But sometimes is like this: 但是有时候是这样的：

['00114','00115']

How can I set the dtype for the items in the lists? 如何为列表中的项目设置dtype？

Answer 1

Pandas DataFrames are not really designed to house lists as row/column values, so this is why you are facing difficulty. Pandas DataFrames并非真正旨在将列表作为行/列值来容纳，因此这就是您面临困难的原因。 you could do 你可以做

python3.x: python3.x：

df['user_mentions'].apply(lambda x: list(map(int, x)))

python2.x: python2.x：

df['user_mentions'].apply(lambda x: map(int, x))

In python3 when mapping a map object is returned so you have to convert to list, in python2 this does not happen so you don't explicitly call it a list. 在python3中，当返回映射对象时，您必须转换为列表，而在python2中，这不会发生，因此您无需显式地将其称为列表。

In the above lambda, x is your row list and you are mapping the values to int . 在上面的lambda中，x是您的行list并且您正在将值映射到int 。

Answer 2

df['user_mentions'].map(lambda x: ['00' + str(y) if isinstance(y,int) else y for y in x]) If your objective is to convert all user_mentions to str the above might help. df['user_mentions'].map(lambda x: ['00' + str(y) if isinstance(y,int) else y for y in x])如果您的目标是将所有user_mentions转换为str ，则可能会有所帮助。 I would also look into this post for unnesting . 我还要考虑这个职位unnesting 。 As mentioned ; 如上所述 ; pandas not really designed to house lists as values. 熊猫并非真正旨在将列表作为值来容纳。

Answer 3

这应该工作，我在第一列中包含字符串

df[0].apply((lambda x: [str(y) for y in x]))

列表的熊猫列：如何设置项目的dtype

问题描述

3 个解决方案

解决方案1
6 已采纳 2019-02-20 18:54:00

解决方案2
2 2019-02-20 20:44:47

解决方案3
1 2019-02-20 18:53:55

列表的熊猫列：如何设置项目的dtype

问题描述

3 个解决方案

解决方案1 6 已采纳 2019-02-20 18:54:00

解决方案2 2 2019-02-20 20:44:47

解决方案3 1 2019-02-20 18:53:55

解决方案1
6 已采纳 2019-02-20 18:54:00

解决方案2
2 2019-02-20 20:44:47

解决方案3
1 2019-02-20 18:53:55