Pandas dataframe 列包含字符串和列表

Question

I have a data frame that contains a column with both strings and lists.我有一个数据框，其中包含一个包含字符串和列表的列。

import pandas as pd    
data = {'lanes': ['1',['2','4'],'2','3',['1','2','3']]}
df = pd.DataFrame(data,columns=['lanes'])
df

original dat frame原始数据框

I need to convert the strings to ints and replace the lists with means of the list elements.我需要将字符串转换为整数并用列表元素替换列表。 So, the output should look like this:因此，output 应如下所示：

data2 = {'lanes': [1,3,2,3,2]}
df2 = pd.DataFrame(data2,columns=['lanes']) 
df2

desired data frame所需的数据框

Can anyone give me some direction on how to do this, if you have done something like this before?如果您以前做过类似的事情，谁能给我一些指导如何做到这一点？

Answer 1

Use Series.explode , convert values to integers and then count mean per duplicated index by mean :使用Series.explode ，将值转换为整数，然后按 mean 计算每个重复索引的mean ：

df['lanes'] = df['lanes'].explode().astype(int).mean(level=0)
print (df)
   lanes
0      1
1      3
2      2
3      3
4      2

If data are not lists, but strings repr of lists use:如果数据不是列表，但列表的字符串 repr 使用：

data = {'lanes': ['1',"['2','4']",'2','3',"['1','2','3']"]}
df = pd.DataFrame(data,columns=['lanes'])

import ast

df['lanes'] = df['lanes'].apply(ast.literal_eval).explode().astype(int).mean(level=0)
print (df)
   lanes
0      1
1      3
2      2
3      3
4      2

Answer 2

You can try below snippet as well.您也可以尝试以下代码段。 It uses list comprehension to get the result它使用列表理解来获得结果

import pandas as pd
data = {'lanes': ['1',['2','4'],'2','3',['1','2','3']]}


def mean(lst):
    return sum(lst) / len(lst)

data2 = dict()
data2['lanes']= [int(mean(i)) for i in [[int(x) for x in list] for list in data['lanes']]]
df2 = pd.DataFrame(data2,columns=['lanes'])

Pandas dataframe 列包含字符串和列表

问题描述

2 个解决方案

解决方案1
4 已采纳 2021-05-22 05:24:13

解决方案2
2 2021-05-22 05:40:42

Pandas dataframe 列包含字符串和列表

问题描述

2 个解决方案

解决方案1 4 已采纳 2021-05-22 05:24:13

解决方案2 2 2021-05-22 05:40:42

解决方案1
4 已采纳 2021-05-22 05:24:13

解决方案2
2 2021-05-22 05:40:42