熊猫数据框的列表列表

Question

Hi I am trying to make a look up list, that given a listID I can find the users who have it, and given a UserID I can find all lists of that user. 嗨，我正在尝试创建一个查找列表，给定一个listID，我可以找到拥有它的用户，给定一个UserID，我可以找到该用户的所有列表。

The data comes in this format: 数据采用以下格式：

[['34', '345'],
['12', '23,534,34'],
['1', '13,42']]

What I would like is a pandas dataframe that looks like: 我想要的是一个熊猫数据框，看起来像：

UserID, ListID
34, 345
12, 23
12, 534
12, 34
 1, 13
 1, 42

My thoughts were to make the second string to a list splitting on 'commas', but from there I am stuck. 我的想法是将第二个字符串添加到以“逗号”分隔的列表中，但从那开始我陷入了困境。 Any suggestions? 有什么建议么？

Answer 1

You should clean up your data before feeding it into the data frame constructor. 您应该先清理数据，然后再将其输入数据框架构造函数。 Here is a simple script: 这是一个简单的脚本：

import pandas as pd

data = [['34', '345'],
['12', '23,534,34'],
['1', '13,42']]

new_data = []
for row in data:
    x, yvals = row
    for y in yvals.split(','):
        new_data.append([x,y])

df = pd.DataFrame(new_data, columns=['UserID', 'ListID'])

Answer 2

Here's one way 这是一种方法

In [386]: L = [['34', '345'], ['12', '23,534,34'], ['1', '13,42']]

In [387]: (pd.DataFrame(L, columns=['UserID', 'ListID'])
             .set_index('UserID')
             .ListID.str.split(',')
             .apply(pd.Series)
             .stack()
             .reset_index(level=0, name='ListID'))
Out[387]:
  UserID ListID
0     34    345
1     12     23
2     12    534
3     12     34
4      1     13
5      1     42

Answer 3

You can do as follow : 您可以执行以下操作：

df_tmp = pd.DataFrame([['34', '345'],
['12', '23,534,34'],
['1', '13,42']], columns=['ListID', 'UserIDs'])

s = df_tmp['UserIDs'].str.split(',', expand=True).stack()
i = s.index.get_level_values(0)
df = df_tmp.loc[i].copy()
df["UserID"] = s.values
del df['UserIDs']

熊猫数据框的列表列表

问题描述

3 个解决方案

解决方案1
5 已采纳 2017-08-28 12:47:03

解决方案2
1 2017-08-28 12:45:15

解决方案3
0 2017-08-28 12:52:39

熊猫数据框的列表列表

问题描述

3 个解决方案

解决方案1 5 已采纳 2017-08-28 12:47:03

解决方案2 1 2017-08-28 12:45:15

解决方案3 0 2017-08-28 12:52:39

解决方案1
5 已采纳 2017-08-28 12:47:03

解决方案2
1 2017-08-28 12:45:15

解决方案3
0 2017-08-28 12:52:39