如何编写一个 function 占用一行并返回二维元组列表

Question

So I am working on this dataset.所以我正在研究这个数据集。

I wanted to take one row and returns with 2-dimension tuples.我想取一行并返回二维元组。 For example, for row 0, it returns: [('Action', 7.9), ('Adventure', 7.9), ('Fantasy', 7.9), ('Sci-Fi', 7.9)].例如，对于第 0 行，它返回：[('Action', 7.9), ('Adventure', 7.9), ('Fantasy', 7.9), ('Sci-Fi', 7.9)]。 So that every genre from the movie will be the same imdb score.这样电影中的每种类型都将是相同的 imdb 分数。

This is from a school project and I can't think of a way that this could be done.这是来自一个学校项目，我想不出办法可以做到这一点。 Can anyone help me?谁能帮我？

Im sorry, for the lack of details in this question, I will try to lay out all the details now.对不起，由于这个问题缺乏细节，我现在将尝试列出所有细节。

The dataset is movie_metadata.csv.数据集是movie_metadata.csv。 I cant seem to attach the file here.我似乎无法在此处附加文件。

After i got the function I am supposed to apply the function to all the rows until i have a one list containing all 2-dimensional tuples.在我得到 function 之后，我应该将 function 应用于所有行，直到我有一个包含所有二维元组的列表。 Then i would have to convert the list of tuples into a dataframe.然后我必须将元组列表转换为 dataframe。 Ideally, I want to create a new data set named 'genre_score' that has two columns: genre, and imdb_score.理想情况下，我想创建一个名为“genre_score”的新数据集，它有两列：genre 和 imdb_score。 Each row will have only one genre and the IMDB rating of the movie from that genre.Then i would have to calculate the mean IMDB rating per genre and make the following graph.每行将只有一个流派和该流派的电影的 IMDB 评级。然后我必须计算每个流派的平均 IMDB 评级并制作下图。

I can probably figure something out with everything else except the function.除了 function 之外，我可能可以用其他所有东西来解决问题。 Writing the function is the struggle for me.编写 function 对我来说是一场斗争。

Answer 1

Use list comprehension with flatten values splitted by |使用列表推导和由|分割的展平值: ：

df = pd.DataFrame({'genres':['Action|Adventure|Fantasy|Sci-Fi','Action|Adventure|Fantasy'],
                   'imdb_score':[7.9,7.1]})
print (df)
                            genres  imdb_score
0  Action|Adventure|Fantasy|Sci-Fi         7.9
1         Action|Adventure|Fantasy         7.1

row = 0
L = [(x, i) for g,i in df.loc[[row], ['genres','imdb_score']].values for x in g.split('|')]
print (L)
[('Action', 7.9), ('Adventure', 7.9), ('Fantasy', 7.9), ('Sci-Fi', 7.9)]

EDIT: Use Series.str.get_dummies for indicator columns, multiple by DataFrame.mul , replace 0 to missing values and get mean s, last convert Series to DataFrame by Series.rename_axis and Series.reset_index :编辑：对指标列使用Series.str.get_dummies ，乘以DataFrame.mul ，将0替换为缺失值并获得mean ，最后通过Series.rename_axis和Series.reset_index将Series转换为DataFrame ：

df1 = (df['genres'].str.get_dummies()
                   .replace(0, np.nan)
                   .mul(df['imdb_score'], axis=0)
                   .mean()
                   .rename_axis('genres')
                   .reset_index(name='imdb_score'))
print (df1)
      genres  imdb_score
0     Action         7.5
1  Adventure         7.5
2    Fantasy         7.5
3     Sci-Fi         7.9

Another solution is use Series.str.split for lists and DataFrame.explode , last aggregate mean :另一种解决方案是使用Series.str.split列表和DataFrame.explode ，最后一个聚合mean ：

df1 = (df.assign(genres=df['genres'].str.split('|'))
         .explode('genres')
         .groupby('genres', as_index=False)['imdb_score']
         .mean())
print (df1)
      genres  imdb_score
0     Action         7.5
1  Adventure         7.5
2    Fantasy         7.5
3     Sci-Fi         7.9

Answer 2

Try this:尝试这个：

array = [ (col,val) for col,val in dataframe.iloc[row_num].items() ]
print(array)

Answer 3

You can use Dictionary inside a Dictionary您可以在字典中使用字典

dataset = {'R1':{'C1':'V1','C2':'V2','C3':'V3'},
'R2':{'C1':'V1','C2':'V2','C3':'V3'},
'R3':{'C1':'V1','C2':'V2','C3':'V3'}
}

Answer 4

U can make ur function like this你可以像这样制作你的 function

def myFunction(row):
    row += 1
    // Your list
    mylist = [
        // first row
        [
            ('genres', 'Action|Adventure|Fantasy|Sci-Fi'),
            ('num_user_for_reviews', 3054.0)],
        ],
        // second row
        [
            ('genres', 'Action|Adventure|Fantasy'),
            ('num_user_for_reviews', 1238.0)]
        ]
    return myList[row]

Then call the function and fill with row u want然后调用 function 并填写你想要的行

// return firstrow
muFunction(1)

如何编写一个 function 占用一行并返回二维元组列表

问题描述

4 个解决方案

解决方案1
1 已采纳 2020-04-11 11:25:27

解决方案2
0 2020-04-11 11:23:10

解决方案3
0 2020-04-11 11:24:55

解决方案4
0 2020-04-11 11:33:46

如何编写一个 function 占用一行并返回二维元组列表

问题描述

4 个解决方案

解决方案1 1 已采纳 2020-04-11 11:25:27

解决方案2 0 2020-04-11 11:23:10

解决方案3 0 2020-04-11 11:24:55

解决方案4 0 2020-04-11 11:33:46

解决方案1
1 已采纳 2020-04-11 11:25:27

解决方案2
0 2020-04-11 11:23:10

解决方案3
0 2020-04-11 11:24:55

解决方案4
0 2020-04-11 11:33:46