简体   繁体   English

如何编写一个 function 占用一行并返回二维元组列表

[英]How do I write a function that takes one row and returns a list of 2-dimension tuples

So I am working on this dataset.所以我正在研究这个数据集。

在此处输入图像描述

I wanted to take one row and returns with 2-dimension tuples.我想取一行并返回二维元组。 For example, for row 0, it returns: [('Action', 7.9), ('Adventure', 7.9), ('Fantasy', 7.9), ('Sci-Fi', 7.9)].例如,对于第 0 行,它返回:[('Action', 7.9), ('Adventure', 7.9), ('Fantasy', 7.9), ('Sci-Fi', 7.9)]。 So that every genre from the movie will be the same imdb score.这样电影中的每种类型都将是相同的 imdb 分数。

This is from a school project and I can't think of a way that this could be done.这是来自一个学校项目,我想不出办法可以做到这一点。 Can anyone help me?谁能帮我?

Im sorry, for the lack of details in this question, I will try to lay out all the details now.对不起,由于这个问题缺乏细节,我现在将尝试列出所有细节。

The dataset is movie_metadata.csv.数据集是movie_metadata.csv。 I cant seem to attach the file here.我似乎无法在此处附加文件。

After i got the function I am supposed to apply the function to all the rows until i have a one list containing all 2-dimensional tuples.在我得到 function 之后,我应该将 function 应用于所有行,直到我有一个包含所有二维元组的列表。 Then i would have to convert the list of tuples into a dataframe.然后我必须将元组列表转换为 dataframe。 Ideally, I want to create a new data set named 'genre_score' that has two columns: genre, and imdb_score.理想情况下,我想创建一个名为“genre_score”的新数据集,它有两列:genre 和 imdb_score。 Each row will have only one genre and the IMDB rating of the movie from that genre.Then i would have to calculate the mean IMDB rating per genre and make the following graph.每行将只有一个流派和该流派的电影的 IMDB 评级。然后我必须计算每个流派的平均 IMDB 评级并制作下图。

在此处输入图像描述

I can probably figure something out with everything else except the function.除了 function 之外,我可能可以用其他所有东西来解决问题。 Writing the function is the struggle for me.编写 function 对我来说是一场斗争。

Use list comprehension with flatten values splitted by |使用列表推导和由|分割的展平值:

df = pd.DataFrame({'genres':['Action|Adventure|Fantasy|Sci-Fi','Action|Adventure|Fantasy'],
                   'imdb_score':[7.9,7.1]})
print (df)
                            genres  imdb_score
0  Action|Adventure|Fantasy|Sci-Fi         7.9
1         Action|Adventure|Fantasy         7.1

row = 0
L = [(x, i) for g,i in df.loc[[row], ['genres','imdb_score']].values for x in g.split('|')]
print (L)
[('Action', 7.9), ('Adventure', 7.9), ('Fantasy', 7.9), ('Sci-Fi', 7.9)]

EDIT: Use Series.str.get_dummies for indicator columns, multiple by DataFrame.mul , replace 0 to missing values and get mean s, last convert Series to DataFrame by Series.rename_axis and Series.reset_index :编辑:对指标列使用Series.str.get_dummies ,乘以DataFrame.mul ,将0替换为缺失值并获得mean ,最后通过Series.rename_axisSeries.reset_indexSeries转换为DataFrame

df1 = (df['genres'].str.get_dummies()
                   .replace(0, np.nan)
                   .mul(df['imdb_score'], axis=0)
                   .mean()
                   .rename_axis('genres')
                   .reset_index(name='imdb_score'))
print (df1)
      genres  imdb_score
0     Action         7.5
1  Adventure         7.5
2    Fantasy         7.5
3     Sci-Fi         7.9

Another solution is use Series.str.split for lists and DataFrame.explode , last aggregate mean :另一种解决方案是使用Series.str.split列表和DataFrame.explode ,最后一个聚合mean

df1 = (df.assign(genres=df['genres'].str.split('|'))
         .explode('genres')
         .groupby('genres', as_index=False)['imdb_score']
         .mean())
print (df1)
      genres  imdb_score
0     Action         7.5
1  Adventure         7.5
2    Fantasy         7.5
3     Sci-Fi         7.9

Try this:尝试这个:

array = [ (col,val) for col,val in dataframe.iloc[row_num].items() ]
print(array)

You can use Dictionary inside a Dictionary您可以在字典中使用字典

dataset = {'R1':{'C1':'V1','C2':'V2','C3':'V3'},
'R2':{'C1':'V1','C2':'V2','C3':'V3'},
'R3':{'C1':'V1','C2':'V2','C3':'V3'}
}

U can make ur function like this你可以像这样制作你的 function

def myFunction(row):
    row += 1
    // Your list
    mylist = [
        // first row
        [
            ('genres', 'Action|Adventure|Fantasy|Sci-Fi'),
            ('num_user_for_reviews', 3054.0)],
        ],
        // second row
        [
            ('genres', 'Action|Adventure|Fantasy'),
            ('num_user_for_reviews', 1238.0)]
        ]
    return myList[row]

Then call the function and fill with row u want然后调用 function 并填写你想要的行

// return firstrow
muFunction(1)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 编写一个 function 获取一行并返回一个二维元组列表:歌曲名称和分数数据库 - Write a function that takes one row and returns a list of 2-dimension tuples: song title and points database 如何构建一个接收元组列表的神经网络 model - How do I build a neural network model that takes in a list of tuples 我如何编写一个程序,将字符串列表作为输入并返回一个字典,其中包含匹配字符串的单词索引 - how do I write a program that takes a list of strings as input and returns a dictionary, containing an index of the words to the matching strings 如何制作一个返回元组列表的函数,该列表按元组中的最后一个元素排序? - How do I make a function that returns a list of tuples sorted by the last element in the tuple? 如何编写一个接受字符串并返回该字符串中的第一个单词的函数 - How do I Write a function that takes in a string and returns the first word in that string 如何编写返回 Python 中的字典列表的 function? - How do I write a function that returns a list of dictionaries in Python? 如何创建一个从元组列表中获取随机元组并在 python 中生成基本 plot 的程序? - how do I create a program that takes random tuples from a list of tuples and generates a basic plot in python? 您如何为 function 编写名为 makeWordLengthDict 的程序,该程序将单词列表作为其唯一参数,并在 python 中返回字典 - how do you write a program for function named makeWordLengthDict which takes a LIST of words as its only parameter, and returns a dictionary in python Python中的二维数组循环(列表) - loop in 2-dimension array in Python(list) 如何编写一个函数 function(n) 接受一个整数,并使用 while 循环返回前 n 个偶数的总和? - How do I write a function function(n) that takes in an integer, and returns the sum of the first n even numbers using a while loop?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM