简体   繁体   English

如何以某种方式创建 pandas 数据框?

[英]How can I create a pandas data frame in a certain way?

I need to create a pandas dataframe that contains all of the required information where each row of the dataframe should be one track.我需要创建一个 pandas dataframe ,其中包含所有必需的信息,其中 dataframe 的每一行都应该是一个轨道。 I also need to sort the dataframe by popularity score, so that the most popular track is at the top and the least popular is at the bottom.我还需要将 dataframe 按流行度分数排序,这样最受欢迎的曲目在顶部,最不受欢迎的曲目在底部。 I tried many ways but they did not work.我尝试了很多方法,但没有奏效。 Your help is much appreciated.非常感谢您的帮助。

I am sharing my nested dictionary.我正在分享我的嵌套字典。

{'Artist name': ['Paramore', 'Weezer', 'Lizzo'],
 'Track name': (['Still into You',
   "Ain't It Fun",
   'Hard Times',
   'Misery Business',
   'The Only Exception',
   'Ignorance',
   'Rose-Colored Boy',
   'Fake Happy',
   "That's What You Get",
   'Brick by Boring Brick'],
  ['Island In The Sun',
   "Say It Ain't So",
   'Buddy Holly',
   'Beverly Hills',
   'Africa',
   'The End of the Game',
   'Hash Pipe',
   'Undone - The Sweater Song',
   'My Name Is Jonas',
   'Take On Me'],
  ['Truth Hurts',
   'Good As Hell',
   'Good As Hell (feat. Ariana Grande) - Remix',
   'Juice',
   'Boys',
   'Tempo (feat. Missy Elliott)',
   'Blame It on Your Love (feat. Lizzo)',
   'Soulmate',
   'Water Me',
   'Like A Girl']),
 'Release date': (['2013-04-05',
   '2013-04-05',
   '2017-05-12',
   '2007-06-11',
   '2009-09-28',
   '2009-09-28',
   '2017-05-12',
   '2017-05-12',
   '2007-06-11',
   '2009-09-28'],
  ['2001-05-15',
   '1994-05-10',
   '1994-05-10',
   '2005-05-10',
   '2019-01-24',
   '2019-09-10',
   '2001-05-15',
   '1994-05-10',
   '1994-05-10',
   '2019-01-24'],
  ['2019-05-03',
   '2016-03-09',
   '2019-10-25',
   '2019-04-19',
   '2019-04-18',
   '2019-04-19',
   '2019-09-13',
   '2019-04-19',
   '2019-04-18',
   '2019-04-19']),
 'Popularity score': ([76, 74, 73, 73, 72, 69, 66, 66, 65, 65],
  [77, 75, 73, 71, 67, 67, 66, 65, 63, 62],
  [94, 90, 86, 84, 72, 78, 68, 72, 58, 71])}

There are definitely more efficient ways, but here's a solution肯定有更有效的方法,但这里有一个解决方案

import pandas as pd

def gen_artist_frame(d):
    categories = [c for c in d.keys()]

    for idx, artist in enumerate(d['Artist name']):

        artist_mat = [d[j][idx] for j in categories[1:]]

        artist_frame = pd.DataFrame(artist_mat, index=categories[1:]).T

        artist_frame[categories[0]] = artist

        yield artist_frame

def collapse_nested_artist(d):
    return pd.concat([
        a for a in gen_artist_frame(d)
        ])

d = {'Artist name': ['Paramore', 'Weezer', 'Lizzo'],
 'Track name': (['Still into You',
   "Ain't It Fun",
   'Hard Times',
   'Misery Business',
   'The Only Exception',
   'Ignorance',
   'Rose-Colored Boy',
   'Fake Happy',
   "That's What You Get",
   'Brick by Boring Brick'],
  ['Island In The Sun',
   "Say It Ain't So",
   'Buddy Holly',
   'Beverly Hills',
   'Africa',
   'The End of the Game',
   'Hash Pipe',
   'Undone - The Sweater Song',
   'My Name Is Jonas',
   'Take On Me'],
  ['Truth Hurts',
   'Good As Hell',
   'Good As Hell (feat. Ariana Grande) - Remix',
   'Juice',
   'Boys',
   'Tempo (feat. Missy Elliott)',
   'Blame It on Your Love (feat. Lizzo)',
   'Soulmate',
   'Water Me',
   'Like A Girl']),
 'Release date': (['2013-04-05',
   '2013-04-05',
   '2017-05-12',
   '2007-06-11',
   '2009-09-28',
   '2009-09-28',
   '2017-05-12',
   '2017-05-12',
   '2007-06-11',
   '2009-09-28'],
  ['2001-05-15',
   '1994-05-10',
   '1994-05-10',
   '2005-05-10',
   '2019-01-24',
   '2019-09-10',
   '2001-05-15',
   '1994-05-10',
   '1994-05-10',
   '2019-01-24'],
  ['2019-05-03',
   '2016-03-09',
   '2019-10-25',
   '2019-04-19',
   '2019-04-18',
   '2019-04-19',
   '2019-09-13',
   '2019-04-19',
   '2019-04-18',
   '2019-04-19']),
 'Popularity score': ([76, 74, 73, 73, 72, 69, 66, 66, 65, 65],
  [77, 75, 73, 71, 67, 67, 66, 65, 63, 62],
  [94, 90, 86, 84, 72, 78, 68, 72, 58, 71])}

frame = collapse_nested_artist(d)

Dictionaries as dataframes are easier to handle if all the values in the key value pairings are the same size, and can make it more straightforward.如果键值对中的所有值都具有相同的大小,则字典作为数据帧更容易处理,并且可以使其更直接。 If possible, I would reformat your dictionary slightly.如果可能的话,我会稍微重新格式化你的字典。 For example, nest each column into the artist to avoid assumptions about positions:例如,将每一列嵌套到艺术家中以避免对位置的假设:

ex = {'foo':{'title':[1,2],'letter':['a','b']},
      'bar':{'title':[3,4],'letter':['c','d']}, 
      'fob':{'title':[5,6],'letter':['e','f']},
     }

df = []
for key, value in ex.items():
    minidf = pd.DataFrame(value)
    minidf['label'] = key
    df.append(minidf)
pd.concat(df, ignore_index=True)

will return将返回

   title letter label
0      1      a   foo
1      2      b   foo
2      3      c   bar
3      4      d   bar
4      5      e   fob
5      6      f   fob

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 pandas 获取数据框中高于某个百分位数的所有值? - How can I get all values above a certain percentile in a data frame using pandas? 在给定某些约束的情况下,如何使用 Python 遍历目录中的文件和 output 和 pandas 数据框? - How can I use Python to walk through files in directories and output a pandas data frame given certain constraints? Python & Pandas:如何在 (for) 循环中从我的大数据框创建新的小数据框? - Python & Pandas: How can I create new smaller data frames from my large data frame in a (for) loop? 如何轻松删除熊猫数据框中的特殊行 - How can I remove the special lines in data frame of pandas in an easy way 如何在重新采样 Pandas 数据框期间创建额外的列? - How can I create extra columns during resampling a Pandas data frame? 如何从多个列表的每个唯一组合中创建一个 Pandas 数据框? - How can I create a pandas data frame from each unique combination of multiple lists? 如何使用 Pandas 数据框的特定行和列创建新系列? - How can I create a new series by using specific rows and columns of a pandas data frame? 如何使用Pandas数据框创建显示组值的sum()的数据透视表? - How can I create a Pivot Table that show sum() of group values, using my Pandas Data Frame? 如何通过计算熊猫数据框中的值来创建新系列? - How can I create a new series by calculating the values in my pandas data frame? 如何通过从另一列中的句子中提取单词来在 pandas 数据框中创建一个新列? - How can I create a new column in a pandas data frame by extracting words from sentences in another column?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM