Python：從列表創建2列數據框並在列表上進行計算

Question

我正在使用python邁出第一步，希望您可以在以下方面為我提供幫助：

我有一個清單

scores = [1,1,1,2,2,2,3,3,3,3,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,5,5]

我想創建一個數據框，該數據框在第1列中具有得分，在第2列中具有得分的頻率。

任何幫助或指針表示贊賞。謝謝！

我的第一次嘗試不是很好：

scores = [1,1,1,2,2,2,3,3,3,3,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,5,5]
freq = []
df = {'col1': scores, 'col2': freq}

Answer 1

首先，創建一個Counter對象來計算每個樂譜的頻率。

In [1]: scores = [1,1,1,2,2,2,3,3,3,3,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,5,5]

In [2]: from collections import Counter

In [3]: score_counts = Counter(scores)

In [4]: score_counts
Out[4]: Counter({5: 12, 4: 8, 3: 4, 1: 3, 2: 3})

In [5]: import pandas as pd

In [6]: pd.DataFrame.from_dict(score_counts, orient='index')
Out[6]: 

    0
1   3
2   3
3   4
4   8
5  12

[5 rows x 1 columns]

可能會使某些用戶絆倒的部分是pd.DataFrame.from_dict() 。 該文檔位於此處： http : //pandas.pydata.org/pandas-docs/dev/genic/pandas.DataFrame.from_dict.html

Answer 2

我將使用value_counts （例如，此處為Series文檔）。 請注意，我在這里稍微更改了數據：

>>> import pandas as pd
>>> scores = [1]*3 + [2]*3 + [3]*4 + [4]*1 + [5]*4
>>> pd.value_counts(scores)
5    4
3    4
2    3
1    3
4    1
dtype: int64

您可以根據需要更改輸出：

>>> pd.value_counts(scores, ascending=True)
4    1
1    3
2    3
3    4
5    4
dtype: int64
>>> pd.value_counts(scores).sort_index()
1    3
2    3
3    4
4    1
5    4
dtype: int64
>>> pd.value_counts(scores).sort_index().to_frame()
   0
1  3
2  3
3  4
4  1
5  4

Answer 3

要計算頻率：

freq = {}
for score in scores:
     freq[score] = freq.get(score, 0) + 1

這將為您提供一個字典，其中的鍵映射到鍵值的頻率。 然后，要創建兩列，您可以只創建一個字典，例如：

data = {'scores': scores, 'freq': freq}

您也可以使用列表理解來實現此目的，其中列表的索引等於您的分數，值是頻率，但是如果分數的范圍較大，則將需要較大的稀疏數組，因此您可能會更好如上使用字典

Python：從列表創建2列數據框並在列表上進行計算

問題描述

任何幫助或指針表示贊賞。謝謝！

3 個解決方案

解決方案1
2 已采納 2014-07-18 16:37:47

解決方案2
2 2014-07-18 16:49:00

解決方案3
0 2014-07-18 16:50:21

Python：從列表創建2列數據框並在列表上進行計算

問題描述

任何幫助或指針表示贊賞。 謝謝！

3 個解決方案

解決方案1 2 已采納 2014-07-18 16:37:47

解決方案2 2 2014-07-18 16:49:00

解決方案3 0 2014-07-18 16:50:21

任何幫助或指針表示贊賞。謝謝！

解決方案1
2 已采納 2014-07-18 16:37:47

解決方案2
2 2014-07-18 16:49:00

解決方案3
0 2014-07-18 16:50:21