[英]Grouping in Pandas
我想將數據分組到我具有“計數”列和另一列“狀態”的數據框中。 我想輸出一個列表列表,每個子集列表將只是每個狀態的計數。
示例輸出:[[120,200],[40、20、40],...]
120和200可以說是加州
我嘗試了以下方法:
df_new = df[['State']].groupby(['Count']).to_list()
我收到一個關鍵錯誤:“計數”
追溯:
Traceback (most recent call last):
File "C:\Users\Michael\workspace\UCIIntrotoPythonDA\src\Michael_Madani_week3.py", line 84, in <module>
getStateCountsDF(filepath)
File "C:\Users\Michael\workspace\UCIIntrotoPythonDA\src\Michael_Madani_week3.py", line 81, in getStateCountsDF
df_new = df[['State']].groupby(['Count']).to_list()
File "C:\Users\Michael\Anaconda\lib\site-packages\pandas\core\generic.py", line 3159, in groupby
sort=sort, group_keys=group_keys, squeeze=squeeze)
File "C:\Users\Michael\Anaconda\lib\site-packages\pandas\core\groupby.py", line 1199, in groupby
return klass(obj, by, **kwds)
File "C:\Users\Michael\Anaconda\lib\site-packages\pandas\core\groupby.py", line 388, in __init__
level=level, sort=sort)
File "C:\Users\Michael\Anaconda\lib\site-packages\pandas\core\groupby.py", line 2148, in _get_grouper
in_axis, name, gpr = True, gpr, obj[gpr]
File "C:\Users\Michael\Anaconda\lib\site-packages\pandas\core\frame.py", line 1797, in __getitem__
return self._getitem_column(key)
File "C:\Users\Michael\Anaconda\lib\site-packages\pandas\core\frame.py", line 1804, in _getitem_column
return self._get_item_cache(key)
File "C:\Users\Michael\Anaconda\lib\site-packages\pandas\core\generic.py", line 1084, in _get_item_cache
values = self._data.get(item)
File "C:\Users\Michael\Anaconda\lib\site-packages\pandas\core\internals.py", line 2851, in get
loc = self.items.get_loc(item)
File "C:\Users\Michael\Anaconda\lib\site-packages\pandas\core\index.py", line 1572, in get_loc
return self._engine.get_loc(_values_from_object(key))
File "pandas\index.pyx", line 134, in pandas.index.IndexEngine.get_loc (pandas\index.c:3824)
File "pandas\index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas\index.c:3704)
File "pandas\hashtable.pyx", line 686, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12280)
File "pandas\hashtable.pyx", line 694, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12231)
KeyError: 'Count'
我覺得這應該是簡單的代碼行,我在這里做錯了什么?
單線可能是:
import pandas as pd
df = pd.DataFrame.from_dict({"State": ["ny", "or", "ny", "nm"],
"Counts": [100,300,200,400]})
list_new = df.groupby("State")["Counts"].apply(list).tolist()
print(list_new)
[[400], [100, 200], [300]]
您應該閱讀groupby的文檔,以了解分組的預期結果以及如何更改分組( http://pandas.pydata.org/pandas-docs/stable/groupby.html )。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.