繁体   English   中英

将数据帧连接到具有预设行名python的pandas数据帧

[英]Concatenating data frames into a pandas dataframe with pre set row names python

我试图重构以前非常手工的代码,并涉及为我创建的每个新数据框设置索引,以实质上创建此所需的输出:

    f1          precision   recall
A   0.600315956 0.72243346  0.513513514
B   0.096692112 0.826086957 0.051351351
C   0.085642317 0.62962963  0.045945946
D   0.108641975 0.628571429 0.059459459

这是我当前的代码:

summaryDF = pd.DataFrame().set_index(['A','B','C','D'])

def evaluation(trueLabels, evalLabels):

    precision = precision_score(trueLabels, evalLabels)
    recall = precision_score(trueLabels, evalLabels)
    f1 = precision_score(trueLabels, evalLabels)
    accuracy = accuracy_score(trueLabels, evalLabels)

    data = {'precision': precision,
               'recall': recall,
               'f1': f1}

    DF = pd.DataFrame(data)

    summaryDF.concat(DF,ignore_index=True)


results = [y_randpred,y_cat_random_to_binary,y_cat_random_to_binary_threshold,y_closed_random_to_binary]

for result in results:
    evaluation(y_true_claim, result)

这是我的错误跟踪:

Traceback (most recent call last):
  File "/Users/dhruv/Documents/bla/bla/src/main/bla.py", line 419, in <module>
    summaryDF = pd.DataFrame().set_index(['A','B','C','D'])
  File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/frame.py", line 2607, in set_index
    level = frame[col].values
  File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/frame.py", line 1797, in __getitem__
    return self._getitem_column(key)
  File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/frame.py", line 1804, in _getitem_column
    return self._get_item_cache(key)
  File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/generic.py", line 1084, in _get_item_cache
    values = self._data.get(item)
  File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/internals.py", line 2851, in get
    loc = self.items.get_loc(item)
  File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/index.py", line 1572, in get_loc
    return self._engine.get_loc(_values_from_object(key))
  File "pandas/index.pyx", line 134, in pandas.index.IndexEngine.get_loc (pandas/index.c:3824)
  File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:3704)
  File "pandas/hashtable.pyx", line 686, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12280)
  File "pandas/hashtable.pyx", line 694, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12231)
KeyError: 'A'

知道我在做什么错吗?

我解决了我的问题。

使用此答案 ,我的代码变为:

summaryDF = pd.DataFrame(columns=('precision','recall','f1'))

def evaluation(trueLabels, evalLabels):

    global summaryDF

    precision = precision_score(trueLabels, evalLabels)
    recall = recall_score(trueLabels, evalLabels)
    f1 = f1_score(trueLabels, evalLabels)

    data = {'precision': [precision],
               'recall': [recall],
               'f1': [f1]
            }

    DF = pd.DataFrame(data)

    summaryDF = pd.concat([summaryDF,DF])

results = [y_randpred,
           y_cat_random_to_binary,
           y_cat_random_to_binary_threshold,
           y_closed_random_to_binary,
           y_closedCat_random_to_binary_threshold]

for result in results:
    evaluation(y_true_claim, result)

summaryDF.index=list(['A',
                     'B',
                     'C',
                     'D',
                     'E'])

关键方面是,我需要将元素放在方括号中以实现精度,召回率和F1,然后还要通过summaryDF.index而不是set_index方法设置索引。

因此,我只追加数据,然后在其末尾设置索引,而不是在数据帧追加的开始处设置索引,因为任何已启动的数据帧都必须在某种类型的开头具有索引。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM