處理Pandas DataFrame列中的多個值字符串

Question

我有一個問卷數據集，其中的一個列（一個問題）有多個可能的答案。 該列的數據是列表的字符串，具有多個可能的值，從無到五個，即'[1]'或'[1, 2, 3, 5]'

我試圖處理該列以獨立訪問值，如下所示：

def f(x):
        if notnull(x):
            p = re.compile( '[\[\]\'\s]' )
            places = p.sub( '', x ).split( ',' )
            place_tally = {'1':0, '2':0, '3':0, '4':0, '5':0}
            for place in places:
                place_tally[place] += 1
            return place_tally

df['places'] = df.where_buy.map(f)

這會在我的數據框“ places”中創建一個新列，其中包含以下值的格： {'1': 1, '3': 0, '2': 0, '5': 0, '4': 0}或{'1': 1, '3': 1, '2': 1, '5': 1, '4': 0}

現在，從新列中提取數據的最有效/最簡潔的方法是什么？ 我嘗試通過DataFrame進行迭代，但效果不佳，即

    for row_index, row in df.iterrows():
         r = row['places']
         if r is not None:
             df.ix[row_index]['large_super'] = r['1']
             df.ix[row_index]['small_super'] = r['2']

這似乎不起作用。

謝謝。

Answer 1

這是您打算做什么？

for i in range(1,6):
    df['super_'+str(i)] = df['place'].map(lambda x: x.count(str(i)) )

處理Pandas DataFrame列中的多個值字符串

問題描述

1 個解決方案

解決方案1
0 2012-09-05 10:44:34

處理Pandas DataFrame列中的多個值字符串

問題描述

1 個解決方案

解決方案1 0 2012-09-05 10:44:34

解決方案1
0 2012-09-05 10:44:34