簡體   English   中英

嵌套字典錯誤-Python Pandas

[英]Nested Dictionary Errors — Python Pandas

我有以下代碼:

import os
import pandas as pd 
from pandas import ExcelWriter
from pandas import ExcelFile

fileName= input("Enter file name here (Case Sensitve) > ")
df = pd.read_excel(fileName +'.xlsx', sheetname=None, ignore_index=True)
xl = pd.ExcelFile(fileName +'.xlsx')
SystemCount= len(xl.sheet_names)
df1 = pd.DataFrame([])

for y in range(1, int(SystemCount)+ 1): 
    df = pd.read_excel(xl,'System ' + str(y))  #reads each sheet
    df['System {0}'.format(y)] = "1"  #adds a column for each system, sets the column = 1
    df1 = df1.append(df)  #appends all sheets together into a new df


df1 = df1.sort_values(['Email']) #sorts by email
df = df1['Email'].value_counts() #counts the amount each email shows
df1['Count'] = df1.groupby('Email')['Email'].transform('count') #adds the count to the end


df1 = df1.apply(lambda x : pd.to_numeric(x,errors='ignore')) #turns ints to floats
d = dict(zip(df1.columns[1:],['sum']*df1.columns[1:].str.contains('System').sum()+['first'])) #adds up each row
df1 = df1.fillna(0).groupby('Email').agg(d) #turns NAN into 0 and groups everything together
df1 = df1.reset_index() #email column was turned into an index with above line, this turns it back to a df column


SystemsList = []#creates empty list
for count in range(1, int(SystemCount)+1): #counts up to the system amount
    SystemsList.append(['System {0}'.format(count)]) #creates list of systems

SystemDict = {}
for item in SystemsList:
    SystemDict[item]=df1[df1[item]== 1]["Email"]

它沿着(輸出的一小段)輸出內容:

 Email          System 1  System 2 System 3 System 4 Count
    test1@test.com    0     1       0        1           2
    test2@test.com    1     0       0        1           2
    test3@test.com    1     1       0        1           3
    test4@test.com    1     0       1        0           2

我正在嘗試為每個系統創建一個嵌套的字典,使用以下代碼部分將電子郵件放置在每個顯示1的位置:

SystemDict = {}
    for item in SystemsList:
        SystemDict[item]=df1[df1[item]== 1]["Email"]

但是我收到以下錯誤-ValueError:條件而不是float64所期望的布爾數組。 有想法該怎么解決這個嗎?

這是一種方法。

import pandas as pd

lst = [['test1@test.com', 0, 1, 0, 1, 2],
       ['test2@test.com', 1, 0, 0, 1, 2],
       ['test3@test.com', 1, 1, 0, 1, 3],
       ['test4@test.com', 1, 0, 1, 0, 1]]

df = pd.DataFrame(lst, columns=['Email', 'System 1', 'System 2',
                                'System 3', 'System 4', 'Count'])

d = {'System'+str(i): list(filter(None, df['System '+str(i)]*df['Email'])) \
                      for i in range(1, 5)}

# {'System1': ['test2@test.com', 'test3@test.com', 'test4@test.com'],
#  'System2': ['test1@test.com', 'test3@test.com'],
#  'System3': ['test4@test.com'],
#  'System4': ['test1@test.com', 'test2@test.com', 'test3@test.com']}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM