简体   繁体   English

使用 Openpyxl excel 中的重复数据

[英]Duplicated data in excel using Openpyxl

I have created a python script that will append data in excel. However, data that are being transferred in excel is having multiple duplication.我创建了一个 python 脚本,它将 excel 中的 append 数据。但是,在 excel 中传输的数据有多个重复。 Can someone help me fix my script?有人可以帮我修复我的脚本吗?

tree = ET.parse('users.xml')
root = tree.getroot()
#create excel
wb = Workbook()
ws = wb.active
ws.title = ("Active Users")
df=pd.DataFrame(columns=["Login", "User Name", "Role", "Status"])
for user in root.findall('user'):
    login = user.find('login').text
    for m in tls.getUserByLogin(login):
        user_status = int(m.get("isActive"))
        
        if user_status == 1:
            lastname = m.get("lastName")
            firstname = m.get("firstName")
            userLogin = m.get("login")
            activeStatus = ("Active User")
            role = m.get("globalRole")
            tproject = m.get("tprojectRoles")    
            print("Login: " + userLogin + " " + lastname + " " + firstname + " Role: " + str(role['name']) + " " + str(activeStatus))
            df.loc[len(df.index)] =[userLogin, lastname, str(role['name']), str(activeStatus)]
            for row in dataframe_to_rows(df, index = False):
                ws.append(row)          
        else:
            inactive = (str(m.get("firstName")) + " " + str(m.get("lastName")) +": User is not Active")
            print(inactive)
    wb.save(filename = 'userData.xlsx')

The output in excel is this: Login = A1, User Name = B1, Role = C1, Status = D1 excel中的output是这样的:Login = A1, User Name = B1, Role = C1, Status = D1

  1. Login User Name Role Status登录用户名角色状态
  2. admin Administrator Admin Active admin 管理员 Admin 活跃
  3. Login User Name Role Status登录用户名角色状态
  4. admin Administrator Admin Active admin 管理员 Admin 活跃
  5. user1 Pedro leader Active user1 Pedro leader 活跃
  6. Login User Name Role Status登录用户名角色状态
  7. admin Administrator Admin Active admin 管理员 Admin 活跃
  8. user1 Pedro leader Active user1 Pedro leader 活跃
  9. user2 Juan leader Active user2 Juan leader 活跃

Also, for my else loop for inactive users, is it possible to append them in the same excel file to another sheet?另外,对于非活动用户的 else 循环,是否可以将同一个 excel 文件中的 append 转移到另一张纸上? Thank you all谢谢你们

Hi to @Redox and @taipei thank you for your quick responses and answers, I have resolve my duplication issues in a different format:) @Redox 和@taipei 您好,感谢您的快速回复和回答,我已经以不同的格式解决了我的重复问题:)

def getUserDetail():    
tree = ET.parse('users.xml')
root = tree.getroot()
#create excel
workbook = Workbook()
ws = workbook.active
ws.title = ("Active Users")
ws.append(['Login', 'User Name', 'Role', 'Status'])
#logins = []
for user in root.findall('user'):
    login = user.find('login').text
#    logins.append(login)
# for index in range(10):
#     login = logins[index]
    for m in tls.getUserByLogin(login):
        user_status = int(m.get("isActive"))
        if user_status == 1:
            lastname = m.get("lastName")
            firstname = m.get("firstName")
            userLogin = m.get("login")
            activeStatus = ("Active User")
            role = m.get("globalRole")
            tproject = m.get("tprojectRoles")    
            print("Login: " + userLogin + " " + lastname + " " + firstname + " Role: " + str(role['name']) + " " + str(activeStatus))
            data = [[userLogin, lastname + firstname, str(role['name']), str(activeStatus)]]
            for row in data:
                ws.append(row)
        else:
            inactive = (str(m.get("firstName")) + " " + str(m.get("lastName")) +": User is not Active")
            print(inactive)
### MOVED code here - note it should be outside ALL for loops ####             
workbook.save(filename = 'userData.xlsx')

getUserDetail()获取用户详情()

The ws.append() and ws.save should be outside of the ALL for loops, including the first one. ws.append()ws.save应该在 ALL for循环之外,包括第一个循环。 Updated code here.在此处更新代码。


tree = ET.parse('users.xml')
root = tree.getroot()
#create excel
wb = Workbook()
ws = wb.active
ws.title = ("Active Users")
df=pd.DataFrame(columns=["Login", "User Name", "Role", "Status"])
for user in root.findall('user'):
    login = user.find('login').text
    for m in tls.getUserByLogin(login):
        user_status = int(m.get("isActive"))
        
        if user_status == 1:
            lastname = m.get("lastName")
            firstname = m.get("firstName")
            userLogin = m.get("login")
            activeStatus = ("Active User")
            role = m.get("globalRole")
            tproject = m.get("tprojectRoles")    
            print("Login: " + userLogin + " " + lastname + " " + firstname + " Role: " + str(role['name']) + " " + str(activeStatus))
            df.loc[len(df.index)] =[userLogin, lastname, str(role['name']), str(activeStatus)]
        else:
            inactive = (str(m.get("firstName")) + " " + str(m.get("lastName")) +": User is not Active")
            print(inactive)

### MOVED code here - note it should be outside ALL for loops ####
for row in dataframe_to_rows(df, index = False):
    ws.append(row)          

wb.save(filename = 'userData.xlsx')

Are you sure that users.xml only contains a unique user?您确定users.xml只包含唯一用户吗?

If you're not sure, I think it's better to check existing user logic.如果您不确定,我认为最好检查现有的用户逻辑。

to achieve that you can use a dictionary or array to temporary store your user in a loop and check if the current user was exists为此,您可以使用字典或数组将用户临时存储在循环中并检查当前用户是否存在

. . .
user_tmp = []
for user in root.findall('user'):
    login = user.find('login').text
    # Check if login is in the list
    if login not in user_tmp:
        user_tmp.append(login)
    else:
        # if login is in the list, continue the loop
        continue
 . . .

since you are using the Pandas data frame, you can generate multiple sheets when saving the data frame with toExcel由于您使用的是 Pandas 数据框,因此在使用toExcel保存数据框时可以生成多张工作表

# Example, you generate an active user in df_active and inactive user in # create a excel writer object
with pd.ExcelWriter("path to file\filename.xlsx") as writer:
    # use to_excel function and specify the sheet_name and index
    # to store the dataframe in specified sheet
    df_active.to_excel(writer, sheet_name="Active", index=False)
    df_inactive.to_excel(writer, sheet_name="Inactive", index=False)

I hope you can get hints to solve your issues from my suggestions.我希望你能从我的建议中得到解决问题的提示。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM