简体   繁体   English

迭代两个以列表为值的字典

[英]iterate on two dictionaries with list as values

I have data regarding to employees punch time clock.我有关于员工打卡时间的数据。

Employee can report his starting time by an automated process (by his key card or by finger print), or manually through a simple web form.员工可以通过自动流程(通过钥匙卡或指纹)或通过简单的 web 表格手动报告他的开始时间。

The problem is that there are employees who accidentally reported time on more than one method问题是有些员工不小心用一种以上的方法报告了时间

The input stream data comes in two dictionaries with value as list as the following:输入的 stream 数据来自两个字典,其值如下所示:

I'm trying to iterate on both dictionaries and count (not sum) if employee made login on the specific date.如果员工在特定日期登录,我正在尝试对字典和计数(不是总和)进行迭代。

def count_type(logins, punch_clock_data):
    counter = 0
    ans_dict = {}
    for punch_clock_key, punch_clock_value in punch_clock_data.items():
        for element in punch_clock_value:
            if element not in ans_dict:
                l1 = logins.get(element)
                for i in range(len(l1)):
                    if i == 1:
                        counter+=1
                        break
            ans_dict[punch_clock_key] = counter
        print(f'this employee was at work {counter} days this week by {punch_clock_key} login')
    return ans_dict


# each item in the array is a representation of weekday. for example, 
# on Sunday there was not any log in data.
# on Tuesday, this employee reported login both on web form and by 
# card(flag = 1) etc.


logins = { 'card'        :[0, 1, 1, 0, 0, 0, 0],
           'fingerprint' :[0, 0, 0, 1, 1, 0, 0],
           'web form'    :[0, 0, 1, 1, 0, 1, 1]
}


# dictionary contains data on types of punch clock
punch_clock_data  = { 'automated' :['card', 'fingerprint'],
                      'manual'    :['web form'],
                      'all_types' :['card', 'fingerprint', 'web form']
}

res = count_type(logins, punch_clock_data)
print(res)

My output is not as expected.我的 output 不如预期。 This is my output这是我的 output

{'automated': 2, 'manual': 3, 'all_types': 6}

But I'm trying to get is:但我试图得到的是:

{'automated': 4, 'manual': 4, 'all_types': 6}
  • automated should be 4 because the are four days where flag equal to 1 (Monday, Tuesday by card and Wednesday, Thursday by fingerprint自动应该是 4 因为标志等于 1 的四天(星期一,星期二通过卡片和星期三,星期四通过指纹
  • manual should be 4 because the are four days where flag equal to 1 (Tuesday, Wednesday, Friday, Saturday) manual 应该是 4,因为这是 flag 等于 1 的四天(星期二、星期三、星期五、星期六)
  • all_types should be 6 because the are six days where at least one flag is equal to 1 all_types 应该是 6,因为这是六天,其中至少一个标志等于 1

I think that my problem is that I need to iterate on all of the weekdays list by index, and not by the value.我认为我的问题是我需要按索引而不是按值迭代所有工作日列表。 For each day of the week, get the right index and than count it (vertically and not horizontally)对于一周中的每一天,获取正确的索引并计算它(垂直而不是水平)

Looks like you only want to count one login on the days where the employee logged in by more than one method in a specific punch clock category.看起来您只想在员工在特定打卡时钟类别中通过多种方法登录的日子计算一次登录。 You could zip the login method lists together for each category and test whether there was a login for any of them.您可以 zip 登录方法为每个类别一起列出并测试其中任何一个是否有登录。

logins = {'card': [0, 1, 1, 0, 0, 0, 0], 'fingerprint' :[0, 0, 0, 1, 1, 0, 0], 'web form': [0, 0, 1, 1, 0, 1, 1]}
punch_clock_data = { 'automated': ['card', 'fingerprint'], 'manual': ['web form'], 'all_types': ['card', 'fingerprint', 'web form']}

results = {}
for group, keys in punch_clock_data.items():
    results[group] = sum(any(t) for t in zip(*[logins[k] for k in keys]))

print(results) 
# {'automated': 4, 'manual': 4, 'all_types': 6}

Per your comment requesting a version that makes it easier to see the steps involved.根据您的评论,要求提供一个可以更轻松地查看所涉及步骤的版本。 Here is a bit of a breakdown.这是一个小故障。

results = {}
for group, keys in punch_clock_data.items():
    # list of lists of logins for each method in the category
    all_logins = [logins[k] for k in keys]

    # zip all_logins by day of the week
    logins_per_day = zip(*all_logins)

    # add 1 for each day where any of the values in the tuple are not zero
    results[group] = sum(any(t) for t in logins_per_day)

look this code看这段代码

def count_type(logins, punch_clock_data):
    ans_dict = {}
    for punch_clock_key, punch_clock_value in punch_clock_data.items():
        counter = 0
        tmp_tab = [0] * 7
        for login_key in punch_clock_value:
            for i in range(len(logins[login_key])):
                tmp_tab[i] += logins[login_key][i]
        for day in tmp_tab:
            counter += day > 0
        ans_dict[punch_clock_key] = counter
    return ans_dict

For exemple with all_types, i create a tmp_tab which transform your 3 tab in以 all_types 为例,我创建了一个 tmp_tab,它将您的 3 个选项卡转换为

[0, 1, 2, 2, 1, 1, 1]

then it's then sum of each col and counter += 1 if value of col is > to 0如果 col 的值 > 到 0,那么它是每个 col 和 counter += 1 的总和

The key here, you need to SUM login like this, example in all type :这里的关键,您需要像这样 SUM 登录, all type的示例:

       'card':        [0, 1, 1, 0, 0, 0, 0]
       'fingerprint' :[0, 0, 0, 1, 1, 0, 0]
       'web form'    :[0, 0, 1, 1, 0, 1, 1]
       'all type'    :[0, 1, 1, 1, 1, 1, 1]  total = 6

so you could try this:所以你可以试试这个:

NUMBER_OF_DAY = 7
def count_type(logins, punch_clock_data):
    ans_dict = {}
    for punch_clock_key, punch_clock_values in punch_clock_data.items():
        # list of all login
        element_list = [logins[punch_clock_value] for punch_clock_value in punch_clock_values]
        # compute the sum of each day
        # EX:
        # [0, 1, 1, 0, 0, 0, 0] + [0, 0, 0, 1, 1, 0, 0]
        # total equal to = [0, 1, 1, 1, 1, 0, 0]
        total_days = [0] * NUMBER_OF_DAY
        for element in element_list:
            for day_index, is_checked in enumerate(element):
                # if he checked is_checked == 1 else is 0 
                if is_checked:
                    # he checked in day mark this in the total by 1 not by some
                    total_days[day_index] = 1

        # now just some the total of day
        ans_dict[punch_clock_key] = sum(total_days)
    return ans_dict

using zip, zip and list comprehensive will help to reduce the code:使用 zip、zip 和综合列表将有助于减少代码:

def count_type(logins, punch_clock_data):
    ans_dict = {}
    for punch_clock_key, punch_clock_values in punch_clock_data.items():
        # list of all login
        element_list = [logins[punch_clock_value] for punch_clock_value in punch_clock_values]
        # zip them so when we iterate over them we get a tuple of login of one day in each iteration
        element_list = zip(*element_list)
        total = 0
        for days in element_list:   
           total += any(days) and 1 or 0
        ans_dict[punch_clock_key] = total
    return ans_dict

Now we can simplify the code even more:现在我们可以进一步简化代码:

  element_list = [logins[punch_clock_value] for punch_clock_value in punch_clock_values]
  element_list = zip(*element_list)

  # to this 
  element_list = zip(*[logins[punch_clock_value] for punch_clock_value in punch_clock_values])

and thanks to the build-in sum :并感谢build-in sum

    total = 0
    for days in element_list:   
       total += any(days) and 1 or 0
    ans_dict[punch_clock_key] = total


    # to this 
    ans_dict[punch_clock_key] = sum(any(days) for days in element_list)

So the final result function:所以最终结果function:

def count_type(logins, punch_clock_data):
    ans_dict = {}
    for punch_clock_key, punch_clock_values in punch_clock_data.items():
        # list of all login
        element_list = element_list = zip(*[logins[punch_clock_value] for punch_clock_value in punch_clock_values])
        ans_dict[punch_clock_key] = sum(any(days) for days in element_list)
    return ans_dict

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM