简体   繁体   English

如何计算文本文件中值匹配的出现次数

[英]How to count occurences of a value match from a text file

Here's my problem: each employee is uniquely identified by an id (eg KCUTD_41) I have already created a dictionary from a file to gather each company with the employee id and that looks like this:这是我的问题:每个员工都由一个 ID 唯一标识(例如 KCUTD_41)我已经从一个文件中创建了一个字典来收集每个公司的员工 ID,如下所示:

{    'Company 1' :['KCUTD_41',
                   'KCTYU_48',
                   'VTSYC_48',
                      ......]
     'Company 2' :['PORUH_21',
                   'PUSHB_10',
                    ....... ]
     'Company 3' :['STEYRU_69']}

In total I have several companies.我总共有几家公司。

In parallel in another file, I have several lines where each line corresponds to a collaboration group between companies with several employees and doctoral students (d215485 etc.....)在另一个文件中并行,我有几行,其中每一行对应于具有多个员工和博士生的公司之间的协作组(d215485 等.....)

The file looks like this:该文件如下所示:

PORUH_21 d215487 d215489 d213654 KCTYU_48 d154225 ...
d25548 d89852 VTSYC_48 d254548 d121154 d258774 PUSHB_10 ...
etc ....

What I want is the number of employees and the number of groups (line where it appears) to get something like that我想要的是员工人数和组数(出现的行)以获得类似的东西

OUTPUT: OUTPUT:

Company 1 : (number of employees from company 1 per line ) : number of groups or line where it appears in total 
Company 2 : (number of employees per line from company2) : nb of groups or line where the employees from company2 appears in total
Company 3 : ......

I wanted to use a condition in order to see if the values for each keys from my dictionary matches and if yes count the number of occurrences我想使用一个条件来查看我的字典中每个键的值是否匹配,如果是,则计算出现次数

I hope it's better now ^^'我希望它现在更好^^'

If you can help me ^^如果你能帮助我^^

I'm not clear exactly how you want the output to look, but this code might help you get to where you want to go...我不清楚你希望 output 看起来如何,但这段代码可能会帮助你到达你想要 go 的地方......

import re

companies = {
    'Company 1' :['KCUTD_41','KCTYU_48','VTSYC_48'],
    'Company 2' :['PORUH_21','PUSHB_10'],
    'Company 3' :['STEYRU_69']
     }

finalout = {}
for k,v in companies.items():
    finalout[k] = {"number_in_company":len(v)}
print (finalout)

lines_from_file = [
    "PORUH_21 d215487 d215489 d213654 KCTYU_48 d154225", 
    "d25548 d89852 VTSYC_48 d254548 d121154 d258774 PUSHB_10"
]


pattern_groups    = "(d\d+)"
pattern_employees = "([A-Z]_\d+)"
for line in lines_from_file:
    print("---------------------")
    print(line)
    print("Groups per line:", re.subn(pattern_groups, '', line)[1])
    print("Employees per line:", re.subn(pattern_employees, '', line)[1])

OUTPUT: OUTPUT:

{'Company 1': {'number_in_company': 3}, 'Company 2': {'number_in_company': 2}, 'Company 3': {'number_in_company': 1}}
---------------------
PORUH_21 d215487 d215489 d213654 KCTYU_48 d154225
Groups per line: 4
Employees per line: 2
---------------------
d25548 d89852 VTSYC_48 d254548 d121154 d258774 PUSHB_10
Groups per line: 5
Employees per line: 2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 计算文本文件中列表中字符串的出现次数 - Count occurences of strings from list in text file 计算文本文件中字符串的出现次数 - Count occurences of strings in a text file 如何计算文本文件中某个元素中某个单词的出现次数? - How to count occurences of a word in a certain element in a text file? 计数从 0 到 1 和 0 到 2 的列更改值的出现次数 - Count number of occurences of a column changing value from 0 to 1 and 0 to 2 python如何计算从另一个输入文件中提取的出现次数 - python how to count number of occurences which were extracted from another input file 将pandas DataFrame从宽范围转换为长范围,并统计唯一值的出现次数 - Transform pandas DataFrame from wide to long and count occurences of a unique value 如何计算python3中大文本中排列(重叠)发生的次数? - How to count number of occurences of permutation (overlapping) in large text in python3? 尝试计算文件中每个字母的出现次数时出错 - Getting error trying to count occurences of each letter from file 如何计算在数据框python中的特定值之前出现的次数? - How to count the number of occurences before a particular value in dataframe python? 如何计算每行特定值的独特出现次数? - How to count unique occurences of particular value per row?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM