Here's my problem: each employee is uniquely identified by an id (eg KCUTD_41) I have already created a dictionary from a file to gather each company with the employee id and that looks like this:
{ 'Company 1' :['KCUTD_41',
'KCTYU_48',
'VTSYC_48',
......]
'Company 2' :['PORUH_21',
'PUSHB_10',
....... ]
'Company 3' :['STEYRU_69']}
In total I have several companies.
In parallel in another file, I have several lines where each line corresponds to a collaboration group between companies with several employees and doctoral students (d215485 etc.....)
The file looks like this:
PORUH_21 d215487 d215489 d213654 KCTYU_48 d154225 ...
d25548 d89852 VTSYC_48 d254548 d121154 d258774 PUSHB_10 ...
etc ....
What I want is the number of employees and the number of groups (line where it appears) to get something like that
OUTPUT:
Company 1 : (number of employees from company 1 per line ) : number of groups or line where it appears in total
Company 2 : (number of employees per line from company2) : nb of groups or line where the employees from company2 appears in total
Company 3 : ......
I wanted to use a condition in order to see if the values for each keys from my dictionary matches and if yes count the number of occurrences
I hope it's better now ^^'
If you can help me ^^
I'm not clear exactly how you want the output to look, but this code might help you get to where you want to go...
import re
companies = {
'Company 1' :['KCUTD_41','KCTYU_48','VTSYC_48'],
'Company 2' :['PORUH_21','PUSHB_10'],
'Company 3' :['STEYRU_69']
}
finalout = {}
for k,v in companies.items():
finalout[k] = {"number_in_company":len(v)}
print (finalout)
lines_from_file = [
"PORUH_21 d215487 d215489 d213654 KCTYU_48 d154225",
"d25548 d89852 VTSYC_48 d254548 d121154 d258774 PUSHB_10"
]
pattern_groups = "(d\d+)"
pattern_employees = "([A-Z]_\d+)"
for line in lines_from_file:
print("---------------------")
print(line)
print("Groups per line:", re.subn(pattern_groups, '', line)[1])
print("Employees per line:", re.subn(pattern_employees, '', line)[1])
OUTPUT:
{'Company 1': {'number_in_company': 3}, 'Company 2': {'number_in_company': 2}, 'Company 3': {'number_in_company': 1}}
---------------------
PORUH_21 d215487 d215489 d213654 KCTYU_48 d154225
Groups per line: 4
Employees per line: 2
---------------------
d25548 d89852 VTSYC_48 d254548 d121154 d258774 PUSHB_10
Groups per line: 5
Employees per line: 2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.