[英]counting and storing values in a dictionary using python
def prodInfo():
from collections import Counter
prodHolder = {}
tempdict = {}
try:
os.chdir(copyProd)
for root, dirs, files in os.walk('.'):
for data in files:
fullpath = os.path.join(root, data)
with open(fullpath, 'rt') as fp:
for info in fp:
info = info.strip()
if info.startswith('prodType'):
info0 = info.split('=')[1]
info0 = info0.replace(';','')
info0 = info0.replace('"','')
if info.startswith('acq'):
info1 = info.split('=')[1]
info1 = info1.replace(';','')
info1 = info1.replace('"','')
if info.startswith('ID_num'):
info2 = info.split('=')[1]
info2 = info2.replace(';','')
info2 = info2.replace('"','')
print info0 + info1 + info2
produces this result: 产生以下结果:
SD Acq645467 356788
SD Acq645467 356788
SD Acq645467 356788
SD Acq645467 356788
SD Acq645467 356788
SD Acq645467 356788
SD Acq645467 356788
SD Acq645467 356788
SD Acq645467 356788
Image Acq645467 356788
Image Acq645467 356788
Image Acq645467 356788
Image Acq645467 356788
SD Acq644869 356849
SD Acq644869 356849
Image Acq644869 356849
SD Acq644247 356851
SD Acq644247 356851
Image Acq644247 356851
I would like to store the results and have the ability to count the number of times 'SD' occurs for each specific Id number (356788/356849/356851) and how many 'images' for each Id number. 我想存储结果,并能够计算每个特定ID编号(356788/356849/356851)出现“ SD”的次数以及每个ID编号有多少个“图像”。
The results would be as follows: 结果如下:
9 - SD / 4 - Image for 356788 9-SD / 4-356788的图像
2 - SD / 1 - Image for 356849 2-SD / 1-356849的图像
2 - SD / 1 - Image for 356851 2-SD / 1-356851的图像
I though it would be best if I stored the items in a dictionary but have not been able to successfully count the values. 我最好将项目存储在字典中,但无法成功计算值。 This is the code I have used to store the info in a dictionary.
这是我用来在字典中存储信息的代码。
prodHolder[info2] = {'SD/Image': info0, 'Acq' : info1}
total_Acq = prodHolder
print prodHolder
Results are: 结果是:
{'356788': {'SD/Image': 'SD', 'Acq': Acq645467'}} ... {'356788':{'SD / Image':'SD','Acq':Acq645467'}} ...
Every time the function is run a different set of values will be entered thus producing a different result. 每次运行该功能时,都会输入一组不同的值,从而产生不同的结果。
So there's two questions here. 所以这里有两个问题。
I'd use csv (comma-separated-values). 我会使用csv(逗号分隔值)。 Python has a great module for that ( csv )
Python为此提供了一个很棒的模块( csv )
You can modify your code so, at the same time it reads from a file (as it already does), it writes info0
, info1
and info2
to a .csv
file: 您可以修改代码,以便在读取文件的同时(已经这样做),将
info0
, info1
和info2
写入.csv
文件:
def prodInfo():
from collections import Counter
prodHolder = {}
tempdict = {}
try:
os.chdir(copyProd)
for root, dirs, files in os.walk('.'):
for data in files:
fullpath = os.path.join(root, data)
with open(fullpath, 'r') as fp,\
open('./stack59.write.csv', 'w') as fw:
writer = csv.writer(fw)
for info in fp:
# [ . . . ]
# Yadda yadda yadda
print info0 + info1 + info2
writer.writerow([info0, info1, info2])
This will create a file stack59.write.csv
looking like: 这将创建一个文件
stack59.write.csv
如下所示:
SD,Acq645467,356788
SD,Acq645467,356788
SD,Acq645467,356788
[ . . . ]
SD,Acq644247,356851
SD,Acq644247,356851
Image,Acq644247,356851
For that, probably itertools.groupby would suit your needs. 为此, itertools.groupby可能适合您的需求。 You might wanna look at what iterators do, as well (see this , this and this )
您可能还想看看迭代器的功能(请参阅this , this和this )
First, I'd store the data into a matrix: 首先,我将数据存储到一个矩阵中:
def prodInfo():
from collections import Counter
prodHolder = {}
tempdict = {}
data_matrix = [] # NEW !
try:
os.chdir(copyProd)
for root, dirs, files in os.walk('.'):
for data in files:
# [ . . . ]
# Yadda, yadda, yadda...
print info0 + info1 + info2
data_matrix.append([info0, info1, info2]) # NEW!
And then you can group your data_matrix
as you please. 然后,您可以根据需要将
data_matrix
分组。 For instance: 例如:
# First, group by picture id (356788, 356849...), which is
# the third column of the data
for group_by_id in itertools.groupby(data_matrix,
lambda x: x[2]):
# Now, within those groups, group by type, the first column
# of the data (SD, Image...)
for group_by_type in itertools.groupby([a for a in group_by_id[1]],
lambda y: y[0]):
print "%s: %s %s" % (group_by_id[0],
len([a for a in group_by_type[1]]),
group_by_type[0])
print ''
Which outputs: 哪个输出:
356788: 9 SD
356788: 4 Image
356849: 2 SD
356849: 1 Image
356851: 2 SD
356851: 1 Image
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.