简体   繁体   中英

Unordered collection for mutable objects in Python

The task I have at hand is to parse a large text (several 100K rows) file and accumulate some statistics based which will be then visualized in plots. Each row contains results of some prior analysis.

I wrote a custom class to define the objects that are to be accumulated. The class contains 2 string fields, 3 sets and 2 integer counters. As such there is an __init__(self, name) which initializes a new object with name and empty fields, and a method called addRow() which adds information into the object. The sets accumulate data to be associated with this object and the counters keep track of a couple of conditions.

My original idea was to iterate over the rows of the file and call a method like parseRow() in main

reader = csv.reader(f)
acc = {} # or set()
for row in reader: 
  parseRow(row,acc)

which would look something like:

parseRow(row, acc):
  if row[id] is not in acc: # row[id] is the column where the object names/ids are 
    a = MyObj(row[id])
  else:
    a = acc.get(row[id]) # or equivalent
  a.addRow(...)

The issue here is that the accumulating collection acc cannot be a set since sets are apparently not indexable in Python. Edit: for clarification, by indexable I didn't mean getting the nth element but rather being able to retrieve a specific element .

One workaround would be to have a dict that has {obj_name : obj} mapping but it feels like an ugly solution. Considering the elegance of the language otherwise, I guess there is a better solution to this. It's surely not a particularly rare situation...

Any suggestions?

You could also try an ordered-set . Which is a set AND ordered.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM