Given cluster
and node
objects:
class Cluster():
def __init__(self):
pass
class Node():
def __init__(self):
pass
I am wondering what is the best data structure or design that meets the following requirements:
clusters
that a given node
belongs to. nodes
that belong to a given cluster
. node
belongs to a cluster
, and each cluster
to a node
. node
or cluster
is deleted or added. The number of nodes and clusters will each be in the range of 100,000.
More details of varying relevance:
node
will always belong to one or more clusters, cluster
will always contain one or more nodes. cluster
has its only node
removed the cluster should be deleted. node
will never have all of its clusters removed. node1
might belong 90% to cluster14
and 10% to cluster88
I was thinking about using SQLite, but the problem is that storing serialized objects in the database is too slow. I could store object_ids
in the database and then look those up in a dict
that maps object_ids
to object instances, but then there are consistency issues between the dict
and the database. Additionally fetching a list of instances from the dict
is a bit cumbersome.
I could possibly store the memory locations of the instances in SQLite but that seems dangerous, and we still have consistancy issues.
I implemented a similar data structure on a home project ; my own requirements called for a look alike architecture, except i called cluster "tags" (but the core concept is the same).
Here is how you may implement it:
Node42
belonds to cluster 1 and 3, the dictionnary will have an entry looking like 5:[Node42, ...]
About requirements :
If you are interested in the code I can release it for you to have a look, but I think you first need to make a choice or two regarding architecture : you can't have full Python full constant time memory efficient large scale data structure IMHO.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.