简体   繁体   中英

generating key for AVL tree

I have a large system using AVL trees for fast searching of IP addresses:

struct avl_node
{
   struct avl_node *left;
   struct avl_node *right;
   ...
   void *info; /* point to nhlfe_entry describing nexthop */
}

struct nhlfe_entry
{
  u_int32_t nhlfe_ix;
  u_char opcode;
  ...
  struct nhlfe_key key;
}

/* defines a search key. */
struct nhlfe_key
{
  struct in_addr nh_addr;
  u_int32_t oif_ix;
  u_int32_t out_label;
}

So the search is based on 'struct nhlfe_key', ie comparator function in AVL tree looks like this:

static int
mpls_cmp_nhlfe_ipv4_key (void *data1, void* data2)
{
   struct nhlfe_entry *nh1, *nh2;
   struct nhlfe_key *key1, *key2;
   int ret;

   nh1 = (struct nhlfe_entry *) data1;
   nh2 = (struct nhlfe_entry *) data2;

   key1 = (struct nhlfe_key *) nh1->nkey;
   key2 = (struct nhlfe_key *) nh2->nkey;

   ret = memcmp (&key1->nh_addr, &key2->nh_addr, sizeof (struct in_addr));
   if (ret != 0)
     return ret;

   if (key1->oif_ix > key2->oif_ix)
     return 1;
   else if (key1->oif_ix < key2->oif_ix)
     return -1;

   if (key1->out_label > key2->out_label)
     return 1;
   else if (key1->out_label < key2->out_label)
     return -1;

   return 0;
}

Now, what I'm trying to do is to add support for multiple next hops, that is I add a linked list in nhlfe_entry:

struct nhlfe_entry
{
  u_int32_t nhlfe_ix;
  u_char opcode;
  ...
  struct list *nhkey_list;
}

Each 'struct list' is struct listnode that embeds 'void *data' pointer to caller's private data, and this is 'struct nhlfe_key'.

So my question is -- how to generate key based on multiple elements from the list to store/search nodes in the tree (because otherwise now after introducing a list, it won't be possible to have a key based on only one next hop address). Also, they same question applies for searching.

Also, after adding a new node in the list, do I need to re-build the tree, because I think this operation will change the key and as such the tree may become unbalanced? (or AVL tree with correct implementation naturally doesn't require to be rebuilt?)

I'm thinking about having CRC generated over every listnode and then summed up. Can this guarantee uniqueness of the key? (Disadvantage is that whenever I add/delete listnode I have to re-generate key, delete node from the tre and re-add with a new key).

Thanks!

I have a large system using AVL trees for fast searching of IP addresses:

For large numbers of IP addresses you usually want a radix tree. A binary tree will work, but you don't get any capability to store ranges of addresses using their prefix, eg 10.* . If you're not using this for anything resembling routing, or you don't need to save space mapping an entire su.net to something.

So my question is -- how to generate key based on multiple elements from the list to store/search nodes in the tree (because otherwise now after introducing a list, it won't be possible to have a key based on only one next hop address). Also, they same question applies for searching.

Your mpls_cmp_nhlfe_ipv4_key function will simply have to compare keys that may be lists of addresses. Obviously (1 2 3) compares equal to (1 2 3) . Moreover, (1 2 3) compares greater than (1 2) , but less than (1 3) or (1 2 4) .

Also, after adding a new node in the list, do I need to re-build the tree...

If a node in a balanced search tree is to be updated such that the key changes, the best thing to do may be to remove it and re-insert it.

There may be ways to optimize it. For instance suppose a key changes, but in such a way that it still has exactly the same successor and predecessor in the tree. In that case, it can just be done in place. Or a key can change in such a way that a node just has to be exchanged with the predecessor or successor. I'd get it right before trying tricks like this.

Can [a CRC] guarantee uniqueness of the key?

No, a CRC is a hashing function. It has fewer bits than the object being hashed and so multiple objects can hash to the same CRC. (The exception is when a "perfect hashing function" is found for a set of elements, but such a thing rarely occurs with dynamic data: perfect hashing functions are contrived for some set of static data.) With the hashing approach, you might as well use a hash table. The ordering relation over the CRC's is likely meaningless. Binary search trees are used when the collection has to be ordered by an ordering relation over the keys.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM