简体   繁体   中英

How to store 50,000 English words so that it takes as little memory as possible

I have to store ~50,000 English words in memory and I'd like to know what would be the best data structure in term of memory footprint (and loading speed). Would it be a Trie? How would I serialize it into a file? Is there anything better than that?

Essentially, once the ~50,000 words are loaded into memory, I simply need to check if the word exists or not.

Well, according to your provided guidelines, a simple List would be better.

Fetching time would be obviously slower than a Trie or Dictionary , but

"in term of memory footprint (and loading speed)"

It will require very little memory overhead, and will load faster (as no indexes / prefix data structures are built).

See this blog post for some memory comparison details (In JavaScript, but still applies).

According to this answer, the Dictionary class is what you need. As per the MSDN documentation, you should use the TryGetValue method to access your data:

Use the TryGetValue method if your code frequently attempts to access keys that are not in the dictionary. Using this method is more efficient than catching the KeyNotFoundException thrown by the Item property.

Yes, a trie sounds ok for this. For serialising you'd have two options:

  1. Use the original word list and rebuild the trie. It should be fast enough, I guess, but you may want to profile it.
  2. Just use normal .NET serialisation for the type and dump it to a file. This prevents programs in other languages from reading it, though.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM