简体   繁体   中英

How to handle parsing a big xml file and saving it in a database

I have a fairly large xml file ( greater than 2mb ) that I'm parsing and storing in an sqlite database. I can parse it and store it for the first time fine. My question concerns updating the database when I want to parse the xml file again ( for changes, additions, or deletions ). My initial thought is to just wipe the information in the database and do inserts again rather than parse the data, check to see if a given item is already in the database and do an update. Is there an approach that is better than another? Would there be a performance hit one way or another? I'd appreciate any thoughts on the matter.

Inserting only what needs to be changed is clearly going to be quicker than dumping the entire DB and re-inserting. At least that's my thinking.

I suppose it depends on how complex the information you are checking against is, and how efficient your code for doing that process is. If you aren't comfortable doing verification like that, then dumping and reinserting would be a safer option.

Yes, re-inserting is probably a bad idea. How complicated is the xml structure, how many tables are involved when you would query the existence of one item that is reflected by the structure?

If it's complex you might be able to create a checksum of your entries or a hash of some attributes and values which identify a record uniquely and store this hash/checksum in an extra table in the db, when you look for modified entries you just compute the hash/checksum and look for it in one table. Maybe that even makes the querying faster, depending how expensive the hash calculation is.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM