简体   繁体   中英

Need advise on fast storage of large serialized objects

I have an application where user data is maintained in nested class objects, at the beginning of the application the data is read from disk, decompressed (gzip), deserialized (JSON) into my main class object (called StoredData object containing all nested classes).

Now my application is designed in a way that every time the user makes a change in the application, it changes the underlying property value in relevant class object in StoredData and immediately saves it to disk. The saving process is the reverse of loading... serialize to JSON string, gzip compressed and written to disk.

This model worked well for me until the user data climbed to a significant amount. The raw JSON serialized data string (without compressing) is 98 MB and increasing. Every time user makes a change, the saving process hangs the application for about 3-5 seconds due to large data size.

One of my solution was to push the saving process into background thread. This worked for a while except for situations where a user makes quick changes which calls the Save process twice or more in a second. My application throws an exception that when the StoredData object is being converted to JSON string, the object is modified by the main thread (due to user activity).

I need guidance from the community on how to best handle such a situation. I'm not considering moving to a SQL database model, that would require significant recoding in my application.

Thanks

Your problem isn't that you are writing to disk instead of using a SQL database, your problem is that you store and write EVERYTHING into a single object.

So the obvious answer would be: split your StoredData object into several ones and persist them individually. Smaller objects, faster writes, no hangs.

Alternatively, don't do a full write. Serialize and write only to disk what actually changed. This can be rather difficult though, and not sure if it's possible when the content is GZIPed. If you have many writes, it may still be problematic. I personally wouldn't choose this approach.

Thirdly, you tried concurrency as a solution but had a race condition. Did you push everything into the background thread, meaning JSON serialization, GZIP and writing to disk? If so, you could try to serialize on the main thread and do GZIP+disk writes on the background thread, which shouldn't cause exceptions when you change your objects.

As a final note: you have a god-object that captures all your data and that can't be modified while it is being serialized. This is not a sustainable design, sooner or later you will have to refactor your application.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM