简体   繁体   中英

Importing large datasets into core data, making the relationships in Swift

I have a CoreData database which houses around 500.000 stamps and 86.000 series. I have to download them from a web api, which uses JSON. The adding of stamps and series into core data goes with no problem. But I have troubles when making relationships between the two.

我的数据模型 Each stamp has one serie and each serie can have multiple stamps. As seen in the picture of my datamodel above.

I need to make the relationship between the two, efficiently and fast. While I was doing some research I stumbled across this website https://www.objc.io/issues/4-core-data/importing-large-data-sets-into-core-data/ The piece that I'm most interested in:

A similar problem often arises when establishing relationships between the newly imported objects. Using a fetch request to get each related object independently is vastly inefficient. There are two possible ways out of this: either we resolve relationships in batches similar to how we imported the objects in the first place, or we cache the objectIDs of the already-imported objects. Resolving relationships in batches allows us to greatly reduce the number of fetch requests required by fetching many related objects at once. Don't worry about potentially long predicates like:

 [NSPredicate predicateWithFormat:@"identifier IN %@", identifiersOfRelatedObjects]; 

Resolving a predicate with many identifiers in the IN (...) clause is always way more efficient than going to disk for each object independently. However, there is also a way to avoid fetch requests altogether (at least if you only need to establish relationships between newly imported objects). If you cache the objectIDs of all imported objects (which is not a lot of data in most cases really), you can use them later to retrieve faults for related objects using objectWithID:.

 // after a batch of objects has been imported and saved for (MyManagedObject *object in importedObjects) { objectIDCache[object.identifier] = object.objectID; } // ... later during resolving relationships NSManagedObjectID objectID = objectIDCache[object.foreignKey]; MyManagedObject *relatedObject = [context objectWithID:objectId]; object.toOneRelation = relatedObject; 

Note that this example assumes that the identifier property is unique across all entity types, otherwise we would have to account for duplicate identifiers for different types in the way we cache the object IDs.

But I have no idea what they mean by that, can someone give some more explanation about this. Preferably in Swift, as that is the language I understand the best and also the language in which I'm creating my app. Of course other suggestions are also fine. Note, moving away from CoreData is not an option anymore.

The task of making relationship between two objects involves having those two objects at hand. Considering that they have been already created in Core Data, you may execute a fetch request with predicate like

@"countryID == %@", countryObjectData[@"id"]

and you'll get them. But if you need to establish five hundred thousands relationships, you'll have to execute one million fetch requests. It's slow.

Retrieving an NSManagedObject by its NSManagedObjectID is significally faster than searching by property value. Before starting parsing you can build a cache of all your Core Data objects by entity in form of server key -> objectID pairs.

self.cache = [NSMutableDictionary dictionaryWithCapacity:self.managedObjectModel.entities.count];

NSExpressionDescription *objectIdDescription = [[NSExpressionDescription alloc] init];
objectIdDescription.name = @"objectID";
objectIdDescription.expression = [NSExpression expressionForEvaluatedObject];
objectIdDescription.expressionResultType = NSObjectIDAttributeType;

NSString *key = @"serverID";

for (NSEntityDescription *entity in self.managedObjectModel.entities) {
    NSMutableDictionary *entityCache = [NSMutableDictionary dictionary];
    self.cache[entity.name] = entityCache;

    NSFetchRequest *request = [NSFetchRequest fetchRequestWithEntityName:entity.name];
    request.resultType = NSDictionaryResultType;
    request.propertiesToFetch = @[key, objectIdDescription];
    NSArray *result = [self.context executeFetchRequest:request error:nil];

    for (NSDictionary *item in result) {
        id value = item[key];
        NSManagedObjectID *objectID = item[@"objectID"];
        entityCache[value] = objectID;
    }
}

Having that cache you can get your objects like this:

id serverKey = countryObjectData[@"id"];
NSManagedObjectID *objectID = self.cache[@"Country"][serverKey];
Country *country = [self.context objectWithID:objectID]

It's much faster.

When you creating new objects while parsing JSON, you need to add their server key and objectID pair to cache – after obtaining permanent IDs. Delete that pair from cache when you deleting object.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM