简体   繁体   中英

How to disable chunks in already partitioned step?

Say i have a 100 cities, and each city has objects of interest.

The amount of objects is varied and can be ranged from 0 to N.

Objects could be removed from the read list. That means they're no longer active and i should disable them. So i disable ALL the objects in the city first - from the writer, and then i activate back those, which are still on the list, since i have to update them anyway.

I use spring batch to read objects on each city and write them to a database. I have already partitioned this step, so data on each city is read and written in it's own slave step.

The only issue i have is chunking. Even though the step is partitioned, i don't see any options to disable chunking. This is an issue, because the last chunk for the city will disable all of the objects i've activated in previous chunks.

Spring Batch is an amazing tool and fits almost all of my needs, but now i wonder, if it's a right tool for this particular job.

Say, we have 2 cities: id1 city1 id2 city2

And each city has objects. Objects can be either active or inactive. Each one is bound to a city (sorry, Stackoverflow doesn't seem to support any sort of table).

object1 city1 active
object2 city1 active
object3 city1 inactive
object1 city2 active
object2 city2 inactive

And i only can get currently active objects from reader:

object1 city1 active
object3 city1 active
object1 city2 active

Meaning - all other objects in my DB must be disabled before i start updating the active ones.

So, i do this in the writer:

CityObjectWriter(CityObjectRepository cityObjectRepository, cityId){
    this.cityObjectRepository = cityObjectRepository;
    this.cityId = cityId;
}

@Override 
public void write(List<? extends CityObjectData> cityObjectData) throws Exception{
    //In this case i should do nothing
    if (cityObjectData.isEmpty) return;
    cityObjectRepository.disableObjectsByCityId(cityId);
    List<CityObject> cityObjects = cityObjectRepository.findAllByCity(cityId);
    // Finds a matching existing cityObject and updates it with new data, if it's found
    updateCityObjects(cityObjects, cityObjectData);
}

Do you see the issue here?

Because Spring Batch splits data into chunks - every N records will be written in their own write() method. And each call of that method will deactivate everything else. Even records, which were previously made active. I don't want that and i see no solution except setting chunk size to 9999999, so every object is processed in one method call. But it feels dirty somehow.

I figured it out. Instead of reader/processor/writer, i can use Tasklet instead. A step that uses a tasklet can be partitioned into slave-steps as well as a usual step with reader/writer/processor. It will read, process and write down every object of the city in a single transaction:

CityObjectImportTasklet(CityObjectRepository cityObjectRepository, cityId){
    this.cityObjectRepository = cityObjectRepository;
    this.cityId = cityId;
}

@Override 
public RepeatStatus execute(StepContribution contribution, ChunkContext context) throws Exception {
    List<CityObjectData> cityObjectData = getCityObjectDataSomehow(cityId);

    //In this case i do nothing
    if (cityObjectData.isEmpty) return;

    //disable all existent objects, to activate them later in case they're present
    cityObjectRepository.disableObjectsByCityId(cityId);

    List<CityObject> cityObjects = cityObjectRepository.findAllByCity(cityId);
    // Finds a matching existing cityObject and updates it with new data, if it's found
    updateCityObjects(cityObjects, cityObjectData);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM