简体   繁体   中英

How to fetch complex MongoDB Data from Kedro?

I'm attempting to get hands on Kedro, but don't understand how to build my Data Fetcher (that I used before).

My Data is stored in a MongoDB instance over multiple “Tables”. One table are my usernames. First, I want to fetch them. Thereafter, based on the usernames I get, I would like to fetch Data from three “Tables” and merge them.

How should I do this best in Kedro?

Shall I put everything in a Custom Dataset? Fetch only the Usernames and do the rest in a Part of the pipeline?

So this is an interesting one - Kedro has been designed in a way that the tasks have no knowledge of the IO that is required to provide/save the data. This (for good reasons) requires you to cross this boundary.

My recommendation is to go down the custom dataset, but potentially go a little further and make it return the 3 tables you need directly. Ie do the username filter logic in this stage as well.

It also perfectly fine to raise a NotImplementedError on save() if you're not going do that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM