简体繁体中英

Data collection frequency strategy

原文 2023-01-25 14:53:06 6 1 web-scraping/ google-bigquery/ data-science/ bigdata

I have a question and I am wondering if anyone has solved this problem effectively. I am developing a collector(let's call it A) to collect data from a source(let's call it B) which in turn collects data from somewhere else. B collects every 5 minutes, what frequency or strategy should A use? If A's frequency is double of B then it will end up with duplicate data for an interval. If it's the same as B then there's a chance that it may get stale data if the collection times are exactly the same. Has anyone solved this problem?

1 answers

If there is some sort of time data associated with the data you are collecting from source B, then you could use that to exclude duplicate results; you could set it to only include new data with a more recent timestamp.

I have done this before by converting date/time to a Unix Epoch Timestamp and then checking that the latest data has a larger value, or else ignoring it. This would allow you to run your data collection at twice the rate of B's, if you desired to.

query data for high-frequency rows

Azure CosmosDB - Sharding Strategy for georeferenced data?

Strategy for Updating Schema/Data of Data Stored in AWS S3

How to get data from collection within collection in Firebase?

Data not pulling in from Firestore Collection

No data retrieved for specific collection Firebase

Read collection in Firestore to get GEO data

List data from collection with Flutter and Firebase

Cannot fetch data from Firestore collection

Collection to exclude array items in azure data factory

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question query data for high-frequency rows Azure CosmosDB - Sharding Strategy for georeferenced data? Strategy for Updating Schema/Data of Data Stored in AWS S3 How to get data from collection within collection in Firebase? Data not pulling in from Firestore Collection No data retrieved for specific collection Firebase Read collection in Firestore to get GEO data List data from collection with Flutter and Firebase Cannot fetch data from Firestore collection Collection to exclude array items in azure data factory

Related Tags

Data collection frequency strategy

Question

1 answers

solution1 1 2023-01-25 15:00:25

solution1
1 2023-01-25 15:00:25