简体   繁体   中英

Snowplow Data Processing from PubSub to Java API

I am using Snowplow to do the behavioral data tracking. I could consume the data from Pub/Sub to BigQuery using Snowplow loader (& mutator) open source code ( https://docs.snowplowanalytics.com/docs/getting-started-on-snowplow-open-source/setup-snowplow-on-gcp/setup-bigquery-destination/ ), but I would like to consume the data from Pub/Sub to a Java API directly.

However, the data from Pub/Sub is unstructured without a schema in a String format. The data includes "\t" as the delimiter as well as "{}" to store some schemas, which may require the string processing to do the data formatting.

Is there any other better way to decode the data from Pub/Sub to Java API rather than writing complex string processing. Thank you!

Snowplow maintains a number of so-called 'analytics SDKs' that let you transform the enriched hybrid tsv + JSON format into plain JSON that can then be used in downstream applications.

For Java, your best bet would probably be the Scala Analytics SDK: https://github.com/snowplow/snowplow-scala-analytics-sdk .

There are also SDKs for .NET , Go , JavaScript and Python : https://github.com/snowplow/snowplow/tree/master/5-data-modeling/analytics-sdk .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM