简体   繁体   中英

How to connect Superset to external APIs like Google Analytics?

I am willing to show Google Analytics and Google Search Console data directly into Superset through their API.

  1. Make direct queries to Google Analytics API in JSON (instead of storing the results into my database then showing them into Superset) and show the result in Superset
  2. Make direct queries to Google Search Console API in JSON and show the result in Superset
  3. Make direct queries to other amazing JSON APIs and show the result in Superset

How can I do so?

I couldn't find a Google Analytics datasource. I couldn't find a Google Search Console datasource either.

I can't find a way to display in Superset data retrieved from an API, only data stored in a database. I must be missing something, but I can't find anything in the docs related to authenticating & querying external APIs.

Superset can't query external data API's directly. Superset has to work with a supported database or data engine ( https://superset.incubator.apache.org/installation.html#database-dependencies ). This means that you need to find a way to fetch data out of the API and store it in a supported database / data engine. Some options:

  • Build a little Python pipeline that will query the data API, flatten the data to something tabular / relational, and upload that data to a supported data source - https://superset.incubator.apache.org/installation.html#database-dependencies - and set up Superset so it can talk to that database / data engine.

  • For more robust solutions, you may want to work with an devops / infrastructure to stand up a workflow scheduler like Apache Airflow ( https://airflow.apache.org/ ) to regularly ping this API and store it in a database of some kind that Superset can talk to.

  • If you want to regularly query data from a popular 3rd party API, I also recommend checking out Meltano and learning more about Singer taps . These will handle some of the heavy lifting of fetching data from an API regularly and storing it in a database like Postgres. The good news is that there's a Singer tap for Google Analytics - https://github.com/singer-io/tap-google-analytics

Either way, Superset is just a thin layer above your database / data engine. So there's no way around the reality that you need to find a way to extract data out of an API and store it in a compatible data source.

There is no such connector available by default.

A recommended solution would be storing your Google Analytics and Search Console data in a database, you could write a script that pulls data every 4 hours or whichever interval works for you.

Also, you shouldn't store all data but only the dimension/metrics you wish to see in your reports.

Redash is an alternative to Superset for that task, but it doesn't have the same features. Here is a compared list of integrations for both tools: https://discuss.redash.io/t/a-comparison-of-redash-and-superset/1503

A quick alternative is paying for a third party service like: https://www.stitchdata.com/integrations/google-analytics/superset/

I couldn't find a Google Analytics datasource. I couldn't find a Google Search Console datasource either.

I think you've answered your own question. I don't know of a SQL interface to Google Analytics.

There is this project named shillelagh by one of Superset's contributors. This gives a SQL interface to REST APIs. This same package is used in Apache Superset to connect with gsheets .

New adapters are relatively easy to implement. There's a step-by-step tutorial that explains how to create a new adapter to an API or filetype in shillelagh.

The package shillelagh underlying uses SQLite Virtual Tables by using the SQLite wrapper APSW

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM