Heroku Data Lakehouse
GRAX Data Lake automatically organizes your CRM data as Parquet files in S3 for arbitrary on-demand analytics queries.
AWS Athena lets you query your data lake data by SQL queries that run on the Athena in a serverless and scalable fashion.
Heroku is a serverless PaaS (Platform as a Service) that allows you to deploy applications without worrying about the underlying infrastructure.
GRAX provides a Heroku Add-on to make querying your Athena Data Lake easy.
Connection Details
When you add the GRAX Data Lake add-on to your Heroku application, it exposes environment variables that allow you to make an AWS Athena connection.
They are:
Jupyter Notebook
You can deploy a configurable JupyterHub installation that expects the add-on environment variables from the add-on and automatically configures a client that returns a pandas data frame.
Connecting from Python
You can use any Athena client to connect to and query the data lake. Here is an example using the pyathena
client:
Last updated
Was this helpful?