Heroku Data Lakehouse

GRAX Data Lake automatically organizes your CRM data as Parquet files in S3 for arbitrary on-demand analytics queries.

AWS Athena lets you query your data lake data by SQL queries that run on the Athena in a serverless and scalable fashion.

Heroku is a serverless PaaS (Platform as a Service) that allows you to deploy applications without worrying about the underlying infrastructure.

GRAX provides a Heroku Add-on to make querying your Athena Data Lake easy.

Connection Details

When you add the GRAX Data Lake add-on to your Heroku application, it exposes environment variables that allow you to make an AWS Athena connection.

They are:

GRAX_AWS_ACCESS_KEY_ID=
GRAX_AWS_SECRET_ACCESS_KEY=
GRAX_AWS_REGION=
GRAX_S3_STAGING_DIR=
GRAX_ATHENA_WORKGROUP=
GRAX_ATHENA_DATABASE=

Jupyter Notebook

You can deploy a configurable JupyterHub installation that expects the add-on environment variables from the add-on and automatically configures a client that returns a pandas data frame.

Deploy to Heroku

Connecting from Python

You can use any Athena client to connect to and query the data lake. Here is an example using the pyathena client:

Last updated

Was this helpful?