Heroku Data Lakehouse

GRAX Data Lake automatically organizes your CRM data as Parquet files in S3 for arbitrary on-demand analytics queries.

AWS Athena lets you query your data lake data by SQL queries that run on the Athena in a serverless and scalable fashion.

Heroku is a serverless PaaS (Platform as a Service) that allows you to deploy applications without worrying about the underlying infrastructure.

GRAX provides a Heroku Add-on to make querying your Athena Data Lake easy.

Connection Details

When you add the GRAX Data Lake add-on to your Heroku application, it exposes environment variables that allow you to make an AWS Athena connection.

They are:

GRAX_AWS_ACCESS_KEY_ID=
GRAX_AWS_SECRET_ACCESS_KEY=
GRAX_AWS_REGION=
GRAX_S3_STAGING_DIR=
GRAX_ATHENA_WORKGROUP=
GRAX_ATHENA_DATABASE=

Jupyter Notebook

You can deploy a configurable JupyterHub installation that expects the add-on environment variables from the add-on and automatically configures a client that returns a pandas data frame.

Deploy to Herokuarrow-up-right

Connecting from Python

You can use any Athena client to connect to and query the data lake. Here is an example using the pyathena client:

Last updated

Was this helpful?