# Reusing Your Data

## Turn Your Salesforce Backup into a Strategic Data Asset

GRAX doesn't just protect your Salesforce data—it transforms it into a queryable, analyzable data product that powers modern analytics, AI, and business intelligence across your organization.

## Why Reuse Your Salesforce Data?

### Complete Historical Context

GRAX captures comprehensive Salesforce data history from day one of your backup—including field changes, deletions, and record evolution that Salesforce's native tools don't preserve. Unlike Salesforce's 90-day field history tracking or limited Data Cloud retention, GRAX gives you:

* **Years of historical depth**: Track trends across quarters and years, not just days
* **Deleted record access**: Query records removed from Salesforce production
* **Point-in-time analysis**: See what your data looked like on any historical date
* **Complete audit trails**: Field-level change history for compliance and investigation
* **Training datasets**: Rich historical data for AI/ML model development

### Data You Actually Own

Your data lives in **your cloud storage** (AWS S3, Azure Blob, or GCP Cloud Storage), not locked in a vendor platform. This means:

* **No API limits**: Query as much as you need without throttling
* **No per-query costs**: Beyond standard cloud storage fees
* **Your tools, your choice**: Use any analytics platform, warehouse, or BI tool
* **Data sovereignty**: Full control over data residency and governance
* **Cloud-agnostic**: Works with AWS, Azure, or GCP

### Enterprise-Proven Scale

Fortune 100 companies trust GRAX to handle their mission-critical Salesforce data at massive scale—processing hundreds of millions of record versions per week in production environments. Whether you're analyzing millions of records or building real-time dashboards, GRAX handles enterprise-scale workloads with sub-2-hour latency for operational analytics.

## How GRAX Fits Your Data Architecture

**GRAX provides the Bronze layer.** Your complete Salesforce history as Parquet files in your cloud storage (S3, Azure Blob, GCS). From there, customers take different approaches:

### Direct Query (Serverless)

Query Bronze directly with minimal transformation. Cost-effective for analytics workloads.

* **AWS:** Athena, Glue external tables
* **GCP:** BigQuery external tables
* **Azure:** Synapse serverless pools
* **Local/Open Source:** DuckDB for cost-free analytics on your laptop or server

### Data Lakehouse Platform

Unified analytics and data engineering on Bronze.

* Databricks (medallion architecture)
* Azure Synapse Analytics
* AWS EMR + Spark

### Traditional Warehouse

Transform and load into a data warehouse for BI.

* **Transform:** dbt, Airflow, Cloud Dataflow, custom SQL
* **Warehouse:** Snowflake, Redshift, BigQuery
* **BI:** Tableau, Looker, Power BI, QuickSight

Many customers use combinations of these approaches—for example, running Athena for ad-hoc queries while maintaining Snowflake for production dashboards.

**The key:** GRAX doesn't lock you into any approach. The open Parquet format means you can start simple and evolve as needs change.

## Choose Your Path

The right integration approach depends on your team's capabilities and goals:

| If you want to...                                                                                                     | Start here:                                                                            | Best for                                                         |
| --------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- | ---------------------------------------------------------------- |
| <p><strong>Query historical data with SQL</strong><br>Build BI dashboards, run analytics, or feed data warehouses</p> | <p><strong>Data Lake</strong><br>Automatic Parquet export to your cloud storage</p>    | <p>Data analysts<br>BI teams<br>Data engineers</p>               |
| <p><strong>Find and investigate records</strong><br>Search historical data, explore relationships, audit changes</p>  | <p><strong>Global Search</strong><br>Full-text search across all GRAX data</p>         | <p>Salesforce admins<br>Support teams<br>Compliance officers</p> |
| <p><strong>Recover deleted data</strong><br>Restore records with full relationships back to Salesforce</p>            | <p><strong>Global Search</strong><br>Find, Review, Restore workflow</p>                | <p>Admins<br>Data recovery teams<br>Support staff</p>            |
| <p><strong>Seed developer sandboxes</strong><br>Copy production data (anonymized) into dev/test environments</p>      | <p><strong>Sandbox Seeding</strong><br>On-demand data copying with anonymization</p>   | <p>Developers<br>QA teams<br>Training admins</p>                 |
| <p><strong>Build custom integrations</strong><br>Automate workflows or integrate with internal tools</p>              | <p><strong>Public API</strong><br>OpenAPI REST interface for programmatic access</p>   | <p>Developers<br>Integration engineers<br>Automation teams</p>   |
| <p><strong>Access GRAX from Salesforce UI</strong><br>View history, search, or restore without leaving SFDC</p>       | <p><strong>Managed Package</strong><br>Lightning components embedded in Salesforce</p> | <p>End users<br>Salesforce admins<br>Support agents</p>          |

## Core Capabilities

### Data Lake: SQL Analytics Foundation

**Automatically exports backup data to Parquet format** for high-performance analytics.

**What you get:**

* Cloud-native Parquet files in your S3/Azure/GCP storage
* Historical depth with all record versions over time
* Works with AWS Athena, Azure Synapse, Databricks, Snowflake, BigQuery
* Sub-2-hour latency for operational analytics
* Continuous incremental updates (no batch dumps)

**Perfect for:**

* BI dashboards without Salesforce API limits
* Data warehouse loading (Snowflake, Databricks, Redshift)
* Historical trend analysis and forecasting
* Machine learning training datasets
* Cross-system analytics (join with ERP, marketing, etc.)

**Architecture fit:** Your Bronze layer for downstream transformations

[Get started with Data Lake →](https://documentation.grax.com/reuse-data/data-lake)

***

### Global Search: Find Anything, Anytime

**Full-text search and investigation** across all GRAX historical data.

**What you get:**

* Search by any field value, date range, or text content
* View complete record history and change timeline
* Relationship graph visualization
* Export results or restore to Salesforce
* Template-based searches for common patterns

**Perfect for:**

* Finding deleted records for recovery
* Investigating data quality issues
* Compliance audits and field-level change tracking
* Training users on historical scenarios
* Root cause analysis of data problems

**Architecture fit:** Interactive investigation and recovery tool

[Explore Global Search →](https://documentation.grax.com/reuse-data/global-search)

***

### Sandbox Seeding: Production Data for Development

**Copy production data into sandboxes** with relationship preservation and optional anonymization.

**What you get:**

* Select records via Salesforce reports, SOQL, CSV, or Global Search
* Automatic relationship graph building (parent/child records)
* Deterministic or random data anonymization
* Full control over object inclusion and field overrides
* Faster than Salesforce's sandbox refresh cycle

**Perfect for:**

* Giving developers realistic test data
* QA testing against production scenarios
* Training environments with anonymized data
* On-demand sandbox refreshes (not quarterly waits)
* Testing complex integrations with real data shapes

**Architecture fit:** Development enablement and testing

[Start Sandbox Seeding →](https://documentation.grax.com/reuse-data/sandbox-seeding)

***

### Public API: Programmatic Access

**OpenAPI-based REST interface** for custom integrations and automation.

**What you get:**

* RESTful endpoints for search, backup, restore, and metadata operations
* Full OpenAPI specification at `/api/spec/grax.json`
* Token-based authentication with scoped permissions
* Webhook support for event-driven workflows (where available)
* Rate limits appropriate for enterprise workloads

**Perfect for:**

* Building custom applications on GRAX data
* Automating compliance and governance workflows
* Real-time data sync to operational systems
* Integration with internal tools and platforms
* Scheduled reporting and alerting

**Architecture fit:** Programmatic integration layer

[Explore the API →](https://documentation.grax.com/reuse-data/public-api)

***

### Managed Package: In-Salesforce Access

**Lightning components** that embed GRAX functionality directly in Salesforce UI.

**What you get:**

* Search component for finding historical records
* Record detail components showing change history
* Restore wizards integrated into page layouts
* Template-based search for common patterns
* Auto-updates via managed package releases

**Perfect for:**

* End users who never leave Salesforce
* Support teams needing quick record recovery
* Admins managing user access to GRAX features
* Zero-training adoption (familiar Salesforce UI)
* Governance via Salesforce permission sets

**Architecture fit:** User-facing access layer

[Install Managed Package →](https://documentation.grax.com/reuse-data/managed-package)
