# Reusing Your Data

## Turn Your Salesforce Backup into a Strategic Data Asset

GRAX doesn't just protect your Salesforce data—it transforms it into a queryable, analyzable data product that powers modern analytics, AI, and business intelligence across your organization.

## Why Reuse Your Salesforce Data?

### Complete Historical Context

GRAX captures comprehensive Salesforce data history from day one of your backup—including field changes, deletions, and record evolution that Salesforce's native tools don't preserve. Unlike Salesforce's 90-day field history tracking or limited Data Cloud retention, GRAX gives you:

* **Years of historical depth**: Track trends across quarters and years, not just days
* **Deleted record access**: Query records removed from Salesforce production
* **Point-in-time analysis**: See what your data looked like on any historical date
* **Complete audit trails**: Field-level change history for compliance and investigation
* **Training datasets**: Rich historical data for AI/ML model development

### Data You Actually Own

Your data lives in **your cloud storage** (AWS S3, Azure Blob, or GCP Cloud Storage), not locked in a vendor platform. This means:

* **No API limits**: Query as much as you need without throttling
* **No per-query costs**: Beyond standard cloud storage fees
* **Your tools, your choice**: Use any analytics platform, warehouse, or BI tool
* **Data sovereignty**: Full control over data residency and governance
* **Cloud-agnostic**: Works with AWS, Azure, or GCP

### Enterprise-Proven Scale

Fortune 100 companies trust GRAX to handle their mission-critical Salesforce data at massive scale—processing hundreds of millions of record versions per week in production environments. Whether you're analyzing millions of records or building real-time dashboards, GRAX handles enterprise-scale workloads with sub-2-hour latency for operational analytics.

## How GRAX Fits Your Data Architecture

**GRAX provides the Bronze layer.** Your complete Salesforce history as Parquet files in your cloud storage (S3, Azure Blob, GCS). From there, customers take different approaches:

### Direct Query (Serverless)

Query Bronze directly with minimal transformation. Cost-effective for analytics workloads.

* **AWS:** Athena, Glue external tables
* **GCP:** BigQuery external tables
* **Azure:** Synapse serverless pools
* **Local/Open Source:** DuckDB for cost-free analytics on your laptop or server

### Data Lakehouse Platform

Unified analytics and data engineering on Bronze.

* Databricks (medallion architecture)
* Azure Synapse Analytics
* AWS EMR + Spark

### Traditional Warehouse

Transform and load into a data warehouse for BI.

* **Transform:** dbt, Airflow, Cloud Dataflow, custom SQL
* **Warehouse:** Snowflake, Redshift, BigQuery
* **BI:** Tableau, Looker, Power BI, QuickSight

Many customers use combinations of these approaches—for example, running Athena for ad-hoc queries while maintaining Snowflake for production dashboards.

**The key:** GRAX doesn't lock you into any approach. The open Parquet format means you can start simple and evolve as needs change.

## Choose Your Path

The right integration approach depends on your team's capabilities and goals:

| If you want to...                                                                                                     | Start here:                                                                            | Best for                                                         |
| --------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- | ---------------------------------------------------------------- |
| <p><strong>Query historical data with SQL</strong><br>Build BI dashboards, run analytics, or feed data warehouses</p> | <p><strong>Data Lake</strong><br>Automatic Parquet export to your cloud storage</p>    | <p>Data analysts<br>BI teams<br>Data engineers</p>               |
| <p><strong>Find and investigate records</strong><br>Search historical data, explore relationships, audit changes</p>  | <p><strong>Global Search</strong><br>Full-text search across all GRAX data</p>         | <p>Salesforce admins<br>Support teams<br>Compliance officers</p> |
| <p><strong>Recover deleted data</strong><br>Restore records with full relationships back to Salesforce</p>            | <p><strong>Global Search</strong><br>Find, Review, Restore workflow</p>                | <p>Admins<br>Data recovery teams<br>Support staff</p>            |
| <p><strong>Seed developer sandboxes</strong><br>Copy production data (anonymized) into dev/test environments</p>      | <p><strong>Sandbox Seeding</strong><br>On-demand data copying with anonymization</p>   | <p>Developers<br>QA teams<br>Training admins</p>                 |
| <p><strong>Build custom integrations</strong><br>Automate workflows or integrate with internal tools</p>              | <p><strong>Public API</strong><br>OpenAPI REST interface for programmatic access</p>   | <p>Developers<br>Integration engineers<br>Automation teams</p>   |
| <p><strong>Access GRAX from Salesforce UI</strong><br>View history, search, or restore without leaving SFDC</p>       | <p><strong>Managed Package</strong><br>Lightning components embedded in Salesforce</p> | <p>End users<br>Salesforce admins<br>Support agents</p>          |

## Core Capabilities

### Data Lake: SQL Analytics Foundation

**Automatically exports backup data to Parquet format** for high-performance analytics.

**What you get:**

* Cloud-native Parquet files in your S3/Azure/GCP storage
* Historical depth with all record versions over time
* Works with AWS Athena, Azure Synapse, Databricks, Snowflake, BigQuery
* Sub-2-hour latency for operational analytics
* Continuous incremental updates (no batch dumps)

**Perfect for:**

* BI dashboards without Salesforce API limits
* Data warehouse loading (Snowflake, Databricks, Redshift)
* Historical trend analysis and forecasting
* Machine learning training datasets
* Cross-system analytics (join with ERP, marketing, etc.)

**Architecture fit:** Your Bronze layer for downstream transformations

[Get started with Data Lake →](/reuse-data/data-lake.md)

***

### Global Search: Find Anything, Anytime

**Full-text search and investigation** across all GRAX historical data.

**What you get:**

* Search by any field value, date range, or text content
* View complete record history and change timeline
* Relationship graph visualization
* Export results or restore to Salesforce
* Template-based searches for common patterns

**Perfect for:**

* Finding deleted records for recovery
* Investigating data quality issues
* Compliance audits and field-level change tracking
* Training users on historical scenarios
* Root cause analysis of data problems

**Architecture fit:** Interactive investigation and recovery tool

[Explore Global Search →](/reuse-data/global-search.md)

***

### Sandbox Seeding: Production Data for Development

**Copy production data into sandboxes** with relationship preservation and optional anonymization.

**What you get:**

* Select records via Salesforce reports, SOQL, CSV, or Global Search
* Automatic relationship graph building (parent/child records)
* Deterministic or random data anonymization
* Full control over object inclusion and field overrides
* Faster than Salesforce's sandbox refresh cycle

**Perfect for:**

* Giving developers realistic test data
* QA testing against production scenarios
* Training environments with anonymized data
* On-demand sandbox refreshes (not quarterly waits)
* Testing complex integrations with real data shapes

**Architecture fit:** Development enablement and testing

[Start Sandbox Seeding →](/reuse-data/sandbox-seeding.md)

***

### Public API: Programmatic Access

**OpenAPI-based REST interface** for custom integrations and automation.

**What you get:**

* RESTful endpoints for search, backup, restore, and metadata operations
* Full OpenAPI specification at `/api/spec/grax.json`
* Token-based authentication with scoped permissions
* Webhook support for event-driven workflows (where available)
* Rate limits appropriate for enterprise workloads

**Perfect for:**

* Building custom applications on GRAX data
* Automating compliance and governance workflows
* Real-time data sync to operational systems
* Integration with internal tools and platforms
* Scheduled reporting and alerting

**Architecture fit:** Programmatic integration layer

[Explore the API →](/reuse-data/public-api.md)

***

### Managed Package: In-Salesforce Access

**Lightning components** that embed GRAX functionality directly in Salesforce UI.

**What you get:**

* Search component for finding historical records
* Record detail components showing change history
* Restore wizards integrated into page layouts
* Template-based search for common patterns
* Auto-updates via managed package releases

**Perfect for:**

* End users who never leave Salesforce
* Support teams needing quick record recovery
* Admins managing user access to GRAX features
* Zero-training adoption (familiar Salesforce UI)
* Governance via Salesforce permission sets

**Architecture fit:** User-facing access layer

[Install Managed Package →](/reuse-data/managed-package.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.grax.com/reuse-data/reusing-your-data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
