Archive

Before you can archive data you must enable Backup. Archiving data is a destructive action. Salesforce's Cascade Delete mechanisms and trigger automation may cause more data than you expect to be destroyed. Backup is the best way to guarantee all deleted data, whether deleted by GRAX archives or from any other SFDC action, is protected in GRAX.

GRAX Archive provides you tooling to easily delete data from Salesforce with confidence.

Archived records are still available in GRAX for search, visualization and restore—just like any other record deleted directly via Salesforce. But by using GRAX to archive records you can count on us to take care of the usual concerns that come with data deletion, like auditing, capacity planning and data integrity.

The foundation for this is Backup, which ensures GRAX has the latest version of all your records. This means things like triggers and cascade deletes aren't going to unexpectedly cause data loss.

GRAX strongly recommends you first archive and restore a single EmailMessage to understand the tools and to expose any permission problems deleting and inserting data in Salesforce.

To archive your first record in the GRAX Application:

Enable Backup.
In the Record Viewer, look for an EmailMessage record and click the "Archive" button
Keep the default option "Verify records individually with Salesforce"
Look at the confirmation; that shows a plan with what data and files are going to be deleted
Click "Execute" and GRAX:
- Verifies the record and related data is backed up
- Deletes the provided records and orphaned files
- Updates the record and related data in GRAX to reflect its latest status

After this is done you can open the record again to restore it.

How It Works

Archives are performed by the GRAX integration user, which requires "Modify All Data" permissions to be able to delete records. Once you start an archive, GRAX:

Collects the "parent" object ID or IDs
Introspects the parent object schema to collect "child" object IDs based on "master detail" relationships
Applies your archive verification choice
Generates a preview of all parent and child records counts that are to be deleted for confirmation (you can switch from the graph view to a list view at this point to ensure that relationships intended for archival are checked off)
Issues batches of "Cascade Delete" API calls to Salesforce which deletes parents and relationships
Provides a summary of how many records were deleted and what errors if any were returned

What's Deleted

By default GRAX relies on Salesforce to delete related records, via cascade deletes. So when you request a Case to be archived, for example, you'll notice that related Events and Tasks are also going to be deleted.

That aside, the one case where we may delete additional records in an archive is to clean up ContentDocument records that would be left orphaned. This is important because while the cascade delete takes care of deleting files represented by an Attachment, the ContentDocument hierarchy is a bit different and would allow for orphaned records and files to be left in Salesforce.

Archive Dashboard

The Archive Dashboard provides a graphical view of Salesforce storage usage over time as well as an archive record count.

The "Show Storage Total" toggle displays the sum of object and file storage usage if enabled. If disabled, those two storage types are displayed independently. Similarly to the backup dashboard, the display of each data type can be enabled or disabled by clicking on that data type in the dashboard legend.

Archive Sources

Archive offers multiple source selection options:

Search
Salesforce Report (only in tabular format)
CSV
List of Record IDs
Query Template
Query WHERE

Using `Search` as a Source

You can use Global Search to generate an archive job based on search criteria by specifying the object you are searching for, the record status, a date range, and field filters to narrow down results.

Using `Salesforce Report` as a Source

You can upload a tabular Salesforce report to generate a list of records for an archive job. For the report to be visible in the GRAX Console it must be saved to a folder with the name GRAX in it or to have the name GRAX in the Report Title. The report also needs to be saved to a location that the GRAX Integration User has access to.

We recommend that you create a specific folder, "GRAX Reports" that you use to keep GRAX-specific Salesforce reports in.

Using `CSV` as a Source

You can upload a CSV file to use as a source of an archive job; the CSV must have a header named Id and must not exceed 500 MB. Please note that this option will start with a batch up to 20,000 - if the CSV contains more records than 20,000, an Auto Archive job can be configured to continue the archival of the remaining records.

Using `Record IDs` as a Source

You can manually input a list of Record IDs to generate an archive job; the system will automatically determine the object and generate a list of the matching records. There is a limit of 200 records for this source type.

Using `Query Template` as a Source

The Query Template archive source allow you to choose from a list of 4 objects (Case, ContentDocumentLink, EmailMessage and Task) and then select a Template based on the previously chosen object.

This option does not require any SOQL.

Using `Query WHERE` as a Source

The Query WHERE archive source is similar to writing a SOQL query, but allows the user to provide only the "WHERE" portion of the query. Other portions of the query are then generated by GRAX automatically when the archive runs using the selected object. There are several performance advantages to using query WHERE source selection for Archive over the full query. These include:

The ability to auto tune parts of the query to deal with issues like timeouts. For example, GRAX progressively adjusts the LIMIT aiming to fetch an ideal number of records to archive.
It allows GRAX to make additional optimizations to improve performance.
Testing the query to see if it works and if it has records is faster, since we initially run it with LIMIT 1.

By default these queries are issued like:

SELECT Id FROM Object WHERE ...

But you also have the option to query for a reference field, which is useful to write queries that join on different objects. A common case is to query for a History object, for example:

SELECT CaseId FROM CaseHistory WHERE ...

Verification

The recommended way to prevent data loss when deleting records with GRAX is to pick Verify records individually with Salesforce when running archives. In this mode GRAX issues SOQL queries to get information about records that may be deleted with the archive hierarchy, and then ensures that Backup has them covered.

This is extremely important when you aren't sure about the shape of the data being deleted, or when you are archiving records that may still change.

There is a faster option available, though, for when you know the data being deleted is stable and cannot change. For example, when archiving email messages via a SOQL query that only returns records created years ago, in most environments this record isn't expected to receive updates so you can archive it faster by picking Verify Backup is current.

Archive Options

Aside from the verification method, you have a few more options to pick when archiving data with GRAX:

Archive blocking children: this tells GRAX to delete child records that restrict deletes on their parent, like cases under an account.
- Without this option you'll see DELETE_FAILED errors for these records.
Ignore objects: This holds a list of objects that GRAX skips, both while archiving and restoring data.
- For archives, this ensures GRAX does not delete nor try to track deletion data for records of these given objects.

Additionally, in the Settings page there are global settings that apply to every archive.

Record Archival Statuses

Successful: Record was successfully archived.
Pending: Record is in the process of being archived OR is blocked due to an error with a parent record that the pending record is dependent on. Please note that if a record is in pending status due to it being blocked, it will stay in pending status until a subsequent archive job is run for that record individually or a subsequent archive job is run where the error for the blocking record has been resolved.
Error: Record was not archived due to an error; this could be SFDC or GRAX related and the error message for the record will need to be reviewed to find out more information.
Skipped: Record was not archived due to it either being previously deleted within SFDC or being automatically archived with a linked parent record as is the case with "ContentVersion" records.

Archive Job Statuses

Completed: Job successfully finished and archived records.
Pending: Job is preparing to run and has not been executed yet.
Sustaining: Job is waiting for new records that match the job criteria to be archived.
Error: Job did not complete and did not archive any records.
Warning: Job has at least one record that was not archived OR did not find any results.
Disabled: Job has been manually deactivated and is not currently running.

Archive Use Cases

Deleting data from Salesforce without impacting your dataset and end user experience is a challenging topic. Due to the complex nature of SFDC data, schemas, relationships, and validation rules it becomes increasingly complex to archive and restore object the higher in the data model hierarchy you go.

As an example, deleting / archiving a single Account record might be impossible. Deleting the Account record cascade deletes many other records in the master-detail relationship. But many other objects in SFDC reference the account and its children so the SFDC API won't let you delete the record by any means until you update or delete these other objects.

For this reason we strongly recommend breaking your archive plan down into small pieces and working from the bottom up of your data model. For standard SFDC objects we recommend configuring archives jobs in this order:

EmailMessage + Attachments
Task
Case
Opportunity

Inversely we strongly discourage trying to archive data top down, such as expecting an Opportunity archive to clear out all cases, tasks, and email messages below it. Chances are you'll hit validation errors and only partially delete the data you want.

In summary:

Use GRAX Backup to back up everything so GRAX has a version of your SFDC data deleted by any means
Archive data from the bottom up
Use the GRAX Embedded Experience to show archived data to you end users in SFDC

EmailMessage and Attachments to reduce SFDC storage

The simplest and most effective archive is EmailMessage and Attachments.

Most SFDC users see the vast majority of SFDC object storage coming from the EmailMessage object and file storage coming from related attachments. EmailMessage is also at the bottom of the data hierarchy so there are little to no relationships that cause validation errors on deletion.

Therefore we recommend:

configuring an archive for EmailMessage only
leaving Case, etc. objects in SFDC and using the GRAX Embedded Experience to display archived email messages on Case records in SFDC

To do so, first validate basic archive and restore capabilities:

Use "Archive by SOQL" to select the oldest EmailMessages
- SELECT Id FROM EmailMessage ORDER BY CreatedDate ASC LIMIT 1
Use the "Delete orphaned files" default option
Validate archiving and restoring 1 EmailMessage to understand the tools and to expose any permission problems deleting and inserting a single record in Salesforce.

Next, archive many email messages to reduce storage:

Use "Archive by SOQL" to target a bigger set of EmailMessages
- SELECT Id FROM EmailMessage WHERE CreatedDate < 2019-01-01T00:00:00Z
- SELECT Id FROM EmailMessage WHERE CreatedDate != LAST_N_DAYS:731

In Salesforce, Content Documents are managed by three key objects. The first is the ContentDocument, which represents an individual file. The second is the ContentVersion, a child of the ContentDocument, which tracks all versions of that file. The third is the ContentDocumentLink, which associates the ContentDocument with other records, users, and libraries.

Content Documents can be linked to multiple records simultaneously. This is supported in Salesforce through multiple ContentDocumentLinks, which allow a single ContentDocument to be associated with multiple records.

Because a ContentDocument can be related to multiple records, GRAX typically follows the guideline of archiving all three ContentDocument objects only when the ContentDocument is related to a single record. If it is associated with multiple records, GRAX will archive only the ContentDocumentLink related to the specific record.

Below are some scenarios in which GRAX will archive only the ContentDocumentLink and not all three ContentDocument objects:

If the ContentDocument was uploaded to a Content Library, creating a relationship between the ContentDocument and the ContentWorkspace, neither the ContentDocument nor the ContentVersion will be included in the archive. In this case, only the parent record and the ContentDocumentLink will be archived.
If the ContentDocument is related to multiple records, neither the ContentDocument nor the ContentVersion will be included in the archive.
If a ContentDocument and ContentVersion were once linked to multiple records but are now related to only one record (due to the deletion or archiving of other records), the ContentDocument and ContentVersion will still not be included in the archive.

Archiving a ContentDocument directly will remove all related ContentVersion and ContentDocumentLink records.

Archive Task Hierarchy

Another simple and effective archive is Task for both storage savings and to delete old records for data compliance reasons. Many SFDC users have Task records from years in the past that no longer need to be retained as live records in SFDC.

Therefore we recommend:

Configuring an archive for Task only
Leaving Case, etc. objects in SFDC and using the GRAX Embedded Experience to display archived Task on a Case record in SFDC

To do so, first validate basic archive and restore capabilities:

Use "Archive by SOQL" to select the oldest Task
- SELECT Id FROM Task ORDER BY CreatedDate ASC LIMIT 1
Validate archiving and restoring 1 Task to understand the tools and to expose any permission problems deleting and inserting hierarchical data in Salesforce.

Next, archive many Tasks to remove old data:

Use "Archive by SOQL" to target a bigger set of Cases
- SELECT Id FROM Task WHERE CreatedDate < 2019-01-01T00:00:00Z
- SELECT Id FROM Task WHERE CreatedDate != LAST_N_DAYS:731

Frequently Asked Questions

Can records that are archived by GRAX be recovered from the SFDC recycle bin?

No - Archived records are "hard deleted" by GRAX. Archived records can be recovered by using Restore.

Why is my Auto Archive job not picking up new matching records?

When an archive job is initiated from the Global Search module, any subsequent Auto Archive job will run only as many times as necessary to process the initial batch of search results; it will not continue to search for or archive newly matching records.

To ensure that an Auto Archive job consistently captures new records based on defined search criteria, it must be configured using the Search source within the Archive module.

Last updated 15 days ago

Was this helpful?

How It Works

What's Deleted

Archive Dashboard

Archive Sources

Using Search as a Source

Using Salesforce Report as a Source

Using CSV as a Source

Using Record IDs as a Source

Using Query Template as a Source

Using Query WHERE as a Source

Verification

Archive Options

Record Archival Statuses

Archive Job Statuses

Archive Use Cases

EmailMessage and Attachments to reduce SFDC storage

Archiving Records with Related Content Documents

Archive Task Hierarchy

Frequently Asked Questions

Can records that are archived by GRAX be recovered from the SFDC recycle bin?

Using `Search` as a Source

Using `Salesforce Report` as a Source

Using `CSV` as a Source

Using `Record IDs` as a Source

Using `Query Template` as a Source

Using `Query WHERE` as a Source