Troubleshooting FAQ
This document is a list of the most important questions for troubleshooting and evaluating issues blocking general operations of GRAX including networking failures, boot failures, restart behaviors, etc. This guide includes commands specific to the Amazon Linux 2 AMI maintained by AWS, but attempts to remain otherwise infrastructure-agnostic where possible. Some steps may not work as intended if you have made heavy customizations to networking, image, or environment.
Where are GRAX files located?
First, let's note the locations of the GRAX binary and environment file. Keeping track of their paths helps validate service configurations in later steps. In a typical installation, GRAX is stored under /home/ec2-user/graxinc/grax
and the .env
file is stored under /home/ec2-user
. Thus, we'll use the following paths:
- GRAX binary:
/home/ec2-user/graxinc/grax/grax
- GRAX command-line tools:
/home/ec2-user/graxinc/grax/graxctl
- Environment file:
/home/ec2-user/.env
Is GRAX executable?
To ensure that Linux knows GRAX is an executable, we can check permissions on the file as follows:
[root@grax-test-runtime grax]# ls -la
total 278108
drwxr-xr-x 2 root root 137 Aug 20 10:09 .
drwxr-xr-x 3 root root 18 Aug 18 12:38 ..
-rwxr-xr-x 1 root root 71471248 Aug 20 10:09 grax
-rwxr-xr-x 1 root root 56687264 Aug 19 15:48 graxctl
-rw-r--r-- 1 root root 52411443 Aug 18 12:39 master.zip
The "x" in the permissions strings at the beginning of each line denotes an executable file. If the grax
and graxctl
files aren't executable, we can mark them as such:
[root@grax-test-runtime grax]# chmod +x grax graxctl
How should the Environment file be formatted?
There are several important rules to remember for .env
files:
- Only one key-value pair per line
- Only
=
is supported as a key-value separator - Comments aren't supported
Comments are the most commonly seen issue as teams often attempt to label values for later reference. Unfortunately, this causes most .env
parsers to immediately return (sometimes non-fatally). This can lead to partial configurations and thus cause indeterminate symptoms.
For a total example of a valid .env
file, see our Linux Install Guide.
Is the service (systemd) working properly?
This guide assumes you're operating GRAX as a permanent service on the instance with systemd
. The most common issues with systemd
are configuration issues in the service file. In a typical installation, the GRAX service file is at the path /lib/systemd/system/grax.service
.
Validate Configuration
We can see the contents of the service configuration by using cat
:
[root@grax-test-runtime grax]# cat /lib/systemd/system/grax.service
[Install]
WantedBy=multi-user.target
[Service]
EnvironmentFile=/home/ec2-user/.env
ExecStart=/home/ec2-user/graxinc/grax/grax
Restart=always
Type=simple
[Unit]
Description=grax daemon
Check the following:
EnvironmentFile
is a valid absolute path that points to your GRAX.env
file.ExecStart
is a valid absolute path that points to your GRAX executable.Restart
is "always" to ensure GRAX is always running regardless of exit-singaling.
Service Status
The services run via systemd
are managed and interacted with via the systemctl
command. To see the current status of the GRAX service, we can use the status
subcommand:
[root@grax-test-runtime grax]# systemctl status grax.service
● grax.service - grax daemon
Loaded: loaded (/usr/lib/systemd/system/grax.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2022-08-19 17:35:29 UTC; 5 days ago
Main PID: 13125 (grax)
CGroup: /system.slice/grax.service
└─13125 /home/ec2-user/graxinc/grax/grax
(Log Lines omitted for brevity)
Hint: Some lines were ellipsized, use -l to show in full.
We always expect the GRAX service to be "active" unless under maintenance or GRAX was intentionally taken offline. If the service isn't "active" (that is "failed" or "stopped"), you can restart GRAX at any time by running the restart
subcommand:
[root@grax-test-runtime grax]# systemctl restart grax.service
If the GRAX service is entirely disabled, you can enable it with the enable
subcommand, and then enforce a start immediately with start
:
[root@grax-test-runtime grax]# systemctl enable grax.service; systemctl start grax.service
A successful start of the service (and thus app) outputs logs in the app log file. If you have a regular health check configured, you'll see logs in relation to those calls being submitted to the log if the app is active.
Is the Web Server Serving Requests?
GRAX is a web server and API. It offers an endpoint for an external health check to see if the app is available. The health check endpoint for GRAX is an HTTP/1.1 HTTPS-only GET handler on /health
. In a typical installation, GRAX runs on port 8000.
We can manually check the status of the app from the instance by curling the local route:
[root@grax-test-runtime grax]# curl -k https://localhost:8000/health
ok
The expected value from the endpoint HTTP status 200; this signifies a healthy service. A failed call, either via timeout, rejection, or different status is a sign of a failed service/app. This endpoint is designed for load balancer registration and de-registration, not for instance replacement.
Is Connectivity intact?
The GRAX app is a data-processor at its core. To process data, it must be able to retrieve that data, write it to storage, and read it back. When you add app maintenance, licensing, and telemetry to the equation, connectivity is critical to ensure proper operation.
Only some pieces of overall connectivity requirements are possible to test from the instance. These are the egress connections that are used to push or pull data to/from remote resources. Ingress communications, as they start from other sources, are harder to test.
Timeouts, rejections, or broken connections during the following tests are considered failures. All failures should be investigated.
GRAX HQ
Communication to GRAX HQ is egress-only, and can be tested relatively simply. To start, we can verify connectivity to the GRAX packaging API, which allows downloading of the app in the first place:
[root@grax-test-runtime grax]# curl -L -o testgrax https://hq.grax.com/api/v2/download/graxinc/grax/master/linux/amd64
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 68 100 68 0 0 4270 0 --:--:-- --:--:-- --:--:-- 4533
100 48.6M 0 48.6M 0 0 6344k 0 --:--:-- 0:00:07 --:--:-- 9270k
The -L
flag is utilized to follow the ALB redirect that points to HQ. -o
is used to avoid printing the binary data to the terminal. A successful download of several dozen MB can be considered a passing test.
Next, let's confirm POST requests to HQ succeed:
[root@grax-test-runtime grax]# curl -L -X POST https://hq.grax.com/api/v2/dd/logs/api/v2/logs
{"cause":"","status":401,"message":"Unauthorized"}
This may seem an unusual result, but the 401 return is a good enough response to know that your POST request made it to HQ without requiring you construct a valid set of credentials for a simple test. If you don't get a JSON response in line with the above, consider the test a failure.
Salesforce
To read and write Salesforce data via the Salesforce API, GRAX must first be able to connect. We can test that connectivity much the same as above:
[root@grax-test-runtime grax]# curl https://test.salesforce.com
The response to the above should be an HTML document, too large to post here. Repeat that test for the following:
https://login.salesforce.com
- Any custom/my-domain paths configured in your organization
Postgres
To test connectivity to your DB instance, we use postgresql
, a Linux command-line tool that allows direct interaction with Postgres clusters. Installing the tool may be unnecessary depending on image, but can easily be done like the following:
[root@grax-test-runtime grax]# yum install postgresql
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd
amzn2-core | 3.7 kB 00:00:00
amzn2extra-docker | 3.0 kB 00:00:00
Package postgresql-9.2.24-6.amzn2.x86_64 already installed and latest version
Nothing to do
As you can see, our typical installation already includes the right tooling. We can connect in two ways:
- Copy the
DATABASE_URL
value from your.env
file, and runpsql [database_url]
- Use the
graxctl psql
subcommand
A valid connection results in an interactive psql
shell, which can be exited with \q
:
[root@grax-test-runtime grax]# ./graxctl psql
2022/08/25 16:25:54 trace C9tATg VBlnGV start main mainWithCode:152 e=0s
2022/08/25 16:25:54 pprof addr: [::]:46569
2022/08/25 16:25:54 trace C9tATg VBlnGV info config setTemplateDefaults:427 msg="loading general template v1.0.0 defaults" template=virtual-appliance e=0s
2022/08/25 16:25:54 trace C9tATg VBlnGV info config/secrets New:175 secretStore=database e=11ms
psql (9.2.24, server 14.5)
WARNING: psql version 9.2, server version 14.0.
Some psql features might not work.
SSL connection (cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256)
Type "help" for help.
grax=> \q
If a connection cannot be made, graxctl
tries again every 5 seconds for a few minutes. This is usually a sign of the following:
DATABASE_URL
isn't setDATABASE_URL
isn't properly formattedDATABASE_URL
contains invalid cluster nameDATABASE_URL
contains a password with special characters that need to be escapedDATABASE_URL
isn't exported to current environment- Route tables are forcing DB traffic outside of the VPC
- Security groups aren't allowing traffic from the Instance into the DB
If a connection can be made but you receive a Postgres error about DB existence, credentials, etc., then you likely have an issue with correctness in your DATABASE_URL
value (that is username, password, or DB name).
Is additional assistance available?
If you have exhausted the steps here and require further assistance (or have recommendations for quality/completeness of this guide), contact GRAX Support.
Updated 10 days ago