Skip to main content

Integrating API Proxy

Capabilities

DataGrail's API Proxy integration provides the following capabilities:

ProductCapability
Request Manager
Request TypesAccess, Deletion, Do Not Sell/Share, Identifiers
Identifier CategoriesAny

Before You Start

The API Proxy integration enables you to connect custom or proprietary APIs to DataGrail by routing privacy requests through your infrastructure. To ensure successful integration, the APIs you connect must follow a flexible API contract that defines basic requirements for authentication, request/response patterns, and data formatting.

Requirements Overview

Review the requirements below to verify your API meets the necessary contract before proceeding with the integration.

Authentication

The API must support static, token-based authentication methods where credentials remain constant and do not require dynamic refresh or renewal. Supported authentication methods include:

  • API Keys - Static keys passed in headers or query parameters
  • Bearer Tokens - Long-lived tokens included in Authorization headers
  • Basic Authentication - Username and password combinations encoded in headers
  • Custom Headers - Static tokens or secrets passed in custom header fields

Authentication credentials must be stored in your credentials manager and referenced through string interpolation in your API configuration.

Request/Response Flow

The API must support a synchronous request/response pattern where the integration sends a request and receives a complete response over a single HTTP connection. The response must contain all necessary data or confirmation of the operation's success or failure.

Key requirements:

  • Synchronous responses - The API must return results immediately within the request lifecycle (no polling, callbacks, or webhooks)
  • Single connection - All operations must complete over one open HTTP connection without requiring follow-up requests
  • Stateless requests - Each request must be independent and self-contained, performing deletion, identifier retrieval, or data access without relying on context from previous requests
  • Complete responses - The response body must include all relevant data for the operation (for access requests) or appropriate status codes indicating success or failure (for deletion/test requests)

Request Timeout

All API requests for a given query type must be completed in under 5 minutes by default. For example, if three endpoints are configured for access requests, then all three must be completed in under 5 minutes (cumulative). The timeout can be configured by setting the RM_JOB_TIMEOUT_SECONDS environment variable in the container.

Response Body Format

The API responses for each request type must follow specific formatting requirements to ensure the RM Agent can properly process the data.

Access Requests

The response body of an access request must be an array of objects. The value of each key can be of any datatype, including arrays and nested objects. Each object in the array will be converted into a separate file for your data subject.

The environment variable LOGLEVEL can be set to DEBUG to get more detailed feedback if responses are malformed. Be aware that this level of logging has the potential to expose sensitive data.

Example Access Response Body
[
{
"first_name": "Peter",
"last_name": "Gibbons",
"company": "Initech",
"friends": ["Samir", "Michael"]
},
{
"address_type": "Home",
"address": {
"street": "191 N. Lamar",
"city": "Flander",
"state": "Illinois",
"zip_code": "77070"
}
}
]

The above example would generate two files for the data subject:

report_1.csv
first_name, Peter
last_name, Gibbons
company, Initech
friends_0, Samir
friends_1, Michael
report_2.csv
address_type, Home
address_street, 191 N. Lamar
address_city, Flander
address_state, Illinois
address_zip_code, 77070

Deletion Requests

The status code is the only signal used to determine if a deletion request was successful. A response body can be included for logging purposes but will not propagate to the DataGrail platform.

Test Requests

The status code is the only signal used to determine if the API is available and healthy. A response body can be included for logging purposes but will not propagate to the DataGrail platform.

Identifier Requests

The response body of an identifier retrieval request must be an array of objects. The key must be the snake_case Identifier Category name, and the value must be the identifier value.

Response Body for Identifier Under User ID Category
[
{
"user_id": "3ef6159b-a523-4ae4-a2b8-6b3ddedf1ab4"
}
]
Identifier Name

The name of the field in the configuration is the name of the identifier in DataGrail in "snake_case". For example, User ID in DataGrail would be user_id in the configuration. Learn more about identifier setup.

Connecting with RM Agent

To configure the API Proxy integration, you'll create an Agent Query Configuration object in the DataGrail application that defines how the RM Agent should interact with your API. This configuration includes the endpoint URLs, authentication headers, request bodies, and response handling for each privacy request type.

Each API endpoint you configure uses the APIProxyQuery schema below to specify the connection details and expected behavior.

Add the Agent Integration

  1. In DataGrail, navigate to Agents and select your Agent.
  2. In the top right, select Add New Integration and search for API Proxy.
  3. Under Enabled Capabilities and Enabled Identifiers, select only those that will be used for this integration.
  4. Enter the Credentials Location (e.g. AWS Secrets Manager ARN).
  5. Select the Data Retrieval behavior for deletion requests.
    warning

    When using Retrieve Data, the data reviewed may not be exactly what is deleted due to the access and deletion logic executing separately!

  6. Under Agent Query Configuration, add request logic to be executed within API Proxy for all enabled request types.
  7. Finally, select Configure Integration. Wait a few moments to ensure that the connection is successful. For failed connections, review the Agent container logs for additional details.

Understanding String Interpolation

Before configuring your API queries, it's important to understand how string interpolation works. The schema supports using curly braces {} to dynamically insert values at runtime in the url, headers, and body fields.

This allows you to reference:

  • Identifiers from DataGrail (e.g., {email}, {user_id})
  • Credentials from your secrets manager (e.g., {api_key}, {credentials})

The examples below show how to use string interpolation in each field type.

APIProxyQuery Schema

When configuring your API endpoints, you'll define a queries array where each query object follows the APIProxyQuery schema below. Each query represents a single API endpoint that the RM Agent will call for a specific request type (access, deletion, identifier retrieval, or test).

Field
urlstring (required)
The full URL of the API endpoint.
headersobject(Headers) (optional)
The headers to include in the API request.
bodystring (optional)
The body to include in the API request.
verbstring (required)
The HTTP method for the API request. All HTTP methods are supported
verify_sslstring (required)
Determines whether to verify the SSL certificate of the URL. Accepted values are "true" or "false".
valid_response_codes[]integer (required)
Status codes that the Agent should consider successful. The default value is [200].

Note: An instance of a data subject not existing should be handled the same as a successful request. If the API returns 404, for example, 404 should be added to the array.
fail_retry_response_codes[]integer (required)
Status codes that the Agent should consider unsuccessful and should be retried.

Headers

Field
<header_name>string
The value of the header.

String Interpolation Examples

The following examples demonstrate how to use string interpolation in different parts of your query configuration.

URL

URLs can support query parameters, which can be used to pass identifiers or other values. For example, if you want to pass an email address as a query parameter, you can use {email} in the URL.

URL
"url": "https://api.acme.com/v2/data-subject-request?email={email}"
Headers

Headers use substitutions to pass credentials from your credentials manager. The credentials manager will be configured to store the credentials in a key-value pair, and the value will be substituted into the header. For example, if you have a header that requires Basic Authentication, you can store the credentials in your credentials manager and reference them in the header like this:

Headers
"headers": {
"Authorization": "Basic {credentials}"
}

Credentials to authenticate API requests will be stored in JSON format in your credentials manager. The key-value pairs in the secret will be dictated by substitutions you need to make in your headers. If the API uses Basic Authentication, like the above example, the credentials should be stored as the following key-value pair:

Credentials Manager Secret
{ "credentials": "<base64 encoded username:password>" }
Body

The request body can support any content type. If using JSON, the body must be formatted as a JSON string. Single curly braces { } should be used for substitutions, and double curly braces {{ }} to escape literals.

Body
"body": "{{\"email\": \"{email}\"}}"

Test Query Configuration

Test Query Required

The API Proxy connector requires at least one test query for health checks to determine that the API is available and healthy. Test queries follow the same APIProxyQuery schema as other request types.

Test queries are used by the RM Agent to verify connectivity and availability before processing privacy requests. Configure test queries to call a lightweight endpoint that confirms the API is reachable and properly authenticated.

Example test query:

{
"url": "https://api.acme.com/health",
"headers": {
"Authorization": "Bearer {api_key}"
},
"verb": "GET",
"verify_ssl": "true",
"valid_response_codes": [200],
"fail_retry_response_codes": [500, 502, 503]
}

Troubleshooting

If you are unable to successfully connect the integration, review these common troubleshooting steps:

Agent Unable to Connect to API Proxy
  1. Verify that the network is configured to allow the Agent to connect with the API Proxy instance.
  2. Verify the Agent has permissions to access the API Proxy credentials stored in your vault.
Agent is Not Connected in DataGrail
  1. Confirm that the Agent is running, and logs do not indicate any errors.
  2. The DataGrail API Key used by the Agent is valid and not expired.
  3. The Agent has permissions to access the DataGrail API Key stored in your vault.
  4. Network egress is permitted from the Agent to your DataGrail domain.

Technical Details

Access TypeSynchronous
Deletion TypeSynchronous (Whole Record)
Opt Out TypeSynchronous

 

Need help?
If you have any questions, please reach out to your dedicated Account Manager or contact us at support@datagrail.io.

Disclaimer: The information contained in this message does not constitute as legal advice. We would advise seeking professional counsel before acting on or interpreting any material.