Integrating API Proxy
Capabilities
DataGrail's API Proxy integration provides the following capabilities:
| Product | Capability | ||||
|---|---|---|---|---|---|
| Request Manager |
|
Before You Start
The API Proxy integration enables you to connect custom or proprietary APIs to DataGrail by routing privacy requests through your infrastructure. To ensure successful integration, the APIs you connect must follow a flexible API contract that defines basic requirements for authentication, request/response patterns, and data formatting.
Requirements Overview
Review the requirements below to verify your API meets the necessary contract before proceeding with the integration.
Authentication
The API must support static, token-based authentication methods where credentials remain constant and do not require dynamic refresh or renewal. Supported authentication methods include:
- API Keys - Static keys passed in headers or query parameters
- Bearer Tokens - Long-lived tokens included in Authorization headers
- Basic Authentication - Username and password combinations encoded in headers
- Custom Headers - Static tokens or secrets passed in custom header fields
Authentication credentials must be stored in your credentials manager and referenced through string interpolation in your API configuration.
Request/Response Flow
The API must support a synchronous request/response pattern where the integration sends a request and receives a complete response over a single HTTP connection. The response must contain all necessary data or confirmation of the operation's success or failure.
Key requirements:
- Synchronous responses - The API must return results immediately within the request lifecycle (no polling, callbacks, or webhooks)
- Single connection - All operations must complete over one open HTTP connection without requiring follow-up requests
- Stateless requests - Each request must be independent and self-contained, performing deletion, identifier retrieval, or data access without relying on context from previous requests
- Complete responses - The response body must include all relevant data for the operation (for access requests) or appropriate status codes indicating success or failure (for deletion/test requests)
Request Timeout
All API requests for a given query type must be completed in under 5 minutes by default. For example, if three endpoints are configured for access requests, then all three must be completed in under 5 minutes (cumulative). The timeout can be configured by setting the RM_JOB_TIMEOUT_SECONDS environment variable in the container.
Response Body Format
The API responses for each request type must follow specific formatting requirements to ensure the RM Agent can properly process the data.
Access Requests
The response body of an access request must be an array of objects. The value of each key can be of any datatype, including arrays and nested objects. Each object in the array will be converted into a separate file for your data subject.
The environment variable LOGLEVEL can be set to DEBUG to get more detailed feedback if responses are malformed. Be aware that this level of logging has the potential to expose sensitive data.
[
{
"first_name": "Peter",
"last_name": "Gibbons",
"company": "Initech",
"friends": ["Samir", "Michael"]
},
{
"address_type": "Home",
"address": {
"street": "191 N. Lamar",
"city": "Flander",
"state": "Illinois",
"zip_code": "77070"
}
}
]
The above example would generate two files for the data subject:
first_name, Peter
last_name, Gibbons
company, Initech
friends_0, Samir
friends_1, Michael
address_type, Home
address_street, 191 N. Lamar
address_city, Flander
address_state, Illinois
address_zip_code, 77070
Deletion Requests
The status code is the only signal used to determine if a deletion request was successful. A response body can be included for logging purposes but will not propagate to the DataGrail platform.
Test Requests
The status code is the only signal used to determine if the API is available and healthy. A response body can be included for logging purposes but will not propagate to the DataGrail platform.
Identifier Requests
The response body of an identifier retrieval request must be an array of objects. The key must be the snake_case Identifier Category name, and the value must be the identifier value.
[
{
"user_id": "3ef6159b-a523-4ae4-a2b8-6b3ddedf1ab4"
}
]
The name of the field in the configuration is the name of the identifier in DataGrail in "snake_case". For example, User ID in DataGrail would be user_id in the configuration. Learn more about identifier setup.
Connecting with RM Agent
To configure the API Proxy integration, you'll create an Agent Query Configuration object in the DataGrail application that defines how the RM Agent should interact with your API. This configuration includes the endpoint URLs, authentication headers, request bodies, and response handling for each privacy request type.
Each API endpoint you configure uses the APIProxyQuery schema below to specify the connection details and expected behavior.
Add the Agent Integration
- In DataGrail, navigate to Agents and select your Agent.
- In the top right, select Add New Integration and search for API Proxy.
- Under Enabled Capabilities and Enabled Identifiers, select only those that will be used for this integration.
- Enter the Credentials Location (e.g. AWS Secrets Manager ARN).
- Select the Data Retrieval behavior for deletion requests.
warning
When using Retrieve Data, the data reviewed may not be exactly what is deleted due to the access and deletion logic executing separately!
- Under Agent Query Configuration, add request logic to be executed within API Proxy for all enabled request types.
- Finally, select Configure Integration. Wait a few moments to ensure that the connection is successful. For failed connections, review the Agent container logs for additional details.
Understanding String Interpolation
Before configuring your API queries, it's important to understand how string interpolation works. The schema supports using curly braces {} to dynamically insert values at runtime in the url, headers, and body fields.
This allows you to reference:
- Identifiers from DataGrail (e.g.,
{email},{user_id}) - Credentials from your secrets manager (e.g.,
{api_key},{credentials})
The examples below show how to use string interpolation in each field type.
APIProxyQuery Schema
When configuring your API endpoints, you'll define a queries array where each query object follows the APIProxyQuery schema below. Each query represents a single API endpoint that the RM Agent will call for a specific request type (access, deletion, identifier retrieval, or test).
| Field | |
|---|---|
| url | string (required) The full URL of the API endpoint. |
| headers | object(Headers) (optional) The headers to include in the API request. |
| body | string (optional) The body to include in the API request. |
| verb | string (required) The HTTP method for the API request. All HTTP methods are supported |
| verify_ssl | string (required) Determines whether to verify the SSL certificate of the URL. Accepted values are "true" or "false". |
| valid_response_codes[] | integer (required) Status codes that the Agent should consider successful. The default value is [200]. Note: An instance of a data subject not existing should be handled the same as a successful request. If the API returns 404, for example, 404 should be added to the array. |
| fail_retry_response_codes[] | integer (required) Status codes that the Agent should consider unsuccessful and should be retried. |
Headers
| Field | |
|---|---|
| <header_name> | string The value of the header. |
String Interpolation Examples
The following examples demonstrate how to use string interpolation in different parts of your query configuration.
URL
URLs can support query parameters, which can be used to pass identifiers or other values. For example, if you want to pass an email address as a query parameter, you can use {email} in the URL.
"url": "https://api.acme.com/v2/data-subject-request?email={email}"
Headers
Headers use substitutions to pass credentials from your credentials manager. The credentials manager will be configured to store the credentials in a key-value pair, and the value will be substituted into the header. For example, if you have a header that requires Basic Authentication, you can store the credentials in your credentials manager and reference them in the header like this:
"headers": {
"Authorization": "Basic {credentials}"
}
Credentials to authenticate API requests will be stored in JSON format in your credentials manager. The key-value pairs in the secret will be dictated by substitutions you need to make in your headers. If the API uses Basic Authentication, like the above example, the credentials should be stored as the following key-value pair:
{ "credentials": "<base64 encoded username:password>" }
Body
The request body can support any content type. If using JSON, the body must be formatted as a JSON string. Single curly braces { } should be used for substitutions, and double curly braces {{ }} to escape literals.
"body": "{{\"email\": \"{email}\"}}"
Test Query Configuration
The API Proxy connector requires at least one test query for health checks to determine that the API is available and healthy. Test queries follow the same APIProxyQuery schema as other request types.
Test queries are used by the RM Agent to verify connectivity and availability before processing privacy requests. Configure test queries to call a lightweight endpoint that confirms the API is reachable and properly authenticated.
Example test query:
{
"url": "https://api.acme.com/health",
"headers": {
"Authorization": "Bearer {api_key}"
},
"verb": "GET",
"verify_ssl": "true",
"valid_response_codes": [200],
"fail_retry_response_codes": [500, 502, 503]
}
Troubleshooting
If you are unable to successfully connect the integration, review these common troubleshooting steps:
Agent Unable to Connect to API Proxy
- Verify that the network is configured to allow the Agent to connect with the API Proxy instance.
- Verify the Agent has permissions to access the API Proxy credentials stored in your vault.
Agent is Not Connected in DataGrail
- Confirm that the Agent is running, and logs do not indicate any errors.
- The DataGrail API Key used by the Agent is valid and not expired.
- The Agent has permissions to access the DataGrail API Key stored in your vault.
- Network egress is permitted from the Agent to your DataGrail domain.
Technical Details
| Access Type | Synchronous |
|---|---|
| Deletion Type | Synchronous (Whole Record) |
| Opt Out Type | Synchronous |
Disclaimer: The information contained in this message does not constitute as legal advice. We would advise seeking professional counsel before acting on or interpreting any material.