Skip to main content

Internal Systems API Specification

The Internal Systems API specification defines how to build consistent, scalable, and secure REST APIs for any organization's internal system, like databases, data warehouses, unstructured data stores, or homegrown applications, in order to provide an interface with the DataGrail Platform to process privacy requests.

Introduction

This specification establishes the guidelines the REST API should follow so interfaces are implemented consistently adhering to common endpoints and methods designed by DataGrail while leveraging state-of-the-art security practices and communication patterns to minimize the risk of exposure of any internal data.

You may use any programming language and/or frameworks to build the APIs that DataGrail will use to access authorized resources defined in this specification.

This specification is designed to be asynchronous for data access and deletion calls when processing privacy requests in order to ensure the efficient management of multiple simultaneous and potentially long-running requests. Identifier lookups and opt-out requests are synchronous due to workflow considerations.

By following this specification, we aim to achieve the following:

  • Define consistent and secure practices and patterns for all API endpoints interacting with DataGrail.
  • Put organizations in control of the interfaces to critical data infrastructure.
  • Make connecting and using DataGrail via REST API for internal systems as easy as any third-party SaaS without compromising data security and integrity.
  • Allow organizations to leverage prior internal work by utilizing consistently defined REST endpoints.
  • Optionality and extensibility by default.

Connecting with DataGrail

Security

Given the sensitive nature of the data exchanged to process privacy requests, all interactions with DataGrail must be secured using HTTPS via TLS v1.2 or above.

Authorization

DataGrail allows two types of authentications to allow limited authorized access to resources exposed via APIs, OAuth 2.0 Client Credentials Grant and Token Based.

OAuth 2.0 Client Credentials Grant

DataGrail recommends using this authentication method because it is the most secure option. A token retrieval endpoint is required when connecting DataGrail to the implemented API. When connecting your API to DataGrail, you will need to provide the following:

  • Client ID: Unique ID used to identify requests from DataGrail.
  • Client Secret: Secret for authorizing requests from DataGrail.
  • API Base URL: Base URL where requests will be sent.
  • API Token Endpoint URL: Endpoint that DataGrail uses to initiate the OAuth flow (e.g. /oauth/token).
Example OAuth Authorization Request

When initially connecting using the OAuth authorization method, DataGrail will initiate a client_credentials grant request to your API Base URL plus your API Token Endpoint URL i.e. https://my_company_name.com/oauth/token.

Example Headers

In the headers of the request from DataGrail, the authorization header will be a base64 encoded value of a string interpolated with the Client ID and Client Secret as client_id:client_secret. You can decode and verify the Client ID and Client Secret you’ve configured during the initial connection.

{
"content_type": "application/x-www-form-urlencoded",
"authorization": "Basic <base64 encoded client_id:client_secret>",
"X-Dg-Authtype": "OAuth",
"user_agent": "dgclient/1.0"
}
Example Body

The body will contain the grant_type with a value of client_credentials.

{
"grant_type": "client_credentials"
}
Expected Response

The access_token returned by you will be used as the authorization Bearer token for all other internal systems requests.

{
"access_token": "<your_access_token>",
"token_type": "Bearer",
"expires_in": "<expiry_time>"
}
OAuth Token Management Considerations

If you will be using your own OAuth token management with this API, it's important to note the following:

  1. Other OAuth 2.0 grant types are not currently supported.
  2. DataGrail does not require implementation of authorization scopes at the moment. If you implement them, ensure that the appropriate scopes are attached to the token grant.

Token Based

As an alternative to OAuth 2.0 Client Credentials Grant, and because we acknowledge that building such capabilities may be an operational burden for some, DataGrail also accepts the less secure option of a static token-based authentication.

The static token you set here will be used as the authorization Bearer token for all other internal systems requests.

Authorized Requests From DataGrail

Depending on the authorization method, an access_token or static token will be used as the Bearer token for all other internal systems requests. You can verify the Bearer token to authorize the requests. The header uses the following format:

{
"content_type": "json",
"authorization": "Bearer <some_token>",
"X-Dg-Authtype": "Bearer",
"user_agent": "dgclient/1.0"
}

Environments

All endpoints implemented in this specification are required to use the same base URL.

At least one Production environment is required to receive real privacy requests, and it is recommended to have at least one additional environment for Development and/or Testing to enable testing that does not interact with production data.

DataGrail recommends using the following base URL pattern:

  • Production: https://datagrail_prod.my_company_name.com/
  • Development: https://datagrail_dev.my_company_name.com/

API Overview

Workflow

Overview of a standard privacy data request through an Internal Systems API connection:

Internal Systems Workflow

Versioning

This specification will be released by major increments using versioning starting with v1. Since the API was designed with optionality and extensibility by default, DataGrail will be able to add additional functionality and other non-breaking changes to the existing version, thus limiting the need to update the version specified.

All endpoints must include the version number embedded in the path of the request URL, at the end of the service root, following this pattern: <base-url>/api/v1.

HTTP Response Codes

The HTTP status codes that are supported include:

CodeDescription
200OK -- Request was successful.
201Created -- Request was successful and resulted in the creation of a resource.
400Bad Request -- Request was submitted with invalid formatting or parameters, and the server will not process the request.
401Unauthorized -- An invalid authentication signature was provided, or the authentication signature was missing.
403Forbidden -- An authorization signature was provided without permission to access a resource.
405Method Not Allowed -- Request is not valid for the provided connection.
409Conflict -- Indicates a request conflict with the current state of the target resource.
429Too Many Requests -- The client (DataGrail) should slow down and try again later.
500Internal Server Error -- An internal and unexpected error condition occurred.
501Not Implemented -- Request is not supported by this API implementation.
502Bad Gateway -- Server received an invalid response from the upstream server.
503Service Unavailable -- Server cannot handle the request.
504Gateway Timeout -- Server was acting as a gateway or proxy and did not receive a timely response from the upstream server.

Status

All requests require an associated status for tracking progress. The following request statuses are supported:

StatusDescription
requestedIndicates that a well-formed request has been received.
processingIndicates that a request is currently being acted on.
completedIndicates that a request has been fulfilled.
failedIndicates that a request has failed to complete. Responses should provide detailed error information regarding why this happened.

Error Responses

Error responses should include the appropriate HTTP status and detailed information about what went wrong encoded in the response as JSON to facilitate troubleshooting. Any errors should be placed in an errors JSON object at the root of the response object. An associated message may be included in the message field. An example error response could look like:

{
"status": "failed",
"errors": [{ "error": "info" }],
"message": "Request is invalid."
}

General Endpoints

The endpoints in this section are necessary for DataGrail to understand the health and supported capabilities of the API:

Test Connection

DataGrail will call this endpoint to test that the credentials provided for your API are valid and the service is healthy.

Endpoint

GET /api/v1/hc

Headers

ParameterDescription
Authorizationstring
Your bearer token.

Expected Responses

Status Code: 200 OK

{
"status": "completed",
"version": "v1"
}

Status Code: 401 Unauthorized / 403 Forbidden

{
"status": "error",
"message": "Message describing the reason for the error"
}

List Available Connections

This endpoint should return a list of internal connections that are available to process requests. Each connection will be represented as an integration in DataGrail.

Endpoint

GET /api/v1/connections/list

Headers

ParameterDescription
Authorizationstring
Your bearer token.

Connection Capabilities

Each connection defines the types of requests it is capable of processing at corresponding privacy request endpoints. The following capabilities are currently supported in the capabilities response field:

CapabilityDescription
privacy/accessSupports processing of access requests.
privacy/deleteSupports processing of deletion requests.
privacy/optoutSupports processing of opt out requests.
privacy/identifiersSupports retrieval of identifiers (e.g. lookup user_id by passing email).
capability/multiple-identifiersAll connections require this capability.
Identifiers will be sent in Identifier Object Format.

Connection Mode

Connection modes ensure that systems can be adequately tested and validated without interacting with real data subject data, or triggering privacy requests in the DataGrail platform. The following connection modes are supported:

ModeDescription
liveConnection can be used in production with real privacy requests from data subjects.
test(default) Connection can only be used for development. In this mode, DataGrail will not send privacy requests to the connection.

Expected Success Response

Status Code: 200 OK

{
"results": [
{
"uuid": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"type": "PostgreSQL",
"name": "Accounts DB",
"mode": "live",
"capabilities": [
"privacy/access",
"privacy/delete",
"privacy/optout",
"capability/multiple-identifiers"
]
}
]
}

Privacy Request Endpoints

The endpoints in this section are used to receive and support privacy requests from DataGrail:

Identifier Retrieval

DataGrail will call this endpoint to retrieve identifiers related to a privacy request for the passed connection. This endpoint is synchronous and must return results in the response.

Identifiers are verifiable keys that allow the API to uniquely identify a data subject within a specific context like a data system or an organization. Data subjects can be linked to multiple identifiers, therefore the APIs must be able to support an indefinite number of identifiers of different input types.

Additionally, identifiers are linked to categories that allow us to determine which integrations can accept the identifier.

Endpoint

POST /api/v1/privacy/identifiers/:connection-uuid

Headers

ParameterDescription
Authorizationstring
Your bearer token.

Path Parameters

ParameterDescription
connection-uuidUUID
The connection UUID

Body Parameters

ParameterDescription
identifiersobject(Identifiers)
Identifiers of the data subject to action.
request_uuidUUID
Request in DataGrail.

Expected Success Response

Status Code: 200 OK

The request was accepted and the results are available in the response body.

{
"<name of identifier>": [
{ "<identifier category key>": "<identifier value>" },
...
]
}

Example

{
"phone_number": [{ "phone": "+11234567890" }],
"email": [{ "email": "batman@cave.com" }, { "email": "robin@cave.com" }]
}

Submit an Access Request

DataGrail will call this endpoint to initiate a privacy access request for the passed connection. This endpoint is asynchronous and should not return results.

When the request has been processed, results should be returned by triggering the DataGrail Webhook Callback. DataGrail will send a request to the Retrieve Results endpoint periodically until results are received.

Endpoint

POST /api/v1/privacy/access/:connection-uuid

Headers

ParameterDescription
Authorizationstring
Your bearer token.

Path Parameters

ParameterDescription
connection-uuidUUID
The connection UUID

Body Parameters

ParameterDescription
identifiersobject(Identifiers)
Identifiers of the data subject to action.
results_tokenstring
Hexadecimal string (length: 32) used to retrieve the results of this request.
request_uuidUUID
Request in DataGrail.
callback_pathstring
URL path to use for the Webhook Callback. To construct the full URL, append the callback_path to your customer domain. Begins with a forward slash.

Expected Success Response

Status Code: 200 OK

The request was accepted and the process to respond to the request will be initiated.

{
"status": "processing"
}

Submit a Deletion Request

DataGrail will call this endpoint to initiate a privacy deletion request for the passed connection. This endpoint is asynchronous and should not return results.

When the request has been processed, results should be returned by triggering the DataGrail Webhook Callback. DataGrail will send a request to the Retrieve Results endpoint periodically until results are received.

Endpoint

POST /api/v1/privacy/delete/:connection-uuid

Headers

ParameterDescription
Authorizationstring
Your bearer token.

Path Parameters

ParameterDescription
connection-uuidUUID
The connection UUID

Body Parameters

ParameterDescription
identifiersobject(Identifiers)
Identifiers of the data subject to action.
results_tokenstring
Hexadecimal string (length: 32) used to retrieve the results of this request.
request_uuidUUID
Request in DataGrail.
callback_pathstring
URL path to use for the Webhook Callback. To construct the full URL, append the callback_path to your customer domain. Begins with a forward slash.

Expected Success Response

Status Code: 200 OK

The request was accepted and the process to respond to the request will be initiated.

{
"status": "processing"
}

Submit an Opt-Out Request

DataGrail will call this endpoint to initiate an opt-out/Do Not Sell/Share request for the passed connection. This endpoint is synchronous and will return the success/failure of the operation.

Endpoint

POST /api/v1/privacy/optout/:connection-uuid

Headers

ParameterDescription
Authorizationstring
Your bearer token.

Path Parameters

ParameterDescription
connection-uuidUUID
The connection UUID

Body Parameters

ParameterDescription
identifiersobject(Identifiers)
Identifiers of the data subject to action.
Ignore Additional Parameters

Additional parameters not listed above are present in the body, however they can be ignored as they're not applicable for this request type.

Expected Success Response

Status Code: 200 OK

The request was completed.

{
"status": "completed"
}

Retrieve Results

Used by DataGrail to periodically check the status of a request. DataGrail only guarantees polling this endpoint with a frequency of every 15 minutes for a default period of 3 days.

Endpoint

POST /api/v1/results/retrieve

Headers

ParameterDescription
Authorizationstring
Your bearer token.

Body Parameters

ParameterDescription
results_tokenstring
Hexadecimal string (length: 32) used to retrieve the results of this request.
callback_pathstring
URL path to use for the Webhook Callback. To construct the full URL, append the callback_path to your customer domain. Begins with a forward slash.

Expected Success Response

Status Code: 200 OK

{
"status": "completed"
}

DataGrail Webhook Callback

When privacy requests are completed, make a request to DataGrail with the status and results.

This webhook should be triggered immediately when the processing of the request has been completed to avoid unnecessary polling of the Retrieve Results endpoint.

Endpoint

To construct the full URL, append the callback_path to your customer domain:

POST https://<customer>.datagrail.io/api/v1/data-request-callback

Headers

Use the following headers in the request to DataGrail:

{
"Content-Type": "application/json",
"Accept": "application/json",
"Authorization": "Bearer <token supplied by DataGrail>"
}

Ask your DataGrail representative for a pre-registered authentication token.

Results: Inline Data Response

Used for any Access Request result set under 10MB in size, inline results can be provided using the structure defined below.

{
"status": "completed",
"results_token": "<hexadecimal string>",
"results": {
"<connection-uuid>": [
{ "first_name": "Guinevere", "last_name": "Pendragon" },
{ "first_name": "Arthur", "last_name": "Pendragon" }
]
}
}

Results: Remote Data Response

Used for any Access Request result set larger than 10MB, this response type references the location of the files stored in your DataGrail account's cloud storage.

{
"status": "completed",
"results_token": "<hexadecimal string>",
"results_locations": [
"internal-results/<request-uuid>/<results-token>/<connection-uuid>.log",
"internal-results/<request-uuid>/<results-token>/<connection-uuid>.log"
]
}

Result File Format

Results files should be written in json-log format using a modified form of the inline results structure. Each record should be written on a new line as a JSON serialized string. These should be stored using UTF-8 encoding.

{ "<connection-uuid>": { "first_name": "Jane", "last_name": "Doe" } },
{ "<connection-uuid>": { "first_name": "John", "last_name": "Doe" } }

Deletion Response

Used for all Deletion Requests. No results are required as the only purpose is to let DataGrail know the status of the Deletion Request.

{
"status": "completed",
"results_token": "<hexadecimal string>"
}

About Identifiers

Identifier Categories

DataGrail integrations support the following identifier categories and subcategories:

CategoryCategory KeyDescription
Advertising IDadvertising_idDevice ID used for Ad-based purposes.
Advertising Subcategories
SubcategorySubcategory Key
iOSios_advertising_id
Googleandroid_advertising_id
Amazonfire_advertising_id
Microsoftmicrosoft_advertising_id
Rokuroku_advertising_id
App IDapplication_idSystem-assigned ID to a customer application.
Browser IDbrowser_idID provided via a browser cookie.
Email AddressemailPersonal, work, or other types of email address.
Identifier value returned will be validated.
Phone NumberphonePersonal, work, cell, or other types of phone numbers.
Identifier value returned will be validated.
Service IDservice_idSystem-assigned ID for a data subject record.
Social Media IDsocial_media_idID provided via a social media profile.
Social Media Subcategories
SubcategorySubcategory Key
Twittertwitter_id
Facebookfacebook_id
Intercomintercom_id
Smoochsmooch_id
User IDuser_idCustom customer assigned ID for a data subject record.

Identifier Object Format

Identifiers are a JSON object that contain values used to identify the data subject’s personal data.

Connections should always include the capability/multiple-identifiers capability, which will structure the identifiers parameter as:

{
"<name of identifier>": [
{ "<identifier category key>": "<email address>" }
],
...
}

Example:

{
"email": [
{ "email": "batman@cave.com" }
],
"my_custom_identifier": [
{ "user_id": "12345" }
]
}
Identifier Name Formatting

The name used when creating an identifier in DataGrail will be snake_case formatted in the API.

  • Identifier Name in DataGrail: My Custom Identifier
  • Identifier Name in the API: my_custom_identifier
Deprecated: Connections without capability/multiple-identifiers Capability

If a connection does not have the "capability/multiple-identifiers" capability, the identifiers parameter will be structured as:

{
"email": ["guinevere@camelotknights.com", "queen@camelotknights.com"]
}

 

Need help?
If you have any questions, please reach out to your dedicated CSM or contact us at support@datagrail.io.

Disclaimer: The information contained in this message does not constitute as legal advice. We would advise seeking professional counsel before acting on or interpreting any material.