Amazon Elastic Container Service
Overview
If you are hosting in AWS, DataGrail recommends using Elastic Container Service (ECS) to deploy the Request Manager Agent. ECS simplifies the management of load balancing, SSL termination, and service uptime, offering a reliable and streamlined deployment process. For more information about ECS, refer to Amazon's documentation.
Source the Agent Image
The Request Manager Agent Docker image is hosted in DataGrail's private image registry. Once you have obtained credentials from your DataGrail representative, you can pull the image using the following command:
# Authenticate with the DataGrail registry
docker login contairium.datagrail.io -u $DATAGRAIL_SUBDOMAIN -p $DATAGRAIL_API_KEY
# Pull the latest Request Manager Agent image (or specify a version)
docker pull contairium.datagrail.io/datagrail-rm-agent:latest
If you would like to, you may pull the image into your own Docker repository, or use it directly from our repository upon service startup.
Docker Pull Error: No Matching Manifest
If you are attempting to docker pull
the image locally and receive a "no matching manifest..." error, update the command to use the --platform
option.
For example, if the original command is docker pull ${IMAGE_URI}
:
- For M1 Macs, use:
docker pull ${IMAGE_URI} --platform=linux/amd64
- For M2 Macs, use:
docker pull ${IMAGE_URI} --platform=linux/x86_64
Agent Configuration
The Request Manager Agent requires a configuration environment variable named DATAGRAIL_AGENT_CONFIG
to define its actions. The configuration contains metadata about the connections, credentials, and other settings required for the Agent to function properly.
For details on the configuration schema, see the DataGrailAgentConfig Schema documentation.
Example DATAGRAIL_AGENT_CONFIG
{
"connections": [
{
"name": "Metrics DB",
"uuid": "272c0934-0a06-4b11-8ec9-7755499001a3",
"capabilities": ["privacy/access","privacy/delete"],
"mode": "live",
"connector_type": "Redshift",
"queries": {
"access": ["call dsr('access', %(email)s)"],
"delete": ["call dsr('delete', %(email)s)"]
},
"credentials_location": "arn:aws:secretsmanager:us-west-2:000123456789:secret:redshift-cvh7s2"
}
],
"customer_domain": "acme.datagrail.io",
"datagrail_agent_credentials_location": "arn:aws:secretsmanager:us-west-2:000123456789:secret:datagrail-agent-sjc7s3",
"datagrail_credentials_location": "arn:aws:secretsmanager:us-west-2:000123456789:secret:datagrail-bs83nh",
"platform": {
"credentials_manager": {
"provider": "AWSSecretsManager"
},
"storage_manager": {
"provider": "AWSS3",
"options": {
"bucket": "acme-datagrail-reports"
}
}
}
}
Define the configuration in a JSON file. Once complete, you can use the jq
command-line JSON processor to convert the file to a single line string suitable for use in the DATAGRAIL_AGENT_CONFIG
environment variable. Save the output of this command for use in the next steps.
jq -sR . datagrail-agent-config.json
Quick Setup Guide
The following sections contain the core steps to creating an ECS Agent service. Please note that depending on your AWS environment's pre-existing configuration, you may need to take additional steps to configure your VPC, subnets, etc. Those are not covered in this document but we are happy to provide you with any assistance we can offer.
Task Policy and Role
To give the ECS service permission to make API requests to AWS services, you will need to define a task IAM policy and role.
Policy
- In the AWS console, navigate to Identity and Access Management.
- Under Access management, select Policies and then Create policy.
- In the Policy editor, select JSON on the right-hand side.
- At a minimum, the Agent should be allowed to perform the s3:PutObject and secretsmanager:GetSecretValue actions on a set of defined resources. Use the below example policy and update it with the ARNs of your resources.
Example Policy
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "S3PutObject",
"Action": "s3:PutObject",
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::datagrail-reports-bucket/*" // Replace with your S3 bucket ARN
]
},
{
"Sid": "SecretsManagerGetSecretValue",
"Action": "secretsmanager:GetSecretValue",
"Effect": "Allow",
"Resource": [
"arn:aws:secretsmanager:us-west-2:012345678901:secret:datagrail.agent-credentials",
"arn:aws:secretsmanager:us-west-2:012345678901:secret:datagrail.credentials"
// Include other secrets the Agent will need access to, such as connector credentials
]
}
]
} - Once done creating the policy, click Next.
- Create a name for the policy and optionally a description.
- Confirm that the permissions in the Permissions defined in this policy are accurate and add any optional tags.
- Click Create policy.
Role
- Navigate back to the IAM home screen.
- Under Access management, select Roles and then Create role.
- Under Trusted entity type, select AWS service.
- Under Use case, choose Elastic Container Service as the service, and Elastic Container Service Task as the use case and click Next.
- In the Permissions policies section, search for the policy that you have just created and select it.
- Under Role details, give the role a name and optionally update the description.
- Click Create role.
ECS Cluster
Optional if the Agent service will be deployed in an existing cluster.
- Navigate to Elastic Container Service, making sure you are in your desired AWS region.
- In the left-hand menu, select Clusters and then Create cluster.
- Under Cluster configuration, give the cluster a Name.
- Under Infrastructure, select AWS Fargate (serverless).
- Click Create.
Task Definition
To create an ECS service, you will need to define a Task Definition which contains details of the Agent container's environment, configuration, and system resources.
- In the AWS console, navigate to Elastic Container Service, making sure to select your desired AWS region.
- In the left-hand menu, select Task definitions, then Create new task definition, and Create new task definition with JSON.
Creating a task definition with JSON will allow you to define the container's environment variables, configuration, and resources in a single JSON object. In the below example, update the taskRoleArn
, executionRoleArn
, awslogs-region
, image
URI, and the value
of the DATAGRAIL_AGENT_CONFIG
variable with the contents created in the Agent COnfiguration section.
The launch command is specified in the Docker image configuration. In general, there should be no need to override the default launch command directly, however for informational purposes the command used inside the image is:
CMD ["supervisord", "-n", "-c", "/etc/rm.conf"]
Example Task Definition
{
"taskRoleArn": "arn:aws:iam::012345678912:role/DataGrailAgentTaskRole", // Replace with the ARN of the Role you created in the Task Policy and Role
"executionRoleArn": "arn:aws:iam::012345678912:role/ecsTaskExecutionRole", // Replace with the ARN of the AWS-managed ECS Task Execution Role
"containerDefinitions": [
{
"essential": true,
"name": "datagrail-rm-agent",
"image": "datagrail-rm-agent:latest", // Replace with the image version you are using
"portMappings": [
{
"hostPort": 8080,
"protocol": "tcp",
"containerPort": 8080
}
],
"command": ["supervisord", "-n", "-c", "/etc/rm.conf"],
"cpu": 0,
"environment": [
{
"name": "DATAGRAIL_AGENT_CONFIG",
"value": "" // Replace with the JSON object from the Environment Variables section
}
],
"mountPoints": [],
"workingDirectory": "/app",
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-region": "us-west-2", // Replace with your AWS region
"awslogs-group": "/ecs/datagrail-rm-agent",
"awslogs-create-group": "true",
"awslogs-stream-prefix": "/ecs"
}
},
"healthCheck": {
"retries": 3,
"command": [
"CMD-SHELL",
"curl -f http://localhost:8080/docs || exit 1"
],
"timeout": 5,
"interval": 30,
"startPeriod": 1
}
}
],
"family": "datagrail-rm-agent",
"requiresCompatibilities": ["FARGATE"],
"runtimePlatform": {
"operatingSystemFamily": "LINUX"
},
"networkMode": "awsvpc",
"cpu": "1024",
"memory": "2048"
}
Security Groups
As of Request Manager Agent v0.9.0, the container port has changed to 8080. (If using a version older than v0.9.0, the Load Balancer and Service Security Groups will require outbound and inbound TCP rules on port 80, respectively).
Both the load balancer and service created in the following sections should have strict security groups to act as a virtual firewalls to control inbound and outbound traffic.
- Navigate to EC2 and select Security Groups under Network & Security in the left-hand menu.
Load Balancer Security Group
Load balancer ingress should be allowed on port 443 from DataGrail's VPC IP of 52.36.177.91
, and requests from any other source should be rejected.
- Click Create security group.
- In the Basic details section,
- Give the security group a Name indicating that it is for the load balancer (e.g., "datagrail-rm-agent-load-balancer-sg"). This is important to later identify the security group to use when creating the load balancer.
- Add a Description of the security group (e.g., "Allow HTTPS ingress to DataGrail and TCP egress to DataGrail Agent service").
- Select the VPC that the load balancer will be created in.
- In the Inbound Rules section,
- Click Add rule.
- In the Type dropdown, select HTTPS.
- In the Source dropdown, select Custom and add 52.36.177.91/32 (DataGrail's IP) as the only allowed CIDR block.
The next step is temporary. An outbound rule will be created later to allow egress from the load balancer to the service security group after it has been created.
- Under Outbound rules, delete the default rule.
- Click Create security group.
Service Security Group
The service should restrict ingress to the load balancer, and egress to:
- Systems you have configured
- A preconfigured S3 bucket for storing results
- Secrets Manager for retrieving credentials
- DataGrail at
https://<customer-name>.datagrail.io
- Navigate back to Security Groups and click Create security group.
- In Basic details,
- Give the security group a Name, indicating that it is for the service (e.g., "datagrail-rm-agent-service-sg"). This is important to later identify the security group to use when creating the service.
- Add a Description (e.g., "Allow ingress to application load balancer").
- Select the VPC that the service will be created in.
- In Inbound Rules,
- Click Add rule.
- In the Type dropdown, select Custom TCP.
- Under Port range, enter 8080.
- In the Destination dropdown, select Custom, and select the security group created in the Load Balancer Security Group section under Security Groups.
- Under Outbound Rule, leave the default All traffic rule.
- Click Create security group.
Now that the service security group has been created, an outbound rule to the application load balancer security group can be added.
- Navigate back to security groups and select the security group created in the Load Balancer Security Group.
- Under Actions in the top-right, select Edit outbound rules.
- Click Add rule.
- In the Type dropdown, select Custom TCP.
- Under Port Range, enter 8080.
- In the Destination dropdown, select Custom, and select the security group created in the Service Security Group section under Security Groups.
Target Group
As of Request Manager Agent v0.9.0, the container port has changed to 8080. (If using a version older than v0.9.0, the Target Group port will need to be set to 80).
The target group defines where the load balancer routes requests and performs health checks on the targets.
- Navigate to EC2 and select Target Groups under Load Balancing in the left-hand menu.
- Click Create target group.
- In the Basic configuration section,
- Under Choose a target type, select IP addresses.
- Under Target group name, give the target group a name indicating that it is for the Agent service load balancer (e.g., "datagrail-rm-agent-target-group")
- Under Protocol : Port, select HTTP in the Protocol dropdown, and enter 8080 in the Port field.
- Under IP address type, select IPv4.
- Under VPC, select the VPC where the Agent service will be deployed.
- Under Protocol version, select HTTP/1.
- In the Health checks section,
- Under the Health check protocol dropdown, select HTTP.
- Under the Health check path field, enter /docs.
- Click Next.
- On the Register targets, do not modify any settings and click Create target group. The Agent service will be registered during its creation.
Application Load Balancer
The Agent service does not support internal SSL termination, so configuring a load balancer or another form of SSL termination is required. TLS 1.2+ is required to provide secure communication between services.
- Navigate to EC2, and select Load Balancers under Load Balancing in the left-hand menu.
- Click Create load balancer.
- Under Load balancer types, select Create under Application Load Balancer.
- Under Basic configuration,
- Enter a Load balancer name.
- Under Scheme, select Internet-facing.
- Under IP address type, select IPv4.
- Under Network mapping,
- Select the VPC, select the VPC where the Agent service will be deployed.
- Under Mappings, select at least two Availability Zone and one public subnet per zone
- Under Security groups, remove the default group and select the security group you created in the Load Balancer Security Group section.
- Under Listeners and routing,
- In the Protocol dropdown, select HTTPS.
- In the Port field, enter 443.
- Set the Default action to Forward to the target group that you created in the Create Target Group section.
- Under Security policy,
- In the Security category dropdown, select All security policies.
- In the Policy name dropdown, select
ELBSecurityPolicy-TLS13-1-2-2021-06
.
- Review the Summary to make sure everything looks correct.
- Click Create load balancer
ECS Service
- Navigate to Elastic Container Service and select Clusters in the left-hand menu.
- Select the cluster in which you will be deploying the Agent.
- In the Services tab, select Create.
- Under the Environment section in the Create wizard, modify the Compute configuration.
- Under Compute options, select Launch type.
- In the Launch type dropdown, select FARGATE.
- In the Platform version dropdown, select LATEST.
- Under the Deployment configuration section,
- Under Application type, select Service.
- Under Task definition, select the task definition you created in the Create Task Definition section and select the latest Revision.
- Under Service name, give the service a name (e.g., "datagrail-rm-agent-service").
- Under Service type, select Replica.
- Under Desired tasks, enter 1.
- Leave all default values under Deployment options and Deployment failure detection.
- Under the Networking section,
- Select the VPC that you will be deploying the Agent in.
- Under Subnets, select a private subnet to place the Agent in.
- Under Security group, select Use and existing security group.
- Remove the default security group and select the security group created in the Service Security Group section.
- Under the Load balancing section,
- In the Load balancer type dropdown, select Application Load Balancer.
- The Container dropdown will prepopulate with the single container defined in the task definition.
- Under Application Load Balancer, select Use and existing load balancer.
- In the Load balancer dropdown, select the load balancer that you created in the Create Application Load Balancer section.
- In the Health check grace period field, enter 15.
- Under Listener, select Use an existing listener and select 443:HTTPS in the Listener dropdown.
- Under Target group, select Use an existing target group, and select the target group created in the Create Target Group section in the Target group name dropdown.
- Select Create.
You should then see the service appear on the management page of the cluster under the Services section. Click on the service and then on the Tasks tab in the service. If tasks are not in a Running state within a few minutes, you can inspect the stopped tasks to look for any errors or failures.
Alias Record for Application Load Balancer in Route 53
- Navigate to Route 53 and click on Hosted zones in the left-hand menu.
- Select the hosted zone where you will be deploying the Agent and click Create record.
- In the Record name field, enter the subdomain that you would like to use (e.g., "datagrail-rm-agent").
- In the Record type dropdown, select A.
- Toggle the Alias button ON.
- Under Route traffic to,
- Select Alias to Application and Classic Load Balancer in the Choose endpoint dropdown.
- Select the region where the load balancer was deployed in the Choose Region dropdown.
- Select the load balancer in the Choose load balancer dropdown.
- Keep Routing policy set to Simple routing and Evaluate target health set to Yes
- Click Create records.
Your Agent service will now be reachable at your subdomain!
Disclaimer: The information contained in this message does not constitute as legal advice. We would advise seeking professional counsel before acting on or interpreting any material.