Google Cloud Run
Overview
If you are hosting in Google Cloud Platform, DataGrail recommends deploying the Request Manager Agent to a Cloud Run worker pool. This guide covers deploying the egress-only Request Manager Agent, which initiates outbound connections to DataGrail and does not require incoming traffic or load balancing. Cloud Run worker pools simplifies the management of service uptime and networking, offering a reliable and streamlined deployment process. For more information about Cloud Run worker pools, refer to Google's documentation.
Quick Setup Guide
The following sections contain the core steps to creating an egress-only Cloud Run Agent service. Since the Agent only makes outbound connections and does not receive incoming traffic, no load balancer or TLS certificate configuration is required. Please note that depending on your GCP environment's pre-existing configuration, you may need to take additional steps to configure your VPC, subnets, etc. Those are not covered in this document but we are happy to provide you with any assistance we can offer.
Sourcing the Agent Image
The Request Manager Agent Docker image is hosted in DataGrail's private image registry. Once you have obtained the credentials from your DataGrail representative, you can pull the image using the following command:
# Authenticate with the DataGrail registry
docker login contairium.datagrail.io -u $DATAGRAIL_SUBDOMAIN
# Pull the Request Manager Agent image
docker pull contairium.datagrail.io/rm-agent:$VERSION
If you prefer to have Cloud Run pull images directly from DataGrail's image registry upon startup, you can configure an Artifact Registry remote repository.
Store the API Key
Store the DataGrail API key in Google Secret Manager. During startup, the Agent will register itself with DataGrail using this API key. Not storing the API key before deploying the service will result in startup failure, as the Agent will be unable to connect to the platform without it.
- Console
- CLI
- In the Google Cloud console, navigate to Secret Manager and select Create Secret at the top of the page.
- Under Secret name, enter a name for the secret (e.g.
datagrail-api-key). - Under Secret value, either upload a JSON file containing your DataGrail API key or enter it directly.
- Select Create Secret to save.
- To minimize the risk of exposing credentials through the command shell history, create a temporary JSON file to store them securely following the schema for the secret type.
cat > datagrail-platform-credentials.json <<EOF
{
"token": "privacy_api_key..."
}
EOF
- Create the secret using the GCP CLI.
gcloud secrets create datagrail-api-key \
--data-file=datagrail-platform-credentials.json
- After creating the secret, securely delete the temporary JSON file.
shred -u platform-credentials.json
Prepare Environment Variables
The Agent's configuration is defined by the environment variables you set on the Cloud Run service. Some of these variables reference the secrets you stored in Secret Manager, while others are static values. Refer to the Environment Variables documentation for a complete list of required and optional environment variables, along with their descriptions and example values.
- Console
- CLI
If you are using the Google Cloud Console, prepare your environment variables in advance in a text editor, and have them ready to copy-paste when you reach the appropriate step in the Cloud Run worker pool creation wizard.
export DATAGRAIL_DOMAIN="<YOUR_DATAGRAIL_DOMAIN>"
export PROJECT_ID="<YOUR_PROJECT_ID>"
export BUCKET_NAME="<YOUR_BUCKET_NAME>"
cat > rm-agent-env.yaml <<EOF
RM_CUSTOMER_DOMAIN: ${DATAGRAIL_DOMAIN}
RM_PLATFORM_CREDENTIALS_LOCATION: ${API_KEY_NAME}
RM_CREDENTIALS_MANAGER: '{"provider":"GCP","options":{"project_id":"${PROJECT_ID}"}}'
RM_STORAGE_MANAGER: '{"provider":"GCPCloudStore","options":{"bucket":"${BUCKET_NAME}","project_id":"${PROJECT_ID}"}}'
EOF
Create a Service Identity
Cloud Run will use the Compute Engine default service account if you do not specify a user-managed service account to run the service under. This principal is granted the Editor role, which grants read and write access on all resources in your Google Cloud project. To follow the principle of least privilege, it is recommended that you create a new service account with the most minimal set of permissions.
- Console
- CLI
- In the Google Cloud console, navigate to IAM and select Service Account in the left-hand menu, and then Create Service Account in the top bar.
- Under Service account details, enter the Service account name (e.g. rm-agent), and optionally a Service account description. The Service account ID will automatically be generated based on the Service account name.
- In the Grant this service account access to project section, add at least the following two roles to the service account:
- Secret Manager Secret Accessor - Accessing the various secrets used by the Agent.
- Storage Object Creator - Writing the results of an access request to Cloud Storage.
export SERVICE_ACCOUNT_NAME="rm-agent-runner"
export SERVICE_ACCOUNT_ID="${SERVICE_ACCOUNT_NAME}@${PROJECT_ID}.iam.gserviceaccount.com"
# Create the service account
gcloud iam service-accounts create "${SERVICE_ACCOUNT_NAME}" \
--description="Service account for DataGrail Request Manager Agent" \
--display-name="DataGrail RM Agent"
gcloud projects add-iam-policy-binding "${PROJECT_ID}" \
--member="serviceAccount:${SERVICE_ACCOUNT_ID}" \
--role="roles/secretmanager.secretAccessor"
gsutil iam ch \
serviceAccount:${SERVICE_ACCOUNT_ID}:objectAdmin \
gs://${BUCKET_NAME}
Create a Cloud Run Worker Pool
- Console
- CLI
- In the Google Cloud console, navigate to Cloud Run, select Worker pools in the left-hand panel and then select Create worker pool.
- Under Container Image URL, select the container image from Artifact Registry.
- Under Configure, enter a Worker pool name and Region.
- Under Instances, set the Number of instances to 1.
- Under Billing, select Instance-based, so that CPU is allocated while the service is running, even if there are no incoming requests. This is necessary to ensure that the Agent can process background tasks and maintain a persistent connection to DataGrail.
- Expand the Containers, Volumes, Networking, Security section, and select the Container tab.
- Under Settings > Resources, set Memory to at least 4 Gib, and CPU to 2.
- Under Settings > Variables & Secrets, select Add Variable and set the appropriate environment variables. The complete list of required and optional environment variables can be found in the Environment Variables documentation.
- Under Settings > Security, set the Service account to the service account you created in the previous step.
- Select Create at the bottom of the wizard and your worker pool will launch.
Monitor the deployment process in the Cloud Run console. Once the worker pool is deployed, it will automatically start and connect to DataGrail using the provided API key.
export IMAGE="<YOUR_AGENT_IMAGE_URL>"
export REGION="<YOUR_REGION>"
gcloud run worker-pools deploy rm-agent-worker-pool \
--image="${IMAGE}" \
--region="${REGION}" \
--service-account="${SERVICE_ACCOUNT_ID}" \
--instances=1 \
--memory=4Gi \
--cpu=2 \
--env-vars-file=rm-agent-env.yaml
After running the command, monitor the deployment process in the Cloud Run console. Once the worker pool is deployed, it will automatically start and connect to DataGrail using the provided API key.
Your egress-only Agent service is now deployed and will initiate outbound connections to DataGrail.
Disclaimer: The information contained in this message does not constitute as legal advice. We would advise seeking professional counsel before acting on or interpreting any material.