Skip to main content

Google Cloud Run

Overview

Sourcing the Agent Image

The Request Manager Agent Docker image is hosted in the DataGrail Artifact Registry repository. Once access is granted, use this path for retrieving the image:

us-west1-docker.pkg.dev/datagrail-202106/datagrail-rm-agent/datagrail-rm-agent:v0.8.6

To install the service on GCP, this image needs to be cloned to your Artifact Registry or Container Registry. Detailed instructions are provided below in the quick setup guide.

Network Communication Requirements

The Agent service does not support internal SSL termination, configuring a load balancer or another form of SSL termination is required. TLS 1.2+ is required to provide secure communication between services.

Ingress will be made to the Agent over port 443 and will arrive from our VPC IP: 52.36.177.91. Inbound requests from any other source should be rejected.

The Agent will make network requests to:

  • Systems you have configured
  • Storage Bucket for storing results
  • Secret Manager for retrieving credentials
  • DataGrail at https://<your-subdomain>.datagrail.io

Environment Variables

DATAGRAIL_AGENT_CONFIG

The Request Manager Agent's primary configuration is sourced from this environment variable, which defines the connections available for the Agent and the credentials used for authenticating with DataGrail. This variable is required to get the Agent service running and healthy. See the documentation for additional information and examples.

Example DATAGRAIL_AGENT_CONFIG
{
"connections": [
{
"name": "Metrics DB",
"uuid": "272c0934-0a06-4b11-8ec9-7755499001a3",
"capabilities": ["privacy/access","privacy/delete"],
"mode": "live",
"connector_type": "BigQuery",
"queries": {
"access": ["CALL metrics.dsr('access', %(email)s)"],
"delete": ["CALL metrics.dsr('delete', %(email)s)"]
},
"credentials_location": "datagrail-rm-agent-big-query"
}
],
"customer_domain": "acme.datagrail.io",
"datagrail_agent_credentials_location": "datagrail-rm-agent-credentials",
"datagrail_credentials_location": "datagrail-credentials",
"platform": {
"credentials_manager": {
"provider": "GCP",
"options": {
"project_id": "my-project"
}
},
"storage_manager": {
"provider": "GCPCloudStore",
"options": {
"bucket": "acme-datagrail-reports",
"project_id": "my-project"
}
}
}
}

GOOGLE_APPLICATION_CREDENTIALS_JSON

Best Practice

Only recommended for local development. Configure a user-managed service identity to authenticate access to Google Cloud APIs in production services instead. Setting this environment variable will also override a user-managed service identity used in the Cloud Run service.

Ensure the environment variable is set to the contents of the JSON key pair file downloaded from the Keys page of the service account's details view in IAM.

Upon Agent startup, the contents of this variable will be written to a file at app/google_application_credentials.json and an environment variable named GOOGLE_APPLICATION_CREDENTIALS will be created set to that file path.

The credentials require the following permissions:

  • Artifact Registry Service Agent - Pulling the Docker image from Artifact Registry.
  • Secret Manager Secret Accessor - Accessing the various secrets used by the Agent.
  • Storage Object Creator - Writing the results of an access request to Cloud Storage.
Example GOOGLE_APPLICATION_CREDENTIALS_JSON
{
"type": "service_account",
"project_id": "project-id",
"private_key_id": "private-key-id",
"private_key": "private-key",
"client_email": "client-email",
"client_id": "client-id",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "cert-url",
"universe_domain": "googleapis.com"
}

For more information on Google Application Default Credentials, check out the documentation.

Quick Setup Guide

We recommend deploying the Agent in GCP with Cloud Run. This toolchain provides simple and robust management of your deployment. To get an overview of Cloud Run, check out the documentation.

In this setup guide, you will:

  1. Pull and upload the Agent image to Artifact Registry
  2. Create a Service Identity with the necessary permissions
  3. Create a Cloud Run Service
  4. Create an Application Load Balancer
  5. Create a Cloud Armor policy

Upload the Agent Image

Google Cloud Run needs service images to be hosted in either Container Registry or Artifact Registry.

  1. To download the image to your local machine in the next step, you must first authenticate with the datagrail-rm-agent repository. For more information, see this Artifact Registry guide

  2. Download the image to your local machine using the following command:

    docker pull us-west1-docker.pkg.dev/datagrail-202106/datagrail-rm-agent/datagrail-rm-agent:v0.8.6
  3. Tag and upload the image to your registry. You may need to adjust the destination if you prefer the image be stored in a different region:

    docker tag us-west1-docker.pkg.dev/datagrail-202106/datagrail-rm-agent/datagrail-rm-agent:v0.8.6 \
    us-west1-docker.pkg.dev/<PROJECT_ID>/<REPOSITORY_NAME>/datagrail-rm-agent:v0.8.6
    docker push us-west1-docker.pkg.dev/<PROJECT_ID>/<REPOSITORY_NAME>/datagrail-rm-agent:v0.8.6

    The image should then appear in your registry:

    Registry Image

Create a Service Identity

Cloud Run will use the Compute Engine default service account if you do not specify a user-managed service account to run the service under. This principal is granted the Editor role, which grants read and write access on all resources in your Google Cloud project. To follow the principle of least privilege, it is recommended that you create a new service account with the most minimal set of permissions.

  1. In the Google Cloud console, navigate to IAM and click Service Account in the left-hand menu, and then Create Service Account in the top bar.

    Create Service Account

  2. Under Service account details, enter the Service account name (e.g. datagrail-rm-agent), and optionally a Service account description. The Service account ID will be automatically be generated based on the Service account name.

    Service Account Details

  3. In the Grant this service account access to project section, add at least the following three roles to the service account:

    1. Artifact Registry Service Agent - Pulling the Docker image from Artifact Registry.

    2. Secret Manager Secret Accessor - Accessing the various secrets used by the Agent.

    3. Storage Object Creator - Writing the results of an access request to Cloud Storage.

      Service Account Roles

Create Cloud Run Service

  1. In the Google Cloud console, navigate to Cloud Run and click Create Service at the top of the page to start the Service wizard.

    Create Service

  2. Under Container Image URL, enter the container image URL of the uploaded Agent image.

    Container Image URL

  3. Under Configure, enter a Service name and Region.

    Add Name and Region

  4. Under Authentication, select Allow unauthenticated invocations to enable DataGrail to reach the Agent successfully.

    Authentication

  5. Under CPU Allocation and Pricing, select CPU is always allocated to ensure processing of background tasks.

    Important Setting

    If this step is missed, the Agent will not be able to process background tasks!

    CPU Allocation

  6. Under Service Autoscaling, set minimum number of instances to 1.

    Service Autoscaling

  7. Under Ingress Control, if you will be using an Application Load Balancer, select Internal and then Allow traffic from external Application Load Balancers. If not, you can select All which will allow any external traffic to reach the API.

    Best Practice

    We strongly recommend configuring an Application Load Balancer to limit the incoming traffic to only the DataGrail IP address.

    Ingress Control

  8. Expand the Container(s), Volumes, Networking, Security section, and select the Container tab. Set the Container port to 80.

    Container Port

  9. Under Resources, set Memory to at least 4 Gib, and CPU to 1.

    Resources

  10. In the Variables & Secrets tab, select Add Variable and set the DATAGRAIL_AGENT_CONFIG environment variable. The value should be set to the configuration JSON object.

    Environment Variables

  11. Under Requests, set the timeout to 300, and maximum concurrent requests to 80.

    Requests

  12. Under Revision Autoscaling, set both the minimum and maximum number of instances to 1 to avoid any automatic autoscaling actions.

    Revision Autoscaling

  13. With everything configured, click Create at the bottom of the wizard and your service will launch!

Create an Application Load Balancer

  1. In the Google Cloud console, navigate to the Load balancing page
  2. Click Create load balancer.

Start your Configuration

  1. Under Type of load balancer, select Application Load Balancer (HTTP/HTTPS), and click Next.
  2. Under Public facing or internal, select Public facing (external), and click Next.
  3. Under Global or single region deployment, select Best for regional workloads, and click Next.
  4. Under Create load balancer, ensure that the load balancer has both Public facing (external) and Regional features.
  5. Click Configure.
    Important Setting

    In the next step, ensure that the Network you choose has a proxy-only subnet. For more information, see the documentation.

  6. On the following page, enter the Load Balancer Name, Region, and Network.

Frontend Configuration

  1. Before proceeding, make sure that you have a valid SSL certificate. For more information, see the documentation.
  2. Under New Frontend IP and port:
    1. Optionally enter a Name.
    2. In the Protocol dropdown list, select HTTPS (includes HTTP/2 and HTTP/3).
    3. Under Certificate, either select your certificate, or Create a new certificate.

Backend Configuration

  1. Under Create or select backend services, select Create a backend service.
  2. In the Create backend service window, enter a Name.
  3. Under Backend type, select Serverless network endpoint group.
  4. In the Backends > New backend dropdown list, select Create new serverless network endpoint group.
    1. In the Create Serverless network endpoint group window, enter a Name.
    2. Under Region, the region of your Application Load Balancer should be displayed.
    3. Under Serverless network endpoint group type, Cloud Run should be displayed.
    4. In the Select service > Service dropdown list, select the Cloud Run service you created in the Create Cloud Run Service section.
    5. Click Create.
  5. Back in the Create backend service window, also click Create.

Configure Routing Rules

  1. Click Simple host and path rule.
  2. Select the backend service created in the Backend Configuration section from the Backend dropdown list.

Test the Load Balancer

  1. Navigate back to the Load balancing homepage.
  2. You will see your load balancer under the Load balancers section.
  3. Click on the load balancer you just created.
  4. Note the IP Address of the load balancer.
  5. You can test your load balancer using a web browser by going to https://<load-balancer-ip-address>.
  6. Update your domain's DNS records by adding an A record (e.g. datagrail-rm-agent.your-domain.com) with this IP address.

Create Cloud Armor Policy

Best Practice

We strongly recommend creating a Cloud Armor policy to limit the incoming traffic to only the DataGrail IP address.

  1. In the Google Cloud console, navigate to Cloud Armor and click Create policy.
  2. Under Configure policy, give the policy a Name and a Description.
  3. Under the Policy type, select Backend network policy.
  4. Under Scope, select Global and click Next.
  5. Under Add more rules, click New rule.
    1. Enter a description of the rule (e.g. "Limit the incoming traffic to only the DataGrail IP address.")
    2. Under Condition > Mode, select Basic mode (IP addresses/ranges only).
    3. Under match, input 52.36.177.91.
    4. In the Action dropdown list, select Allow.
    5. Under Priority, enter 0.
    6. Click Next step.
  6. In Apply policy to targets > Targets, click Add target.
  7. Under Backend service target 1, select the load balancer backend created in the Backend Configuration section.
  8. Skip Advanced configurations and click Create policy.

The policy can take some time to attach, so wait a few minutes before confirming that you are no longer able to access the Cloud Run service from your browser.

 

Need help?
If you have any questions, please reach out to your dedicated CSM or contact us at support@datagrail.io.

Disclaimer: The information contained in this message does not constitute as legal advice. We would advise seeking professional counsel before acting on or interpreting any material.