GCP - SaaS Runtime
Service Accounts
- SaaS Runtime Service Agent:
service-PROJECT_NUMBER@gcp-sa-saasservicemgmt.iam.gserviceaccount.com- Owner: Google managed.
- Creation: Automatically provisioned when you create your first SaaS Runtime resource.
- Usage: for SaaS Runtime to perform actions in your project.
- Required accesses: SaaS Runtime needs access to your Artifact Registry, GCS, Infrastructure Manager, Cloud Build, etc.
- Actuation service account:
[email protected]- Owner: user.
- Creation: user created. In user's producer project or tenant projects.
- Usage: SaaS Runtime (via Infra Manager) uses this service account to execute your Terraform configurations. I.e. for SaaS Runtime to perform actions in tenant projects.
- Reqruied permissions:
roles/iam.serviceAccountTokenCreator: Allows the service account to generate tokens for authentication.roles/config.admin: Grants full control over Infra Manager resources.roles/storage.admin: Grants full control of Cloud Storage.- and permissions to create and manage the specific Google Cloud resources used in the Terraform config.
- Note: Google recommends creating a dedicated actuation service account for each tenant (unit) to isolate and limit the severity of potential issues.
- Artifact creation service account:
- Optional: If you manually build and upload your Terraform artifacts, you don't need a separate artifact creation service account. If you upload the artifacts through SaaS Runtime, it needs permission to put that uploaded artifacts to the AR.
- Owner: user.
- Usage: to package and upload your Terraform artifacts to Artifact Registry.
Concept Relationships
- SaaS offering is the top level umbrella term for the service you are offering.
- A unit kind is a component of your service, it can be a k8s cluster, an application, etc. One important usage of Unit kind is to manage dependencies. Also, variable mapping is the mechanism to pass data (variable values) between dependent units and their dependencies.
- A unit is one instantiation of a unit kind, in the multi-single tenant model, you should have one unit for each tenant.
- A blueprint is a Terraform configuration packaged as an Open Container Initiative (OCI) image, stored in Artifact Registry.
- A release points to a blueprint for a unit kind. Basically you pick the desired version of the blueprint.
- For a unit, you can pick the release when you click "Provision".
- A rollout is the process of updating units with a new release. A rollout process follows the rules defined in the rollout kind. Rollouts are used in Day 2 operations like upgrades, or applying a security patch.
- Rollouts target units based on their UnitKind and can optionally apply filters (unit_filter) to target a specific subset of units using the Google Cloud CLI .
- A rollout kind describes how to deploy new releases to units.
- A tenant represents a dedicated instance of the SaaS offering. It acts as a container for all the units (containing applications, databases, and infrastructure components) that you provision and manage.
- While a tenant is just a concept in SaaS Runtime, a tenant project a real GCP project that host the actual GCP resources, like GCE instances, GCS buckets, etc. The project you used for your SaaS offering is called the producer project; your SaaS customers (i.e. tenants) have their consumer projects.
- Under VPC-SC, the consumer project and the cooresponding tenant project are both within the security perimeter.
- A provisional unit kind is a unit used to can automate the tenant set up process using SaaS Runtime; otherwise you need to create and configure the tenant projects before using SaaS Runtime.
- Feature flags toggle the state or other binary behaviors of a feature. Feature flags allow you to change feature availability or feature behavior without redeploying or restarting the application.
- A flag rollout is a rollout for feature flags.
For different SaaS architecture
SaaS Runtime, as the name suggests, is only covering the "Runtime" part. It can be used for single-tenant, multi-tenant or multi-single tenant (check out the differences among these three), it really depends on how you create and configure the tenant projects.
Why SaaS Runtime?
SaaS Runtime adds another layer of abstraction:
- Say you build an awesome piece of software, but you need the world to know about it.
- You package the code as a container image, but you need to manually run it, let lone scaling it up.
- You run the container in a kubernetes cluster, but you still need to manually create the cluster.
- You created a Terraform to create and configure the cluster and use Infrastructure Manager to deploy it in GCP, but what if you are so successful that there are multiple customers want their own instance of the cluster and the application?
- Now you can use something like SaaS Runtime to automate the management of multiple instances for multiple tenants.
Under the hood
SaaS Runtime relies on other tools and GCP services:
- Terraform and the hashicorp/google provider.
- Infrastructure Manager: used to provision resources in tenant projects using Terraform.
- Artifact Registry: used to store the Terraform configurations (packaged as OCI images).
Infrastructure Manager
Terraform commands run in an ephemeral Cloud Build environment
The runtime environment of Infra Manager is an ephemeral Cloud Build environment. Infra Manager executes Terraform commands in this Cloud Build environment, and then the environment is discarded.
Cloud Build provides an ephemeral and isolated execution environment. When a Terraform job runs, Cloud Build spins up a fresh, dedicated container. As soon as the job is complete, the environment is discarded. This "one-and-done" approach means there's no persistent state, no lingering credentials, and no shared resources between different Terraform runs. This is a critical security advantage, as it minimizes the attack surface and reduces the risk of a compromised environment affecting subsequent deployments.
In contrast, if you were to run this on a long-lived GCE instance or Cloud Run, you would have to manage the security of that environment yourself. This would include ensuring the instance is patched, the network is secure, and that the credentials are properly rotated and protected. Cloud Build handles all of this automatically.
When you trigger a deployment with Infrastructure Manager, it automatically:
- Uses a Google-maintained container image with Terraform pre-installed.
- Handles the entire workflow from fetching the code to applying the changes.
- Stores the logs, state file, and other metadata in a managed Cloud Storage bucket.