The MediaWiki foundation publishes a base container image that we take and modify though the
use of a Dockerfile and shell scripts and repackage into a new modified image. This image is
uploaded to an Artifact Registry that will hold our container images privately and allow us
to call them from GCP services. In this case we are using the Cloud Run service which allows
us to run a single container without provisioning servers. The container itself is stateless
and relies on a MySQL compatible relational database for persistence of user data and page
content. For configuration files and persistent binary files google storage buckets were
connected to the container using the gcsFuse connector. Secrets are added in as
environmental variables to prevent key loss if files are compromised.
Google Cloud was the first and then then fourth place I implemented the wiki project. I initially set
it up with no persistent storage and hard coded configurations. After I had gone through Azure and
AWS I came back to implement some of the things I had learned along the way. More importantly I came
back to GCP because it offers something the others do not. The ability to scale to zero and cold
start my service on request. This means that I can set up a service and it will idle down to nothing
not accumulating any costs apart from the database unless people look at it. This is incredibly
attractive for personal projects and things that will have inconsistent traffic. As long as you are
willing to allow the 10-15 second loading time from idle.
Bring lessons back
Just as I had used what I learned setting the project up in Azure to improve my architecture
when setting the wikis up in AWS. Here I came back with a better understanding of what is
possible and tried to make some small improvements. Things like using service accounts to
grant permissions and setting up SQL connections to prevent traffic from routing to the
public internet.
Mount Storage Buckets with gcsFUSE
Google developed a Linux package for mounting filesystems in user space called gcsFUSE. This
allows us to mount storage buckets as volumes for holding configuration files and
reading/writing persistent binary files.
IAM Service Accounts
Service Accounts allow cloud run to access the storage buckets and the database. GCP has a
very strict zero trust policy so service accounts are attached to instances in the same way
that task roles are attached to containers in AWS to grant them permissions for other
resources.
Autoscaling and cold starts
One of the most useful features of cloud run is that it allows you set some very basic
autoscaling that is based on the number of requests, but it lets you set the scaling all the
way down to zero. This means it will stop your container until a request to view it comes in
and then it will automatically start the container and connect your user. This can take 10
to 15 seconds or more if you have a complex startup, but it does mean you will not be
charged for running a container when no one is looking at it.
Retrospective
Google cloud was my first stop in my cloud journey, but it is also the service I know the least. At
this point I don’t have any GCP certifications. I would like to get them eventually but for now
there is a focus on learning more about security and on building projects. I still feel like GCP has
a lot to offer in terms of features if you knew how to use them. They are supposed to have a mature
set of machine learning tools. As well as service like cloud run being backed by their own custom
Kubernetes engine which I believe is the reason they can offer things like scaling to zero. It also
has some of the lowest prices for its cloud services making it a very attractive option if you know
how to use it.
I would like to revisit this and see if there could be benefits of moving to something like the full
version of google Kubernetes for something more robust. I would also like to better understand the
GCP authentication options and permission structures. While Azure strives to be business user
friendly, and AWS has more documentation and guide than you could ever hope to read. GCP
documentation can be a little difficult to decipher if you don’t already know what you are doing.