The MediaWiki foundation publishes a base container image that we take and modify though the
use of a Dockerfile and shell scripts and repackage into a new modified image. This image is
uploaded to an Azure Container Registry that will hold our container images privately and
allow us to call them from Azure container services. In this case we are using the Container
Instance service which allows us to run a single container or group of containers without
having to provision a service plan. The container itself is stateless and relies on a MySQL
database for persistence of user data and page content. For configuration options and saving
binary files an Azure storage account File share is created and mounted to the container at
boot. Secrets are added in as environmental variables to prevent key loss if files are
compromised.
This was the second time I had set up Mediawiki in the cloud. The first time was using Google Cloud
run but I had not yet implemented any permanent file storage. I used GCP to learn the basics but
switched over to Azure when I saw the potential business use cases.
Nothing comes ready to deploy
An important lesson learned was that while you can get a base image ready to use from the
Mediawiki foundation it may still require some troubleshooting depending on your
implementation. I had not run into the same issues on Azure that I had on GCP. It may have
been due to not fully implementing some features as well as some of the underlying
infrastructure each provider uses.
Volumes only mountable from ARM template
The Azure portal does not allow you to mount volumes to containers in the GUI. To attach
volumes to your container you must build an Azure Resource Management (ARM) template. This
is a JSON document that describes your service and the ARM resource manager interpreter it
to set up your resources. Knowing how to write am ARM template becomes a valuable skill to
have when diving deeper into Azure but when just starting it is a steep climb
Apache web server and network drives
When you mount the file share directly to the location that Apache reads and writes files to
it does not have the necessary permissions or ownership. To get around this I needed to set
up symbolic linking which lets you mirror files in different locations of a filesystem but
each can have different permissions and ownership.
Apache web server and network files
There was a strange issue when Mediawiki attempted to serve images stored in the file store.
It would generate errors that ultimately came back to needing to disable a file send and
memory mapping tool in Apache. These commands ultimately became part of the initial image
modification as well as being carried forward to my other MediaWiki projects.
Secure Transport to MySQL
Since the MySQL database is managed by Azure several of the options come preset. One of these
options is the SecureTransportRequired flag, which by default is set to TRUE. This means
that connections must be made using SSL. During the implementation of this I manually set
this flag to FALSE for ease of integration.
Secure String environmental variables
In order to keep things like the database keys from being included in the image or
configuration files you can set the system to read them from environmental variables. And
while this is safer there is still a concern that if someone were to access your Azure
Portal they would be able to see the plain text credentials in the variable definitions. You
can use secure strings to tell Azure to not display them in the web portal or include them
in a template export.
Retrospective
This was my second cloud project after setting the same system up on google cloud. At this point I
did not have any cloud certifications and I was using this as a learning project. Forcing my self to
dig into documentation and resources to learn how to set up containers and databases on Azure.
Looking back at this now I see that it is not well architected, it does not have defense in depth,
it does not scale, and it is not resilient or highly available.
If I were to revisit this, I would make several adjustments such as switching from a container
instance service to a container web app service. The container web application abstracts away some
of the management and puts it in the hands of Microsoft. It would allow us to put the wiki behind an
active directory authentication. It would add a load balancer in front of the container and the
underlying service plan would allow the number of containers to scale with demand. There is also the
added benefit of deployment slots and better database integration that would not route out of our
VPC. I would also investigate setting up proper private and public subnets and endpoints. I would
ensure that the database and the file store are set to create snapshots that are dispersed
geographically. I would also look for ways to move secrets and credentials into a Key vault instead
of using environmental variables.
Overall, for a first project on Azure I think this went well and served as a great jumping off
point. The push to learn ARM syntax and start with a container and database architecture made
pursuing the Azure certifications easier as I had experience navigating some of these areas.