Self-Hosted macOS CI on Apple Silicon with Cilicon

Marco Cancellieri
Trade Republic Engineering
6 min readJan 11, 2023

--

Update: We’ve released a new major version of Cilicon so parts of this article are outdated. Here’s what’s new in Cilicon 2.0:

  • While Cilicon 1.0 relied on a user-defined Login Item script in the VM, its new version now includes an SSH client and directly executes commands on the VM.
  • Cilicon has partially adopted the tart image format and can automatically convert 1.0 images to it.
  • The integrated OCI client can download pre-built CI images that have been created with/for tart. We recommend their macos-ventura-xcode images.

TL;DR: We released a new macOS app called Cilicon, which provisions and runs ephemeral virtual machines for CI. Using it, we were able to switch to self-hosted Actions Runners and speed up our CI by 3x while giving some of our damaged M1 MacBook Pro devices a second life.

Our Challenges with CI on macOS

When we started Trade Republic, our mobile CI consisted of a single 2013 Mac Pro running a Buildkite Agent. Its performance was satisfactory, yet it quickly became clear that manually maintaining more than one machine would not be a viable option in the long run. In 2020, as the team grew and the COVID-19 lockdowns started, we began keeping an eye out for alternatives.

Although it meant taking a hit performance-wise, we decided to switch to GitHub-hosted runners. This allowed us to run more jobs in parallel and not worry about maintenance at all.

At the time, our iOS workflow only took around 10 minutes to run (when using cached dependencies), but this quickly increased to around 30 minutes as the team and the codebase grew.

Given the 10x pricing multiplier for macOS runner minutes as well as the adoption of GitHub Actions in other repos, we rapidly started exceeding the 50,000 monthly free minutes included in our enterprise plan. Additionally, our teams grew increasingly frustrated with the performance of GitHub-hosted runners, so were on the lookout for alternatives for a while.

Given our initial experience with manually maintaining a Buildkite Agent and the effort involved, self-hosting did not seem like a viable option.

One day, however, I happened to stumble upon Apple’s Virtualization Framework. Out of curiosity I downloaded the sample code and gave it a run. Much to my surprise, the code was very simple and easy to understand. Quickly the potential of being able to create and run virtual machines with code became clear. Around the same time, I was made aware that we had a handful of M1 MacBook Pros in stock that were either broken or in too poor condition to hand them out to employees. Self-Hosting our CI suddenly became more realistic and the idea for Cilicon (CI + Silicon) was born.

Introducing Cilicon

The concept behind Cilicon boils down to a simple cycle:

Duplicate Image

Cilicon creates a clone of your Virtual Machine bundle (folder) for each run. Thanks to a nifty cloning feature in APFS this is extremely fast, even with large bundles.

Provision Shared Folder

Depending on the provisioner you choose, Cilicon places files required by your Guest OS in your bundle’s Resources folder.

The GitHub Actions Provisioner provisions the image with the runner download URL, registration token, name, and labels.

The Process Provisioner runs an executable of your choice when provisioning and de-provisioning a bundle.

You may also opt out of using a provisioner by setting the provisioner type to none. This may work fine with services like Buildkite which use non-expiring registration tokens.

Start Virtual Machine

Cilicon starts the virtual machine and automatically mounts the bundle’s Resources folder on the guest OS.

Listen for Shutdown

Cilicon listens for a shutdown of the Guest OS and removes the used image before starting over.

Cilicon Cycle: Running a sample job via Github Actions (2x playback)

To create VM Bundles, Cilicon comes with its own standalone app called “Cilicon Installer”. Once created, all that’s left to do is to start the VM in editor mode, install any needed dependencies and add start.command from the shared Resources folder as a Login Item.

Our Experience so far

After a trial period of a 2 weeks in November, during which we ran our jobs on self-hosted and GitHub-Hosted simultaneously, we felt confident to make the switch. Ever since we’ve been using our fleet of eight M1 MacBook Pros exclusively and without a single hiccup, enjoying 3x faster runs. Most of the MacBooks we used were considered undeployable to employees as they had display, keyboard or cosmetic defects. While we initially planned to switch to M1 Mac Minis after the test period, the MacBooks have been doing such a great job that we will keep using them for now.

Test Setup using 8 M1 Macbook Pro devices. Many of them were marked as broken as they had display and keyboard issues.
Left: Our initial setup consisting of 8 M1 Macbook Pros | Right: Github Actions Runner overview

Falling Back to Github Hosted Runners

Self-hosting always comes with extra risks. While we have power and network redundancy in our office server rooms, we wanted to add the ability to easily fall back to GitHub hosted runners in case we couldn’t reach our own.

To do so we added a new job to our workflow, which checks if the PR contains a label named “Run on GitHub Hosted Runner” and chooses the runs-on label for the subsequent job accordingly. Not wanting to add a labeled trigger in our workflow, meant that we needed to fetch the labels via the GitHub API rather than extracting them from the provided context variables, as re-runs provide an exact snapshot of the data from the initial run. As Github Hosted runners run on x86, you may also want to include the runner arch in your cache keys by using ${{ runner.arch }}.

jobs:
runner_type_job:
name: Runner Type Selection
runs-on: ubuntu-latest
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
outputs:
runner_type: ${{ steps.set_runner_type.outputs.runner_type }}
steps:
- id: set_runner_type
run: |
# Fetching fresh PR Labels using GH CLI, needed on Re-run Workflows.
GH_PR_URL="https://github.com/${{ github.repository }}/pull/${{ github.event.number }}"
GH_PR_LABELS=$((gh pr view $GH_PR_URL --json=labels --jq='.labels | map(.name) | @sh') | tr -d \')
echo "Labels applied: $GH_PR_LABELS"
RUN_ON_GH_HOSTED_RUNNER="Run on Github Hosted Runner"
if [[ ! " ${GH_PR_LABELS[*]} " =~ " ${RUN_ON_GH_HOSTED_RUNNER} " ]]; then
echo 'runner_type=["self-hosted", "macos-13", "ARM64", "xcode-14.1"]' >> $GITHUB_OUTPUT
else
echo 'runner_type="macos-12"' >> $GITHUB_OUTPUT
fi
test:
name: Unit Tests
needs: runner_type_job
runs-on: ${{ fromJSON(needs.runner_type_job.outputs.runner_type) }}

Maintenance

To keep maintenance effort at a minimum, Cilicon comes with a few tricks up its sleeve.

As Cilicon does not support provisioning images via network, transferring the image via SSD is currently the recommended way to go. So to eliminate any interactions with the OS (especially since some of the devices we use have broken displays), Cilicon can be configured to scan for an attached volume with a specific name and copy the VM.bundle over automatically. The start and end of the copy process are accompanied by system sounds, therefore removing the need to open the lid or interact with the keyboard.

It also doesn’t harm to restart the devices every once in a while, so Cilicon can be configured to restart the host machine after a set number of runs.

Conclusion

We’re hoping to see more companies and individuals use Cilicon to reduce cost and speed up their CI. While self-hosting hardware may not be an option for many, there’s a few hosting providers available. OakHost is offering affordable M2 Mac Mini hosting starting at around 85€/month.

Contributions to the project are also highly appreciated!

--

--