This article is the sum of my findings in scope of that’s still unresolved at this moment (April 2020). In short, when the node_modules becomes large Gitlab is experiencing huge performance bottleneck due to great amount of files it needs to archive and upload during caching procedures. I have came up with few ways to improve caching performance
At RingCentral we use runners and local S3 cache server, we have 120K files afterof one of our monorepos, with overall size about 400–600 megabytes. Monorepo is a git repository which has multiple packages at once, packages can have own dependencies and usually are linked together. This is the worst case scenario since amount of Node modules can become insanely big. But this article can help you to deal with regular repositories too if they have lots of dependencies.
Cache download about 1 minute
Cache create 4 minutes (zipping of hundreds of thousands files)
Cache upload 1 minutes
Bare Yarn install 3 minutes
Yarn install on top of cache 1 minute
This is our baseline, now let’s analyze how we can speed things up.
Solution 1: Create cache in first job, run others with read-only cache
Install, cache create & upload: 3+4+1 = 7 min
Download, install, skip upload: 1+1 = 2 min
Best case scenario without cache upload it actually speeds things up a lot. It’s OK to waste time in ONE job to create and upload cache, if you have many other jobs that only download.
Given cache create+upload takes 5min, and Yarn install with cache saves us (3min pure install - 1min cached install) = 2min, it means that at least two jobs must successfully download cache and utilize it in Yarn install to justify time spent on uploading. More jobs use the cache — the better.
Solution 1, but with per-job cache for selected jobs
Down side is that cache can’t be configured “per-job” since you can have only one job that uploads cache. If you need to cache something else besides you're stuck. And unfortunately, jobs like ESLint spend more time on actual analysis than on installation etc., so caching those artifacts is even more important than .
Solution for this could be to disable Yarn workspaces for such jobs, only top level packages will be installed, which allowed to cut Yarn install time from 3.5min to ~1min, so no need to cache at all, just cache ESLint artifacts.
Solution 2: Build a docker image and use it in other jobs
Since each job is running in a container we can build an image for the container as first step. We will be using Kaniko because it is faster than regular Docker-in-Docker (DIND):
We can skip the git checkout () since the code is already a part of image.
In order to get maximum optimization we will build a special image manually (as described in the article ) and push it from local computer to registry once in a while. Dependencies of this base image may become outdated, so on CI we will use base image just as a foundation for real image which will be used in jobs. Pre-warmed cache of the base image will help to speed up build of the second one.
Let’s create afor the base image.
You can build and push this image like so:
This will require a, in which we will do pretty much the same but it would be based n the previous image:
Make sure to haveand ban all and build artifacts, otherwise your image will become too huge.
In our case timings were:
Install step of docker build on CI: 3 min
Image upload: 1 minute
Image download: 1.5 minutes
Which gives similar time as solution 1: Kaniko build job overall takes 5 min, regular about 6–7 min.
Overall looks like a viable approach for heavy jobs that do require full install. Image download + quick install (1.5 min+1 min) take less time than full Yarn install (3.5 min) or Gitlab cache download, so having image is beneficial.
For quick jobs like ESLint, which does not require full install of monorepo, just top-level, it is an overkill though, since these jobs won’t need all dependencies anyway, but that’s micromanagement…
Solution 3: Mount global Yarn cache to runner & don’t use Gitlab cache for
Gitlab jobs are running in Docker containers so we can mount a directory used by Yarn to globally cache packages in container to a directory in the runner. This will make any container to always have pre-warmed global Yarn cache. Unfortunately, this saves time on network downloads, but Yarn still copy tons of files into place during install.
To do that we need to add following to:
After that disable caching ofin all jobs.
This produced steady 2 minutes installation time, and now you can cache individual artifacts per-job. You can speed it up by using top-level installs for certain jobs too.
Solution 4. Yarn 2 with PNP
You even can use regular cache because it’s still fast enough, Yarn’s cache contains zip files instead of thousands of small sources. And install itself is much faster because Plug’n’play eliminates the time spent on copying from cache to workspace.
This approach requires quite a lot of effort to resolve all the bugs of early software, but as a reward you can use the cache as intended:
Speed is amazing. First time w/o cache it takes same 1–2 minutes, but cache creation and upload take ~20 seconds each. But magic happens on next installs, when cache is warm: install takes 10 seconds (!!!), cache download same ~20 seconds. If we disable cache upload overall freeload on subsequent jobs is just 30 seconds!!!
As you see, worst performance is regular Gitlab Cache. Even quite fast execution with blocked cache uploads does not justify the performance failure on creation and upload.
Second comes Kaniko. Quite fast overall, faster w/o cached image, but slower on subsequent jobs. Requires juggling with Docker images. Allows to have cache per job.
Simple idea to mount global Yarn cache to runner resulted in surprisingly good and stable results, a bit worse than regular Gitlab cache but it depends how many subsequent jobs you have. If not many then it’s your best bet.
But the absolute winner undoubtedly is Yarn 2 with PNP. It’s much faster on first run, and 30 seconds on subsequent jobs is the best thing ever. And you can make it even faster if you commit it’s local cache ;)