🚀 Speed up NPM/Yarn install in Gitlab

This article is the sum of my findings in scope of Gitlab Issue that’s still unresolved at this moment (April 2020). In short, when the node_modules becomes large Gitlab is experiencing huge performance bottleneck due to great amount of files it needs to archive and upload during caching procedures. I have came up with few ways to improve caching performance
At RingCentral we use runners and local S3 cache server, we have 120K files after yarn install of one of our monorepos, with overall size about 400–600 megabytes. Monorepo is a git repository which has multiple packages at once, packages can have own dependencies and usually are linked together. This is the worst case scenario since amount of Node modules can become insanely big. But this article can help you to deal with regular repositories too if they have lots of dependencies.
Advice zero would be to use Yarn Workspaces to deal with monorepo instead of Lerna, as latter is 2–3 times slower. Install and caching phases timings roughly are:
Cache download about 1 minute
takes
Cache create 4 minutes (zipping of hundreds of thousands files)
Cache upload 1 minutes
Bare Yarn install 3 minutes
Yarn install on top of cache 1 minute
This is our baseline, now let’s analyze how we can speed things up.
Solution 1: Create cache in first job, run others with read-only cache
before_script: - yarn install cache: key: $CI_PROJECT_ID policy: pull untracked: true install: stage: install script: echo 'Warming the cache' cache: key: $CI_PROJECT_ID policy: pull-push paths: - .yarn - node_modules - 'packages/*/node_modules' lint: stage: test script: yarn run lint:all
Copy
YAML
Key measurements:
Install, cache create & upload: 3+4+1 = 7 min
Download, install, skip upload: 1+1 = 2 min
Best case scenario without cache upload it actually speeds things up a lot. It’s OK to waste time in ONE job to create and upload cache, if you have many other jobs that only download.
Given cache create+upload takes 5min, and Yarn install with cache saves us (3min pure install - 1min cached install) = 2min, it means that at least two jobs must successfully download cache and utilize it in Yarn install to justify time spent on uploading. More jobs use the cache — the better.
Solution 1, but with per-job cache for selected jobs
Down side is that cache can’t be configured “per-job” since you can have only one job that uploads cacheIf you need to cache something else besides node_modules you're stuck. And unfortunately, jobs like ESLint spend more time on actual analysis than on installation etc., so caching those artifacts is even more important than node_modules.
Solution for this could be to disable Yarn workspaces for such jobs, only top level packages will be installed, which allowed to cut Yarn install time from 3.5min to ~1min, so no need to cache node_modules at all, just cache ESLint artifacts.
lint: stage: test before_script: - sed -i 's/workspaces/workspaces-disabled/g' package.json - yarn install --no-lockfile - sed -i 's/workspaces-disabled/workspaces/g' package.json script: yarn lint cache: # here we use per-job cache key: $CI_COMMIT_REF_NAME-lint paths: - .eslint
Copy
YAML
Solution 2: Build a docker image and use it in other jobs
Since each job is running in a container we can build an image for the container as first step. We will be using Kaniko because it is faster than regular Docker-in-Docker (DIND):
stages: - docker - build - testvariables: IMAGE_TAG:$CI_REGISTRY_IMAGE:$CI_COMMIT_REF_NAME docker: stage: docker image: name: gcr.io/kaniko-project/executor:debug entrypoint: [""] script: - echo "{\"auths\":{\"${CI_REGISTRY}\":{\"username\":\"${CI_REGISTRY_USER}\",\"password\":\"${CI_REGISTRY_PASSWORD}\"}}}" > /kaniko/.docker/config.json - time /kaniko/executor --context="${CI_PROJECT_DIR}" --dockerfile="${CI_PROJECT_DIR}/Dockerfile" --destination="${IMAGE_TAG}" --cache=true build: image: "${IMAGE_TAG}" stage: build script: yarn build artifacts: paths: - 'packages/*/lib' # all artifacts that are used by tests variables: GIT_STRATEGY: nonelint: image: "${IMAGE_TAG}" before_script: cd /opt/workspace script: yarn lint variables: GIT_STRATEGY: none cache: # here we use per-job cache key: $CI_COMMIT_REF_NAME-lint paths: - .eslint
Copy
YAML
We’re using Gitlab’s own Container Registry via environment variables. But you can use Docker Hub or anything else too, just define your own ENV variables.
We can skip the git checkout (GIT_STRATEGY: none) since the code is already a part of image.
In order to get maximum optimization we will build a special image manually (as described in the article https://medium.com/@dSebastien/speeding-up-your-ci-cd-build-times-with-a-custom-docker-image-3bfaac4e0479) and push it from local computer to registry once in a while. Dependencies of this base image may become outdated, so on CI we will use base image just as a foundation for real image which will be used in jobs. Pre-warmed cache of the base image will help to speed up build of the second one.
Let’s create a shared.Dockerfile for the base image.
FROMringcentral/web-tools:alpine WORKDIR /opt/workspace ADD yarn.lock package.json lerna.json ./ # Note! # Add one by one, you can't use glob because ofthis bug in Docker ADD packages/a/package.json packages/a/package.json ADD packages/b/package.json packages/b/package.json ADD packages/c/package.json packages/c/package.json RUN yarn install--link-duplicates
Copy
Docker
You can build and push this image like so:
$ docker login YOUR-GITLAB-OR-OTHER-REGISTRY $ docker build --tagSOMENAME --fileshared.Dockerfile . $ docker tag SOMENAMESOMENAME:latest $ docker push SOMENAME:latest
Copy
Bash
This will require a DOCKERFILE, in which we will do pretty much the same but it would be based n the previous image:
FROM SOMENAME:latest WORKDIR /opt/workspaceADD yarn.lock package.json lerna.json ./ # Note! # Add one by one, you can't use glob because ofthis bug in Docker ADD packages/a/package.json packages/a/package.json ADD packages/b/package.json packages/b/package.json ADD packages/c/package.json packages/c/package.json RUN yarn install --link-duplicates # this cuts size of node_modules ADD . ./ # now we add all the files RUN rm -rf $(yarn cache dir)
Copy
Docker
Make sure to have .dockerignore and ban all node_modules and build artifacts, otherwise your image will become too huge.
In our case timings were:
Install step of docker build on CI: 3 min
Image upload: 1 minute
Image download: 1.5 minutes
Which gives similar time as solution 1: Kaniko build job overall takes 5 min, regular about 6–7 min.
Overall looks like a viable approach for heavy jobs that do require full install. Image download + quick install (1.5 min+1 min) take less time than full Yarn install (3.5 min) or Gitlab cache download, so having image is beneficial.
For quick jobs like ESLint, which does not require full install of monorepo, just top-level, it is an overkill though, since these jobs won’t need all dependencies anyway, but that’s micromanagement…
Solution 3: Mount global Yarn cache to runner & don’t use Gitlab cache for node_modules
Gitlab jobs are running in Docker containers so we can mount a directory used by Yarn to globally cache packages in container to a directory in the runner. This will make any container to always have pre-warmed global Yarn cache. Unfortunately, this saves time on network downloads, but Yarn still copy tons of files into place during install.
To do that we need to add following to config.toml:
# /etc/gitlab-runner/config.toml [[runners]] ... [runners.docker] volumes = ["/cache:/usr/local/share/.cache:rw"
Copy
YAML
After that disable caching of node_modules in all jobs.
This produced steady 2 minutes installation time, and now you can cache individual artifacts per-job. You can speed it up by using top-level installs for certain jobs too.
stages: - docker - build - testimage: node:ltsbefore_script: - yarn installbuild: stage: build script: yarn build artifacts: paths: - 'packages/*/lib' # all artifacts that are used by testslint: stage: test script: yarn lint cache: # here we use per-job cache key: $CI_COMMIT_REF_NAME-lint paths: - .eslint
Copy
YAML
Solution 4. Yarn 2 with PNP
Yarn 2 with Plug’n’play remediates caching issue dramatically: https://dev.to/arcanis/introducing-yarn-2-4eh1.
You even can use regular cache because it’s still fast enough, Yarn’s cache contains zip files instead of thousands of small sources. And install itself is much faster because Plug’n’play eliminates the time spent on copying from cache to workspace.
This approach requires quite a lot of effort to resolve all the bugs of early software, but as a reward you can use the cache as intended:
stages: - docker - build - testimage: node:ltsbefore_script: - yarn install# here we can use one cache for all jobs cache: paths: - node_modules - "packages/*/node_modules" - .eslint build: stage: build script: yarn build artifacts: paths: - 'packages/*/lib' # all artifacts that are used by teststest: stage: test script: yarn test
Copy
YAML
Speed is amazing. First time w/o cache it takes same 1–2 minutes, but cache creation and upload take ~20 seconds each. But magic happens on next installs, when cache is warm: install takes 10 seconds (!!!), cache download same ~20 seconds. If we disable cache upload overall freeload on subsequent jobs is just 30 seconds!!!
Summary
As you see, worst performance is regular Gitlab Cache. Even quite fast execution with blocked cache uploads does not justify the performance failure on creation and upload.
Second comes Kaniko. Quite fast overall, faster w/o cached image, but slower on subsequent jobs. Requires juggling with Docker images. Allows to have cache per job.
Simple idea to mount global Yarn cache to runner resulted in surprisingly good and stable results, a bit worse than regular Gitlab cache but it depends how many subsequent jobs you have. If not many then it’s your best bet.
But the absolute winner undoubtedly is Yarn 2 with PNP. It’s much faster on first run, and 30 seconds on subsequent jobs is the best thing ever. And you can make it even faster if you commit it’s local cache ;)