⚙️ Monorepo scripts strategies & naming conventions2019-01-29
Mono repository is a popular approach where some libraries or other fairly independent projects are colocated in one repository. One of the benefits you get is simpler control over changes that should be synchronized across all involved packages. As a downside, you get less independence of each package.
The vast majority of everyday monorepo management tasks boil down to running package scripts in a certain order and under certain conditions, both locally and on CI. This article explains one of many possible approaches. This approach worked fairly well in some of RingCentral’s monorepo projects. The size of projects varies: from less than a hundred files and a handful of devs up to thousands of files and 50+ developers.
In this article, I will use Lerna as a tool that helps to manage such mono repositories. It allows you to run scripts and execute code in packages, as well as takes care of some publishing activities. You may read more about Lerna concepts here: https://github.com/lerna/lerna#concepts.
💡 A short remark about package versions: they could be completely independent, or they could be synchronized to a certain degree. For example, if a package did have changes, its version will be updated, whereas non-updated packages will not get a version bump. But for simplicity, we will assume fully synchronized version bumps of all published packages (even unchanged), so we will be using exact versions in
lerna.json
and--force-publish=*
flag inpublish
commands.
Main ideas
- Scripts should eliminate the possibility of error in critical tasks like test coverage collection and publishing
- Default scripts (like
install
,test
andpublish
) should be no-brainers: do all the things automatically with the least astonishment - Full control mode when needed: ability to run scripts granularly or in a quicker fashion
Phases
Overall, any interaction with a repository, both locally and on the CI, can be presented as scenarios that can be split into some granular phases (or stages in Gitlab terminology). Scenarios may skip phases or have some phases substituted with other different phases. Scenarios usually have certain goals and expectations — here are the most common ones:
- A developer checks out the repo and runs
npm install
— the goal is to have a fully functional and ready-to-use dev setup at this point, the expectation is that right after this command the developer may run other commands with no additional actions related to getting things ready to work - A developer runs the
npm start
script — the goal is to runstart
in all packages, the expectation is that it just starts working, with no need to pre-build anything - CI determines that a tag has been pushed, checks out the repo, runs linter, runs build scripts, runs tests, and collects coverage, and if all is good — publishes to NPM. The goal is the delivery of high-quality code. The expectation is that if any phase breaks it stops the process and there is zero possibility of any unwanted publishes of incomplete/broken code
Let’s dive deeper into those phases.
Install
In this phase, CI or the developer sets up the environment to run other scripts.
- CI should run
npm install --ignore-scripts
so that we can control when and how to do bootstrapping in the CI environment, for example, thelint
task does not require packages to be bootstrapped, but thetest
task does (more about this in the next section) - Developers should use regular
npm install
, which must run full bootstrap
Lint
This is the cheapest check that makes sure the codebase is in good condition.
- CI should run
lint
right afterinstall
as the first cheap check - CI must stop all further expensive checks if the cheap
lint
check failed
Packages bootstrapping
In this phase, dependencies of underlying packages are installed — this is a post install on the dev machine or a post lint phase in the CI environment, but it depends on the setup.
- In general the
postinstall
script should runlerna bootstrap --hoist --no-ci
so that devs will get a fully working repo right after runningnpm install
, the-no-ci
flag is needed due to a bug https://github.com/lerna/lerna/issues/1324 - CI may not require demos to be built/tested and thus neither to be bootstrapped unless you use demos for e2e tests. In this case, we should bootstrap only the truly used packages by scoping the
bootstrap
script:npm run bootstrap -- --scope=@ringcentral/*
(we usually have libraries in this scope). This is a nice optimization that saves CI time.
Building
A phase needed to obtain an artifact, something that later will be published. Also, it may be a prerequisite for tests.
- The
build
script on package level should run theclean
script beforehand to make sure no previous artifacts exist on subsequent builds - The
build:quick
can be used to build all libs that will be used in demos
Tests
A phase that does the majority of code quality verification and makes sure things work as expected, not just written as expected (linter phase).
Sometimes code has to be built before running tests. If there are multiple libs in monorepo which depend on each other it is safer to run tests once all dependencies are built instead of using magic to run tests using only sources.
- CI runs tests by calling the
test
script, which is a maximum set of all tests with e2e tests and coverage collection - Locally devs may run
test:quick
which does not collect coverage or does not run e2e tests - CI should also run
test:coverage
script to upload coverages aftertest
- Optionally there could also be a
test:watch
script which will run quick tests in watch mode
Committing
This is a phase that occurs on dev machines when developers commit and push the code, some minor quick checks should happen on this phase.
- Obviously, we don’t want to make CI run potentially invalid code so we can use
test:quick
andlint:staged
(orlint:quick
)scripts in pre-commit hooks
Publishing
This delivers artifacts to some package management system that distributes them to end users. Code must be built and tested before publishing, which is CI responsibility.
- Publish scripts actually do the publishing (usually on CI)
- Canary means a pre-release (something you can run nightly)
- Release means a regular versioned release
- CI may add
-yes
flag topublish:*
scripts: it’s better to add them here in order to prevent unwanted local publishes
Alternatively, on publish
, CI can skip regular flow (build
, test
, coverage
) and only run publish
in packages, which should do build
, test
, coverage
to make sure that if publish
is run locally from dev machine it still goes through all phases, but we recommend to always publish from CI only as it is the only way to ensure all proper checks.
Starting/watching
Developer-specific phase, day to day activities like website or library/demo/storybook development, on this phase codebase is constantly watched for changes and re-built once changes occur.
Assume the following setup:
packages/
- demo (depends on lib1 and lib2)
- lib1
- lib2 (depends on lib1)
- Demos that use libraries often need those libs to be pre-built in order to run
start
correctly without errors about missing files so root’sstart
script must runbuild
script scoped for libs and then runstart:quick
script (see examples below), unfortunately, this brings overhead until https://github.com/webpack/webpack/issues/4991 is fixed (previously TS was prone too: bug 12996, see update below) start:quick
simply runsstart
in all packages which starts Webpack and Babel watchers- UPDATE In TypeScript 3.4 new incremental option has been introduced: it will produce a build and a cache, so no matter how often you restart watchers it will be ready much faster. Unfortunately it’s not yet supported by TS Loader for Webpack It still does not eliminate the necessity to pre-build libraries.
Cleaning
Developer-specific phase before builds or when switching branches.
- Root’s
clean
should first runclean
scripts in packages (e.g. clean artifacts), then remove node_modules in packages, then remove node_modules in the root, this will bring repository to ground zero - Cleaning is useful when devs switch branches, then command
npm run clean && npm install
makes sure they have working setup after branch switch
Examples
Main package scripts
{
"postinstall": "npm run bootstrap",
"bootstrap": "lerna bootstrap --no-ci --hoist",
"bootstrap:quick": "npm run bootstrap -- --scope=@ringcentral/*",
"clean": "npm run clean:artifacts && npm run clean:packages && npm run clean:root",
"clean:artifacts": "lerna run clean --parallel",
"clean:packages": "lerna clean --yes",
"clean:root": "rimraf node_modules", "start": "npm run build:quick && npm run start:quick",
"start:quick": "dotenv lerna run start -- --parallel", "build": "lerna run build --concurrency=1 --stream",
"build:quick": "npm run build -- --scope=@ringcentral/*", "test": "lerna run test --concurrency=1 --stream",
"test:quick": "lerna run test:quick --concurrency=1 --stream",
"test:coverage": "lerna run test:coverage --parallel",
"test:watch": "lerna run test:watch --parallel", "publish:release": "lerna publish --force-publish=* --no-push --no-git-tag-version", "lint": "eslint --cache --cache-location node_modules/.cache/eslint --fix",
"lint:all": "npm run lint 'packages/*/src/**/*.ts*'",
"lint:staged": "lint-staged"
}
We prefer to run tests one by one in topological order (maintained by Lerna) to see the nice structured output. Technically --concurrency=1 --stream
can be replaced with --parallel
which disregards topology and runs everything in parallel.
Notes:
- You can publish all packages together using
-force-publish=*
flag for simplicity.
Scripts test:quick
and lint:staged
should be part of pre-commit hook.
Package scripts
{
"mocha": "mocha --opts mocha.opts",
"karma": "karma start --singleRun",
"nyc": "nyc mocha --opts mocha.opts",
"build": "npm run clean && npm run build:tsc && npm run build:wp",
"build:tsc": "tsc",
"build:wp": "webpack --progress",
"start": "npm-run-all -p watch:tsc watch:webpack",
"start:tsc": "npm run build:tsc -- --watch --preserveWatchOutput",
"start:webpack": "npm run build:webpack -- --watch"
}
GitLab CI config
image:
node:lts
stages:
- validation
- bootstrap
- test
- publish
variables:
BRANCH: $(echo ${CI_COMMIT_REF_NAME} | sed -e 's/\(.*\)/\L\1/' | sed -r 's/[^a-z0-9-]/-/g')
before_script:
- npm config set //registry.npmjs.org/:_authToken=${NPM_TOKEN}
# or
- npm config set repository$PRIVATE_NPM
- npm config set _auth$PRIVATE_NPM_AUTH
- npm config set email$PRIVATE_NPM_EMAIL
- npm install --progress=false --ignore-scripts
cache:
key: "$CI_COMMIT_REF_NAME"
paths:
- node_modules/
- packages/*/node_modules
job_lint:
stage: validation
script: DEBUG=eslint:cli-engine npm runlint:all
job_bootstrap:
stage: bootstrap
script:
- npm runbootstrap:quick
- npm runbuild:quick
artifacts:
paths:
-"*/es"
-"*/lib"
expire_in: 1 day
job_test:
stage: test
script:
- npm test
- npm run test:coverage
job_canary:
stage: publish
script: npm runpublish:release-- --canary --preid=$BRANCH.$CI_PIPELINE_IID --dist-tag=$BRANCH --yes
only:
- master
- feature/*
- release/*
job_publish:
stage: publish
script: npm runpublish:release --$CI_COMMIT_TAG--yes
only:
-tags
This setup assumes you have NPM_TOKEN
(and others) in GitLab ENV variables. Also note the --yes
flags for publish scripts.
Due to https://github.com/lerna/lerna/issues/2171 we have to explicitly set the pipeline ID in the canary preid.
Travis CI config
language: node_js
node_js:
- stable
cache:
directories:
- $HOME/.npm
- node_modules
- packages/*/node_modules
before_install:
- BRANCH=$(echo ${TRAVIS_BRANCH} | sed -e 's/\(.*\)/\L\1/' | sed -r 's/[^a-z0-9-]/-/g')
- npm config set //registry.npmjs.org/:_authToken=${NPM_TOKEN}before_script:
- DEBUG=eslint:cli-engine npm runlint:all
- npm run build:quick
deploy:
- provider: script
script: npm runpublish:release --$TRAVIS_TAG--yes
skip_cleanup: true
on:
branch: master
tags: false
repo: xxx
- provider: script
script: npm runpublish:release-- --canary --preid=$BRANCH.$TRAVIS_JOB_NUMBER --dist-tag=$BRANCH --yes
skip_cleanup: true
on:
tags: true
repo: xxx
- provider: releases
api_key: $GITHUB_TOKEN
skip_cleanup: true
file:
- packages/xxx/dist/xxx.js
- packages/yyy/dist/yyy.js
on:
tags: true
repo: xxx
after_success:
- npm runtest:coverage
This setup also assumes you have NPM_TOKEN
and GITHUB_TOKEN
in Travis ENV variables. Also, note the --yes
flags for publish
scripts. Travis by default will do npm install
which will do npm bootstrap
, so there’s some room for optimization like we did in GitLab example.
Due to https://github.com/lerna/lerna/issues/2171 we have to explicitly set the Travis build number in the canary preid.
Packages used
Since we mention a lot of packages it does make sense to tell what they are doing.
- Lerna — manages JavaScript projects with multiple packages
- Dotenv-cli — loads
.env
files that stores environment variables - ESLint — a fully pluggable tool for identifying and reporting on patterns in JavaScript
- Mocha — simple &flexible javascript test framework for Node.js & the browser
- NYC — command line interface for Istanbul, a JS code coverage tool that computes statement, line, function and branch coverage
- Karma — test runner for JavaScript, which is able to run Mocha and others
- Coveralls — code coverage tracking system
- TSC — TypeScript compiler
- Webpack — module bundler, bundles JavaScript files for usage in a browser
- Husky — runs NPM scripts from git hooks
- Lint-staged — runs linters on git staged files
- Jest — JavaScript testing framework with a focus on simplicity, can be used as replacement for Karma+Mocha+Istanbul
Conclusion
This approach with minor variations (mostly alternate phases) was successfully implemented and tested in different projects and proven to be consistent and complete. I hope it will help you to build, test and interact with your projects more efficiently.
This whole thing definitely requires some library to make things easier :)