Monorepo or Multirepo?
I can see some of you saying “It depends”. So let me first define the requirements:
- #1: Scalable Organization (Consider Conway's law!!) where multiple teams can progress independently
- #2: Distributed Microservices Architecture
- #3: Application&Artifact independently buildable/deployable
Monorepo can be defined as a single repository that contains more than one application or infrastructure-related code.
As a professional who is fully committed to design patterns, being an advocate of SOLID principles, I am an enforcer of the Single-Responsibility Principle. Considering this principle, Monorepo is an anti-pattern, that is clear, but after publishing a previous version of this BlogPost, I had several feedbacks that favour Monorepo. Then I decided to make a Poll on Twitter to see the wider view on that topic:
Results show that Multirepo wins. But there is still a noticeable percentage of the community prefers Monorepo. I tried to understand the underlying reasons for that by doing several sessions with some Senior Developers that I rely on.
Based on those sessions, here are some of the reasons that I noticed why still that %41 tend to choose Monorepo:
-
Some of the Development Environment(IDE) functions make it easy to troubleshoot if the Microservices are on the same repository even they are independently deployable.
-
In some cases, developers not always follow component-based architecture. Decomposing the appropriate parts of the software to reusable libraries will also need a deployment pipeline and also versioning, which seems an additional work.
-
Some IDEs, like Visual Studio, favor the usage of, so-called, "Solutions" which makes it easy to create multiple artifacts from a single repository.
My experience with Monorepo on scaled projects is horrible, where `git rebase` and `git reset` are your best friends. You spend a certain amount of time to sync the branch with the trunk.
Besides, Trunk-based development practically is not possible. So you cannot create an organizational structure of, so-called, two-pizza teams. You have to pull and sometimes merge the changes which have no relation with what you are doing!
The other drawback I experienced is first-time cloning. If you rely on your code repository architecture on Monorepo, then you have to accept the risk of long-running `git clone`.
Keeping the `Git History` clean is challenging with Monorepo. Each pull from the origin may bring additional changes to your short-living branch and making it difficult to shrink because the history it contains may have other non-related commits! Although Gitlab, GitHub or some additional tools can solve this, pure git users like me will continue to suffer!
If you have a Monorepo, benefiting from the Git webhooks for triggering an automated build after each commit is not possible(although there is a non-native workaround to overcome this, which increases the complexity). This means that Continuous Delivery with Monorepo is practically under stress.
On the other side, considering you are a Platform Developer, who develops a platform for developer teams, which includes various DSLs(Ansible, Packer, Kubernetes, ...), scripting and also additional platform features, effects are same like having all the Terraform modules and infrastructure-related code, all in one repository. Recall a fundamental software development principle: Common/repetitive code which is consumed by more than one consumer should be positioned as a dependency! You can apply this by positioning your Terraform modules to separate repositories and referring them using the git tag versions. Here is the structure I am referring to:
So even for infrastructure developers, Multirepo is possible and from my perspective, is a must.
As a result, the developer team needs efficiency, having Monorepo can be a good start from a "Developer perspective" but from "Engineering-perspective", Multirepo is required. This decision must be given from the beginning. If you defer that architectural decision, to an undefined time in the future, then you have a technical debt!
Here are some of the comments:
I hope this Blogpost is useful for the teams who are in their early stage of deciding their overall architecture. Once the code repository grows, it may relatively not easy to convert Monorepo --> Multirepo based on your architecture.
Do not hesitate to comment if you also have a relative experience and view on the topic.
Derya (Dorian) Sezen
Derya, a.k.a. Dorian, ex-CTO of an amazon.com subsidiary, is currently working as Cloud and DevOps Consultant at kloia.