NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
Monorepo – Our Experience (ente.io)
CharlieDigital 5 hours ago [-]

    > Moving to a monorepo didn't change much, and what minor changes it made have been positive.
I'm not sure that this statement in the summary jives with this statement from the next section:

    > In the previous, separate repository world, this would've been four separate pull requests in four separate repositories, and with comments linking them together for posterity.
    > 
    > Now, it is a single one. Easy to review, easy to merge, easy to revert.
IMO, this is a huge quality of life improvement and prevents a lot of mistakes from not having the right revision synced down across different repos. This alone is a HUGE improvement where a dev doesn't accidentally end up with one repo in this branch and forgot to pull this other repo at the same branch and get weird issues due to this basic hassle.

When I've encountered this, we've had to use another repo to keep scripts that managed this. But this was also sometimes problematic because each developer's setup had to be identical on their local file system (for the script to work) or we had to each create a config file pointing to where each repo lived.

This also impacts tracking down bugs and regression analysis; this is much easier to manage in a mono-repo setup because you can get everything at the same revision instead of managing synchronization of multiple repos to figure out where something broke.

ericyd 11 minutes ago [-]
I felt the same, the author seemed to downplay the success while every effect listed in the article felt like a huge improvement.
danudey 3 hours ago [-]
I prefer microservices/microrepos _conceptually_, but we had the same experience as your quoted text - making changes to four repos, and backporting those changes to the previous two release branches, means twelve separate PRs to make a change.

Having a centralized configuration library (a shared Makefile that we can pull down into our repo and include into the local Makefile) helps, until you have to make a backwards-incompatible change to that Makefile and then post PRs to every branch of every repo that uses that Makefile.

Now we have almost the entirety of our projects back into one repository and everything is simpler; one PR per release branch, three PRs (typically) for any change that needs backporting. Vastly simpler process and much less room for error.

taeric 4 hours ago [-]
My only counter argument here, is when those 4 things deploy independently. Sometimes, people will get tricked into thinking a code change is atomic because it is in one commit, when it will lead to a mixed fleet because of deployment realities. In that world, having them separate is easier to work with, as you may have to revert one of the deployments separately from the others.
derefr 22 minutes ago [-]
That's just an argument for not doing "implicit GitOps", treating the tip of your monorepo's main branch as the source-of-truth on the correct deployment state of your entire system. ("Implicit GitOps" sorta-kinda works when you have a 1:1 correspondence between repos and deployable components — though not always! — but it isn't tenable for a monorepo.)

What instead, then? Explicit GitOps. Explicit, reified release specifications (think k8s resource manifests, or Erlang .relup files), one per separately-deploy-cadenced component. If you have a monorepo, then these live also as a dir in the monorepo. CD happens only when these files change.

With this approach, a single PR can atomically merge code and update one or more release specifications (triggering CD for those components), if and when that is a sensible thing to do. But there can also be separate PRs for updating the code vs. "integrating and deploying changes" to a component, if-and-when that is sensible.

lmz 12 minutes ago [-]
Isn't a mixed fleet always the case once you have more than one server and do rolling updates?
wongarsu 2 hours ago [-]
It's not as much of a pain if your tooling supports git repos as dependencies. For example a typical multi-repo PR for us with rust is 1) PR against library 2) PR against application that points dependency to PR's branch, makes changes 3) PR review 4) PR 1 is approved and merged 5) PR 2 is changed to point to new master branch of commit 6) PR 2 is approved and merged

Same idea if you use some kind of versioning and release system. It's still a bit of a pain with all the PRs and coordination involved, but at every step every branch is consistent and buildable, you just check it out and hit build.

This is obviously more difficult if you have a more loosely coupled architecture like microservices. But that's self-inflicted pain

2 hours ago [-]
notwhereyouare 5 hours ago [-]
ironically was gonna come and comment on that same second block of text.

We went from monorepo to multi-repo at work and it's been a huge set back and disappointment with the devs because it's what our contractors recommended.

I've asked for a code deploy and everything and it's failed in prod due to a missing check in

CharlieDigital 5 hours ago [-]

    > ...because it's what our contractors recommended
It's sad when this happens instead of taking input from the team on how to actually improve productivity/quality.

A startup I joined started with a multi-repo because the senior team came from a FAANG where this was common practice to have multiple services and a repo for each service.

Problem was that it was a startup with one team of 6 devs and each of the pieces was connected by REST APIs. So now any change to one service required deploying that service and pulling down the OpenAPI spec to regenerate client bindings. It was so clumsy and easy to make simple mistakes.

I refactored the whole thing in one weekend into a monorepo , collapsed the handful of services into one service, and we never looked back.

That refactoring and a later paper out of Google actually inspired me to write this article as a practical guide to building a "modular monolith": https://chrlschn.dev/blog/2024/01/a-practical-guide-to-modul...

eddd-ddde 5 hours ago [-]
At least google and meta are heavy into monorepos, I'm really curious what company is using a _repo per service_. That's insane.
jgtrosh 4 hours ago [-]
My team implemented (and reimplemented!) a project using one repo per module. I think the main benefit was ensuring enough separation of concern due to the burden of changing multiple parts together. I managed to reduce something like 10 repos down to 3... Work in progress.
tpm 3 hours ago [-]
> burden of changing multiple parts together

Then you are adapting your project to the properties of code repository. I don't see that as a benefit.

pc86 4 hours ago [-]
It can make sense when you have a huge team of devs and different teams responsible for everything where you may be on multiple teams, and nobody is exactly responsible for all the same set of services you are. Depending on the security/access provisioning culture of the org, "taking half a day to manually grant access to the repos so-and-so needs access to" may actually be an easier sell than "give everyone access to all our code."

If you just have 20-30 devs and everyone is pretty silo'd (e.g. frontend or backend, data or API, etc) having 75 repos for your stuff is just silly.

psoundy 2 hours ago [-]
Have you heard of OpenShift 4? Self-hosted Kubernetes by Red Hat. Every little piece of the control plane is its own 'operator' (basically a microservice) and every operator is developed in its own repo.

A github search for 'operator' in the openshift org has 178 results:

https://github.com/orgs/openshift/repositories?language=&q=o...

Not all are repos hosting one or more microservices, but most appear to be. Best of luck ensuring consistency and quality across so many repos.

adra 25 minutes ago [-]
It's just as easy? When you have a monorepo with 5 million lines of code, you're only going to focus on the part of the code you care about and forget the rest. Same with 50 repos of 100,000 loc.

Enforcing standards means actually having org level mandates around acceptable development standards, and it's enforced using tools. Those tools should be just as easily run on one monorepo than 50+ distributed repositories, nay?

58 minutes ago [-]
bobnamob 4 hours ago [-]
Amazon uses "repo per service" and it is semi insane, but Brazil (the big ol' internal build system) and Coral (the internal service framework) make it "workable".

As someone who worked in the dev tooling org, getting teams to keep their deps up to date was a nightmare.

bluGill 4 hours ago [-]
Monorepo and multi repo both have their own need for teams to work on dev tooling when the project gets large.
wrs 2 hours ago [-]
I worked at a Fortune 1 company that used one repo per release for a certain major software component.
seadan83 12 minutes ago [-]
Did that work out well at all? Any silver lining? My first thought is: "branches" & "tags" - wow... Would branches/tags have just been easier to work with?

Were they working with multiple services in a multi-repo? Seems like a cross-product explosion of repos. Did that configuration inhibit releases, or was the process cumbersome but just smooth because it was so rote?

biorach 1 hours ago [-]
was that as insane as it sounds?
dewey 4 hours ago [-]
It's almost never a good idea to get inspired by what Google / Meta / Huge Company is doing as most of the times you don't have their problems and they have custom toolings and teams making everything work on that scale.
CharlieDigital 4 hours ago [-]
In this case, I'd say it's the opposite: monorepo as an approach works amazingly well for small teams all the ways up to huge orgs (with the right tooling to support it).

The difference is that past a certain level of complexity, the org will most certainly need specialized tooling to support massive codebases to make CI/CD (build, test, deploy, etc.) times sane.

On the other hand, multi-repos may work for massive orgs, but is always going to add friction for small orgs.

dewey 4 hours ago [-]
In this case I wasn't even referring to mono repo or not, but more about the idea of taking inspiration from very large companies for your own not-large-company problems.
influx 3 hours ago [-]
I’ve used one of the Meta monorepos (yeah there’s not just one!) and it’s super painful at that scale.
stackskipton 2 hours ago [-]
>So now any change to one service required deploying that service and pulling down the OpenAPI spec to regenerate client bindings. It was so clumsy and easy to make simple mistakes.

Why? Is your framework heavily tied to client bindings? APIs I consume occasionally get new fields added to it for data I don't need. My code just ignores it. We also have a policy you cannot add a new mandatory field to API without version bump. So maybe REST API would have new field but I didn't send it and it happily didn't care.

jayd16 4 hours ago [-]
If prod went down because of a missing check in, there are other problems.
notwhereyouare 40 minutes ago [-]
did I say prod went down? I just said it failed in prod. it was a logging change and only half the logging went out. To me, that's a failure
audunw 4 hours ago [-]
There’s nothing preventing you from having a single pull request in for merging branches over multiple repos. There’s nothing preventing you from having a parent repo with a lock file that gives you a single linear set of commits tracking the state of multiple repos.

That is, if you’re not tied to using just Github of course.

Big monorepos and multiple repo solutions require some tooling to deal with scaling issues.

What surprises me is the attitude that monorepos are the right solution to these challenges. For some projects it makes sense yes, but it’s clear to me that we should have a solution that allows repositories to be composed/combined in elegant ways. Multi-repository pull requests should be a first class feature of any serious source code management system. If you start two projects separately and then later find out you need to combine their history and work with them as if they were one repository, you shouldn’t be forced to restructure the repositories.

CharlieDigital 4 hours ago [-]

    > Multi-repository pull requests should be a first class feature of any serious source code management system. 
But it's currently not?

    > If you start two projects separately and then later find out you need to combine their history and work with them as if they were one repository, you shouldn’t be forced to restructure the repositories.
It's called a directory copy. Cut + paste. I'd add a tag with a comment pointing to the old repo (if needed). But probably after a few weeks, no one is going to look at the old repo.
dmazzoni 3 minutes ago [-]
> It's called a directory copy. Cut + paste. I'd add a tag with a comment pointing to the old repo (if needed). But probably after a few weeks, no one is going to look at the old repo.

Not in my experience. I use "git blame" all the time, and routinely read through commits from many years ago in order to understand why a particular method works the way it does.

Luckily, there are many tools for merging git repos into each other while preserving history. It's not as simple as copy and paste, but it's worth the extra efford.

pelletier 4 hours ago [-]
> Multi-repository pull requests should be a first class feature of any serious source code management system.

Do you have examples of source code management systems that provide this feature and do you have experience with them? repo-centric approach of GitHub often feels limiting.

jvolkman 3 hours ago [-]
Apparently Gerrit supports this with topics: https://gerrit-review.googlesource.com/Documentation/cross-r...
msoad 18 minutes ago [-]
I love monorepos but I'm not sure if Git is the right tool beyond certain scale. Where I work doing a simple `git status` takes seconds due to the size of the repo. There has been various attempts to solve Git performance but so far this is nothing close to what I experienced at Google.

The Git team should really invest in tooling for very large repos. Our repo is around 10M files and 100M lines of code and no amount of hacks on top of Git (cache, sparse checkout etc etc) is not really solving the core problem.

Meta and Google have really solved this problem internally but there is no real open source solution that works for everyone out there.

dijit 21 seconds ago [-]
I’m secretly hoping that google releases piper (and Mondrian); the gaming industry would go wild.

Perforce is pretty brutal, and the code review tools are awful.

KaiserPro 23 minutes ago [-]
Monorepos have their advantages, as pointed out, one place to review, one place to merge.

But it can also breed instability, as you can upgrade other people's stuff without them being aware.

There are ways around this, which involve having a local module store, and building with named versions. Very similar to a bunch of disparate repos, but without getting lost in github (github's discoverability was always far inferior to gitlab)

However it has its draw backs namely that people can hold out on older versions than you want to support.

dkarl 3 minutes ago [-]
> But it can also breed instability, as you can upgrade other people's stuff without them being aware

This is why Google embraced the principle that if somebody breaks your code without breaking your tests, it's your fault for not writing better tests. (This is sometimes known as the Beyonce rule: if you liked it, you should have put a test on it.)

You need the ability to upgrade dependencies in a hands-off way even if you don't have a monorepo, though, because you need to be able to apply security updates without scheduling dev work every time. You shouldn't need a careful informed eye to tell if upgrades broke your code. You should be able to trust your tests.

xyzzy_plugh 5 hours ago [-]
Without indicating my personal feelings on monorepo vs polyrepo, or expressing any thoughts about the experience shared here, I would like to point out that open-source projects have different and sometimes conflicting needs compared to proprietary closed-source projects. The best solution for one is sometimes the extreme opposite for the other.

In particular many build pipelines involving private sources or artifacts become drastically more complicated than their those of publicly available counterparts.

bunderbunder 46 minutes ago [-]
I've also seen this with branching strategies. IMO the best branching strategy for open source projects is generally the worst one for commercial projects, and vice versa.
mgaunard 3 hours ago [-]
Doing modular right is harder than doing monolithic right.

But if you do it right, the advantage you get is that you get to pick which versions of your dependencies you use; while quite often you just want to use the latest, being able to pin is also very useful.

lukewink 3 hours ago [-]
You can still publish packages and pull them down as (pinned) dependencies all within a monorepo.
mgaunard 11 minutes ago [-]
that's a terrible and arguably broken-by-design workflow which entirely defeats the point of the monorepo, which is to have a unified build of everything together, rather than building things piecemeal in ways that could be incompatible.

For C++ in particular, you need to express your dependencies in terms of source versions, and ensure all of the build artifacts you link together were built against the same source version of every transitive dependency and with the same flags. Failure to do that results in undefined behaviour, and indeed I have seen large organizations with unreliable builds as a manner of routine because of that.

The best way to achieve that is to just build the whole thing from source, with a content-addressable-store shared with the whole organization to transparently avoid building redundant things. Whether your source is in a single repo or spread over several doesn't matter so long as your tooling manages that for you and knows where to get things, but ultimately the right way to do modular is simply to synthesize the equivalent monorepo and build that. Sometimes there is the requirement that specific sources should have restricted access, which is often a reason why people avoid building from source, but that's easy to work around by building on remote agents.

Now for some reason there is no good open-source build system for C++, while Rust mostly got it right on the first try. Maybe it's because there are some C++ users still attached to the notion of manually managing ABI.

siva7 5 hours ago [-]
Ok, but the more interesting part - how did you solve the CI/CD part and how does it compare to a multirepo?
devjab 5 hours ago [-]
I don’t think CI/CD should really be a big worry as far as mono-repositories go as you can setup different pipelines and different flows with different configurations. Something you’re probably already doing if you have multiple repos.

In my experience the article is right when it tells you there isn’t that big of a difference. We have all sorts of repositories, some of which are basically mono-repositories for their business domain. We tend to separate where it “makes sense” which for us means that it’s when what we put into repositories is completely separate from everything else. We used to have a lot of micro-repositories and it wasn’t that different to be honest. We grouped more of them together to make it easier for us to be DORA compliant in terms of the bureaucracy it adds to your documentation burden. Technically I hardly notice.

JamesSwift 5 hours ago [-]
In my limited-but-not-nothing experience working with mono vs multi repo of the same projects, CI/CD definitely was one of the harder pieces to solve. Its highly dependent on your frameworks and CI provider on just how straightforward it is going to be, and most of them are "not very straightforward".

The basic way most work is to run full CI on every change. This quickly becomes a huge speedbump to deployment velocity until a solution for "only run what is affected" is found.

devjab 4 hours ago [-]
Which CI/CD pipelines have you had issues with? Because that isn’t my experience at all. With both GitHub (also Azure DevOps) and gitlab you can separate your pipelines with configurations like .gitlab-ci.yml. I guess it can be non-trivial to setup proper parallelisation when you have a lot of build stages if this isn’t something you’re familiar with. A lot of other more self-hosted tools like Gradle, RushJS and many others you can setup configurations which does X if Y and make sure only to run things which are necessary.

I don’t want to be rude, but a lot of these tools have rather accessible documentation on how to get up and running as well as extensive documentation for more complex challenges available in their official docs. Which is probably the, only, place you’ll find good ways of working with it because a lot of the search engine and LLM “solutions” will range from horrible to outdated.

It can be both slower and faster than micro-repositories in my experience, however, you’re right that it can indeed be a Cthulhu level speed bump if you do it wrong.

JamesSwift 3 hours ago [-]
I implied but didnt explicitly mention that I'm talking from the context of moving _from_ existing polyrepo _to_ monorepo. The tooling is out there to walk a more happy-path experience if you jump in on day 1 (or early in the product lifecycle). But its much harder to migrate to it and not have to redo a bunch of CI-related tooling.
bluGill 4 hours ago [-]
The problem with "only run what is affected" is it is really easy to have something that is affected but doesn't seem like it should be (that is whatever tools you have to detect is it affected say it isn't). So if you have such a system you must have regular rebuild everything jobs as well to verify you didn't break something unexpected.

I'm not against only run what is affected, it is a good answer. It just has failings that you need to be aware of.

JamesSwift 3 hours ago [-]
Yeah thats a good point. Especially for an overly-dynamic runtime like ruby/rails, theres just not usually a clean way to cordon off sections of code. On the other hand, using nx in an angular project was pretty amazing.
bluGill 2 hours ago [-]
Even in something like C++ you often have configuration, startup scripts (I'm in embedded, maybe this isn't a think elsewhere), database schemas, and other such things that the code depends on but it isn't obvious to the build system that the dependency exists.
CharlieDigital 5 hours ago [-]
Most CI/CD platforms will allow specification of targeted triggers.

For example, in GitHub[0]:

    name: ".NET - PR Unit Test"
    
    on:
      ## Only execute these unit tests when a file in this directory changes.
      pull_request:
        branches: [main]
        paths: [src/services/publishing/**.cs, src/tests/unit/**.cs]
So we set up different workflows that kick off based on the sets of files that change.

[0] https://docs.github.com/en/actions/writing-workflows/workflo...

victorNicollet 5 hours ago [-]
I'm not familiar with GitHub Actions, but we reverted our migration to Bitbucket Pipelines because of a nasty side-effect of conditional execution: if a commit triggers test suite T1 but not T2, and T1 is successful, Bitbucket displays that commit with a green "everything is fine" check mark, regardless of the status of T2 on any ancestors of that commit.

That is, the green check mark means "the changes in this commit did not break anything that was not already broken", as opposed to the more useful "the repository, as of this commit, passes all tests".

plorkyeran 4 hours ago [-]
I would find it extremely confusing and unhelpful if tests in the parent commit which weren't rerun for a PR because nothing relevant was touched marked the PR as red. Why would you even want that? That's not something which is relevant to evaluating the PR and would make you get in the habit of ignoring failures.

If you split something into multiple repositories then surely you wouldn't mark PRs on one of them as red just because tests are failing in a different one?

victorNicollet 29 minutes ago [-]
I suppose our development process is a bit unusual.

The meaning we give to "the commit is green" is not "this PR can be merged" but "this can be deployed to production", and it is used for the purpose of selecting a release candidate several times a week. It is a statement about the entire state of the project as of that commit, rather than just the changes introduced in that commit.

I can understand the frustration of creating a PR from a red commit on the main branch, and having that PR be red as well as a result. I can't say this has happened very often, though: red commits on the main branch are very rare, and new branches tend to be started right after a deployment, so it's overwhelmingly likely that the PR will be rooted at a green commit. When it does happen, the time it takes to push a fix (or a revert) to the main branch is usually much shorter than the time for a review of the PR, which means it is possible to rebase the PR on top of a green commit as part of the normal PR acceptance timeline.

ants_everywhere 4 hours ago [-]
isn't that generally what you want? the check mark tells you the commit didn't break anything. if something was already broken it should have either blocked the commit that broke it or there's a flake somewhere that you can only locate by periodically running tests independent of any PR activity.
daelon 4 hours ago [-]
Is it a side effect if it's also the primary effect?
hk1337 4 hours ago [-]
Even AWS CodeBuild (or CodePipeline) allows you to do this now. It didn't before but it's a fairly recent update.
victorNicollet 5 hours ago [-]
Wouldn't CI be easier with a monorepo ? Testing integration across multiple repositories (triggered by changes in any of them) seems more complex than just adding another test suite to a single repo.
bluGill 4 hours ago [-]
Pros and cons. Both can be used successfully, but there are different problems to each. If you have a large project you will have a tool teams to deal with the problems of your solution.
gregmac 5 hours ago [-]
To me, monorepo vs multi-repo is not about the code organization, but about the deployment strategy. My rule is that there should be a 1:1 relation between a repository and a release/deployment.

If you do one big monolithic deploy, one big monorepo is ideal. (Also, to be clear, this is separate from microservice vs monolithic app: your monolithic deploy can be made up of as many different applications/services/lambdas/databases as makes sense). You don't have to worry about cross-compatibility between parts of your code, because there's never a state where you can deploy something incompatible, because it all deploys at once. A single PR makes all the changes in one shot.

The other rule I have is that if you want to have individual repos with individual deployments, they must be both forward- and backwards-compatible for long enough that you never need to do a coordinated deploy (deploying two at once, where everything is broken in between). If you have to do coordinated deploys, you really have a monolith that's just masquerading as something more sophisticated, and you've given up the biggest benefits of both models (simplicity of mono, independence of multi).

Consider what happens with a monorepo with parts of it being deployed individually. You can't checkout any specific commit and mirror what's in production. You could make multiple copies of the repo, checkout a different commit on each one, then try to keep in mind which part of which commit is where -- but this is utterly confusing. If you have 5 deployments, you now have 4 copies of any given line of code on your system that are potentially wrong. It becomes very hard to not accidentally break compatibility.

TL;DR: Figure out your deployment strategy, then make your repository structure mirror that.

CharlieDigital 4 hours ago [-]
It doesn't have to be that way.

You can have a mono-repo and deploy different parts of the repo as different services.

You can have a mono-repo with a React SPA and a backend service in Go. If you fix some UI bug with a button in the React SPA, why would you also deploy the backend?

Falimonda 4 hours ago [-]
This is spot on. A monorepo can still include a granular and standardized CI configuration across code paths. Nothing about monorepo forces you to perform a singular deployment.

The gains provided by moving from polyrepo to monorepo are immense.

Developer access control is the only thing I can think to justify polyrepo.

I'm curious if and how others who see the advantages of monorepo have justified polyrepo in spite of that.

oneplane 4 hours ago [-]
You wouldn't, but making a repo collection into a mono-repo means your mono-deploy needs to be split into a multi-maybe-deploy.

As always, complexity merely moves around when squeezed, and making commits/PRs easier means something else, somewhere else gets less easy.

It is something that can be made better of course, having your CI and CD be a bit smarter and more modular means you can now do selective builds based on what was actually changed, and selective releases based on what you actually want to release (not merely what was in the repo at a commit, or whatever was built).

But all of that needs to be constructed too, just merging some repos into one doesn't do that.

CharlieDigital 4 hours ago [-]
This is not very complex at all.

I linked an example below. Most CI/CD, like GitHub Actions[0], can easily be configured to trigger on changes for files in a specific path.

As a very basic starting point, you only need to set up simple rules to detect which monorepo roots changed.

[0] https://docs.github.com/en/actions/writing-workflows/workflo...

bryanlarsen 4 hours ago [-]
If you don't deploy in tandem, you need to test forwards & backwards compatibility. That's tough with either a monorepo or separate repos, but arguably it'd be simple with separate repos.
CharlieDigital 4 hours ago [-]
It doesn't have to be that complicated.

All you need to know is "does changing this code affect that code".

In the example I've given -- a React SPA and Go backend -- let's assume that there's a gRPC binding originating from the backend. How do we know that we also need to deploy the SPA? Updating the schema would cause generation of a new client + model in the SPA. Now you know that you need to deploy both and this can be done simply by detecting roots for modified files.

You can scale this. If that gRPC change affected some other web extension project, apply the same basic principle: detect that a file changed under this root -> trigger the workflow that rebuilds, tests, and deploys from this root.

aswerty 4 hours ago [-]
This mirrors my own experience in the SaaS world. Anytime things move towards multiple artifacts/pipelines in one repo; trying to understand what change existed where and when seems to always become very difficult.

Of course the multirepo approach means you do this dance a lot more: - Create a change with backwards compatibility and tombstones (e.g. logs for when backward compatibility is used) - Update upstream systems to the new change - Remove backwards compatibility and pray you don't have a low frequency upstream service interaction you didn't know about

While the dance can be a pain - it does follow a more iterative approach with reduced blast radiuses (albeit many more of them). But, all in all, an acceptable tradeoff.

Maybe if I had more familiarity in mature tooling around monorepos I might be more interested in them. But alas not a bridge I have crossed, or am pushed to do so just at the moment.

h1fra 5 hours ago [-]
I think the big issue around monorepo is when a company puts completely different projects together inside a single repo.

In this article almost everything makes sense to me (because that's what I have been doing most of my career) but they put their OTP app inside which suddenly makes no sense. And you can see the problem in the CI they have dedicated files just for this App and probably very few common code with the rest.

IMO you should have one monorepo per project (api, frontend, backend, mobile, etc. as long as it's the same project) and if needed a dedicated repo for a shared library.

fragmede 5 hours ago [-]
> you should have one monorepo per project (api, frontend, backend, mobile, etc. as long as it's the same project)

that's not a monorepo!

Unless the singular "project" is stuff our company ships, the problem you have is of impedance mismatch between the projects, which is the problem that an actual monorepo solves. for swe's on individual projects who will never have the problem of having to ship a commit on all the repos at the "same" time, yeah that seems fine, and for them it is. the problem comes as a distributed systems engineer where, for whatever reason, many or all the repos need to be shipped at the ~same time. or worse - A needs to ship before B which needs ship before C but that needs to ship before A, and you have to unwind that before actually being able to ship the change.

hk1337 4 hours ago [-]
> that's not a monorepo!

Sure it is! It's just not the ideal use case for a monorepo which is why people say they don't like monorepos.

vander_elst 3 hours ago [-]
"one monorepo per project (api, frontend, backend, mobile, etc. as long as it's the same project) and if needed a dedicated repo for a shared library."

They are literally saying that multiple repos should be used, also for sharing the code, this is not monorepo, these are different repos.

stackskipton 2 hours ago [-]
As DevOps/SRE type person that occasionally gets stuck with builds, Monorepos world well if company will invest in the build process. However, many companies don't do well in this area and Monorepo blast radius becomes much bigger so individual repos it is. Also, depending on the language, building private repo is easy enough to keep all common libraries in.
5 hours ago [-]
magicalhippo 5 hours ago [-]
We're transitioning from a SVN monorepo to Git. We've considered doing a kind of best-of-both-worlds approach.

Some core stuff into separate libraries, consumed as nuget packages by other projects. Those libraries and other standalone projects in separate repos.

Then a "monorepo" for our main product, where individual projects for integrations etc will reference non-nuget libraries directly.

That is, tightly coupled code goes into the monorepo, the rest in separate repos.

Haven't taken the plunge just yet tho, so not sure how well it'll actually work out.

dezgeg 2 hours ago [-]
In my experience this turns to nightmare when (not if, when) there is need to make changes to the libraries and app at the same time. Especially with libraries it's often necessary to create a client for an API at the same time to really know that the interface is any good.
magicalhippo 2 minutes ago [-]
[delayed]
stillbourne 2 hours ago [-]
I like to use the monorepo tools without the monorepo repo. If that makes any god damn sense. I use NX at my job and the monorepo was getting out of hand, 6 hour pipeline builds, 2 hours testing, etc. So I broke the repo into smaller pieces. This wouldn't have been possible if I wasn't already using the monorepo tools universally through the project but it ended up working well.
memsom 5 hours ago [-]
monorepos are appropriate for a single project with many sub parts but one or two artifacts on any given release build. But they fall apart when you have multiple products in the monorepo, each with different release schedules.

As soon as you add a second separate product that uses a different subset of any code in the repo, you should consider breaking up the monorepo. If the code is "a bunch of libraries" and "one or more end user products" it becomes even more imperative to consider breaking down stuff..

Having worked on monorepos where there are 30+ artifacts, multiple ongoing projects that each pull the monorepo in to different incompatible versions, and all of which have their own lifetime and their own release cycle - monorepo is the antithesis of a good idea.

vander_elst 3 hours ago [-]
Working on a monorepo where we have hundreds (possibly thousands) of projects each with a different version and release schedule. It actually works quite well, the dependencies are always in a good state, it's easy to see the ramifications of a change and to reuse common components.
memsom 2 hours ago [-]
Good for you. For us, because we have multiple projects going on, pulling the code in different ways, code that runs on embedded, code that runs in the cloud, desktop apps (real ones written in C++ and .Net, not glorified web apps), code that is customer facing, code used by third parties for integrating our products, no - it just doesn’t work. The embedded shares a core with other levels, and we support multiple embedded platforms (bare metal) and OS (Windows, Linux, Android, iOS) and also have stuff that runs in Amazon/Azure cloud platform. You might be fine, but when you hit critical mass and you have very complicated commercial concerns, it doesn’t work well.
tomtheelder 35 minutes ago [-]
I mean it works for Google. Not saying that's a reason to go monorepo, but it at least suggests that it can work for a very large org with very diverse software.

I really don't see why anything you describe would be an issue at all for a monorepo.

munksbeer 4 hours ago [-]
No offense but I think you're doing monorepos wrong. We have more than 100 applications living in our monorepo. They share common core code, some common signals, common utility libs, and all of them share the same build.

We release everything weekly, and some things much more frequently.

If your testing is good enough, I don't see what the issue is?

bluGill 4 hours ago [-]
> If your testing is good enough, I don't see what the issue is?

Your testing isn't good enough. I don't know who you are, what you are working on, or how much testing you do, but I will state with confidence it isn't good enough.

It might be acceptable for your current needs, but you will have bugs that escape testing - often intentional as you can't stop forever to fix all known bugs. In turn that means if anything changes in your current needs you will run into issues.

> We release everything weekly, and some things much more frequently.

This is a negative to users. When you think you will release again next so who cares about bugs it means your users see more bugs. Sure it is nice that you don't have to break open years old code anymore, but if the new stuff doesn't have anything the user wants is this really a good thing?

memsom 2 hours ago [-]
No offence, but you might be a little confused by how complex your actual delivery is. That sounds simple. That sounds like it has a clear roadmap. When you don’t, and you have very agile development that pivots quickly and demands a lot of change concurrently for releases that have very different goals, it is not possible to make all your ducks sit in a row. Monorepos suck in that situation. The dependency graph is so complex it will make your head hurt. And all the streams need to converge in to the main dev branch at some point, which causes huge bottlenecks.
tomtheelder 33 minutes ago [-]
The dependency graph is no different for a monorepo vs a polyrepo. It's just a question of how those dependencies get resolved.
syndicatedjelly 5 hours ago [-]
Some thoughts:

1) Comparing a photo storage app to the Linux kernel doesn't make much sense. Just because a much bigger project in an entirely different (and more complex) domain uses monorepos, doesn't mean you should too.

2) What the hell is a monorepo? I feel dumb for asking the question, and I feel like I missed the boat on understanding it, because no one defines it anymore. Yet I feel like every mention of monorepo is highly dependent on the context the word is used in. Does it just mean a single version-controlled repository of code?

3) Can these issues with sync'ing repos be solved with better use of `git submodule`? It seems to be designed exactly for this purpose. The author says "submodules are irritating" a couple times, but doesn't explain what exactly is wrong with them. They seem like a great solution to me, but I also only recently started using them in a side project

datadrivenangel 4 hours ago [-]
Monorepo is just a single repo. Yup.

Git submodules have some places where you can surprisingly lose branches/stashed changes.

syndicatedjelly 4 hours ago [-]
One of my repos has a dependency on another repo (that I also own). I initialized it as a git submodule (e.g. my_org/repo1 has a submodule of my_org/repo2).

    Git submodules have some places where you can surprisingly lose branches/stashed changes.
This concerns me, as git generally behaves as a leak-proof abstraction in my experience. Can you elaborate or share where I can learn more about this issue?
klooney 4 hours ago [-]
> Does it just mean a single version-controlled repository of code?

Yeah- they idea is that all of your projects share a common repo. This has advantages and drawbacks. Google is most famous for this approach, although I think they technically have three now- one for Google, one for Android, and one for Chrome.

> They seem like a great solution to me

They don't work in a team context because they're extra steps that people don't do, basically. And did some reason a lot of people find them confusing.

nonameiguess 3 hours ago [-]
https://github.com/google/ contains 2700+ repositories. I don't know necessarily how many of these are read-only clones from an internal monorepo versus how many are separate projects that have actually been open-sourced, but the latter is more than zero.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 21:33:19 GMT+0000 (Coordinated Universal Time) with Vercel.