Some changes of mine was reworked by someone and upstreamed:
If you follow the link to the mailing list posting, you will see I implemented a feature: search through multiple caches.
Say you have a build server which populates a cache. Your developers can point their ccache installations to that. However, ccache wants to write things there. The fix is to have a two-level hierarchy: ccache will look in their local cache first, and then the one from the server. If there is a cache miss, it goes to the local one; the server cache isn't polluted with development objects.
One thing missing from this is a trust framework where cache writers would sign their compiled code. There could also be a verification layer.
Building big sharable package systems is indeed a second half of the problem. Figuring out how to make the system have the hashy goodness AND be usefully collaborative with async swarms of people working independently and yet sharing work: tricky. We're trying to do it, though!
We have our first few packages now published, here: https://catalog.warpsys.org/
You can also see _how_ we build things, because we publish the full rebuild instructions with each release: for example, here's how we packaged our bash: https://catalog.warpsys.org/warpsys.org/bash/_replays/zM5K3V...
I'm in #warpforge on matrix with some collaborators if anyone's interested in joining us for a chat and some hacking :)
gittup implements part of this idea, but without the distributed bit. bazel implements the distributed bit (minus trust), but not the distro bit. What's really lacking is momentum around the idea to get a sufficient number of people behind it.
Besides the global cache for the whole distro, you can also set up caches for other software. For example, if you build your projects with Nix, you can have a cache for your projects (so that new contributors won't need to recompile everything from scratch). That's the premise behind https://www.cachix.org/
The only difference is that Nix caches aren't fine grained like ccache and sccache
The trick is to set --fdebug-prefix-map and -fprofile-dir to proper relative paths and then with some extra scripting caches will be reusable across build nodes even if the workspace directory is different for each build.
This and distcc (or IncrediBuild) are a game changer for every serious C++ workshop.
I kind of one in the future if there will be a publicly trustable, free to use "just plug and play, you don't need to set up your own" ccache
Think of the energy savings implications
Imagine every unoptimized build job for every microservice (Rust in Docker pulls the entire Cargo registry + rebuilds all dependencies every time you make a source code change of any kind if you don't go out of your way to prevent it)
Obviously trusting a public source to just compile something for you and give you a binary-like object is... probably a malware distributor's dream.
And I don't have data to prove the bandwidth cost would offset the energy savings of CPU cycles recalculating the same stuff over and over.
I never had the time to set it up properly, but by the looks of it, it should be even better.
Finally if you run configure scripts, all those “checking for printf…” messages are the configure script generating and compiling tiny C programs invoking those functions to make sure the compiler can find them. ccache can therefore shave a significant percentage of time off running configure scripts, which is welcome.
It was strictly a cache, it didn't run parallel checks or make any other attempts to improve the run time.
¹ The only source I've found right now is https://github.com/fxttr/confcache
It has a number of features, combining capabilities of ccache/distcc/icecream for C, C++ and Rust...along with some unique things that I've not seen in other tools. My comment at https://news.ycombinator.com/item?id=25604249 has a summary.
distcc didn't for example handle the load balancing well in my experiments whereas icecream did much better on that front and thus resulting with noticeably shorter build times. icecream also comes with a nice GUI (icemon) which can come really handy, e.g. you can use it to observe the build jobs and debug when things go south.
But I didn't know that sccache also supports distributed builds. From your comment it seems as if this was a recentish addition. I wonder how polished it is but I will definitely give it a try at some moment.
Bazel does this, so instead of needing to reimplement this for each compiler it's automatically done across all your languages, and even any random one-off build rules based on a shell script.
You can share the cache with your team/build infra too: https://bazel.build/remote/caching
(Disclaimer: I've never used this with open-source Bazel, I work at Google and use the internal variant)
If you using it distributed between different developer, how do you make sure the cache result is secure? A shared cache where everyone can contribute to is really hard (impossible?) to make secure. Someone could add malicious code the cache that then everyone will use.
Also release builds were excluded from caching, to prevent any form of poisoning there.
Let’s say you have 15 engineers and they each have their own laptop computer. Each of these engineers generates a pair of cryptographic keys, one public and one private.
Each engineer then gives their public key to the trusted authority that operates the ccache server. Only code that is submitted and signed by a respective private key is built and then distributed to the rest of the engineers.
For a public project you would only want the builds to be propagated out to other developers once the changes had been approved and then merged into a branch that triggers the CI.
Using it locally too, works great on Mac, but on Windows ccache has some problems caching debug builds. IIRC the embedded debug symbols use absolute paths, so the presence of this particular flag (/Z something...) disables cache eligibility.
we are trying to use that with conan, which changes the prefix dirs all the time. without hash_dirs the full_path is not stored in cmd line args.
Just looking at a jenkins machine with ccache I see a >90% hit rate for the cache, with 440k compilations returned from cache in the last 3 months (when stats were reset last).
Ccache only requires r/ssh access from the controller to any remote build nodes and the ccache program. I've never heard of anyone using this program as a shared cache source. That would be, well, kinda dumb. :/
*when there are 6 labs of machines to use
distcc is also nice if you have access to a k8s cluster with spare capacity. https://lastviking.eu/distcc_with_k8.html
I used distcc with k8s on a medium sized C++ project, until I got a workstation suitable for the compilations (32 core AMD thread-ripper). With the new workstation in place, I changed the build-script for the project to use ccache by default for all builds, and mapped a docker volume from the build-container to the local disk to keep the cache around.
However that project is based on an old version of clang and the changes were never upstreamed (initially it was a commercial product), so sadly this project is practically dead.
For other generator targets, adding ccache was a single line in the CMake configuration, but for Xcode you had to bend over backwards. This was maybe 4 years ago.
I haven't used either of those since 486s and P1s were popular.
2. Remove the object file associated with the C++ file.
3. Build again with CCACHE_RECACHE=1.
GNU Autotools: A hideous thing
cc -c main.c -o main.o
cc -c init.c -o init.o
cc init.o main.o -o final_binary
Using ccache is also nice when you have generated files. If you edit the generator code but the output is identical, Make will needlessly rebuild everything and ccache will make it quick.
Which makes me agree with the parent above, I don't see how exactly Ccache is supposed to be used. Maybe for a distributed source directory with many developers working on it?
>If you ever run `make clean; make`, you can probably benefit from ccache. It is common for developers to do a clean build of a project for a whole host of reasons, and this throws away all the information from your previous compilations. By using ccache, recompilation goes much faster.
What is the purpose of "make clean" other than to invalidate the whole cache so that it is cleanly recompiled? In such a situation I would want to invalidate the cache from ccache also completely.
I'm sure there are legitimate reasons for using ccache but it is not very obvious to me what it is:
"Only knows how to cache the compilation of a single file. Other types of compilations (multi-file compilation, linking, etc) will silently fall back to running the real compiler. "
Well yes, traditional use of makefiles has been exactly to cache the compilation of single compilation units and trigger the compile of changed units - ccache does not help with granularity here it seems.
Distributed development might be a good argument for this, but then what does it offer to faciliate that? It seems to suggest using NFS - which I could do with a Makefile as well. So is the advantage that it uses hashes instead of timestamps? Timestamps work quite well for me, but maybe that is a valid point.
Another argument could be that is stores the precompiled units somewhere else and therefore doesn't clutter the file system. But is that really a good argument? Build directories exist, so even if you'd like to keep compiling several variants in parallel you could do so with a few different build directories.
And yes, there are quite a lot of newer alternatives to Makefiles as well, so it would have to compete with those alternative build systems as well.
That was pre-Android Studio times. IDK what is the situation now.
But how many times have I seen perfect description of deps?
gcc -o hello hello.c