NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
Delete tests (andre.arko.net)
recursivedoubts 6 hours ago [-]
One of the most important things you can do is move your tests up the abstraction layers and away from unit tests. For lack of a better term, to move to integration tests. End-to-end tests are often too far from the system to easily understand what's wrong when they break, and can overwhelm a development org. Integration tests (or whatever you want to call them) are often the sweet spot: not tied to a particular implementation, able to survive fairly significant system changes, but also easy enough to debug when they break.

https://grugbrain.dev/#grug-on-testing

BinaryIgor 3 hours ago [-]
Was exactly about to point that out - if you mostly have integration (aka in-between tests) tests, you rarely need to refactor your tests. It's about testing mostly at the right abstraction level: https://binaryigor.com/unit-integration-e2e-contract-x-tests...
RHSeeger 5 hours ago [-]
Integration tests and Unit tests are different tools; and each has their place and purpose. Using one "instead" of the other is a mistake.
MrJohz 3 hours ago [-]
I've never really found this to be the case in practice. When I look at well-written unit tests and well-written integration tests, they're usually doing exactly the same sort of thing and have very similar concerns in terms of code organisation and test structure.

For example, in both cases, the tests work best if I test the subject under test as a black box (i.e. interact only with its public interface) but use my knowledge of its internals to identify the weaknesses that will most require testing. In both cases, I want to structure the code so that the subject under test is as isolated as possible - i.e. no complex interactions with global state, no mocking of unrelated modules, and no complex mechanism to reset anything after the test is done. In both cases, I want the test to run fast, ideally instantaneously, so I get immediate results.

The biggest difference is that it's usually harder to write good integration tests because they're interacting with external systems that are generally slower and stateful, so I've got to put extra work into getting the tests themselves to be fast and stateless. But when that works, there's really not much difference at all between a test that tests a single function, and a test that tests a service class with a database dependency.

rkomorn 3 hours ago [-]
I've found that well-written unit tests help me narrow down problems faster during development (eg one unit test failing for a function would show that a change or refactor missed an edge case).

I've found that well-written integration tests help me catch workflow-level issues (eg something changed in a dependency that might be mocked in unit tests).

So while I think good integration tests are the best way to make sure things should ship, I see a lot of value in good unit tests for day-to-day velocity, particularly in code that's being maintained or updated instead of new code.

CuriouslyC 2 hours ago [-]
Unit tests are good for testing isolated units of code, integration tests test integration. If you wait until you have enough code to test integration, when you actually write the tests you're going to find you've checked in a bunch of almost-working code.
RHSeeger 3 hours ago [-]
I'll go with a bank account, because that was one of the initial examples for automated testing.

I would write integration/system (different, but similar, imo) to test that the black box integrations with the system work as expected. Generally closer to the "user story" end of things.

I would write integration tests for smaller, targeted thing. Like making sure the sort method works in various cases, etc. Individual methods, especially ones that don't interact with data outside what is passed into them (functional methods), are good for unit testing.

9rx 2 hours ago [-]
> to test that the black box integrations with the system work as expected. Generally closer to the "user story" end of things.

This is what unit testing was originally described as. Which confirms my belief that unit testing and integration testing has always been the very same thing.

> Individual methods, especially ones that don't interact with data outside what is passed into them (functional methods), are good for unit testing.

Perhaps unit testing has come to mean this, but these kinds of tests are rarely ever worth writing, so it is questionable if it even needs a name. Sometimes it can be helpful to isolate a function like that for the sake of pinning down complex logic or edge cases, but is likely you'll want to delete this kind of test once you're done. This is where testing brittleness is born.

RHSeeger 2 hours ago [-]
I've described this before on occasion; I consider there to be a wide variety of tests.

- Unit test = my code works

- Functional test = my design works

- Integration test = my code is using your 3rd party stuff correctly (databases, etc)

- Factory Acceptance Test = my system works

- Site Acceptance Test = your code sucks, this totally isn't what I asked for!?!

Then there's more "concern oriented" groupings, like "regression tests", which could fall into any number of the above.

That being said, there's a pretty wide set of opinions on the topic, and that doesn't really seem to change over time.

> these kinds of tests are rarely ever worth writing

I strongly disagree. I find it very helpful to write unit tests for specific implementations of things (like a specific sort, to make sure it works correctly with the various edge cases). Do they get discarded if you completely change the implementation? Sure. But that doesn't detract from the fact that they help make sure the current implementation works the way I say it does.

9rx 2 hours ago [-]
> I find it very helpful to write unit tests for specific implementations of things (like a specific sort, to make sure it works correctly with the various edge cases).

Sorting mightn't be the greatest example as sorting could quite reasonably be the entire program (i.e. a library).

But if you needed some kind of custom sort function to serve features within a greater application, you are already going to know that your sort function works correctly by virtue of the greater application working correctly. Testing the sort function in isolation is ultimately pointless.

As before, there may be some benefit in writing code to run that sort function in isolation during development to help pinpoint what edge cases need to be considered, but there isn't any real value in keeping that around after development is done. The edge cases you discovered need to be moved up in the abstraction to the greater program anyway.

mrugge 2 hours ago [-]
In test-driven development, fast unit tests are a must-have. Integration tests are too slow. If you are not doing test-driven development, can go heavier into integration tests. I find the developer experience is not as fun without good unit tests, and even if velocity metrics are the same, that factor alone is a good reason to focus on writing more fast unit tests.
MrJohz 21 minutes ago [-]
In general, fast tests are a must-have, but I find that means figuring out how to write fast integration tests as well so that they can also be run as part of a TDD-like cycle. In my experience, integration tests can generally be written to be very quick, but maybe my definition of an integration test is different from yours?

For me, heavy tests implies end-to-end tests, because at that point you're interacting with the whole system including potentially a browser, and that's just going to be slow whichever way you look at it. But just accessing a database, or parsing and sending http requests doesn't have to be particularly slow, at least not compared to the speed at which I develop. I'd expect to be able to run hundreds of those sorts of tests in less than a second, which is fast enough for me.

soanvig 2 hours ago [-]
I think this discussion has to be open with what is a "unit" in unit tests. "Integration" consists of many units working together. But my unit can be a function or entire module. That's what people ignore in most discussions about test types.
globular-toast 37 minutes ago [-]
It depends what you are doing. Let's say your module implements a way to declare rules and then run some validation function to check objects against those rules. You can't just test every possible set of rules and object that you want to check, even though this is, of course, all that matters. You have to unit test the implementation of the module to be at all confident that it's doing the right thing.

So ultimately we write tests at a lower level to deal with the combinatorial explosion of possible inputs at the edge.

You should push your tests as far to the edge as possible but no further. If a test at the edge duplicates a test in the middle, delete the test in the middle. But if a test at the edge can't possibly account for everything you're going to bed a test in the middle.

JimDabell 17 minutes ago [-]
In my experience, a bug that causes a unit test to fail also causes an integration or E2E test to fail. Also, it’s relatively easy to determine the cause of the problem given a change and a failing integration/E2E test. Unit tests are usually much quicker to run, but you also need a lot more of them. I think when you combine these things, it’s easy to reach the conclusion that unit tests are redundant.
s_ting765 46 minutes ago [-]
Integration tests make unit tests absolutely redundant.
simianwords 3 hours ago [-]
Wow I hate this dogmatism. It is indeed better to use one instead of the other. Let’s stop pretending all are equally good and we need every type of test.

Sometimes you just don’t need unit tests and it’s okay to admit it and work accordingly.

RHSeeger 3 hours ago [-]
And sometimes you only need screws, instead of nails; or vice versa. But that doesn't invalidate the tool; it just means your use case doesn't need it.
imiric 2 hours ago [-]
You can't build a house without nails and screws, though.

Sure, if you're only writing a small script, you might not need tests at all. But as soon as that program evolves into a system that interacts with other systems, you need to test each component in isolation, as well as how it interacts with other systems.

So this idea that unit tests are not useful is coming from a place of laziness. Some developers see it as a chore that slows them down, instead of seeing it as insurance that makes their life easier in the long run, while also ensuring the system works as intended at all layers.

CuriouslyC 2 hours ago [-]
If you don't write unit tests, how do you know something works? Just manual QA? How long does that take you relative to unit tests? How do you know if something broke due to an indirect change? Just more manual QA? Do you really think this is saving you time?
tsimionescu 1 hours ago [-]
You can write many other other kinds of automated tests. Unit tests are rarely worth it, since they only look at the code in isolation, and often miss the forest for the trees if they're the only kind of test you have. But then, if you have other higher level tests that test your components are working well together, they're already implicitly covering that each component individually works well too - so your unit tests for that component are just duplicating the work the integration tests are already doing.
s_ting765 41 minutes ago [-]
I don't understand developers who don't care to know if their code runs or not.
imiric 2 hours ago [-]
You claim it's dogmatism, yet do the same thing in reverse. (:

Unit and integration tests test different layers of the system, and one isn't inherently better or more useful than the other. They complement each other to cover behavior that is impossible to test otherwise. You can't test low-level functionality in integration tests, just as you can't test high-level functionality in unit tests.

There's nothing dogmatic about that statement. If you disagree with it, that's your prerogative, but it's also my opinion that it is a mistake. It is a harmful mentality that makes code bases risky to change, and regressions more likely. So feel free to adopt it in your personal projects if you wish, but don't be surprised if you get push back on it when working in a team. Unless your teammates think the same, in which case, good luck to you all.

tsimionescu 1 hours ago [-]
The problem with this line of argument is that, in general, high level behavior (covered by integratuon tests) is dependent on low level behavior. So if your code is ascertained to work at the high level, you also know that it must be working at the lower level too. So, integration tests also tell you if your component works at a low level, not just a high level.

The converse is not true, however. It's perfectly possible for individual components to "work" well, but to not do the right thing from a high level perspective. Say, one component provides a good fast quicksort function, but the other component requires a stable sort to work properly - each is OK in isolation, but you need an integration test to figure out the mistake.

Unit tests are typically good scaffolding. They allow you to test bits of your infrastructure as you're building it but before it's ready for integration into the larger project. But they give you realtively little assurance at the project level, and are not worth it unless you're pretty sure you're building the right thing in the first place.

ashishb 4 hours ago [-]
What you are asking for is to write tests along the axis of least change

https://ashishb.net/programming/bad-and-good-ways-to-write-a...

yeswecatan 5 hours ago [-]
I find testing terminology very confusing and inconsistent. Personally, I prefer tests that cover multiple components. Is that an integration test because you test multiple components? What if you system is designed in such a way that these tests are _fast_ because the data access is abstracted away and you can use in memory repositories instead of hitting the database?
creesch 19 minutes ago [-]
> I find testing terminology very confusing and inconsistent.

That's because it is both confusing and inconsistent. In my experience, every company uses slightly different names for different types of tests. Unit tests are generally fairly well understood as testing the single unit (a method/function) but after that things get murky fast.

For example, integration tests as reflected by the confused conversation in this thread already has wildly different definitions depending on who you ask.

For example, someone might interpret them as "unit integration tests" where it reflects a test that tests a class, builder, etc. Basically something where a few units are combined. But, in some companies I have seen these being called "component tests".

Then there is the word "functional tests" which in some companies means the same as "manual tests done by QA" but for others simply means automated front-end tests. But in yet other companies those automated tests are called end 2 end tests.

What's interesting to me when viewing these online discussions is the complete lack of awareness people display about this.

You will see people very confidently say that "test X should by done in such and such way" in response to someone where it is very clear they are actually talking about different types of tests.

jessekv 4 hours ago [-]
I think it's relative, right? That's how abstractions and interfaces work.

I can write a module with integration tests at the module level and unit tests on its functions.

I can now write an application that uses my module. From the perspective of my application, my module's integration tests look like unit tests.

My module might, for example, implicitly depend on the test suite of CPython, the C compiler, the QA at the chip fab. But I don't need to run those tests any more.

In your case you hope the in-memory database matches the production one enough that you can write fast isolated unit tests on your application logic. You can trust this works because something else unit-tested the in-memory database, and integration tested the db client against the various db backends.

MathMonkeyMan 6 hours ago [-]
Integration tests at $DAY_JOB are often slow (sleeps, retries, inadequate synchronization, startup and shut down 8 processes that are slow to start and stop), flaky (the metrics for this rate limiter should be within 5%, this should be true within 3 seconds, the output of this shell command is the same on all platforms), undocumented, and sometimes cannot be run locally or with locally available configurations. When I run a set of integration tests associated with some code I'm modifying, I have no idea what they are, why they were written, what they do, how long they will take to run, or whether I should take failures seriously.

Integration tests are closer to what you want to know, but they're also more. If I want to make sure that my state machine returns an error when it receives a message for which no state transition is defined, I could spin up a process and set up log collection and orchestrate with python and... or I could write a unit test that instantiates a state machine, gives it a message, and checks the result.

My point is that we need both. Write a unit test to ensure that your component behaves to its spec, especially with respect to edge cases. Write an integration test to make sure that the feature of which your component is a part behaves as expected.

majormajor 4 hours ago [-]
You need to test contracts with external code without having to include full external systems. Unit tests on internal implementation details are fragile as behavior changes. Unit tests on your module's contracts give you confidence in refactoring.

Passing params in instead of making external calls inside your business logic functions can help. DI can help if that's too impractical or unwieldy for whatever reason in the domain.

It's hard to do right the first time - sometimes its fuzzy what's an internal detail vs what's an external contract - but you need to get there ASAP.

skydhash 5 hours ago [-]
My current mental model is a car. If it’s a function or some other things you’re fully confident you captured the domain, add unit tests to capture that. Just like an engine. But the most important is integration tests that couple something like the engine, the ignition system and test that when the user press the start button, the engine start and the dashboard light up.

Unit tests are great for DX but only integration and above tests matter business wise.

lenkite 2 hours ago [-]
These are "system tests" at your $DAY_JOB, not "integration tests".
3036e4 3 hours ago [-]
I remember reading blogs (and Testing on the Toilet) around 2010 about how Google divided tests into Small/Medium/Large, with specific definitions, rather than trying to use more vague and overloaded terminology that no one ever agreed on. Seems like they are no longer doing that? Too bad, since I think it was a clever trick to avoid having to get into pointless discussions about things like "what is a unit?". Having experienced more than one project where a unit test was uselessly defined to "have to only run a single method, everything else must be mocked" I like the idea of not having any level of tests below "small" (that is still above a level most would call "unit").

Found this long 2011 post now that goes into some detail on the background and the reasons for introducing that ("The Testing Grouplet"?): https://mike-bland.com/2011/11/01/small-medium-large.html

But I am not sure even after reading all that if the SML terminology was still used in 2011 or if they had moved on already? Can't really find any newer sources that mention it.

strogonoff 2 hours ago [-]
If there is one single test-related thing you must have, that would be e2e testing.

Integration tests are, in a way, worst of both worlds: they are more complicated than unit tests, they require involved setup, and yet they can’t really guarantee that things work in production.

End-to-end tests, meanwhile, do show whether things work or not. If something fails with an error, error reporting should be good enough in the first place to show you what exactly is wrong. If something failed without an error but you know it failed, make it fail with an error first by writing another test case. If there was an error but error reporting somehow doesn’t capture it, you have a bigger problem than tests.

At the end of the day, you want certainty that you deliver working software. If it’s too difficult to identify the failure, improve your error reporting system. Giving up that certainty because your error reporting is not good enough seems like a bad tradeoff.

Incidentally, grug-friendly e2e tests absolutely exist: just take your software, exactly as it’s normally built, and run a script that uses it like it would be used in production. This gives you a good enough guarantee that it works. If there is no script, just do it yourself, go through a checklist, write a script later. It doesn’t get more grug than that.

dsego 2 hours ago [-]
E2e tests are the hardest to maintain and take a lot of time for little benefit in my experience. I'm talking about simulating a browser to open pages and click on buttons. They are flaky and brittle, the UI is easily the component which gets updated the most often, it's also easy to manually test while developing during QA and UAT. It's hard to mock out things, so you either have to bootstrap or maintain a whole 2nd working system with all the bells and whistles, including authentication, users, real data in the database, 3rd party integrations etc. It's just too overwhelming for little benefit. It's also hard to cover all error cases to see if a thing works correctly or breaks subtly. Most commonly in e2e we test for the happy path just to see that the thing doesn't fall over.
strogonoff 2 hours ago [-]
The benefit is certainty that the system you are building and delivering to people works. If that benefit is little, then I don’t quite understand the point of testing.

> it's also easy to manually test while developing during QA and UAT.

As I said in the original comment, e2e tests can definitely be manual. Invoke your CLI, curl your API, click around in GUI. That said, comprehensively testing it that way quickly becomes infeasible as your software grows.

jraph 2 hours ago [-]
Integration test are a compromise. e2e tests may be quite expensive to run (for a web application, you might need to run your backend and a web browser, possibly in a docker container - and the whole thing will also run slower). Efficiency matters a lot.

You can have robust testing by combining the two. You can check that the whole thing runs end to end once, and then test all the little features / variations using integration tests.

That's what we do for XWiki.

https://dev.xwiki.org/xwiki/bin/view/Community/Testing/#HTes...

ivanb 1 hours ago [-]
In practice e2e tests don't cover all code paths and raise a question: what is the point? There is a code path explosion when going from a unit to an endpoint. A more low-level test can cover all code paths of every small unit, whereas tests at service boundary do not in practice do that. Even if they did, there would be a lot of duplication because different service endpoints would reuse the same units. Thus, I find e2e tests very limited in usability. They can demonstrate that the whole stack works together on a happy path, but that's about it.
3 hours ago [-]
mstigge 2 hours ago [-]
[dead]
vitonsky 2 hours ago [-]
No, thanks. I already spent time to write tests while implementing a features, now I have a lot of tests that proof the feature is works fine, and I no more fear to make changes, because tests keep me safe of regression bugs.

The typical problems of any code base with no tests is a regression bugs, rigid team (because they must keep in mind all cases when code may destroy everything), fear driven development (because even team with zero rotation factor don't actually remember all problems they've fixed).

willio58 2 hours ago [-]
Did you read the article?

What is your answer to the points the author makes around flaky tests/changing business requirements/too many tests confirming the same functionality and taking too long to run?

snovv_crash 1 hours ago [-]
Flaky tests: tests should be deterministic. If your tests are flakey in a 100% controlled environment, probably your real system is unreliable too.

Changing business requirements: business logic should be tested separately. It is expected to change, so if all of your tests include it, then yes of course it will be hard to maintain.

Too many tests for the same thing: yeah then maybe delete some of the duplicates?

Taking too long: mock stuff out. Also, maybe reconsider some architectural decisions you made, if your tests take too long it's probably going to bother your customers with slow behaviour too.

ikari_pl 1 hours ago [-]
I think the point of article is to delete the BAD tests.

Just like you need to delete the bad code, not all the code. ;)

CPLX 1 hours ago [-]
Hacker News is a prominent message board where users create wide ranging conversations based on article titles.
jhhh 4 hours ago [-]
If you are having to refactor 150 things each time you change your codebase then maybe you need to refactor your test suite first. Direct calls in tests to constructed/mocked objects is usually something you can just stuff into a private method so you only need to change it in one place.

Not quite sure I agree with the conclusion of the tiers of testing section either. If a test suite takes a long time but still covers something useful, then just deleting it because it takes too long makes no sense. Yes, if you have a 'fastTests' profile that doesn't run it that could temporarily convince you your changes are fine when they aren't. But the alternative is just never knowing your change is bad until it breaks production instead of just breaking your CI prior to that point.

strogonoff 3 hours ago [-]
Tests are code. Code has bugs. More complex code has more bugs. The more complex your tests, the more bugs in your tests. Who tests the tests? It’s one thing if you rely on functionality provided by a stable testing framework, but I bet grug no like call stacks in own test code.
jessekv 2 hours ago [-]
> Who tests the tests?

To me it's a bit like double entry bookkeeping. Two layers is valuable, but there's rapidly diminishing returns beyond two.

lenkite 2 hours ago [-]
In my last org, we just separated "flaky" system tests into its own independent suite. They were still valuable to run - just not all the time.
eru 3 hours ago [-]
A simple thing you can set up is to run short tests on every push to a feature branch, but run the long tests only when merging into master.

Basically, you provisionally make the merge commit, run the expensive tests against it, and iff they pass, declare the newly created commit to be the new master.

dcminter 4 hours ago [-]
How about you fix the flakey tests?

The tests I'd delete are the ones that just test that the code is written in a particular way instead of testing the expected behaviour of thr code.

Shank 3 hours ago [-]
> How about you fix the flakey tests?

Often times a flakey test is not flakey because it was well-written and something else strange is failing. Often times the test reveals something about the system that is somewhat non-deterministic, but not non-deterministic in a detrimental way. When you have multiple levels of abstraction and parallelization and interdependent behavior, fixing a single test becomes a time consuming process that is difficult to work with (because it's flakey, you can't always replicate the failure).

If a test fails in CI and the traceback is unclear, many people will re-run once and let it continue to flake. Obvious flakes around time and other dependencies are much easier to spot and fix, so they are. It's only the weird ones that lead to pain and regret.

silversmith 4 hours ago [-]
Came here to comment this. Most of the flakey tests are badly written, some warn you about bugs you don't yet understand.

Couple years ago I helped to bring a project back on track. They had a notoriously flakey part of test suite, turned out to be caused by a race condition. And a very puzzling case of occasional data corruption - also, turns out, caused by the same race condition.

XorNot 4 hours ago [-]
This: anything which starts doing stuff like "called API N times" is utterly worthless (looking at you whole bunch of AWS API mock tests...)
avg_dev 7 hours ago [-]
idk, i never thought

> “it is blasphemy to delete a test”,

was ever a thing. i still don't.

if a test is flaky, but it covers something useful, it should be made not flaky somehow. if tests are slow, but they are useful, then they should be optimized so they run faster. if tests cover some bit of functionality that the software no longer needs to provide, the functionality should be deleted from the code and the tests. if updating a small bit of code causes many tests to need to be adjusted, and that's a pain, and it happens frequently, then the tests should be refactored or adjusted.

> Confidence is the point of writing tests.

yes, agreed. but tests are code, too. just maintain the tests with the code in a sensible way. if there is something worth deleting, delete it; there is no gospel that says you can't. but tests provide value just like the author describes in the "fix after revert after fix after revert" counterexample. just remember they're code like anything else is all and treat them accordingly.

rcktmrtn 6 hours ago [-]
> > “it is blasphemy to delete a test”,

> was ever a thing. i still don't.

I experienced this when working at a giant company where all the teams were required to report their "code coverage" metrics to middle management.

We had the flaky test problem too, but I think another angle of is being shackled to test tech-debt. The "coverage goals" in practice encouraged writing a lot of low quality tests with questionable and complex fixtures (using regular expressions to yoink C++ functions/variables out of their modules and place them into test fixtures).

Fiddling with tests slowed down a lot of things, but there was a general agreement that the whole projected needed to be re-architected (it was split up over a zillion different little "libraries" that pretended to be independent, but were actually highly interdependent) and while I was there I always felt like we needed to cut the Gordian knot and accept that it might decrease the sacred code coverage.

Not sure if I was right or what ever happened with that project but it sure was a learning experience.

skybrian 6 hours ago [-]
I think the argument is that sometimes updating a flaky test is not worth the effort, so consider deleting it,
arkis22 5 hours ago [-]
Adding a test is easy. Deleting a test should involve like 3-4 people who all know the codebase.
sitkack 2 hours ago [-]
What a poorly written article. I should delete my tests because they fail randomly. My tests don't fail randomly.
jampa 4 hours ago [-]
I work in an app where bugs are unacceptable due to the nature of the company's reputation. We've been having a lot of success with E2E, but getting there was NOT easy. Some tips:

- False negative results will make your devs hate the tests. People want to get things done and will start ignoring them if you unnecessarily break their workflow. In the CI, you should always retry on failure to avoid flaky false-negative tests.

- E2E Tests can fail suddenly. To avoid breaking people's workflow, we do a megabenchmark every day at 1 AM, and the test runs multiple times - even if it passes - so that we can measure flakiness. If a test fails in the benchmark, we remove it from the CI so we don't break other developers' workflows. The next day, we either fix the test or the bug.

- Claude Code SDK has been a blessing for E2E. Before, you couldn't run all the E2E in the PR's CI due to the time they all take. Now, we can send the branch to the Claude Code SDK to determine what E2E tests should run.

- Also, MCPs and Claude Code now write most of my E2E. I wrote a detailed Claude.md to let it run autonomously --writing, validating, and repeating -- while I do something else. It does in 3 to 4 shots. For the price of a cup of coffee, it saves me 30-60 minutes per test.

bubblebeard 4 hours ago [-]
The author has a point. Obsolete tests serves no one, but deleting a test because it will randomly fail is an indication of an unstable process. Maybe there is a race condition, maybe your code has some dependency that is sporadically unavailable. Deleting such tests is just turning a blind eye to the problem. Unstable tests means you either didn’t write that test very well to begin with, or the process you are testing is itself unstable.
nine_k 1 hours ago [-]
Un-clickbaiting the title: "Delete useless tests".

I once faced a suite of half-broken tests; so many were broken that engineers stopped caring if their changes broke another test. I suggested to separate a subset of still-working, useful tests, keep them always green, and make them passing a required check in CI/CD. Ignore the rest of the tests for CI/CD purposes. Gradually fix some of the flaky or out-of-sync tests if they are still useful, and promote them to the evergreen subset. Delete tests that are found to be beyond repair (like the article suggests).

This worked.

egeozcan 1 hours ago [-]
At one of the companies I worked with when I was doing consulting, they could make the slow tests, which used to take around 3 hours, run much faster and in parallel by throwing engineering and hardware resources at the problem. First it was 30 minutes, then it was 10, then around 2-3 minutes.

I think it was one of the best investments that company made.

So my point is, don't delete slow tests, just make them fast.

4ndrewl 2 hours ago [-]
> If your test is creating confidence in broken code with failing tests, it would be better for it to not exist.

The author never considers the other option of fixing the flaky tests. I find this odd.

teiferer 2 hours ago [-]
All seems like "fix tests" is the better advice.

Flaky test? Fix it! Make it rock solid! Slow test? Fix it! Make it fast! That can be hard (if it was easy, people would have already done it), but it's vastly more useful than deleting.

Even the mentioned overtesting requires a fix by focusing the tests on separate things. You could call that "deleting" but that's oversimplifying what's going on. Same with changed requirements.

simianwords 3 hours ago [-]
A good heuristic for a test is: how many times you are having to fix it when you change real code.

If every small change in the code base causes you to go back and fix the tests then your tests are bad. They should not get in the way so often. There should be a concept of “test maintenance overhead” that is weighted against the number of bugs it catches. You could also think of it as false positives vs true positives.

rotbart 2 hours ago [-]
So... clickbait title for an article that could have been called "Delete flakey tests"...but then and most of us would have just gone "yep" and not clicked.
rustystump 2 hours ago [-]
At the end of the day, you need to have some kind of way to know if shit work or dont. This article feels a bit contrived to make an edgy point of “delete the test” which feels like it misses the real why behind testing.
throwmeaway222 2 hours ago [-]
delete all mocked tests imo

  mock exists
  call exists
  assert exists was called one time
so so so useless so that you can increase your coverage. just move to integration tests
efitz 5 hours ago [-]
I have had a weird thought lately about testing at runtime. My thought is just to log violations of expectations- i.e. log when the test would have failed.

This doesn’t prevent the bug from being introduced but can remove a huge amount of complexity for test cases that are hard to set up.

seer 4 hours ago [-]
I’ve kinda of the opinion that if introducing tests, especially the useful integration tests is hard and complex, then it is a code smell.

Most of the times, especially while I was learning, making your code “more testable” has always involved things that should have been done in the first place, but we were lazy/didn’t know better.

Things like reducing dependencies, moving state away from the core and into the shell. Using more formal state machines etc. Once the “painful changes” were done I’ve found that it was actually beneficial in a lot of other contexts.

That given, I’ve kinda almost stopped writing unit tests - with the advent of expressive types everywhere, the job of unit tests has now been shifted to the compiler.

In one typescript project I’ve managed to set it up, the part that kept the state was statically typed (a database) making sure any data that went in and out was _exactly_ like the compiler expected.

After typing and validating all the other user / non-user inputs into the code, it ended up in a situation where “if the code compiles, it will work” and that was glorious. We had very minimal unit tests - only around actual business logic with state machines, the rest was kinda handled by the compiler and we didn’t feel the need to do it manually.

Apart from that, the integration tests had the philosophy of “don’t specify anything that the user is not seeing” so no button test ids, urls or weird expectation of the underlying code, just an explanation of “the user is on the page with this title, they see a button named this and they press it, expecting they are now in a page titled this”

The concept was taken from the capybara ruby testing library way back in the day, and the tests this produced have been incredibly resilient. Any update that changes the user experience would fail the tests (as they should) and any refactor, up to the level of changing urls or even changing the underlying libraries and frameworks, would be ignored.

runstop 2 hours ago [-]
Sounds a bit like "design by contract", leaving the assertions enabled in production code. It would be great to have solid DbC support in mainstream languages.
mirekrusin 2 hours ago [-]
We add .skip and QAs are taking over in the background to address those issues.
gijoeyguerra 5 hours ago [-]
I've always deleted tests. I've never heard anyone say not to delete tests.
fritzo 5 hours ago [-]
I repeatedly, emphatically tell AI coding assistants not to delete tests.
readthenotes1 4 hours ago [-]
I groaned when a co-worker deleted a test that was pointing out his code was broken.

I didn't tell him not to delete tests. It wouldn't have done any good.

cjfd 2 hours ago [-]
I think the word you were looking for was 'cow orker'.
mattlondon 1 hours ago [-]
Don't delete flaky tests, fix them.
imiric 3 hours ago [-]
I'm a big believer in the utility of tests, and I do think the author has a point. There is a time and place when a test is not useful, and should be deleted.

However...

> If the future bug occurs, fix it and write a new test that doesn’t flake. Today, delete the tests.

How is this different from simply fixing the flaky test today?

Tests are code, and can also incur technical debt. The reason the test is flaky is likely because nobody is willing to take the time to address it properly. Sometimes it requires a refactoring of the SUT to allow making the test more reliable. Sometimes the test itself is too convoluted and difficult to change. All of this is chore work, and is often underappreciated. Nobody got promoted or celebrated for fixing something that is an issue a random percentage of times. After all, how do we know for sure that it's permanently fixed? Only time will tell.

But the flaky test might still deliver confidence and be valuable when it does run successfully. So deleting it would bring more uncertainty. That doesn't seem like a fair tradeoff for removing an annoyance. The better approach would be to deal with the annoyance.

> What if your tests are written so that a one line code change means updating 150 tests?

That might be a sign that the tests are too brittle, and too "whiteboxy". So fix the tests.

That said, there are situations when a change does require updating many tests. These are usually large refactors or major business logic changes. This doesn't mean that the tests are and won't be useful. It's just a side effect of the change. Tests are code, so fix the tests.

I've often heard negativity around unit tests, from programmers who strongly believe that more utility comes from integration tests (the inverted test pyramid, etc.). One of the primary reasons is this belief that unit tests slow you down because they need to be constantly updated. This is a harmful mentality, coming from a place of laziness.

Tests are code, and require maintenance just as well. Unit tests in particular are tightly coupled to the SUT, which makes them require maintenance more frequently. There should also be more unit tests than other types, adding more maintenance burden. But none of these are reasons to not write unit tests, and codebases without them are more difficult to change, and more susceptible to regressions.

> What if your tests take so long to run that you can’t run them all between merges, and you start skipping some?

That is an organizational problem. Label your tests by category (unit, integration, E2E), and provide quick ways to run a subset of them. During development, you can run the quick tests for a sanity check, while the more expensive tests run in CI.

There's also the problem of long test suites because the tests are inefficient.

Again: *fix the tests*.

> Even worse, what if your business requirements have changed, and now you have thousands of lines of tests failing because they test the wrong thing?

That is a general maintenance task. Would you say the same because you had to update a library that depended on the previous business logic? Would you simply delete the library because it would take a lot of effort to update it?

No?

Then *fix the tests*. :)

juped 1 hours ago [-]
I think all the listed reasons are good reasons to delete tests. I like to keep the test suite running in a single-digit number of seconds. (Sometimes a test you really need takes a while, and you can skip it by default and enable it on the CI test runner or whatever.)

Another one I really agree with is "What if your tests are written so that a one line code change means updating 150 tests?". If you update a test, basically, ever, it's probably a bad test and is better off not existing than being like that. It's meant to distinguish main code with errors from main code without errors; if it must be updated in tandem with main code, it's just a detector that anything changed, which is counterproductive. Of course you're changing things, that's why you fenced them with tests.

jessekv 2 hours ago [-]
I hate to admit it, but flaky tests almost always highlight weaknesses in my software architecture.

And fixing a flaky test usually involves making the actual code more robust.

huflungdung 1 minutes ago [-]
[dead]
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 08:30:42 GMT+0000 (Coordinated Universal Time) with Vercel.