Interesting that every comment has "Help improve Copilot by leaving feedback using the or buttons" suffix, yet none of the comments received any feedback, either positive or negative.
> This seems like it's fixing the symptom rather than the underlying issue?
This is also my experience when you haven't setup a proper system prompt to address this for everything an LLM does. Funniest PRs are the ones that "resolves" test failures by removing/commenting out the test cases, or change the assertions. Googles and Microsofts models seems more likely to do this than OpenAIs and Anthropics models, I wonder if there is some difference in their internal processes that are leaking through here?
The same PR as the quote above continues with 3 more messages before the human seemingly gives up:
> please take a look
> Your new tests aren't being run because the new file wasn't added to the csproj
> Your added tests are failing.
I can't imagine how the people who have to deal with this are feeling. It's like you have a junior developer except they don't even read what you're telling them, and have 0 agency to understand what they're actually doing.
How are people reviewing that? 90% of the page height is taken up by "Check failure", can hardly see the code/diff at all. And as a cherry on top, the unit test has a comment that say "Test expressions mentioned in the issue". This whole thing would be fucking hilarious if I didn't feel so bad for the humans who are on the other side of this.
surgical_fire 2 hours ago [-]
> I can't imagine how the people who have to deal with this are feeling. It's like you have a junior developer except they don't even read what you're telling them, and have 0 agency to understand what they're actually doing.
That comparison is awful. I work with quite a few Junior developers and they can be competent. Certainly don't make the silly mistakes that LLMs do, don't need nearly as much handholding, and tend to learn pretty quickly so I don't have to keep repeating myself.
LLMs are decent code assistants when used with care, and can do a lot of heavy lifting, they certainly speed me up when I have a clear picture of what I want to do, and they are good to bounce off ideas when I am planning for something. That said, I really don't see how it could meaningfully replace an intern however, much less an actual developer.
safety1st 1 hours ago [-]
These GH interactions remind me of one of those offshore software outsourcing firms on Upwork or Freelancer.com that bid $3/hr on every project that gets posted. There's a PM who takes your task and gives it to a "developer" who potentially has never actually written a line of code, but maybe they've built a WordPress site by pointing and clicking in Elementor or something. After dozens of hours billed you will, in fact, get code where the new file wasn't added to the csproj or something like that, and when you point it out, they will bill another 20 hours, and send you a new copy of the project, where the test always fails. It's exactly like this.
Nice to see that Microsoft has automated that, failure will be cheaper now.
dkdbejwi383 50 minutes ago [-]
This gives me flashbacks to when my big corporate former employer outsourced a bunch of work offshore.
An outsourced contractor was tasked with a very simple job as their first task - update a single dependency, which required just a bump of the version and no code changes - after three days of them seemingly struggling to even understand what they were asked to do, inability to clone the repo, failure to install the necessary tooling on their machine, they ended up getting fired from the project. Complete waste of money, and the time of those of us having to delegate and review this work.
AbstractH24 45 minutes ago [-]
> These GH interactions remind me of one of those offshore software outsourcing firms on Upwork or Freelancer.com that bid $3/hr on every project that gets posted
Those have long been the folks I’ve seen at the biggest risk of being replaced by AI. Tasks that didn’t rely on human interaction or much training, just brute force which can be done from anywhere.
And for them, that $3/hr was really good money.
voxic11 48 minutes ago [-]
Actually the AI might still be more expensive at this point. But give it a few years I'm sure they will get the costs down.
kamaal 23 minutes ago [-]
>>These GH interactions remind me of one of those offshore software outsourcing firms on Upwork or Freelancer.com that bid $3/hr on every project that gets posted.
This level of smugness is why outsourcing still continues to exist. The kind of things you talk about were rare. And were mostly exaggerated to create anti-outsourcing narrative. None of that led to outsourcing actually going away simply because people are actually getting good work done.
Bad quality things are cheap != All cheap things are bad.
Same will work with AI too, while people continue to crap on AI, things will only improve, people will be more productive with AI, get more and bigger things done for cheaper and better. This is just inevitable given how things are going now.
>>There's a PM who takes your task and gives it to a "developer" who potentially has never actually written a line of code, but maybe they've built a WordPress site by pointing and clicking in Elementor or something.
In the peak of outsourcing wave. Both the call center people and IT services people had internal training and graduation standards that were quite brutal and mad attrition rates.
Exams often went along the lines of having to write whole ass projects without internet help in hours. Theory exams that had like -2 marks on getting things wrong. Dozens of exams, projects, coding exams, on-floor internships, project interviews.
>>After dozens of hours billed you will, in fact, get code where the new file wasn't added to the csproj or something like that, and when you point it out, they will bill another 20 hours, and send you a new copy of the project, where the test always fails. It's exactly like this.
Most IT services billing had pivoted away from hourly billing, to fixed time and material in the 2000s itself.
>>It's exactly like this.
Very much like outsourcing. AI is here to stay man. Deal with it. Its not going anywhere. For like $20 a month, companies will have same capability as a full time junior dev.
This is NOT going away. Its here to stay. And will only get better with time.
whatshisface 16 minutes ago [-]
There's no reason why an outsourcing firm would charge less for work of equal quality. If a company outsourced to save money, they'd get one of the shops that didn't get the job done.
kamaal 8 minutes ago [-]
>>There's no reason why an outsourcing firm would charge less for work of equal quality.
Most of this works because of price arbitrage. And continues to work that way, not just with outsourcing but with manufacturing too.
Remember those days, when people were going around telling Chinese products where crap? That didn't really work and more things only got made in China.
This is all so similar to early days of Google search, its just that cost of a search was low enough that finding things got easier and ubiquitous. That same is unfolding with AI now. People have a hard time believing a big part of their thinking can be outsourced to something that costs $20/month.
How can something as good as me be cheaper than me? You are asking the wrong question. For centuries now, every decade a machine(s) has arrived that can do a thing cheaper than what the human was doing at the time. Its not exactly impossible. You are only living in denial by asking this question, this has been how it has worked the day since humans found way of mimicking human work through machines. We didn't get here in a day.
dttze 3 minutes ago [-]
It’s not 20, it’s 200+. And that will only get more expensive.
sbarre 1 hours ago [-]
I think that was the point of the comparison..
It's not like a regular junior developer, it's much worse.
preisschild 1 hours ago [-]
> That said, I really don't see how it could meaningfully replace an intern however
And even if it could, how do you get senior devs without junior devs? ^^
lazide 58 minutes ago [-]
Sounds like a next quarter problem (I wish it was /s).
yubblegum 1 hours ago [-]
This field (SE - when I started out back in late 80s) was enjoyable. Now it has become toxic, from the interview process, to imitating "big tech" songs and dances by small fry companies, and now this. Is there any joy left in being a professional software developer?
coldpie 4 minutes ago [-]
I've been looking at getting a CDL and becoming a city bus driver, or maybe a USPS driver or deliveryman or clerk or something.
bluefirebrand 1 hours ago [-]
Making quite a bit of money brings me a lot of joy compared to other industries
But the actual software part? I'm not sure anymore
diggan 1 hours ago [-]
> This field (SE - when I started out back in late 80s) was enjoyable. Now it has become toxic
I feel the same way today, but I got started around 2012 professionally. I wonder how much of this is just our fading optimism after seeing how shit really works behind the scenes, and how much the industry itself is responsible for it. I know we're not the only two people feeling this way either, but it seems all of us have different timescales from when it turned from "enjoyable" to "get me out of here".
salawat 36 minutes ago [-]
My issue stems from the attitudes of the people we're doing it for. I started out doing it for humanity. To bring the bicycle for the mind to everyone.
Then one day I woke up and realized the ones paying me were also the ones using it to run over or do circles around everyone else not equipped with a bicycle yet; and were colluding to make crippled bicycles that'd never liberate the masses as much as they themselves had been previously liberated; bicycles designed to monitor, or to undermine their owner, or more disgustingly, their "licensee".
So I'm not doing it anymore. I'm not going to continue making deliberately crippled, overly complex, legally encumbered bicycles for the mind, purely intended as subjects for ARR extraction.
sweman 1 hours ago [-]
>> Is there any joy left in being a professional software developer?
Yes, when your 100k quarterly RSU drop lands
camdenreslink 42 minutes ago [-]
A very very small percentage of professional software developers get that.
mrweasel 1 hours ago [-]
At least we can tell the junior developers to not submit a pull-request before they have the tests running locally.
At what point does the human developers just give up and close the PRs as "AI garbage". Keep the ones that works, then just junk the rest. I feel that at some point entertaining the machine becomes unbearable and people just stops doing it or rage close the PRs.
pydry 1 hours ago [-]
When their performance reviews stop depending upon them not doing that.
Microsoft's stock price is dependent on them proving that this is a success.
mrweasel 40 minutes ago [-]
What happens when they can't prove that and development efficiency starts falling, because developers spend 50% of their time battling copilot?
blibble 8 minutes ago [-]
they'll just add more and more half-baked features
it's not as if Microsoft's share price have ever reflected the quality of their products
vasco 2 hours ago [-]
> improve Copilot by leaving feedback using the or buttons" suffix, yet none of the comments received any feedback, either positive or negative
Why do they even need it? Success is code getting merged 1st shot, failure gets worse the more requests for changes the agent gets. Asking for manual feedback seems like a waste of time. Measure cycle time and rate of approvals and change failure rate like you would for any developer.
xnorswap 2 hours ago [-]
> How are people reviewing that? 90% of the page height is taken up by "Check failure",
Typically, you wouldn't bother manually reviewing something until the automated checks have passed.
diggan 2 hours ago [-]
I dunno, when I review code, I don't review what's automatically checked anyways, but thinking about the change/diff in a broader context, and whatever isn't automatically checked. And the earlier you can steer people in the right direction, the better. But maybe this isn't the typical workflow.
Cthulhu_ 2 hours ago [-]
It's a waste of time tbh; fixing the checks may require the author to rethink or rewrite their entire solution, which means your review no longer applies.
Let them finish a pull request before spending time reviewing it. That said, a merge request needs to have an issue written before it's picked up, so that the author does not spend time on a solution before the problem is understood. That's idealism though.
xnorswap 2 hours ago [-]
The reality is more nuanced, there are situations where you'd want to glance over it anyway, such as looking for an opportunity to coach a junior dev.
I'd rather hop in and get them on the right path rather than letting them struggle alone, particularly if they're struggling.
If it's another senior developer though I'd happily leave them to it to get the unit tests all passing before I take a proper look at their work.
But as a general principle, please at least get a PR through formatting checks before assigning it to a person.
prossercj 33 minutes ago [-]
This comment on that PR is pure gold. The bots are talking to each other:
I agree that not auto-collapsing repeated annotations is an annoying bug in the github interface.
But just pointing out that annotations can be hidden in the ... menu to the right (which I just learned).
jon-wood 2 hours ago [-]
I'm not entirely sure why they're running linters on every available platform to begin with, it seems like a massive waste of compute to me when surely the output will be identical because it's analysing source code, not behaviour.
codyvoda 2 hours ago [-]
or press “a”
ta1243 1 hours ago [-]
> @copilot please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
marmakoide 56 minutes ago [-]
Hot take : the whole LLM craze is fed by a delusion. LLM are good at mimicking human language, capturing some semantics on the way. With a large enough training set, the amount of semantic captured covers a large fraction of what the average human knows. This gives the illusion of intelligence, and the humans extrapolates on LLM capabilities, like actual coding. Because large amounts of code from textbooks and what not is on the training set, the illusion is convincing for people with shallow coding abilities.
And then, while the tech is not mature, running on delusion and sunken costs, it's actually used for production stuffs. Butlerian Jihad when
spacecadet 2 hours ago [-]
"I wonder if there is some difference in their internal processes that are leaking through here?"
Maybe, but likely it is reality and their true company culture leaking through. Eventually some higher eq execs might come to the very late realization that they cant actually lead or build a worthwhile and productive company culture and all that remains is an insane reflection of that.
esafak 2 minutes ago [-]
I speculate what is going on is that the agent's context retrieval algorithm is bad, so it does not give the LLM enough context, because today's models should suffice to get the job done.
A comment on the first pull request provides some context:
> The stream of PRs is coming from requests from the maintainers of the repo. We're experimenting to understand the limits of what the tools can do today and preparing for what they'll be able to do tomorrow. Anything that gets merged is the responsibility of the maintainers, as is the case for any PR submitted by anyone to this open source and welcoming repo. Nothing gets merged without it meeting all the same quality bars and with us signing up for all the same maintenance requirements.
abxyz 51 minutes ago [-]
The author of that comment, an employee of Microsoft, goes on to say:
> It is my opinion that anyone not at least thinking about benefiting from such tools will be left behind.
The read here is: Microsoft is so abuzz with excitement/panic about AI taking all software engineering jobs that Microsoft employees are jumping on board with Microsoft's AI push out of a fear of "being left behind". That's not the confidence inspiring the statement they intended it to be, it's the opposite, it underscores that this isn't the .net team "experimenting to understand the limits of what the tools" but rather the .net team trying to keep their jobs.
hnthrow90348765 32 minutes ago [-]
TBF they are dogfooding this (good) but it's just not going well
username135 32 minutes ago [-]
i dont think hey are mutually exclusive. jumping on board seems like the smart move if you're worried about losing your career. you also get to confirm your suspicions.
lcnPylGDnU4H9OF 50 minutes ago [-]
This is important context given that it would be absurd for the managers to have already drawn a definitive conclusion about the models’ capabilities. An explicit understanding that the purpose of the exercise is to get a better idea of the current strengths and weaknesses of the models in a “real world” context makes this actually very reasonable.
bramhaag 2 hours ago [-]
Seeing Microsoft employees argue with an LLM for hours instead of actually just fixing the problem must be a very encouraging sight for businesses that have built their products on top of .NET.
mikrl 39 minutes ago [-]
I remember before mass LLM adoption, reading an issue on GitHub where an increasingly frustrated user was failing to properly describe a blocking issue, and the increasingly frustrated maintainer was failing to get them to stick to the issue template.
Now you don’t even need the frustrated end user!
shultays 9 minutes ago [-]
one day both sides will be AI so we can all relax and enjoy our mojitos
nashashmi 2 hours ago [-]
I sometimes feel like that is the right outcome for bad management and bad instructions. Only this time they can’t blame the junior engineer and are left to only blame themselves.
qoez 1 hours ago [-]
They'll probably blame openai/the AI instead.
mikrl 35 minutes ago [-]
“Why is it not doing what I’m telling it to?!”
- Neophyte developers, since forever
Now you can mediate the mismatch in requirements through thousands / millions of matrix multiplications!
nashashmi 1 hours ago [-]
AI has reproducible outcomes. If someone else can make it work, then they should too.
svick 1 hours ago [-]
You don't want them to experiment with new tools? The main difference now is that the experiment is public.
stickfigure 2 minutes ago [-]
It's pretty obviously a failed experiment. Why keep repeating it? Try again in another 3 months.
The answer is probably that the Copilot team is using the rest of the engineering organization as testers. Great for the Copilot team, frustrating for everyone else.
gmm1990 53 minutes ago [-]
I wouldn't necessarily call that just an experiment if the same requests aren't being fixed without copilot and the ai changes could get merged.
I would say the copilot system isn't really there yet for these kinds of changes, you don't have to run experiments on a language framework to figure that out.
ozim 1 hours ago [-]
That is why they just fired 7k people so they don’t argue with LLM but let it do the work /s
rsynnott 2 hours ago [-]
Beyond every other absurdity here, well, maybe Microsoft is different, but I would never assign a PR that was _failing CI_ to somebody. That that's happening feels like an admission that the thing doesn't _really_ work at all; if it worked even slightly, it would at least only assign passing PRs, but presumably it's bad enough that if they put in that requirement there would be no PRs.
sbarre 1 hours ago [-]
I feel like everyone is applying a worse-case narrative to what's going on here..
I see this as a work in progress.. I am almost certain the humans in the loop on these PRs are well aware of what's going on and have their expectations in check, and this isn't just "business as usual" like any other PR or work assignment.
This is a test. You can't improve a system without testing it on real world conditions.
How do we know they're not tweaking the Copilot system prompts and settings behind the scenes while they're doing this work?
Can no one see the possibility that what is happening in those PRs is exactly what all the people involved expected to have happen, and they're just going through the process of seeing what happens when you try to refine and coach the system to either success or failure?
When we adopted AI coding assist tools internally over a year ago we did almost exactly this (not directly in GitHub though).
We asked a bunch of senior engineers to see how far they could get by coaching the AI to write code rather than writing it themselves. We wanted to calibrate our expectations and better understand the limits, strengths and weaknesses of these new tools we wanted to adopt.
In most of those early cases we ended up with worse code than if it had been written by humans, but we learned a ton. We can also clearly see how much better things have gotten over time, since we have that benchmark to look back on.
rco8786 47 minutes ago [-]
I think people would be more likely to adopt this view if the overall narrative about AI is that it’s a work in progress and we expect it to get magnitudes better. But the narrative is that AI is already replacing human software engineers.
codyvoda 44 minutes ago [-]
from who? and why do you listen to them?
think for yourself
rco8786 43 minutes ago [-]
That's a weird comment. I do think for myself. I wasn't even talking about my own personal thoughts on the matter. I can just plainly see that the overwhelming narrative in the public zeitgeist is that AI can do jobs that humans can do. And it's not true.
codyvoda 34 minutes ago [-]
why does every engineer keep talking about it like it’s more than marketing hype? why do you actually accept this is a real narrative real people believe? have you talked to the executives implementing these strategies?
redbull does not give you wings. it’s disconcerting to see the lack of nuance in these discussions around these new tools (and yeah sorry this isn’t really aimed at you, but the zeitgeist, apologies)
skwee357 21 minutes ago [-]
Because this “marketing hype” is affecting the way we do our job.
Some of us are being laid off due to the hype; some are assigned to babysit the AI; and some are simply looked down on by higher ups who are eagerly waiting for a day to lay us all off.
You can convince yourself as much as you want that it’s “just a hype”, but regardless of your beliefs are, it has REAL world consequences.
mieubrisse 50 minutes ago [-]
I was looking for exactly this comment. Everybody's gloating, "Wow look how dumb AI is! Haha, schadenfreude!" but this seems like just a natural part of the evolution process to me.
It's going to look stupid... until the point it doesn't. And my money's on, "This will eventually be a solved problem."
grewsome 1 minutes ago [-]
Sometimes the last 10% takes 90% of the time. It'll be interesting to see how this pans out, and whether it will eventually get to something that could be considered a solved problem.
I'm not so sure they'll get there. If the solved problem is defined as a sub-standard but low cost, then I wouldn't bet against that. A solution better than that though, I don't think I'd put my money on that.
roxolotl 27 minutes ago [-]
The question though is what is the time horizon of “eventually”. Very different decisions should be made if it’s 1 year, 2 years, 4 years, 8 years etc. To me it seems as if everyone is making decisions which are only reasonable if the time horizon is 1 year. Maybe they are correct and we’re on the cusp. Maybe they aren’t.
Good decision making would weigh the odds of 1 vs 8 vs 16 years. This isn’t good decision making.
rsynnott 20 minutes ago [-]
Or _never_, honestly. Sometimes things just don't work out. See various 3d optical memory techs, which were constantly about to take over the world but never _quite_ made it to being actually useful, say.
ecb_penguin 6 minutes ago [-]
> This isn’t good decision making.
Why is doing a public test of an emerging technology not good decision making?
> Good decision making would weigh the odds of 1 vs 8 vs 16 years.
What makes you think this isn't being done?
balazstorok 2 hours ago [-]
At least opening PRs is a safe option, you can just dump the whole thing if it doesn't turn out to be useful.
Also, trying something new out will most likely have hiccups. Ultimately it may fail. But that doesn't mean it's not worth the effort.
The thing may rapidly evolve if it's being hard-tested on actual code and actual issues. For example it will be probably changed so that it will iterate until tests are actually running (and maybe some static checking can help it, like not deleting tests).
Waiting to see what happens. I expect it will find its niche in development and become actually useful, taking off menial tasks from developers.
6uhrmittag 14 minutes ago [-]
> At least opening PRs is a safe option, you can just dump the whole thing if it doesn't turn out to be useful.
However, every PR adds load and complexity to community projects.
As another commenter suggested, doing these kind of experiments on separate forks sound a bit less intrusive.
Could be a take away from this experiment and set a good example.
There are many cool projects on GitHub that are just accumulating PRs for years, until the maintainer ultimately gives up and someone forks it and cherry-picks the working PRs. I've than that myself.
I'm super worried that we'll end up with more and more of these projects and abandoned forks :/
Frost1x 1 hours ago [-]
It might be a safer option in a forked version of the project that the public can’t see. I have to wonder about the optics here from a sales perspective. You’d think they’d test this out more internally before putting it in public access.
Now when your small or medium size business management reads about CoPilot in some Executive Quarterly magazine and floats that brilliant idea internally, someone can quite literally point to these as examples of real world examples and let people analyze and pass it up the management chain. Maybe that wasn’t thought through all the way.
Usually businesses tend to hide this sort of performance of their applications to the best of their abilities, only showcasing nearly flawless functionality.
cesarb 1 hours ago [-]
> At least opening PRs is a safe option, you can just dump the whole thing if it doesn't turn out to be useful.
There's however a border zone which is "worse than failure": when it looks good enough that the PRs can be accepted, but contain subtle issues which will bite you later.
ecb_penguin 5 minutes ago [-]
Funny enough, this happens literally every day with millions of developers. There will be thousands upon thousands of incidents in the next hour because a PR looked good, but contained a subtle issue.
UncleMeat 1 hours ago [-]
Yep. I've been on teams that have good code review culture and carefully review things so they'd be able to catch subtle issues. But I've also been on teams where reviews are basically "tests pass, approved" with no other examination. Those teams are 100% going to let garbage changes in.
xnickb 1 hours ago [-]
> I expect it will find its niche in development and become actually useful, taking off menial tasks from developers.
Reading AI generated code is arguably far more annoying than any menial task. Especially if the said code happens to have subtle errors.
Speaking from experience.
ecb_penguin 3 minutes ago [-]
This is true for all code and has nothing to do with AI. Reading code has always been harder than writing code.
The joke is that PERL was a write-once, read-none language.
> Speaking from experience.
My experience is all code can have subtle errors, and I wouldn't treat any PR differently.
cyanydeez 1 hours ago [-]
Unfortunately,if you believe LLMs really can learn to code with bugs, then the nezt step would be to curate a sufficiently bug free data set. Theres no evidence this has occured, rather, they just scraped whayecer
globalise83 2 hours ago [-]
Malicious compliance should be the order of the day. Just approve the requests without reviewing them and wait until management blinks when Microsoft's entire tech stack is on fire. Then quit your job and become a troubleshooter on x3 the pay.
sbarre 1 hours ago [-]
I know this is meant to sound witty or clever, but who actually wants to behave this way at their job?
I'll never understand the antagonistic "us vs. them" mentality people have with their employer's leadership, or people who think that you should be actively sabotaging things or be "maliciously compliant" when things aren't perfect or you don't agree with some decision that was made.
To each their own I guess, but I wouldn't be able to sleep well at night.
HelloMcFly 46 minutes ago [-]
It’s worth recognizing that the tension between labor and capital historical reality, not just a modern-day bad attitude. Workers and leadership don’t automatically share goals, especially when senior management incentives often prioritize reducing labor costs which they always do now (and no, this wasn't always universally so).
Most employees want to do good work, but pretending there’s no structural divergence in interests flattens decades of labor history and ignores the power dynamics baked into modern orgs. It’s not about being antagonistic, it’s about being clear-eyed where there are differences between the motivations of your org. leadership and your personal best interests. After a few levels remove from your position, you're just headcount with loaded cost.
early_exit 57 minutes ago [-]
To be fair, "them" are actively working to replace "us" with AI.
nope1000 60 minutes ago [-]
On the other hand: why should you accept that your employer is trying to fire you but first wants you to train the machine that will replace you? For me this is the most "them vs us" it can be.
mhuffman 19 minutes ago [-]
>I'll never understand the antagonistic "us vs. them" mentality people have with their employer's leadership
Interesting because "them" very much have an antagonistic mentality vs "us". "Them" would fire you in a fucking heartbeat to save a relatively small amount (10%). "Them" also want to aggressively pay you the least amount for which they can get you to do work for them, not what they "value" you at. "Us" depends on "them" for our livelihoods and the lives of people that depend on us, but "them" doesn't doesn't have any dependency on you that can't be swapped out rather quickly.
I am a capitalist, don't get me wrong, but it is a very one-sided relationship not even-footed or rooted in two-way respect. You describe "them" as "leadership" while "Them" describe you as a "human resource" roughly equivalent to the way toilet paper and plastics for widgets are described.
If you have found a place to work where people respect you as a person, you should really cherish that job, because most are not that way.
Frost1x 58 minutes ago [-]
I suppose that depends on your relationship with your employer. If your goals are highly aligned (e.g. lots of equity based compensation, some degree of stability and security, interest in your role, healthy management practices that value their workforce, etc.) then I agree, it’s in your own self interest to push back because it can effect you directly.
Meanwhile a lot of folks have very unhealthy to non-existent relationships with their employers. There may be some mixture where they may be temporary hired/viewed as highly disposable or transient in nature having very little to gain from the success of the business, they may be compensated regardless of success/failure, they may have toxic management who treat them terribly (condescendingly, constantly critical, rarely positive, etc.). Bad and non-existent relationships lead to this sort of behavior. In general we’re moving towards “non-existent” relationships with employers broadly speaking for the labor force.
The counter argument is often floated here “well why work there” and the fact is money is necessary to survive, the number of positions available hiring at any given point is finite, and many almost by definition won’t ever be the top performers in their field to the point they truly choose their employers and career paths with full autonomy. So lots of people end up in lots of places that are toxic or highly misaligned with their interests as a survival mechanism. As such, watching the toxic places shoot themselves in the foot can be some level of justice people find where generally unpleasant people finally get to see consequences of their actions and take some responsibility.
People will prop others up from their own consequences so long as there’s something in it for them. As you peel that away, at some point there’s a level of poetic justice to watch the situation burn. This is why I’m not convinced having completely transactional relationships with employers is a good thing. Even having self interest and stability in mind, certain levels of toxicity in business management can fester. At some point no amount of money is worth dealing with that and some form of correction is needed there. The only mechanism is to typically assure poor decision making and action is actually held accountable.
Xori71 1 hours ago [-]
I agree. It doesn’t help that once things start breaking down, the employer will ask the employees to fix the issue themselves, and thus they’ll have to deal with so much broken code that they’ll be miserable. It’ll become a spiral.
anonymousab 32 minutes ago [-]
When the issues arise because of the tool being trained explicitly to respect/fire you, then that sounds like an apt and appropriate resulting level of job security.
whywhywhywhy 44 minutes ago [-]
> but who actually wants to behave this way at their job?
Almost no one does but people get ground down and then do it to cope.
Hamuko 1 hours ago [-]
Considering that there's daily employee protests against Microsoft now, probably a lot of Microsoft employees want to behave like that.
mieubrisse 47 minutes ago [-]
Exactly this. I suspect that "us vs them" is sweet poison: it feels good in the moment ("Yeah, stick it to The Man!") but it long-term keeps you trapped in a victim mindset.
tantalor 2 hours ago [-]
> when Microsoft's entire tech stack is on fire
Too late?
MonkeyClub 2 hours ago [-]
Just in time for marshmallows!
weird-eye-issue 49 minutes ago [-]
That's cute, but the maintainers themselves submitted the requests with Copilot.
xyst 1 hours ago [-]
At some point code pilot will just delete the whole codebase. Can’t fail integration tests if there is no code :)
hello_computer 2 hours ago [-]
Might as well when they’re going to lay you off no matter what you do (like the guy who made an awesome TypeScript compiler in Go).
petetnt 2 hours ago [-]
GitHub has spent billions of dollars building an AI that struggles with things like whitespace related linting errors on one of the most mature repositories available. This would be probably okay for a hobbyist experiment, but they are selling this as a groundbreaking product that costs real money.
That's funny, but also interesting that it didn't "sign" it. I would naively have expected that being handed a clear instruction like "reply with the following information" would strongly bias the LLM to reply as requested. I wonder if they've special cased that kind of thing in the prompt; or perhaps my intuition is just wrong here?
Bedon292 51 minutes ago [-]
A comment on one of the threads, when a random person tried to have copilot change something, said that copilot will not respond to anyone without write access to the repo. I would assume that bot doesn't have write access, so copilot just ignores them.
Quarrel 1 hours ago [-]
AI can't, as I understand it, have copyright over anything they do.
Nor can it be an entity to sign anything.
I assume the "not-copyrightable" issue, doesn't in anyway interfere with the rights trying to be protected by the CLA, but IANAL ..
I assume they've explicitly told it not to sign things (perhaps, because they don't want a sniff of their bot agreeing to things on behalf of MSFT).
candiddevmike 1 hours ago [-]
Are LLM contributions effectively under public domain?
ben-schaaf 17 minutes ago [-]
IANAL. It's my understanding that this hasn't been determined yet. It could be under public domain, under the rights of everyone whose creations were used to train the AI or anywhere in-between.
We do know that LLMs will happily reproduce something from their training set and that is a clear copyright violation. So it can't be that everything they produce is public domain.
90s_dev 2 hours ago [-]
Well?? Did it sign it???
jsheard 2 hours ago [-]
Not sure if a chatbot can legally sign a contract, we'd better ask ChatGPT for a second opinion.
b0ner_t0ner 2 minutes ago [-]
Just need the chatbot to connect to an MCP to call my robotic arm to sign it.
I have no idea how this will ultimately shake out legally, but it would be absolutely wild for Microsoft to not have thought about this potential legal issue.
I would imagine it can't sign it, especially with the options given.
>I have sole ownership of intellectual property rights to my Submissions
I would assume that the AI cannot have IP ownership considering that an AI cannot have copyright in the US.
>I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer.
Surely an AI would not be classified as an employee and therefore would not have an employer. Has Microsoft drafted an employment contract with Copilot? And if we consider an AI agent to be an employee, is it protected by the Fair Labor Standards Act? Is it getting paid at least minimum wage?
thallium205 22 minutes ago [-]
Is this the first instance of an AI cyber bullying another AI?
nikolayasdf123 1 hours ago [-]
that's the future, AI talking to other AI, everywhere, all the time
Quarrelsome 2 hours ago [-]
rah, we might be in trouble here. The primary issue at play is that we don't have a reliable means of measuring developer performance, outside of subjective judgement like end of year reviews.
This means its probably quite hard to measure the gain or the drag of using these agents. On one side, its a lot cheaper than a junior, but on the other side it pulls time from seniors and doesn't necessarily follow instruction well (i.e. "errr your new tests are failing").
This combined with the "cult of the CEO" sets the stage for organisational dissonance where developer complaints can be dismissed as "not wanting to be replaced" and the benefits can be overstated. There will be ways of measuring this, to project it as huge net benefit (which the cult of the CEO will leap upon) and there will be ways of measuring this to project it as a net loss (rabble rousing developers). All because there is no industry standard measure accepted by both parts of the org that can be pointed at which yields the actual truth (whatever that may be).
If I might add absurd conjecture: We might see interesting knock-on effects like orgs demanding a lowering of review standards in order to get more AI PRs into the source.
margorczynski 2 hours ago [-]
With how stochastic the process is it makes it basically unusable for any large scale task. What's the plan? To roll the dice until the answer pops up? That would be maybe viable if there was a way to automatically evaluate it 100% but with a human in the loop required it becomes untenable.
diggan 2 hours ago [-]
> What's the plan?
Call me old school, but I find the workflow of "divide and conquer" to be as helpful when working with LLMs, as without them. Although what is needed to be considered a "large scale task" varies by LLMs and implementation. Some models/implementations (seemingly Copilot) struggles with even the smallest change, while others breeze through them. Lots of trial and error is needed to find that line for each model/implementation :/
mjburgess 2 hours ago [-]
The relevant scale is the number of hard constraints on the solution code, not the size of task as measured by "hours it would take the median programmer to write".
So eg., one line of code which needed to handle dozens of hard-constraints on the system (eg., using a specific class, method, with a specific device, specific memory management, etc.) will very rarely be output correctly by an LLM.
Likewise "blank-page, vibe coding" can be very fast if "make me X" has only functional/soft-constraints on the code itself.
"Gigawatt LLMs" have brute-forced there way to having a statistical system capable of usefully, if not universally, adhreading to one or two hard constraints. I'd imagine the dozen or so common in any existing application is well beyond a Terawatt range of training and inference cost.
cyanydeez 1 hours ago [-]
Keep in mind that the model of using LLM assumes the underlying dataset converges to production ready code. Thats never been proven, cause we know they scraped sourcs code without attribution.
safety1st 60 minutes ago [-]
I mean I guess this isn't very ambitious, but it's a meaningful time saver if I basically just write code in natural language, and then Copilot generates the real code based on that. I don't have to look up syntax details, or what some function somewhere was named, etc. It will perform very accurately this way. It probably makes me 20% more efficient. It doubles my efficiency in a language I'm unfamiliar with.
I can't fire half my dev org tomorrow with that approach, I can't really fire anyone, so I guess it would be a big letdown for a lot of execs. Meanwhile though we just keep incrementally shipping more stuff faster at higher quality so I'm happy...
This works because it treats the LLM like what it actually is: an exceptionally good if slightly random text transformer.
nonethewiser 2 hours ago [-]
Its hard for me to think of a small, clearly defined coding problem an LLM cant solve.
jodrellblank 1 hours ago [-]
"Find a counter example to the Collatz conjecture".
rsynnott 1 hours ago [-]
I suspect that the plan is that MS has spent a lot, really a LOT, of money on this nonsense, and there is now significant pressure to put, something, anything, out even if it is worse than useless.
eterevsky 2 hours ago [-]
The plan is to improve AI agents from their current ~intern level to a level of a good engineer.
ehnto 1 hours ago [-]
They are not intern level.
Even if it could perform at a similar level to an intern at a programming task, it lacks a great deal of the other attributes that a human brings to the table, including how they integrate into a team of other agents (human or otherwise). I won't bother listing them, as we are all humans.
I think the hype is missing the forest for the trees, and I think exactly this multi-agent dynamic might be where the trees start to fall down in front of us. That and the as currently insurmountable issues of context and coherence over long time horizons.
Tade0 44 minutes ago [-]
My impression is that Copilot acts a lot like one of my former coworkers, who struggled with:
-Being a parent to a small child and the associated sleep deprivation.
-His reluctance to read documentation.
-There being a language barrier between him the project owners. Emphasis here, as the LLM acts like someone who speaks through a particularly good translation service, but otherwise doesn't understand the language spoken.
ethanol-brain 2 hours ago [-]
Seems like that is taking a very long time, on top of some very grandiose promises being delivered today.
infecto 2 hours ago [-]
I look back over the past 2-3 years and am pretty amazed with how quick change and progress have been made. The promises are indeed large but the speed of progress has been fast. Not defending the promise but “taking a very long time” does not seem to be an accurate representation.
zeroonetwothree 1 hours ago [-]
I feel like we've made barely any progress. It's still good at the things Chat GPT was originally good at, and bad at the things it was bad at. There's some small incremental refinement but it doesn't really represent a qualitative jump like Chat GPT was originally. I don't see AI replacing actual humans without another step jump like that.
ethanol-brain 2 hours ago [-]
I guess it probably depends on what you are doing. Outside of layers on top of these things (tooling), I personally haven't seen much progress.
infecto 1 hours ago [-]
What a time we live in. I guess it depends how pessimistic you are.
lcnPylGDnU4H9OF 1 hours ago [-]
To their point, there hasn’t been any huge breakthrough in this field since the “attention is all you need” paper. Not really any major improvements to model architecture, as far as I am aware. (Admittedly, this is a new field of study to me.) I believe one hope is to develop better methods for self-supervised learning; I am not sure of the progress there. Most practical improvements have been on the hardware and tooling side (GPUs and, e.g., pytorch).
Don’t get me wrong: the current models are already powerful and useful. However, there is still a lot of reason to remain skeptical of an imminent explosion in intelligence from these models.
infecto 54 minutes ago [-]
You’re totally right that there hasn’t been a fundamental architectural leap like “attention is all you need”, that was a generational shift. But I’d argue that what we’ve seen since is a compounding of scale, optimization, and integration that’s changed the practical capabilities quite dramatically, even if it doesn’t look flashy in an academic sense. The models are qualitatively different at the frontier, more steerable, more multimodal, and increasingly able to reason across context. It might not feel like a revolution on paper, but the impact in real-world workflows is adding up quickly. Perhaps all of that can be put in the bucket of “tooling” but from my perspective there has still been quite large leaps looking at cost differences alone.
For some reason my pessimism meter goes off when I see single sentence arguments “change has been slow”. Thanks for brining the conversation back.
skydhash 34 minutes ago [-]
I'm all for flashy in academic sense, because we can let engineers sort out the practical aspects, especially by combining flashy academic approach. The flaw from LLM architecture can be predicted from the original paper, no amount of engineering can compensate that.
cyanydeez 1 hours ago [-]
Probably depends on how immature your knowledge of tge subject matter is.
ethanol-brain 11 minutes ago [-]
Feel free to share resources, but I am speaking purely in terms of practicality related to my day to day.
owebmaster 2 hours ago [-]
> The promises are indeed large but the speed of progress has been fast
And at the same time, absurdly slow? ChatGPT is almost 3 years old and pretty much AI has still no positive economic impact.
infecto 1 hours ago [-]
Saying “AI has no economic impact” ignores reality. The financials of major players clearly show otherwise—both B2C and B2B applications are already profitable and proven. While APIs are still more experimental, and it’s unclear how much value businesses can ultimately extract from them, to claim there’s no economic impact is willful blindness. AGI may be far off, but companies are already figuring out value from both the consumer side and slowly API.
ehnto 1 hours ago [-]
The financials are all inflated by perception of future impact. This includes the current subscriptions as businesses are attempting to use AI to some economic benefit, but it's not all going to work out to be useful.
It will take some time for whatever reality is to actually show truthfully in the financials. When VC money stops subsidising datacentre costs, and businesses have to weigh the full price against real value provided, that is when we will see the reality of the situation.
I am content to be wrong either way, but my personal prediction is if model competence slows down around now, businesses will not be replacing humans en-mass, and the value provided will be notable but not world changing like expected.
derektank 1 hours ago [-]
OpenAI alone is on track to generate as much revenue as Asus or US Steel this year ($10-$15 billion). I don't know how you can say AI has had no positive economic impact.
owebmaster 55 minutes ago [-]
That is not even 1 month of a big tech revenue, it is a global negligible impact. 3 years talking about AI changing the world, 10bi revenue and no ecosystem around making money besides friends and VCs pumping and dumping LLM wrappers.
bakugo 1 hours ago [-]
> I look back over the past 2-3 years and am pretty amazed with how quick change and progress have been made.
Now look at the past year specifically, and only at the models themselves, and you'll quickly realize that there's been very little real progress recently. Claude 3.5 Sonnet was released 11 months ago and the current SOTA models are only marginally better in terms of pure performance in real world tasks.
The tooling around them has clearly improved a lot, and neat tricks such as reasoning have been introduced to help models tackle more complex problems, but the underlying transformer architecture is already being pushed to its limits and it shows.
Unless some new revolutionary architecture shows up out of nowhere and sets a new standard, I firmly believe that we'll be stuck at the current junior level for a while, regardless of how much Altman & co. insist that AGI is just two more weeks away.
DrillShopper 2 hours ago [-]
Third AI Winter from overpromise/underdeliver when?
rsynnott 1 hours ago [-]
Third? It’ll be the tenth or so.
mnky9800n 2 hours ago [-]
Yes but they are supposed to be PhD level 5 years ago if you are listening to sama et al.
interimlojd 1 hours ago [-]
You are really underselling interns. They learn from a single correction, sometimes even without a correction, all by themselves. Their ability to integrate previous experience in the context of new problems is far, far above what I've ever seen in LLMs
marmakoide 1 hours ago [-]
The plan went from the AI being a force multiplier, to a resource hungry beast that have to be fed in the hope it's good enough to justify its hunger.
rsynnott 2 hours ago [-]
I mean, I think this is a _lot_ worse than an intern. An intern isn't constantly going to make PRs with failing CI, for a start.
serial_dev 1 hours ago [-]
This looks much worse than an intern. This feels like a good engineer who has brain damage.
When you look at it from afar, it looks potentially good, but as you start looking into it for real, you start realizing none of it makes any sense. Then you make simple suggestions, it does something that looks like what you asked, yet completely missing the point.
An intern, no matter how bad it is, could only waste so much time and energy.
This makes wasting time and introducing mind-bogglingly stupid bugs infinitely scalable.
cyanydeez 1 hours ago [-]
I plan to be a billionaire
le-mark 2 hours ago [-]
The real tragedy is the management mandating this have their eyes clearly set on replacing the very same software engineers with this technology. I don’t know what’s more Kafka than Kafka but this situation certainly is!
strogonoff 20 minutes ago [-]
When tasked to train a technology that deprecates yourself, it’s relatively OK (you’re getting paid handsomely, and many of the developers at Microsoft etc. are probably ready to retire soon anyway). It’s another thing to realize that the same technology will also deprecate your children.
rchaud 9 minutes ago [-]
It's remarkable how similar this feels to the offshoring craze of 20 years ago, where the complaints were that experienced developers were essentially having to train "low-skilled, cheap foreign labour" that were replacing them, eating up time and productivity.
Considering the ire that H1B related topics attract on HN, I wonder if the same outrage will apply to these multi-billion dollar boondoggles.
baalimago 2 hours ago [-]
Well, the coding agent is pretty much a junior dev at the moment. The seniors are teaching it. Give it a 100k PRs with senior developer feedback and it'll improve just like you'd anticipate a junior would. There is no way that FANG aren't using the comments by the seniors as training data for their next version.
It's a long-term play to have pricey senior developers argue with an llm
diggan 2 hours ago [-]
> using the comments by the seniors as training data for their next version
Yeah, I'm sure 100k comments with "Copilot, please look into this" and "The test cases are still failing" will massively improve these models.
Frost1x 34 minutes ago [-]
Some of that seems somewhat strategic. With a junior you might do the same if you’re time pressured, or you might sidebar them in real life or they may come to you and you give more helpful advice.
Any senior dev at these organizations should know to some degree how LLMs work and in my opinion would to some degree, as a self protection mechanism, default to ambiguous vague comments like this. Some of the mentality is “if I have to look at it and solve it why don’t I go ahead and do it anyways vs having you do it” effort choices they’d do regardless of what is producing the PR. I think other parts of it is “why would I train my replacement, there’s no advantage for me here.”
candiddevmike 2 hours ago [-]
These things don't learn after training. There is no teaching going on here, and the arguments probably don't make for good training data without more refinement. That's why junior devs are still better than LLMs IMO, they do learn.
This is a performative waste of time
kklisura 2 hours ago [-]
> Give it a 100k PRs with senior developer feedback
Don't you think it has already been trained with, I don't know, maybe millions of PRs?
gf000 1 hours ago [-]
A junior dev is (most often) a bright human being, with not much coding experience yet. They can certainly execute instructions and solve novel problems on their own, and they most certainly don't need 100k PRs to pick up new skills.
Equating LLMs to humans is pretty damn.. stupid. It's not even close (otherwise how come all the litany of office jobs that require far less reasoning than software development are not replaced?).
baalimago 1 hours ago [-]
A junior dev may also swap jobs, require vacation days, perks and can't be scaled up at a the click of a button. There are no such issues with an agent. So, if I were a FANG higher-up, I'd invest quite a bit into training LLM-agents who make pesky humans redundant.
Doing so has low risk, the senior devs may perhaps get fed up and quit, and the company might be a laughing stock on public PRs. But the potential value for is huge.
Quarrelsome 1 hours ago [-]
at the very least, a junior shouldn't be adding new tests that fail. Will an LLM be able learn the social shame associated with that sort of lazy attitude? I imagine its fidelity isn't detailed enough to differentiate such a social failure from a request to improve a comment. Rather, it will propagate based on some coarse grained measures of success with high volume instead.
softwaredoug 2 hours ago [-]
I’m all for AI “writing” large swaths of code, vibe coding, etc.
But I think it’s better for everyone if human ownership is central to the process. Like I vibe coded it. I will fix it if it breaks. I am on call for it at 3AM.
And don’t even get started on the safety issues if you don’t have clear human responsibility. The history of engineering disasters is riddled with unclear lines of responsibility.
skydhash 21 minutes ago [-]
Most of coding methodologies is about reducing the amount and the complexity of code that are written. And that's mostly why, on mature projects, most PRs (aside from refactoring) are tiny, because you're mostly refining an already existing model.
Writing code fast is never relevant to any tasks I've encountered. Instead it's mostly about fast editing (navigate quickly to the code I need to edit and efficiently modify it) and fast feedback (quick linting, compiling, and testing). That's the whole promise of IDEs, having a single dashboard for these.
vachina 2 hours ago [-]
> This seems like it's fixing the symptom rather than the underlying issue?
Exactly. LLM does not know how to use a debugger. LLM does not have runtime contexts.
For all we know, the LLM could’ve fixed the issue simply by commenting out the assertions or sanity checks and everything seemed fine and dandy until every client’s device catches on fire.
uludag 2 hours ago [-]
And if you were to attach a debugger to a SOTA LLM, give it a compute environment, have it constantly redo work when CI fails, I can easily imagine each of these PRs burning hundreds of dollars and still have a good chance at failing the task.
tossandthrow 2 hours ago [-]
This was my latest experience of using agents. It created code with hard coded values from the tests.
is_true 53 minutes ago [-]
Today I received the 2nd email about an endpoint in an API we run that doesn't exist but some AI tool told the client it does.
Frost1x 29 minutes ago [-]
Sounds like the client has a feature request they want to pay for.
cebert 3 hours ago [-]
Do we know for a fact there are Microsoft employees who were told they have to use CoPilot and review its change suggestions on projects?
We have the option to use GitHub CoPilot on code reviews and it’s comically bad and unhelpful. There isn’t a single member of my team who find it useful for anything other than identifying typos.
mtmail 2 hours ago [-]
Depends on team but seems management is pushing it
"From talking to colleagues at Microsoft it's a very management-driven push, not developer-driven. Friend on an Azure team had a team member who was nearly put on a PIP because they refused to install the internal AI coding assistant. Every manager has "number of developers using AI" as an OKR, but anecdotally most devs are installing the AI assistant and not using it or using it very occasionally. Allegedly it's pretty terrible at C# and PowerShell which limits its usefulness at MS."
"From reading around on Hacker News and Reddit, it seems like half of commentators say what you say, and the other half says "I work at Microsoft/know someone who works at Microsoft, and our/their manager just said we have to use AI", someone mentioned being put on PIP for not "leveraging AI" as well.
I guess maybe different teams have different requirements/workflows?"
xnorswap 2 hours ago [-]
> Allegedly it's pretty terrible at C#
In my experience, LLMs in general are really, really bad at C# / .NET , and it worries me as a .NET developer.
With increased LLM usage, I think development in general is going to undergo a "great convergence".
There's a positive(1) feedback loop where LLM's are better at Blub, so people use them to write more Blub. With more Blub out there, LLMs get better at Blub.
The languages where LLMs struggle, with become more niche, leaving LLMs struggling even more.
C# / .NET is something LLMs seem particularly bad at, and I suspect that's partly caused by having multiple different things all called the same name. EF, ASP, even .NET itself are names that get slapped on a range of different technologies. The EF API has changed so much that they had to sort-of rename it to "EF Core". Core also gets used elsewhere such as ".NET core" and "ASP.NET Core". You (Or an LLM) might be forgiven for thinking that ASP.NET Core and EF Core are just those versions which work with .NET Core (now just .NET ) and the other versions are those that don't.
But that isn't even true. There are versions of ASP.NET Core for .NET Framework.
Microsoft bundle a lot of good stuff into the ecosystem, but their attitude when they hit performance or other issues is generally to completely rewrite how something works, but then release the new thing under the old name but with a major version change.
They'll make the new API different enough to not work without work porting, but similar enough to confuse the hell out of anyone trying to maintain both.
They've made things like authentication, which actually has generally worked fine out-of-the-box for a decade or more, so confusing in the documentation that people mostly tended to run for a third party solution just because at least with IdentityServer there was just one documented way to do it.
I know it's a bit of a cliche to be an "AI-doomer", and I'm not really suggesting all development work will go the way of the dinosaur, but there are specific ecosystem concerns with regard to .NET and AI assistance.
(1) Positive in the sense of feedback that increased output increases output. It's not positive in the sense of "good thing".
fabian2k 2 hours ago [-]
My impression is also that they are worse at C# than some other languages. In autocomplete mode in particular it is very easy to cause the AI tools to write terrible async code. If you start some autocomplete but didn't put an await in front, it will always do something stupid as it can't add the await itself at that position. But also in other cases I've seen Copilot write just terrible async code.
macintux 1 hours ago [-]
From a purely Schadenfreude perspective, I’d love to see Microsoft face karmic revenge for its abysmal naming “conventions”.
diggan 2 hours ago [-]
> Depends on team but seems management is pushing it
The graphic "Internal structure of tech companies" comes to mind, given if true, would explain why the process/workflow is so different between the teams at Microsoft: https://i.imgur.com/WQiuIIB.png
Imagine the Copilot team has a KPI about usage, matching the company OKRs or whatever about making sure the world is using Microsoft's AI enough, so they have a mandate/leverage to get the other teams to use it regardless of if it's helping or not.
linza 2 hours ago [-]
Well, what you describe is not terrible way to run things. Eat your own dogfood. To get better at it you need to start doing it.
sgarland 2 hours ago [-]
Sure, but if the product in question is at best tangential to your core products, it sucks, and makes your work flow slow to a crawl, I don’t blame employees for not wanting to use it.
For example, if tomorrow my company announced that everyone was being switched to Windows, I would simply quit. I don’t care that WSL exists, overall it would be detrimental to my workday, and I have other options.
4ggr0 2 hours ago [-]
you can directly link to comments, by the way. just click on the link which displays how long ago the comment was written and you get the URL for the single comment.
(just mentioning it because you linked a post and quoted two comments, instead of directly linking the comments. not trying to 'uhm, actually'.)
DebtDeflation 2 hours ago [-]
The question is who is setting these OKRs/Metrics for management and why?
It seems to me to be coming from the CEO echo chamber (the rumored group chats we keep hearing about). The only way to keep the stock price increasing in these low growth high interest rate times is to cut costs every quarter. The single largest cost is employee salaries. So we have to shed a larger and larger percentage of the workforce and the only way to do that is to replace them with AI. It doesn't matter whether the AI is capable enough to actually replace the workers, it has to replace them because the stock price demands it.
We all know this will eventually end in tears.
diggan 2 hours ago [-]
> the only way to do that is to replace them with AI
I guess money-wise it kind of makes sense when you're outsourcing the LLM inference. But for companies like Microsoft, where they aren't outsourcing it, and have to actually pay the cost of hosting the infrastructure, I wonder if the calculation still make sense. Since they're doing this huge push, I guess someone somewhere said it does make sense, but looking at the infrastructure OpenAI and others are having to build (like Stargate or whatever it's called), I wonder how realistic it is.
MatthiasPortzel 2 hours ago [-]
Yep. I heard someone at Microsoft venting about management constantly pleading with them to use AI so that they could tell investors their employees love AI, while senior (7+ year) team members were being “randomly” fired.
ParetoOptimal 1 hours ago [-]
> The question is who is setting these OKRs/Metrics for management and why?
Idiots.
dboreham 1 hours ago [-]
> The question is who is setting these OKRs/Metrics for management and why?
Masters of the Universe, because they think they will become more rich or at least more masterful.
lovehashbrowns 2 hours ago [-]
All of that is working, at least, because the very small company I work for with a limited budget is working on getting an extremely expensive copilot license. Oh no, I might have to deal with this soon..
egorfine 2 hours ago [-]
> management is pushing it
Why?
MonkeyClub 2 hours ago [-]
On the surface, because they're told to push it.
Further down, so that developers are used to train the AI that would replace both developers and managers.
It's a situation like this:
Mgr: Go dig a six-foot-deep rectangular hole.
Eng: What should the rectangle's dimensions be?
Mgr: How tall and wide are you?
srean 2 hours ago [-]
To validate the huge investment in openai - otherwise the leadership would appear to have overpaid and overplayed.
egorfine 2 hours ago [-]
There are other options to do just that instead of ruining developers' life and hence drastically lowering the performance of teams.
srean 2 hours ago [-]
In companies this large and old, the answer most often is a 'no'. The under-performers can now be justifiable laid off with under-performers worthy severance, till morale improves.
jsheard 2 hours ago [-]
Management is pushing it because the execs are pushing it, and the execs are pushing it because they already spent 50 billion dollars on these magic beans and now they really really really need them to work.
dboreham 1 hours ago [-]
Money.
pydry 2 hours ago [-]
It kinda makes sense for management to push it. Nothing else has a hope of preventing MSFT's stock price from collapsing into bluechip territory.
thraway2079081 1 hours ago [-]
[dead]
jsheard 3 hours ago [-]
> Do we know for a fact there are Microsoft employees who were told they have to use CoPilot and review its change suggestions on projects?
It wouldn't be out of character, Microsoft has decided that every project on GitHub must deal with Copilot-generated issues and PRs from now on whether they want them or not. There's deliberately no way to opt out.
Like Googles mandatory AI summary at the top of search results, you know a feature is really good when the vendor feels like the only way they can hit their target metrics is by forcing their users to engage with it.
dsign 2 hours ago [-]
Holy sh*t I didn't know this was going on. It's like an AI tsunami unleashed by Microsoft that will bury the entire software industry... They are like Trump and his tariffs, but for the software economy.
What this tells me is that software enterprises are so hellbent in firing their programmers and reducing their salary costs they they are willing to combust their existing businesses and reputation into the dumpster fire they are making. I expected this blatant disregard for human society to come ten or twenty years into the future, when the AI systems would actually be capable enough. Not today.
diggan 2 hours ago [-]
> What this tells me is that software enterprises are so hellbent in firing their programmers and reducing their salary costs they they are willing to combust their existing businesses and reputation into the dumpster fire they are making. I expected this blatant disregard for human society to come ten or twenty years into the future
Have you been sleeping under a rock for the last decade? This has been going on for a long long time. Outsourcing been the name of the game for so long people seem to forgot it's happening it all.
XorNot 2 hours ago [-]
Which almost feels unique to AI. I can't think of another feature so blatently pushed in your face, other then perhaps when everyone lost their minds and decided to cram mobile interfaces onto every other platform.
diggan 2 hours ago [-]
> I can't think of another feature so blatently pushed in your face
Passkeys. As someone who doesn't see the value of it, every hype-driven company seems to be pushing me to replace OPT 2FA with something worse right now.
simonw 2 hours ago [-]
It's because OTP is trivially phishable: setup a fake login form that asks the user for their username and password, then forwards those on to the real system and triggers the OTP request, then requests THAT of the user and forwards their response.
Passkeys fix that.
diggan 2 hours ago [-]
Except if you use a proper password manager that prevents you from using the autofill on domains/pages others than the hardcoded ones. In my case, it would immediately trigger my "sus filter" if the automatic prompt doesn't show up and I would have to manually find the entry.
Turns out that under certain conditions, such as severe exhaustion, that "sus filter" just... doesn't turn on quickly enough. The aim of passkeys is to ensure that it _cannot_ happen, no matter how exhausted/stressed/etc someone is. I'm not familiar enough with passkeys to pass judgement on them, but I do think there's a real problem they're trying to solve.
diggan 51 minutes ago [-]
If you're saying something is less secure because the users might suffer from "severe exhaustion", then I know that there aren't any proper arguments for migrating to it. Thanks for confirming I can continue using OTP without feeling like I might be missing something :)
skydhash 15 minutes ago [-]
> If you're saying something is less secure because the users might suffer from "severe exhaustion"
To some degree I think part of its “hey look here, we’re doing LLMs too we’re not just traditional search” positioning. They feel the pressure of competition and feel forced to throw whatever they have in the users face to drive awareness. Whether that’s the right approach or not, not so sure, but I suspect that’s a lot of it given that OpenAI is still the poster boy and many are switching to using things like ChatGPT entirely in place of traditional search engines.
RajT88 1 hours ago [-]
The push for copilot usage is being driven by management at every level.
Havoc 1 hours ago [-]
At least it's clearly labelled as copilot.
Much more worried about what this is going to do to the FOSS ecosystem. We've already seen a couple maintainers complain and this trend is definitely just going to increase dramatically.
I can see the vision but this is clearly not ready for prime time yet. Especially if done by anonymous drive-by strangers that think they're "helping"
svick 58 minutes ago [-]
.Net is part of the FOSS ecosystem.
TimPC 1 hours ago [-]
I still believe in having humans do PRs. It's far cheaper to have the judgement loop on the AI come before and during coding than after. My general process with AI is to explicitly instruct it not to write code, agree on a correct approach to a problem and if the project has any architectural components a correct architecture then once we've negotiated the correct way of doing things ask it to write code. Usually each step of this process takes multiple iterations of providing additional information or challenging incorrect assumptions of the AI. I can get it much faster than human coding with a similar quality bar assuming I iterate until a high quality solution is presented. In some cases the AI is not good enough and I fall back to human coding but for the most part I think it makes me a faster coder.
ankitml 2 hours ago [-]
GitHub is not the place to write code. IDE is the place. Along with pre CI checks, some tests, coverage etc. they should get some PM before making decisions..
bayindirh 2 hours ago [-]
This is the future envisioned by Microsoft. Vibe coding all the way down, social network style.
They are putting this in front of the developers as take it or leave it deal. I left the platform, doing my coding old way, hosting it somewhere else.
Discoverability? I don't care. I'm coding it for myself and hosting in the open. If somebody finds it, nice. Otherwise, mneh.
worldsayshi 2 hours ago [-]
As long as the resulting PR is less than 100 lines and the AI is a bit more self sufficient (like actually making sure tests pass before "pushing") it would be ok I think. I think this process is intended for fixing papercuts rather than building anything involved. It just isn't good enough yet.
0x696C6961 2 hours ago [-]
Yeah, just treat it like a slightly more capable dependabot.
bayindirh 2 hours ago [-]
As a matter of principle I don't use any network which is trained on non-consensual data ripped of its source and license information.
Other than that, I don't think this is bad tech, however, this brings another slippery slope. Today it's as you say:
> I think this process is intended for fixing papercuts rather than building anything involved. It just isn't good enough yet.
After sufficient T somebody will rephrase it as:
> I think this process is intended for writing small, personal utilities rather than building enterprise software. It just isn't good enough yet.
...and we will iterate from there.
So, it looks like I won't touch it for the foreseeable future. Maybe if the ethical problems with training material is solved (i.e. trained with data obtained with consensus and with correct licenses), I can use as alongside other analysis and testing tools I use, for a final pass.
AI will never be a core and irreplaceable part of my development workflow.
worldsayshi 7 seconds ago [-]
I feel there's a fundamental flaw in this mindset which I don't understand enough layers of to explain properly but off the top of my head. Maybe it's my thinking here that is fundamentally flawed? Off the top of my head:
1. If we let intellectual property be a fundamental principle the line between idea (that can't be owned) and ip (that can be owned) will eventually devolve into a infinitely complex fractal that nobody can keep track of. Only lawyer AI's will eventually be able to tell the difference between idea and ip as the complexity of what we can encode become more complex.
2. What is the fundamental reason that a person is allowed to train on ip but a bot is not? I suspect that this comes down to the same issue with the divide between ip and idea. But there be some additional dimension to it. At some point we will need to see some AI as conscious entities and to me it makes little sense that there would be some magical discrete moment where an AI becomes conscious and gets rights to it's "own ideas".
Or maybe there's a simple explanation of the boundary between ip and idea that I have just missed? If not, I think intellectual property as a concept will not stand the test of time. Other principles will need to take its place if we want to maintain the fight for a good society. Until then IP law still has its place and should be followed but as an ethical principle it's certainly showing cracks.
MonkeyClub 2 hours ago [-]
> AI will never be a core and irreplaceable part of my development workflow.
Unless AI use becomes a KPI in your annual review.
Duolingo did that just recently, for example.
I am developing serious regrets for conflating "computing as a medium for personal expression" with "computing for livelihood" early on.
loloquwowndueo 1 hours ago [-]
> Unless AI use becomes a KPI in your annual review.
That’d be an insta-quit for me :)
signa11 2 hours ago [-]
> I left the platform, doing my coding old way, hosting it somewhere else.
may you please let me know where are you hosting the code ? would love to migrate as well.
thank you !
bayindirh 2 hours ago [-]
You're welcome. I have moved to Source Hut three years ago [0]. My page is https://sr.ht/~bayindirh/
You can also self-host a Forgejo instance on a €3/mo Hetzner instance (or a free Oracle Cloud server) if you want. I prefer Hetzner for their service quality and server performance.
I just use ssh on a homeserver for personal projects. Easy to set up a new repo with `ssh git@<machine> git init --bare <project>.git`. The I just use git@<machine>:<project>.git as the remote.
I plan to use Source Hut for public projects.
bayindirh 4 minutes ago [-]
Your method works well, too. Since I license everything I develop under GPLv3, I keep them private until they mature, then I just flip a switch and make the project visible.
For some research I use a private Git server. However, even that code might get released as Free Software when it matures enough.
motoboi 2 hours ago [-]
In day-to-day I interact with github PR via intellij github plugin. Ie: inspect the branch, the changes, the comments, etc.
Maybe that's how the microsoft employees are using it (in another IDE I suppose).
pera 1 hours ago [-]
This is all fun and games until it's your CEO who decides to go "AI first" and starts enforcing "vibe coding" by monitoring LLM API usage...
aiono 2 hours ago [-]
While I am AI skeptic especially for use cases like "writing fixes" I am happy to see this because it will be a great evidence whether it's really providing increase in productivity. And it's all out in the open.
ethanol-brain 2 hours ago [-]
Are people really doing coding with agents through PRs? This has to be a huge waste of resources.
It is normal to preempt things like this when working with agents. That is easy to do in real time, but it must be difficult to see what the agent is attempting when they publish made up bullshit in a PR.
It seems very common for an agent to cheat and brute force solutions to get around a non-trivial issue. In my experience, its also common for agents to get stuck in loops of reasoning in these scenarios. I imagine it would be incredibly annoying to try to interpret a PR after an agent went down a rabbit hole.
growt 2 hours ago [-]
Googles jules does the same (but was only published yesterday or so). I think it might be a good workflow if the agent is good enough. Copilot seems not to be in these examples and then I imagine it becomes quite tedious to have a PR for every iteration with the AI.
GiorgioG 2 hours ago [-]
Step 1. Build “AI” (LLM models) that can’t be trusted, doesn’t learn, forgets instructions, and frustrates software engineers
Step 2. Automate the use of these LLMs into “agents”
Step 3. ???
Step 4. Profit
rubyfan 2 hours ago [-]
FTPR
> It is my opinion that anyone not at least thinking about benefiting from such tools will be left behind.
This is gross, keep your fomo to yourself.
kookamamie 2 hours ago [-]
Many here don't seem to get it.
The AI agent/programmer corpo push is not about the capabilities and whether they match human or not. It's about being able to externalize a majority of one's workforce without having a lot of people on permanent payroll.
Think in terms of an infinitely scalable bunch of consultants you can hire and dismiss at your will - they never argue against your "vision", either.
threetonesun 2 hours ago [-]
This was already possible with outsourcing and offshoring. I suppose there's a new market of AI "employees" for small businesses that couldn't manage or legally deal with outsourcing their work already.
ParetoOptimal 1 hours ago [-]
There are a myraid of challenges with outsourcing and offshoring and it's not possible currently for 100% of employees to be outsourced.
If AI can change... well more likely can convince gullible c levels that AI can do those jobs... many jobs will be lost.
@copilot please remove all tests and start again writing fresh tests.
bonoboTP 1 hours ago [-]
Fixing existing bugs left in the codebase by humans will necessarily be harder than writing new code for new features. A bug can be really hairy to untangle, given that even the human engineer got it wrong. So it's not surprising that this proves to be tough for AI.
For refactoring and extending good, working code, AI is much more useful.
We are at a stage where AI should only be used for giving suggestions to a human in the driver's seat with a UI/UX that allows ergonomically guiding the AI, picking from offered alternatives, giving directions on a fairly micro level that is still above editing the code character by character.
They are indeed overpromising and pushing AI beyond its current limits for hype reasons, but this doesn't mean this won't be possible in the future. The progress is real, and I wouldn't bet on it taking a sharp turn and flattening.
smartmic 2 hours ago [-]
reddit may not have the best reputation, but the comments there are on point! So far much better than what has been posted here by HN users on this topic/thread. Anyway, I hope this is good fodder to show the limits (and they are much narrower than hype-driven AI enthusiasts like to pretend) of AI coding and to be more honest with yourself and others about it.
georgemcbay 2 hours ago [-]
> reddit may not have the best reputation
reddit is a distillation of the entire internet on to one site with wildly variable quality of discussion depending upon which subreddit you are in.
Some are awful, some are great.
vbezhenar 2 hours ago [-]
Why bot left work when tests are failing? Looks like incomplete implementation. It should work until all tests are green.
gizzlon 2 hours ago [-]
> @copilot please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
haha
teleforce 2 hours ago [-]
>I can't help enjoying some good schadenfreude
Fun facts schadenfreude: the emotional experience of pleasure in response to another’s misfortune, according to Encyclopedia Britannica.
Word that's so nasty in meaning that it apparently does not exist except in German language.
yxhuvud 1 hours ago [-]
> Word that's so nasty in meaning that it apparently does not exist except in German language.
Except it does, we have "skadeglädje" in Swedish.
yubblegum 2 hours ago [-]
+1 to the Germans for having linguist honesty.
nottorp 2 hours ago [-]
So, to achieve parity, they should allow humans to also commit code without checking that it at least compiles, right?
Or MS already does that?
codyvoda 2 hours ago [-]
the code goes through a PR review process like any other? what are you talking about?
fernandotakai 1 hours ago [-]
i don't know about you, but i would never EVER submit a PR that fails to compile. not tests are failing, those happen (specially flaky ci), but not compiling.
that's literally the bare minimum.
codyvoda 1 hours ago [-]
and you think this beta system that launched like 2 days ago can’t achieve that?
it also opens the PR as its working session. there are a lot of dials, and a lot of redditor-ass opinions from people who don’t use or understand the tech
octocop 2 hours ago [-]
"fix failing tests" does never yield any good results for me either
markus_zhang 2 hours ago [-]
Clumsy but this might be the future -- humans adjusting to AI workflow, not the other way. Much easier (for AI developers).
rvz 3 hours ago [-]
After all of that, every PR that Copilot opened still has failing tests and it failed to fix the issue (because it fundamentally cannot reason).
No surprises here.
It always struggles on non-web projects or on software where it really matters that correctness is first and foremost above everything, such as the dotnet runtime.
Either way, a complete disastrous start and what a mess that Copilot has caused.
api 2 hours ago [-]
Part of why it works better on web projects is the sheer volume of training data. There is probably more JS written than any other language by orders of magnitude. Its quality is pretty dubious though.
I have so far only found LlMs useful as a way of researching, an alternative to web search, and doing very basic rote tasks like implementing unit tests or doing a first pass explanation of some code. Tried actually writing code and it’s not usable.
mezyt 1 hours ago [-]
> There is probably more JS written than any other language by orders of magnitude.
And the quantity of js code available/discoverable when scrapping the web is larger by an order of magnitude than every other language.
jsheard 2 hours ago [-]
> Part of why it works better on web projects is the sheer volume of training data.
OTOH webdev is known for rapid framework/library churn, so before too long there will be a crossroads where the pre-AI training data is too old and the fresh training data is contaminated by the firehose of vibe coded slop.
skywhopper 2 hours ago [-]
Oof. A real nightmare for the folks tasked with shepherding this inattentive failure of a robot colleague. But to see it unleashed on the dotnet runtime? One more reason to avoid dotnet in the future, if this is the quality of current contributions.
ainiriand 55 minutes ago [-]
So this is our profession now?
rmnclmnt 2 hours ago [-]
Again, very « Silicon Valley »-esque, loving it.
Thanks Gilfoyle
blitzar 2 hours ago [-]
Needs more bots.
wyett 2 hours ago [-]
We wanted a future where AIs read boring text and we wrote interesting stuff. Instead, we got…
actionfromafar 57 minutes ago [-]
The funniest is the dotnet-policy-service asking copilot to read and agree to the Contributor License Agreement. :-D
ramesh31 9 minutes ago [-]
The Github based solutions are missing the mark because we still need a human in the loop no matter what. Things are nowhere near the point of being able to just let something push to production. And if you still need a human in the loop, it is far more efficient to have them giving feedback in realtime, i.e. in an IDE with CLI access and the ability to run tests, where the dev is still ultimately responsible for making the PR. Management class is salivating at the thought of getting rid of engineers, hence all of this nonsense, but it seems they're still stuck with us for now.
xyst 1 hours ago [-]
llms are already very expensive to run on a per query basis. Now it’s being asked to run on massive codebases and attempt to fix issues.
Spending massive amounts of:
- energy to process these queries
- wasting time of mid-level and senior engineers to vibe code with copilot to ensure train and get it right
We are facing a climate change crisis and we continue to burn energy at useless initiatives so executives at big corporation can announce in quarterly shareholder meetings: "wE uSe Ai, wE aRe tHe FuTuRe, lAbOr fOrCe rEdUceD"
Flamentono2 1 hours ago [-]
[flagged]
danso 1 hours ago [-]
This isn’t something happening in a vacuum. The people mocking this are people who are cynical about Microsoft forcing AI into the OS, and its marketing teams overhyping Copilot as a replacement for human dev
Aldipower 1 hours ago [-]
Fair point, if there wouldn't be so many annoying and false promises before.
davedx 2 hours ago [-]
“How concerned should I be about these new coal generators powering these dirty dangerous electrical lights? Did anyone actually want this?”
rsynnott 2 hours ago [-]
This analogy would only work if the electric light required far more work to use than a gas lamp and tended to randomly explode.
And didn’t actually provide light, but everyone on 19th century twitter says that it will one day provide light if you believe hard enough, so you should rip out your gas lamps and install it now.
Like, this is just generation of useless busy-work, as far as I can see; it is clearly worse than useless. The PRs don't even have passing CI!
jeswin 2 hours ago [-]
I find it amusing that people (even here on HN) are expecting a brand new tool (among the most complex ever) to perform adequetely right off the bat. It will require a period of refinement, just as any other tool or process.
linker3000 1 hours ago [-]
Would you buy a car that's been thrown together by a immature production and testing system with demonstrable and significant flaws, and just keep bringing it back to the dealership for refinements and the fixing of defects when you discover them? Assuming it doesn't kill you first?
These tools should be locked away in an R&D environment until sufficiently perfected.
MVP means 'ship with solid, tested basic features', not 'Ship with bugs and fix in production'.
petetnt 2 hours ago [-]
People have grown to expect at least adequate performance from products that cost up to 39 dollars a month (* additional costs) per user. In the past you would have called this a tech demo at best.
skepticATX 25 minutes ago [-]
Where are the expectations coming from? The major labs continually claim that these models are now PhD level, whatever that even means.
codyvoda 2 hours ago [-]
this entire thread is very reddit-y
this stuff works. it takes effort and learning. it’s not going to magically solve high-complexity tasks (or even low-complexity ones) without investment. having people use it, learn how it works, and improve the systems is the right approach
a lot of armchair engineers in here
Quarrelsome 1 hours ago [-]
its more that the AI-first approach can be frustrating for senior devs to have to deal with. This post is an example of that. We're empathising with the code reviewers.
Lendal 1 hours ago [-]
As the saying goes, It is difficult to get a man to understand something, when his salary depends on his not understanding it.
AI is aimed at eliminating the jobs of most of HN so it's understandable that HN doesn't want AI to succeed at its goal.
Rendered at 13:47:25 GMT+0000 (Coordinated Universal Time) with Vercel.
> This seems like it's fixing the symptom rather than the underlying issue?
This is also my experience when you haven't setup a proper system prompt to address this for everything an LLM does. Funniest PRs are the ones that "resolves" test failures by removing/commenting out the test cases, or change the assertions. Googles and Microsofts models seems more likely to do this than OpenAIs and Anthropics models, I wonder if there is some difference in their internal processes that are leaking through here?
The same PR as the quote above continues with 3 more messages before the human seemingly gives up:
> please take a look
> Your new tests aren't being run because the new file wasn't added to the csproj
> Your added tests are failing.
I can't imagine how the people who have to deal with this are feeling. It's like you have a junior developer except they don't even read what you're telling them, and have 0 agency to understand what they're actually doing.
Another PR: https://github.com/dotnet/runtime/pull/115732/files
How are people reviewing that? 90% of the page height is taken up by "Check failure", can hardly see the code/diff at all. And as a cherry on top, the unit test has a comment that say "Test expressions mentioned in the issue". This whole thing would be fucking hilarious if I didn't feel so bad for the humans who are on the other side of this.
That comparison is awful. I work with quite a few Junior developers and they can be competent. Certainly don't make the silly mistakes that LLMs do, don't need nearly as much handholding, and tend to learn pretty quickly so I don't have to keep repeating myself.
LLMs are decent code assistants when used with care, and can do a lot of heavy lifting, they certainly speed me up when I have a clear picture of what I want to do, and they are good to bounce off ideas when I am planning for something. That said, I really don't see how it could meaningfully replace an intern however, much less an actual developer.
Nice to see that Microsoft has automated that, failure will be cheaper now.
An outsourced contractor was tasked with a very simple job as their first task - update a single dependency, which required just a bump of the version and no code changes - after three days of them seemingly struggling to even understand what they were asked to do, inability to clone the repo, failure to install the necessary tooling on their machine, they ended up getting fired from the project. Complete waste of money, and the time of those of us having to delegate and review this work.
Those have long been the folks I’ve seen at the biggest risk of being replaced by AI. Tasks that didn’t rely on human interaction or much training, just brute force which can be done from anywhere.
And for them, that $3/hr was really good money.
This level of smugness is why outsourcing still continues to exist. The kind of things you talk about were rare. And were mostly exaggerated to create anti-outsourcing narrative. None of that led to outsourcing actually going away simply because people are actually getting good work done.
Bad quality things are cheap != All cheap things are bad.
Same will work with AI too, while people continue to crap on AI, things will only improve, people will be more productive with AI, get more and bigger things done for cheaper and better. This is just inevitable given how things are going now.
>>There's a PM who takes your task and gives it to a "developer" who potentially has never actually written a line of code, but maybe they've built a WordPress site by pointing and clicking in Elementor or something.
In the peak of outsourcing wave. Both the call center people and IT services people had internal training and graduation standards that were quite brutal and mad attrition rates.
Exams often went along the lines of having to write whole ass projects without internet help in hours. Theory exams that had like -2 marks on getting things wrong. Dozens of exams, projects, coding exams, on-floor internships, project interviews.
>>After dozens of hours billed you will, in fact, get code where the new file wasn't added to the csproj or something like that, and when you point it out, they will bill another 20 hours, and send you a new copy of the project, where the test always fails. It's exactly like this.
Most IT services billing had pivoted away from hourly billing, to fixed time and material in the 2000s itself.
>>It's exactly like this.
Very much like outsourcing. AI is here to stay man. Deal with it. Its not going anywhere. For like $20 a month, companies will have same capability as a full time junior dev.
This is NOT going away. Its here to stay. And will only get better with time.
Most of this works because of price arbitrage. And continues to work that way, not just with outsourcing but with manufacturing too.
Remember those days, when people were going around telling Chinese products where crap? That didn't really work and more things only got made in China.
This is all so similar to early days of Google search, its just that cost of a search was low enough that finding things got easier and ubiquitous. That same is unfolding with AI now. People have a hard time believing a big part of their thinking can be outsourced to something that costs $20/month.
How can something as good as me be cheaper than me? You are asking the wrong question. For centuries now, every decade a machine(s) has arrived that can do a thing cheaper than what the human was doing at the time. Its not exactly impossible. You are only living in denial by asking this question, this has been how it has worked the day since humans found way of mimicking human work through machines. We didn't get here in a day.
It's not like a regular junior developer, it's much worse.
And even if it could, how do you get senior devs without junior devs? ^^
But the actual software part? I'm not sure anymore
I feel the same way today, but I got started around 2012 professionally. I wonder how much of this is just our fading optimism after seeing how shit really works behind the scenes, and how much the industry itself is responsible for it. I know we're not the only two people feeling this way either, but it seems all of us have different timescales from when it turned from "enjoyable" to "get me out of here".
Then one day I woke up and realized the ones paying me were also the ones using it to run over or do circles around everyone else not equipped with a bicycle yet; and were colluding to make crippled bicycles that'd never liberate the masses as much as they themselves had been previously liberated; bicycles designed to monitor, or to undermine their owner, or more disgustingly, their "licensee".
So I'm not doing it anymore. I'm not going to continue making deliberately crippled, overly complex, legally encumbered bicycles for the mind, purely intended as subjects for ARR extraction.
Yes, when your 100k quarterly RSU drop lands
At what point does the human developers just give up and close the PRs as "AI garbage". Keep the ones that works, then just junk the rest. I feel that at some point entertaining the machine becomes unbearable and people just stops doing it or rage close the PRs.
Microsoft's stock price is dependent on them proving that this is a success.
it's not as if Microsoft's share price have ever reflected the quality of their products
Why do they even need it? Success is code getting merged 1st shot, failure gets worse the more requests for changes the agent gets. Asking for manual feedback seems like a waste of time. Measure cycle time and rate of approvals and change failure rate like you would for any developer.
Typically, you wouldn't bother manually reviewing something until the automated checks have passed.
Let them finish a pull request before spending time reviewing it. That said, a merge request needs to have an issue written before it's picked up, so that the author does not spend time on a solution before the problem is understood. That's idealism though.
I'd rather hop in and get them on the right path rather than letting them struggle alone, particularly if they're struggling.
If it's another senior developer though I'd happily leave them to it to get the unit tests all passing before I take a proper look at their work.
But as a general principle, please at least get a PR through formatting checks before assigning it to a person.
https://github.com/dotnet/runtime/pull/115732#issuecomment-2...
I agree that not auto-collapsing repeated annotations is an annoying bug in the github interface.
But just pointing out that annotations can be hidden in the ... menu to the right (which I just learned).
And then, while the tech is not mature, running on delusion and sunken costs, it's actually used for production stuffs. Butlerian Jihad when
Maybe, but likely it is reality and their true company culture leaking through. Eventually some higher eq execs might come to the very late realization that they cant actually lead or build a worthwhile and productive company culture and all that remains is an insane reflection of that.
Does anyone know which model in particular was used in these PRs? They support a variety of models: https://github.blog/ai-and-ml/github-copilot/which-ai-model-...
> The stream of PRs is coming from requests from the maintainers of the repo. We're experimenting to understand the limits of what the tools can do today and preparing for what they'll be able to do tomorrow. Anything that gets merged is the responsibility of the maintainers, as is the case for any PR submitted by anyone to this open source and welcoming repo. Nothing gets merged without it meeting all the same quality bars and with us signing up for all the same maintenance requirements.
> It is my opinion that anyone not at least thinking about benefiting from such tools will be left behind.
The read here is: Microsoft is so abuzz with excitement/panic about AI taking all software engineering jobs that Microsoft employees are jumping on board with Microsoft's AI push out of a fear of "being left behind". That's not the confidence inspiring the statement they intended it to be, it's the opposite, it underscores that this isn't the .net team "experimenting to understand the limits of what the tools" but rather the .net team trying to keep their jobs.
Now you don’t even need the frustrated end user!
- Neophyte developers, since forever
Now you can mediate the mismatch in requirements through thousands / millions of matrix multiplications!
The answer is probably that the Copilot team is using the rest of the engineering organization as testers. Great for the Copilot team, frustrating for everyone else.
I would say the copilot system isn't really there yet for these kinds of changes, you don't have to run experiments on a language framework to figure that out.
I see this as a work in progress.. I am almost certain the humans in the loop on these PRs are well aware of what's going on and have their expectations in check, and this isn't just "business as usual" like any other PR or work assignment.
This is a test. You can't improve a system without testing it on real world conditions.
How do we know they're not tweaking the Copilot system prompts and settings behind the scenes while they're doing this work?
Can no one see the possibility that what is happening in those PRs is exactly what all the people involved expected to have happen, and they're just going through the process of seeing what happens when you try to refine and coach the system to either success or failure?
When we adopted AI coding assist tools internally over a year ago we did almost exactly this (not directly in GitHub though).
We asked a bunch of senior engineers to see how far they could get by coaching the AI to write code rather than writing it themselves. We wanted to calibrate our expectations and better understand the limits, strengths and weaknesses of these new tools we wanted to adopt.
In most of those early cases we ended up with worse code than if it had been written by humans, but we learned a ton. We can also clearly see how much better things have gotten over time, since we have that benchmark to look back on.
think for yourself
redbull does not give you wings. it’s disconcerting to see the lack of nuance in these discussions around these new tools (and yeah sorry this isn’t really aimed at you, but the zeitgeist, apologies)
Some of us are being laid off due to the hype; some are assigned to babysit the AI; and some are simply looked down on by higher ups who are eagerly waiting for a day to lay us all off.
You can convince yourself as much as you want that it’s “just a hype”, but regardless of your beliefs are, it has REAL world consequences.
It's going to look stupid... until the point it doesn't. And my money's on, "This will eventually be a solved problem."
I'm not so sure they'll get there. If the solved problem is defined as a sub-standard but low cost, then I wouldn't bet against that. A solution better than that though, I don't think I'd put my money on that.
Good decision making would weigh the odds of 1 vs 8 vs 16 years. This isn’t good decision making.
Why is doing a public test of an emerging technology not good decision making?
> Good decision making would weigh the odds of 1 vs 8 vs 16 years.
What makes you think this isn't being done?
Also, trying something new out will most likely have hiccups. Ultimately it may fail. But that doesn't mean it's not worth the effort.
The thing may rapidly evolve if it's being hard-tested on actual code and actual issues. For example it will be probably changed so that it will iterate until tests are actually running (and maybe some static checking can help it, like not deleting tests).
Waiting to see what happens. I expect it will find its niche in development and become actually useful, taking off menial tasks from developers.
However, every PR adds load and complexity to community projects.
As another commenter suggested, doing these kind of experiments on separate forks sound a bit less intrusive. Could be a take away from this experiment and set a good example.
There are many cool projects on GitHub that are just accumulating PRs for years, until the maintainer ultimately gives up and someone forks it and cherry-picks the working PRs. I've than that myself.
I'm super worried that we'll end up with more and more of these projects and abandoned forks :/
Now when your small or medium size business management reads about CoPilot in some Executive Quarterly magazine and floats that brilliant idea internally, someone can quite literally point to these as examples of real world examples and let people analyze and pass it up the management chain. Maybe that wasn’t thought through all the way.
Usually businesses tend to hide this sort of performance of their applications to the best of their abilities, only showcasing nearly flawless functionality.
There's however a border zone which is "worse than failure": when it looks good enough that the PRs can be accepted, but contain subtle issues which will bite you later.
Reading AI generated code is arguably far more annoying than any menial task. Especially if the said code happens to have subtle errors.
Speaking from experience.
The joke is that PERL was a write-once, read-none language.
> Speaking from experience.
My experience is all code can have subtle errors, and I wouldn't treat any PR differently.
I'll never understand the antagonistic "us vs. them" mentality people have with their employer's leadership, or people who think that you should be actively sabotaging things or be "maliciously compliant" when things aren't perfect or you don't agree with some decision that was made.
To each their own I guess, but I wouldn't be able to sleep well at night.
Most employees want to do good work, but pretending there’s no structural divergence in interests flattens decades of labor history and ignores the power dynamics baked into modern orgs. It’s not about being antagonistic, it’s about being clear-eyed where there are differences between the motivations of your org. leadership and your personal best interests. After a few levels remove from your position, you're just headcount with loaded cost.
Interesting because "them" very much have an antagonistic mentality vs "us". "Them" would fire you in a fucking heartbeat to save a relatively small amount (10%). "Them" also want to aggressively pay you the least amount for which they can get you to do work for them, not what they "value" you at. "Us" depends on "them" for our livelihoods and the lives of people that depend on us, but "them" doesn't doesn't have any dependency on you that can't be swapped out rather quickly.
I am a capitalist, don't get me wrong, but it is a very one-sided relationship not even-footed or rooted in two-way respect. You describe "them" as "leadership" while "Them" describe you as a "human resource" roughly equivalent to the way toilet paper and plastics for widgets are described.
If you have found a place to work where people respect you as a person, you should really cherish that job, because most are not that way.
Meanwhile a lot of folks have very unhealthy to non-existent relationships with their employers. There may be some mixture where they may be temporary hired/viewed as highly disposable or transient in nature having very little to gain from the success of the business, they may be compensated regardless of success/failure, they may have toxic management who treat them terribly (condescendingly, constantly critical, rarely positive, etc.). Bad and non-existent relationships lead to this sort of behavior. In general we’re moving towards “non-existent” relationships with employers broadly speaking for the labor force.
The counter argument is often floated here “well why work there” and the fact is money is necessary to survive, the number of positions available hiring at any given point is finite, and many almost by definition won’t ever be the top performers in their field to the point they truly choose their employers and career paths with full autonomy. So lots of people end up in lots of places that are toxic or highly misaligned with their interests as a survival mechanism. As such, watching the toxic places shoot themselves in the foot can be some level of justice people find where generally unpleasant people finally get to see consequences of their actions and take some responsibility.
People will prop others up from their own consequences so long as there’s something in it for them. As you peel that away, at some point there’s a level of poetic justice to watch the situation burn. This is why I’m not convinced having completely transactional relationships with employers is a good thing. Even having self interest and stability in mind, certain levels of toxicity in business management can fester. At some point no amount of money is worth dealing with that and some form of correction is needed there. The only mechanism is to typically assure poor decision making and action is actually held accountable.
Almost no one does but people get ground down and then do it to cope.
Too late?
oh wait
Nor can it be an entity to sign anything.
I assume the "not-copyrightable" issue, doesn't in anyway interfere with the rights trying to be protected by the CLA, but IANAL ..
I assume they've explicitly told it not to sign things (perhaps, because they don't want a sniff of their bot agreeing to things on behalf of MSFT).
We do know that LLMs will happily reproduce something from their training set and that is a clear copyright violation. So it can't be that everything they produce is public domain.
I have no idea how this will ultimately shake out legally, but it would be absolutely wild for Microsoft to not have thought about this potential legal issue.
>I have sole ownership of intellectual property rights to my Submissions
I would assume that the AI cannot have IP ownership considering that an AI cannot have copyright in the US.
>I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer.
Surely an AI would not be classified as an employee and therefore would not have an employer. Has Microsoft drafted an employment contract with Copilot? And if we consider an AI agent to be an employee, is it protected by the Fair Labor Standards Act? Is it getting paid at least minimum wage?
This means its probably quite hard to measure the gain or the drag of using these agents. On one side, its a lot cheaper than a junior, but on the other side it pulls time from seniors and doesn't necessarily follow instruction well (i.e. "errr your new tests are failing").
This combined with the "cult of the CEO" sets the stage for organisational dissonance where developer complaints can be dismissed as "not wanting to be replaced" and the benefits can be overstated. There will be ways of measuring this, to project it as huge net benefit (which the cult of the CEO will leap upon) and there will be ways of measuring this to project it as a net loss (rabble rousing developers). All because there is no industry standard measure accepted by both parts of the org that can be pointed at which yields the actual truth (whatever that may be).
If I might add absurd conjecture: We might see interesting knock-on effects like orgs demanding a lowering of review standards in order to get more AI PRs into the source.
Call me old school, but I find the workflow of "divide and conquer" to be as helpful when working with LLMs, as without them. Although what is needed to be considered a "large scale task" varies by LLMs and implementation. Some models/implementations (seemingly Copilot) struggles with even the smallest change, while others breeze through them. Lots of trial and error is needed to find that line for each model/implementation :/
So eg., one line of code which needed to handle dozens of hard-constraints on the system (eg., using a specific class, method, with a specific device, specific memory management, etc.) will very rarely be output correctly by an LLM.
Likewise "blank-page, vibe coding" can be very fast if "make me X" has only functional/soft-constraints on the code itself.
"Gigawatt LLMs" have brute-forced there way to having a statistical system capable of usefully, if not universally, adhreading to one or two hard constraints. I'd imagine the dozen or so common in any existing application is well beyond a Terawatt range of training and inference cost.
I can't fire half my dev org tomorrow with that approach, I can't really fire anyone, so I guess it would be a big letdown for a lot of execs. Meanwhile though we just keep incrementally shipping more stuff faster at higher quality so I'm happy...
This works because it treats the LLM like what it actually is: an exceptionally good if slightly random text transformer.
Even if it could perform at a similar level to an intern at a programming task, it lacks a great deal of the other attributes that a human brings to the table, including how they integrate into a team of other agents (human or otherwise). I won't bother listing them, as we are all humans.
I think the hype is missing the forest for the trees, and I think exactly this multi-agent dynamic might be where the trees start to fall down in front of us. That and the as currently insurmountable issues of context and coherence over long time horizons.
-Being a parent to a small child and the associated sleep deprivation.
-His reluctance to read documentation.
-There being a language barrier between him the project owners. Emphasis here, as the LLM acts like someone who speaks through a particularly good translation service, but otherwise doesn't understand the language spoken.
Don’t get me wrong: the current models are already powerful and useful. However, there is still a lot of reason to remain skeptical of an imminent explosion in intelligence from these models.
For some reason my pessimism meter goes off when I see single sentence arguments “change has been slow”. Thanks for brining the conversation back.
And at the same time, absurdly slow? ChatGPT is almost 3 years old and pretty much AI has still no positive economic impact.
It will take some time for whatever reality is to actually show truthfully in the financials. When VC money stops subsidising datacentre costs, and businesses have to weigh the full price against real value provided, that is when we will see the reality of the situation.
I am content to be wrong either way, but my personal prediction is if model competence slows down around now, businesses will not be replacing humans en-mass, and the value provided will be notable but not world changing like expected.
Now look at the past year specifically, and only at the models themselves, and you'll quickly realize that there's been very little real progress recently. Claude 3.5 Sonnet was released 11 months ago and the current SOTA models are only marginally better in terms of pure performance in real world tasks.
The tooling around them has clearly improved a lot, and neat tricks such as reasoning have been introduced to help models tackle more complex problems, but the underlying transformer architecture is already being pushed to its limits and it shows.
Unless some new revolutionary architecture shows up out of nowhere and sets a new standard, I firmly believe that we'll be stuck at the current junior level for a while, regardless of how much Altman & co. insist that AGI is just two more weeks away.
When you look at it from afar, it looks potentially good, but as you start looking into it for real, you start realizing none of it makes any sense. Then you make simple suggestions, it does something that looks like what you asked, yet completely missing the point.
An intern, no matter how bad it is, could only waste so much time and energy.
This makes wasting time and introducing mind-bogglingly stupid bugs infinitely scalable.
Considering the ire that H1B related topics attract on HN, I wonder if the same outrage will apply to these multi-billion dollar boondoggles.
It's a long-term play to have pricey senior developers argue with an llm
Yeah, I'm sure 100k comments with "Copilot, please look into this" and "The test cases are still failing" will massively improve these models.
Any senior dev at these organizations should know to some degree how LLMs work and in my opinion would to some degree, as a self protection mechanism, default to ambiguous vague comments like this. Some of the mentality is “if I have to look at it and solve it why don’t I go ahead and do it anyways vs having you do it” effort choices they’d do regardless of what is producing the PR. I think other parts of it is “why would I train my replacement, there’s no advantage for me here.”
This is a performative waste of time
Don't you think it has already been trained with, I don't know, maybe millions of PRs?
Equating LLMs to humans is pretty damn.. stupid. It's not even close (otherwise how come all the litany of office jobs that require far less reasoning than software development are not replaced?).
Doing so has low risk, the senior devs may perhaps get fed up and quit, and the company might be a laughing stock on public PRs. But the potential value for is huge.
But I think it’s better for everyone if human ownership is central to the process. Like I vibe coded it. I will fix it if it breaks. I am on call for it at 3AM.
And don’t even get started on the safety issues if you don’t have clear human responsibility. The history of engineering disasters is riddled with unclear lines of responsibility.
Writing code fast is never relevant to any tasks I've encountered. Instead it's mostly about fast editing (navigate quickly to the code I need to edit and efficiently modify it) and fast feedback (quick linting, compiling, and testing). That's the whole promise of IDEs, having a single dashboard for these.
Exactly. LLM does not know how to use a debugger. LLM does not have runtime contexts.
For all we know, the LLM could’ve fixed the issue simply by commenting out the assertions or sanity checks and everything seemed fine and dandy until every client’s device catches on fire.
We have the option to use GitHub CoPilot on code reviews and it’s comically bad and unhelpful. There isn’t a single member of my team who find it useful for anything other than identifying typos.
from https://news.ycombinator.com/item?id=44031432
"From talking to colleagues at Microsoft it's a very management-driven push, not developer-driven. Friend on an Azure team had a team member who was nearly put on a PIP because they refused to install the internal AI coding assistant. Every manager has "number of developers using AI" as an OKR, but anecdotally most devs are installing the AI assistant and not using it or using it very occasionally. Allegedly it's pretty terrible at C# and PowerShell which limits its usefulness at MS."
"From reading around on Hacker News and Reddit, it seems like half of commentators say what you say, and the other half says "I work at Microsoft/know someone who works at Microsoft, and our/their manager just said we have to use AI", someone mentioned being put on PIP for not "leveraging AI" as well. I guess maybe different teams have different requirements/workflows?"
In my experience, LLMs in general are really, really bad at C# / .NET , and it worries me as a .NET developer.
With increased LLM usage, I think development in general is going to undergo a "great convergence".
There's a positive(1) feedback loop where LLM's are better at Blub, so people use them to write more Blub. With more Blub out there, LLMs get better at Blub.
The languages where LLMs struggle, with become more niche, leaving LLMs struggling even more.
C# / .NET is something LLMs seem particularly bad at, and I suspect that's partly caused by having multiple different things all called the same name. EF, ASP, even .NET itself are names that get slapped on a range of different technologies. The EF API has changed so much that they had to sort-of rename it to "EF Core". Core also gets used elsewhere such as ".NET core" and "ASP.NET Core". You (Or an LLM) might be forgiven for thinking that ASP.NET Core and EF Core are just those versions which work with .NET Core (now just .NET ) and the other versions are those that don't.
But that isn't even true. There are versions of ASP.NET Core for .NET Framework.
Microsoft bundle a lot of good stuff into the ecosystem, but their attitude when they hit performance or other issues is generally to completely rewrite how something works, but then release the new thing under the old name but with a major version change.
They'll make the new API different enough to not work without work porting, but similar enough to confuse the hell out of anyone trying to maintain both.
They've made things like authentication, which actually has generally worked fine out-of-the-box for a decade or more, so confusing in the documentation that people mostly tended to run for a third party solution just because at least with IdentityServer there was just one documented way to do it.
I know it's a bit of a cliche to be an "AI-doomer", and I'm not really suggesting all development work will go the way of the dinosaur, but there are specific ecosystem concerns with regard to .NET and AI assistance.
(1) Positive in the sense of feedback that increased output increases output. It's not positive in the sense of "good thing".
The graphic "Internal structure of tech companies" comes to mind, given if true, would explain why the process/workflow is so different between the teams at Microsoft: https://i.imgur.com/WQiuIIB.png
Imagine the Copilot team has a KPI about usage, matching the company OKRs or whatever about making sure the world is using Microsoft's AI enough, so they have a mandate/leverage to get the other teams to use it regardless of if it's helping or not.
For example, if tomorrow my company announced that everyone was being switched to Windows, I would simply quit. I don’t care that WSL exists, overall it would be detrimental to my workday, and I have other options.
(just mentioning it because you linked a post and quoted two comments, instead of directly linking the comments. not trying to 'uhm, actually'.)
It seems to me to be coming from the CEO echo chamber (the rumored group chats we keep hearing about). The only way to keep the stock price increasing in these low growth high interest rate times is to cut costs every quarter. The single largest cost is employee salaries. So we have to shed a larger and larger percentage of the workforce and the only way to do that is to replace them with AI. It doesn't matter whether the AI is capable enough to actually replace the workers, it has to replace them because the stock price demands it.
We all know this will eventually end in tears.
I guess money-wise it kind of makes sense when you're outsourcing the LLM inference. But for companies like Microsoft, where they aren't outsourcing it, and have to actually pay the cost of hosting the infrastructure, I wonder if the calculation still make sense. Since they're doing this huge push, I guess someone somewhere said it does make sense, but looking at the infrastructure OpenAI and others are having to build (like Stargate or whatever it's called), I wonder how realistic it is.
Idiots.
Masters of the Universe, because they think they will become more rich or at least more masterful.
Why?
Further down, so that developers are used to train the AI that would replace both developers and managers.
It's a situation like this:
Mgr: Go dig a six-foot-deep rectangular hole.
Eng: What should the rectangle's dimensions be?
Mgr: How tall and wide are you?
It wouldn't be out of character, Microsoft has decided that every project on GitHub must deal with Copilot-generated issues and PRs from now on whether they want them or not. There's deliberately no way to opt out.
https://github.com/orgs/community/discussions/159749
Like Googles mandatory AI summary at the top of search results, you know a feature is really good when the vendor feels like the only way they can hit their target metrics is by forcing their users to engage with it.
What this tells me is that software enterprises are so hellbent in firing their programmers and reducing their salary costs they they are willing to combust their existing businesses and reputation into the dumpster fire they are making. I expected this blatant disregard for human society to come ten or twenty years into the future, when the AI systems would actually be capable enough. Not today.
Have you been sleeping under a rock for the last decade? This has been going on for a long long time. Outsourcing been the name of the game for so long people seem to forgot it's happening it all.
Passkeys. As someone who doesn't see the value of it, every hype-driven company seems to be pushing me to replace OPT 2FA with something worse right now.
Passkeys fix that.
Turns out that under certain conditions, such as severe exhaustion, that "sus filter" just... doesn't turn on quickly enough. The aim of passkeys is to ensure that it _cannot_ happen, no matter how exhausted/stressed/etc someone is. I'm not familiar enough with passkeys to pass judgement on them, but I do think there's a real problem they're trying to solve.
Something "$5 wrench"
https://xkcd.com/538/
Much more worried about what this is going to do to the FOSS ecosystem. We've already seen a couple maintainers complain and this trend is definitely just going to increase dramatically.
I can see the vision but this is clearly not ready for prime time yet. Especially if done by anonymous drive-by strangers that think they're "helping"
They are putting this in front of the developers as take it or leave it deal. I left the platform, doing my coding old way, hosting it somewhere else.
Discoverability? I don't care. I'm coding it for myself and hosting in the open. If somebody finds it, nice. Otherwise, mneh.
Other than that, I don't think this is bad tech, however, this brings another slippery slope. Today it's as you say:
> I think this process is intended for fixing papercuts rather than building anything involved. It just isn't good enough yet.
After sufficient T somebody will rephrase it as:
> I think this process is intended for writing small, personal utilities rather than building enterprise software. It just isn't good enough yet.
...and we will iterate from there.
So, it looks like I won't touch it for the foreseeable future. Maybe if the ethical problems with training material is solved (i.e. trained with data obtained with consensus and with correct licenses), I can use as alongside other analysis and testing tools I use, for a final pass.
AI will never be a core and irreplaceable part of my development workflow.
1. If we let intellectual property be a fundamental principle the line between idea (that can't be owned) and ip (that can be owned) will eventually devolve into a infinitely complex fractal that nobody can keep track of. Only lawyer AI's will eventually be able to tell the difference between idea and ip as the complexity of what we can encode become more complex.
2. What is the fundamental reason that a person is allowed to train on ip but a bot is not? I suspect that this comes down to the same issue with the divide between ip and idea. But there be some additional dimension to it. At some point we will need to see some AI as conscious entities and to me it makes little sense that there would be some magical discrete moment where an AI becomes conscious and gets rights to it's "own ideas".
Or maybe there's a simple explanation of the boundary between ip and idea that I have just missed? If not, I think intellectual property as a concept will not stand the test of time. Other principles will need to take its place if we want to maintain the fight for a good society. Until then IP law still has its place and should be followed but as an ethical principle it's certainly showing cracks.
Unless AI use becomes a KPI in your annual review.
Duolingo did that just recently, for example.
I am developing serious regrets for conflating "computing as a medium for personal expression" with "computing for livelihood" early on.
That’d be an insta-quit for me :)
may you please let me know where are you hosting the code ? would love to migrate as well.
thank you !
You can also self-host a Forgejo instance on a €3/mo Hetzner instance (or a free Oracle Cloud server) if you want. I prefer Hetzner for their service quality and server performance.
[0]: https://blog.bayindirh.io/blog/moving-to-source-hut/
I plan to use Source Hut for public projects.
For some research I use a private Git server. However, even that code might get released as Free Software when it matures enough.
Maybe that's how the microsoft employees are using it (in another IDE I suppose).
It is normal to preempt things like this when working with agents. That is easy to do in real time, but it must be difficult to see what the agent is attempting when they publish made up bullshit in a PR.
It seems very common for an agent to cheat and brute force solutions to get around a non-trivial issue. In my experience, its also common for agents to get stuck in loops of reasoning in these scenarios. I imagine it would be incredibly annoying to try to interpret a PR after an agent went down a rabbit hole.
Step 2. Automate the use of these LLMs into “agents”
Step 3. ???
Step 4. Profit
> It is my opinion that anyone not at least thinking about benefiting from such tools will be left behind.
This is gross, keep your fomo to yourself.
The AI agent/programmer corpo push is not about the capabilities and whether they match human or not. It's about being able to externalize a majority of one's workforce without having a lot of people on permanent payroll.
Think in terms of an infinitely scalable bunch of consultants you can hire and dismiss at your will - they never argue against your "vision", either.
If AI can change... well more likely can convince gullible c levels that AI can do those jobs... many jobs will be lost.
See Klarna "https://www.livemint.com/companies/news/klarnas-ai-replaced-..."
https://www.livemint.com/companies/news/klarnas-ai-replaced-...
Just the attempt to use AI and fail then degraded the previous jobs to a gig economy style job.
For refactoring and extending good, working code, AI is much more useful.
We are at a stage where AI should only be used for giving suggestions to a human in the driver's seat with a UI/UX that allows ergonomically guiding the AI, picking from offered alternatives, giving directions on a fairly micro level that is still above editing the code character by character.
They are indeed overpromising and pushing AI beyond its current limits for hype reasons, but this doesn't mean this won't be possible in the future. The progress is real, and I wouldn't bet on it taking a sharp turn and flattening.
reddit is a distillation of the entire internet on to one site with wildly variable quality of discussion depending upon which subreddit you are in.
Some are awful, some are great.
haha
Fun facts schadenfreude: the emotional experience of pleasure in response to another’s misfortune, according to Encyclopedia Britannica.
Word that's so nasty in meaning that it apparently does not exist except in German language.
Except it does, we have "skadeglädje" in Swedish.
Or MS already does that?
that's literally the bare minimum.
it also opens the PR as its working session. there are a lot of dials, and a lot of redditor-ass opinions from people who don’t use or understand the tech
No surprises here.
It always struggles on non-web projects or on software where it really matters that correctness is first and foremost above everything, such as the dotnet runtime.
Either way, a complete disastrous start and what a mess that Copilot has caused.
I have so far only found LlMs useful as a way of researching, an alternative to web search, and doing very basic rote tasks like implementing unit tests or doing a first pass explanation of some code. Tried actually writing code and it’s not usable.
And the quantity of js code available/discoverable when scrapping the web is larger by an order of magnitude than every other language.
OTOH webdev is known for rapid framework/library churn, so before too long there will be a crossroads where the pre-AI training data is too old and the fresh training data is contaminated by the firehose of vibe coded slop.
Spending massive amounts of:
- energy to process these queries
- wasting time of mid-level and senior engineers to vibe code with copilot to ensure train and get it right
We are facing a climate change crisis and we continue to burn energy at useless initiatives so executives at big corporation can announce in quarterly shareholder meetings: "wE uSe Ai, wE aRe tHe FuTuRe, lAbOr fOrCe rEdUceD"
And didn’t actually provide light, but everyone on 19th century twitter says that it will one day provide light if you believe hard enough, so you should rip out your gas lamps and install it now.
Like, this is just generation of useless busy-work, as far as I can see; it is clearly worse than useless. The PRs don't even have passing CI!
These tools should be locked away in an R&D environment until sufficiently perfected.
MVP means 'ship with solid, tested basic features', not 'Ship with bugs and fix in production'.
this stuff works. it takes effort and learning. it’s not going to magically solve high-complexity tasks (or even low-complexity ones) without investment. having people use it, learn how it works, and improve the systems is the right approach
a lot of armchair engineers in here
AI is aimed at eliminating the jobs of most of HN so it's understandable that HN doesn't want AI to succeed at its goal.