ntfy.sh v2.18.0 was written by AI

ueiqkkwhuwjw@lemmy.world · edit-2 1 month ago

ntfy.sh v2.18.0 was written by AI

d15d@feddit.org · 1 month ago

They are not even trusting it themselves. This is from the release notes

I’ll not instantly switch ntfy.sh over. Instead, I’m kindly asking the community to test the Postgres support and report back to me if things are working

Fuck that.

Mirror Giraffe@piefed.social · 1 month ago

Classic “test in production” strategy, very solid!

Railcar8095@lemmy.world · 1 month ago

Test in production is the best. We spent months warning from data bugs and nobody bat an eye (upstream bug, not our responsibility but we noticed) When it was d launched in prod we just pointed out the bug that nobody fixed was still there and immediately a war room was formed and the bug fixed within an hour.

It honestly seems more efficient to let shit hit the fan than to fight everybody to do their job.

Mirror Giraffe@piefed.social · 1 month ago

For sure, the song of the hero who fixed the production bug is oft sang at meetings but the loser who prevented the bug to begin with gets no credit.

hornedfiend@piefed.social · edit-2 1 month ago

Testing in production is the most idiotic last 10 years or so concept, which is mainly driven by incompetence of project managers.

Imagine if you get sold a car by a company, for 100k, then it starts having major issues and the car company tells you: “we’ll fix it”.

While that does not necessarily apply to software or services or webapps, the logic still stands. You are selling bugs to people. Bugs that could have been cought, with some risk management and planning.

Edit: F-ing ios keyboard.

Railcar8095@lemmy.world · 1 month ago

which is mainly driven by incompetence of project managers.

I completely agree. I work on an internal solution, which is a part of a very large product. It’s not a live product, only part of a pipeline that runs on a predetermined schedule. Our bit is the only one with actual business/performance KPIs, most of the other teams measure only “user story/CR points”. If the other teams screw up, it will impact our performance unless we prove it’s their fault. And of it’s their fault, they open a US/bug which improves their metrics (one more US closed). Our team has to think ahead and try to do things well in one go, because our bugfixing doesn’t count as work. But our speed is measured against people who benefits from half doing stuff. When we did massive effort, we got complaints we were slow. Now we do less effort and once every blue moon we have to do a hotfix. Most often than not when we have an production issue is due to the other teams that run before us on the pipeline, so we even had to develop checks to our input because they won’t add checks to their outputs. And they won’t because that’s a CR that requires extra funding that’s not approved, but we had to create them for our own sanity.

Yes, I’m looking to move out haha

hornedfiend@piefed.social · 1 month ago

A project is as good as its weakest point. While people might get butthurt by getting pointed at, a project is a group effort. Segregated teams are always a problem and almost always becomes a vulnerability,

Given current micro services architectures, we all have to get along with each other,for the greater good and the interest of the customer.

You sell shit, you get shit back. You sell high quality products with less obvious faults, you profit in the long run.

But no: “Let’s test in production”…

Railcar8095@lemmy.world · 1 month ago

Again, I agree and I’ve fought for that. But this needs to be top to bottom. We have budget slashed, morale in the ground across the board. Those who keep trying for the best fight a losing battle with those who already have up trying.

If the bosses don’t care about the interest of the “customer”, I don’t either. I’ve already openly spoken to my team saying I’m now ready for things to blow up and get the attention we need from the ones really high up. I’m done working overtime because anther team is already working overtime in something else or because some bullshit political 4D chess were they throw us under the bus for their failings or try to make theirs our work.

Had an annoying day with this things, sorry for dumping this here haha

callmemagnus@lemmy.world · edit-2 1 month ago

Consider a donation to help people providing you the open source software you seem to depend upon.

Usage of a helper tool to perform tasks on code whether it is AI or the IDE internal features can reduce the work load of benevolent developers who has not asked you to use their softwares.

Maybe the language was not appropriate but get real. With the little revenue generated by the usage of people complaining, the use of AI agentic coding might be the only way to bring features without pushing benevolent devs to burnout.

Edit: to bring, not to being!

Mirror Giraffe@piefed.social · 1 month ago

You are completely correct, and to be honest I’ve tested commercial product features in prod as well on teams that have the capacity to handle it and make a living on it, unlike this maintainer.

I’m also experimenting heavily with vibe coding and I think it has many uses for a seasoned programmer while getting a lot of flak.

Of course there are issues and problems with it, but for me it had been helping out a lot.

november@piefed.blahaj.zone · 1 month ago

Hmm, no, I think I’ll just uninstall.

ubergeek77@lemmy.ubergeek77.chat · 1 month ago

What happened to “reviewed and heavily tested over 2-3 weeks” from the release notes? Maybe Claude wrote that too lol

patrick@lemmy.bestiver.se · 1 month ago

It looks like that tool is more or less built by a single developer (you already trust their judgment anyways!), and even though the code came through in a single PR it was a merge from a branch that had 79 separate commits: https://github.com/binwiederhier/ntfy/pull/1619

Also glancing through it a bit, huge portions of that are straightforward refactors or even just formatting changes caused by adding a new backend option.

I’m not going to say it’s fine, but they didn’t just throw Claude at a problem and let it rewrite 25k lines of code unnecessarily.

mudkip@lemdro.id · 1 month ago

Any AI usage immediately discredits the software for me, because it calls into question all of their past and future work.

blarg_dunsen@sh.itjust.works · 1 month ago

Oh boy, do I have bad news about 90% of the internet for you…

mudkip@lemdro.id · 1 month ago

Linus sent an email recently to the Kernel Mailing List trashing AI slop and rejecting AI generated patches. The fact that he used it to play around with a script doesn’t invalidate the fact that he distrusts code written by LLMs when it actually matters.

5gruel@lemmy.world · 1 month ago

you mean this statement? https://www.theregister.com/2026/01/08/linus_versus_llms_ai_slop_docs/?td=rt-3a

If yes, your statement does not really match what Linus said.

prenatal_confusion@feddit.org · 1 month ago

Wow a differentiated opinion on AI use :)

fccview@lemmy.world · 1 month ago

Yeah, I mean, with or without AI, I’ve always only had a big pull request for releases, from a stable release branch into the main branch, the release branch would be a merge of various branches or just be worked on directly on various stages.

One big pull request doesn’t really mean anything.

sloppy_diffuser@sh.itjust.works · 1 month ago

Something like https://graphite.com/ to create stacked PRs that are reviewable probably would have helped. Can be replicated with local LLMs or remote AI providers with locally configured agentic workflows. Never used graphite personally, but I’ve seen some open source maintainers use it to split up large PRs.

johntash@eviltoast.org · 1 month ago

Huh, I was wondering how rrds would help…

Erik-Jan@fosstodon.org · 1 month ago

@ueiqkkwhuwjw just this quote at the start of the release notes

> 14,997 added lines of code, and 10,202 lines removed, all from one pull request

This is already a major red flag even without the ai stuff right? Can’t believe anyone would flaunt that like this.

dev_null@lemmy.ml · edit-2 1 month ago

The “single pull request” is a merge release from 79 separate commits. It’s the sum of all work, it doesn’t mean all of it was changed in one go.

notabot@piefed.social · 1 month ago

I’m assuming this is some sort of canary message to indicate that the code base has been compromised, the author can’t talk about it, and everyone should immediately stop using the service. Surely no-one would be unwise enough to commit this otherwise?

Even ignoring the huge red LLM flag, a 25kLOC delta in a single PR should be cause for instant rejection as there’s no way to fully understand or test it, let alone in 2-3 weeks.

ExFed@programming.dev · 1 month ago

25kLOC delta in a single PR should be cause for instant rejection

Not to pick at nits, but it would be VERY different if it was 1k lines added and 24k lines removed. There’s something extremely satisfying about removing 10k+ lines of unnecessary code.

notabot@piefed.social · 1 month ago

Sure, that would be a little different, but unless you could make a convincing argument, backed up with a solid set of unit tests, at the least, as to why and how you were able to remove that much code whilst only adding a comparatively small amount, I’d still be inclined to reject it and ask for it to be broken down into smaller units.

Now, that explaination might be something along the lines of it being dead code that is not called from anywhere, or even that it was a patched version of an upstream library, and the patch is now included in that upstream, in which case, fair enough, good work, and thanks very much. As a rewrite or refactor though, it’s too big to sensibly review and needs breaking down into separate features.

ExFed@programming.dev · 1 month ago

Absolutely, the author needs to be able to reason about their changes, no matter what. However, the reason why I think the two situations are fundamentally different, though, is that it’s a lot easier to validate the existence of features than it is the non-existence of bugs or malicious behavior. The biggest risk to removing code is breaking preexisting features, whereas the biggest risk to adding code is introducing malicious behavior.

henfredemars@infosec.pub · edit-2 1 month ago

Definitely share your initial concern. Without strong review processes to ensure that every line of code follows the intent of the human developer, there’s no way of knowing what exactly is in there and the implications for the human users. And I’m not just talking about bugs.

They say it’s reviewed, but the temptation to blindly trust is there. In this case, developer appears to have taken some care.

The code was written by Cursor and Claude, but reviewed and heavily tested over 2-3 weeks by me. I created comparison documents, went through all queries multiple times and reviewed the logic over and over again. I also did load tests and manual regression tests, which took lots of evenings.

Let us hope so. Handle with care to ensure responsibility is not offloaded to a machine instead of a person.

Slotos@feddit.nl · 1 month ago

The size of that changeset means that it’s inherently unreviewable.

The commit history is something I’ve seen only in the PRs that even the most dysfunctional companies would demand a rewrite for.

Also, 2-3 weeks review? PostgreSQL support could be added in that time without the need for a damn „vibe check”. Hell, it would probably take less time than that.

Mirror Giraffe@piefed.social · 1 month ago

To be fair they would have needed to spend time testing the manual implementation as well.

The problem I see mainly is that even if this rolls out perfectly, the erratic and changing nature if llms still make it pointless as a proof of concept. Next time Claude might fuck up in a fringe way that’s not covered by unit tests and is missed by manual tests.

On the other hand I guess I’ve been guilty myself on numerous occasions to implement fringe bugs into production code, but at least I learn from it.

Slotos@feddit.nl · 1 month ago

I made my statement as a BDD/TDD practitioner.

The code goal of software engineering is not to deliver said code, but to deliver it in a framework that lets others—and consequently me in a week’s time—to contribute easily. This makes both future improvements and bug fixes easier.

Dumping a ~25000 lines changeset with a git history that’s almost designed to confuse is antithetical to both engineering and open source.

Jul (they/she)@piefed.blahaj.zone · 1 month ago

Yeah, it could easily have added a couple of lines of code that sends everything to Northern Korean hackers because it found that in a bunch of repositories or just logging passwords to public logs or other things an experienced developer would never do. “AI” only replicates what it sees most often and as more spam and junk repos are added to its training data because “AI” companies are too concerned with profit to teach it properly, it could do tons of random stuff. It’s like training a developer by giving them random examples from the internet rather than specific ones. Of course they pick up bad habits. Even if it “works” it is almost never efficient or secure.

nfreak@lemmy.ml · 1 month ago

Definitely time to find an alternative. What the actual fuck is this

rozlav@lemmy.blahaj.zone · 1 month ago

there is this repo that lists some slopware : https://codeberg.org/small-hack/open-slopware maybe someone can add it

cecilkorik@piefed.ca · 1 month ago

I think there’s room for a little bit of nuance that page doesn’t do a great job of describing. In my opinion there’s a huge difference between volunteer maintainers using AI PR checks as a screening measure to ease their review burden and focusing their actual reviews on PRs that pass the AI checks, and AI-deranged lone developers flooding the code with “AI features” and slopping out 10kloc PRs for no obvious reason.

Just because a project is using AI code reviews or has an AGENTS.md is not necessarily a red flag. A yellow flag, maybe, but the evidence that the Linux Kernel itself is on that list should serve as an example of why you can’t just kneejerk anti-AI here. If you know anything about Linus Torvalds you know he has zero tolerance for bad code, and the use of AI is not going to change that despite everyone’s fears. If it doesn’t work out, Linus will be the first one to throw it under the bus.

witten@lemmy.world · edit-2 13 days ago

deleted by creator

EarMaster@lemmy.world · 1 month ago

Well it’s AI slop then - at least by the definition of most users here.

witten@lemmy.world · edit-2 13 days ago

deleted by creator

baner@lemmy.zip · 1 month ago

Upvote this guy

Xylight‮@lemdro.id · 1 month ago

the linux kernel is on that list, bro it’s time to switch!

paequ2@lemmy.today · 1 month ago

Time to switch to Plan9!

VeryFrugal@sh.itjust.works · 1 month ago

Also Chrome, Firefox ans Ladybird!

WhyJiffie@sh.itjust.works · 1 month ago

did not know that the serde developer tolnay is a military apologist. I’m disgusted. serde is a very good tool… I’ll think about what to do about this. such a shame…

osanna@lemmy.vg · 1 month ago

oh no. not ladybird! You were supposed to save us!

addie@feddit.uk · 1 month ago

Awesome page, thanks. Have bookmarked.

Harfbuzz though? That’s going to take some replacing. Hopefully someone will fork an earlier version. The thing that it does (accurate multi-script font shaping) is difficult to do; requires a lot of rule-of-thumb knowledge that’s unlikely to be possessed by a single person, needs a lot of collaboration.

LiveLM@lemmy.zip · edit-2 1 month ago

Look, if he wanted to introduce AI code, whatever, but doing it all at once in a 14k line change is crazy.

Surely it would be better to introduce AI by letting it handle misc changes here and there instead of starting with the “biggest release ever done” (his words), no?

not_IO@lemmy.blahaj.zone · 1 month ago

we’re all so fucked

Kevin@lemmy.world · 1 month ago

I just set up a ntfy server for Unified Push earlier this week to use with Matrix. Now I have to turn around and immediately replace it…

Starfighter@discuss.tchncs.de · edit-2 1 month ago

Same here. Literally just set it up and now this.

I hope the author will roll this back or someone else makes a fork. I don’t want to immediately switch technology to XMPP/Matrix/… and have to do it all over again.

lambalicious@lemmy.sdf.org · 1 month ago

You could, in the meantime, simply not upgrade to the version that uses AI.

Since, from what I’m seeing around, people are having issues looking for an alternative.

poVoq@slrpnk.net · edit-2 1 month ago

If you use ntfy mainly as a Unified Push distributor on Android, then I highly recommend switching to a XMPP client that can do the same.

ueiqkkwhuwjw@lemmy.world · 1 month ago

I was also using it for notifications but I’ll probably switch to E-Mail for that and find an alternative UP distributor.

hoppolito@mander.xyz · edit-2 1 month ago

Conversations is working very well on my phone as UP distributor.

卩卄卂丂乇@lemmy.8th.world · 1 month ago

Do you recommend an app?

poVoq@slrpnk.net · 1 month ago

The first three on this list can do it: https://joinjabber.org/docs/apps/android/

Explanation here: https://joinjabber.org/tutorials/service/unifiedpush/

osanna@lemmy.vg · 1 month ago

Sigh. Time to switch to gotify

GreenKnight23@lemmy.world · 1 month ago

been using EMQX plus an MQTT client on my phone for a few months now, I like it better than gotify since the app was chewing through my battery like a vampire.

it might be better now since my issues happened three-ish years ago.

SayCyberOnceMore@feddit.uk · 1 month ago

This EMQX?

Seems it’s no longer FOSS?

I’ve been using Gotify for a few notifications from Home Assistant and it doesn’t appear to be eating my battery.

It’s a little more responsive than ntfy - sometimes ntfy doesn’t alert for ages after the trigger (could be phone power saving the wifi…), but then I also get realerts from yesterday… not had that with Gotify.

GreenKnight23@lemmy.world · 1 month ago

that’s the one.

FOSS or not, it still runs just fine on my infra. I prefer it over something like rabbitmq because it has a pretty slick admin webgui.

I’ll have to give gotify another try.

shirro@aussie.zone · 1 month ago

I can see the pragmatic appeal. Maintaining a lot of code for an open source project is thankless. Go is designed for idiots like me so it makes sense that an llm should be able to emit code that mostly works. There are classes of errors that are less likely in Go and the compiler and linting will prevent some foot guns and then it would have been tested.

Ethically I hate anything to do with the llm industry and all it represents. I hate the environmental impacts. The social impacts. The disregard for intellectual property. The devaluing of human effort. The scam economics. I won’t use anything touched by it on principle and if that means walking away from a dead Internet so be it. There is enough pre-2020s books, audiobooks, movies, music and code to keep me interested for the rest of my life.

gregmiranda@lemmy.ml · 1 month ago

That’s it. Fuck AI.

uzay@infosec.pub · 1 month ago

Oh ffs…

Thanks for the heads-up

Possibly linux@lemmy.zip · 1 month ago

I’d run for the hills

There are so many issues with AI

NoFun4You@lemmy.world · 1 month ago

Like ppl thinking skilled engineers cannot vet AI output. AI is pretty good for programming.

IphtashuFitz@lemmy.world · 1 month ago

I have a few decades programming experience, as a professional software engineer, an open source developer, and a DevOps engineer. There is no way in hell I would do a code review where 15k lines were added and a similar amount of lines removed without having a long discussion with the person who made those changes. I’d want to ask a lot of detailed questions about the changes, questions that an LLM isn’t likely to answer, and most definitely not questions I’d be inclined to try to type into an LLM to try to get an answer.

Over the years I’ve dealt with all manner of bugs, from overflows & underflows, to bad assumptions about logic flow, and much much more. The whole purpose of pointed questioning of the author is to be comfortable with decisions made in the code and to minimize the chances of all sorts of potential bugs.

NoFun4You@lemmy.world · 1 month ago

I think it largely depends on what you’re building. You’re not gonna get what you’ve got over there over night with a few giant prompts.

mic_check_one_two@lemmy.dbzer0.com · 1 month ago

And yet there are cases like the Huntarr debacle, where the dev simply thought “and make sure your code complies with best security practices” to their vibe code prompts actually made it secure.

They added 14k lines of code in a week, and ripped out 10k lines of existing code. That’s not something that a skilled programmer can reasonably vet in that amount of time. This is showing all the signs of AI slop, and none of the signs of debugged or vetted code.

NoFun4You@lemmy.world · 1 month ago

That’s a bit extreme

thedeadwalking4242@lemmy.world · 1 month ago

It’s not. That’s the problem. It actually sucks ass. It’s super low quality for anything more complex they s very simple CRUD app or a simple function. I say this as someone who s a heavy LLM user. It’s just bad code. It makes all kinds of simple mistakes. Just because code compiles doesn’t mean it’s good or does what you need it to do

NoFun4You@lemmy.world · 1 month ago

I think it largely depends on what you’re building. You’re not gonna get build a company overnight with a few prompts but it’s much more powerful than you’ve described.

thedeadwalking4242@lemmy.world · 1 month ago

It’s really not though. If you think it is I really suggest to re-think your perspective on what maintainable shippable code looks like. It’s basically automating copying from stack overflow. There’s so many little considerations that come into development.

NoFun4You@lemmy.world · 1 month ago

Still sounds like something someone would say who has had a bad time and experience with generating code lol. It really isn’t that hard to be an engineer and to get what you want out of code generation.

Ohi@lemmy.world · 1 month ago

You’re absolutely right, and the vast majority of people on this platform seem to get offended by anything AI related. Software engineers have been reviewing code made by other people since the dawn of the craft. Guess what y’all, AI generated code looks exactly the same, if not better on the first pass at creating a thing.

Down vote me all you want homies. You’re living in a fantasy if you think all AI is slop. Sure, I can see how it’s ruining some content on the Internet, but for code related tasks, its going to dramatically change the world for the better.

NoFun4You@lemmy.world · 1 month ago

Ppl are fucked lol, I’m over here writing a lot of stuff with AI, maybe it’s not always perfect but nothing ever is, and without iteration or dedication to the craft you’re just gonna sit there be all upset and judgy because you’ve never seen it lol

MerryJaneDoe@lemmy.world · 17 days ago

I think you would need to first make the case that software is making the world a better place. So far, it’s got a spotty record…

Ohi@lemmy.world · 17 days ago

The same thing happened to music when GarageBand and similar tools lowered the effort required to produce quality tracks. It took power away from the old gatekeepers and gave it to people with ideas but not traditional access. AI is doing that to software now.

ntfy.sh v2.18.0 was written by AI

ntfy.sh v2.18.0 was written by AI

Release v2.18.0 · binwiederhier/ntfy