More

IanCal · 2026-01-24T23:27:16 1769297236

I think you massively underestimate the number of useful apps that are crud and a bit of business logic and styling. They’re useful, can genuinely take time to build, can be unique every time, and yet not brand new research projects.

krackers · 2026-01-25T01:39:13 1769305153

A lot of stuff is simultaneously useful but not mission critical, which is where I think the sweet spot of LLMs currently lies.

In terms of the state of software quality, the bar has actually been _lowered_, in that even major user-facing bugs in operating systems are no longer a showstopper. So it's no surprise to me that people are vibe-coding things "in prod" that they actually sell to other people (some even theorize claude code itself is vibe-coded, hence its bugs. And yet that hasn't slowed down adoption because of the claude max lock in).

So maybe one alternate way to see the "productivity gains" from vibe-coding in deployed software is that it's actually a realization that quality doesn't matter. The seeds for this were already laid years back when QA vanished as a field.

LLMs occupy a new realm in the pareto frontier, the "slipshod expert". Usually humans grow from "sloppy incompetent newb" to the "prudent experienced dev". But now we have a strange situation where LLMs can write code (e.g. vectorized loops, cuda kernels) that could normally only be done by those with sufficient domain knowledge, and yet (ironically) it's not done with the attention and fastidiousness you'd expect from such an experienced dev.

xyzsparetimexyz · 2026-01-24T23:31:11 1769297471

No totally, I agree. But I don't think that anyone will be YOLO vibe coding massive changes into Blender or ffmpeg any time soon.

IanCal · 2026-01-24T23:36:19 1769297779

Probably not, though additions maybe - I added the feature where the sculpt tool turns as you move it around if I recall right, many moons ago - I don’t think it was that hard but was a useful change.

IanCal · 2026-01-24T08:07:03 1769242023

> if I now throw yet another (pointless) popup for you each time you install an app, are you OK with it?

It would be an extremely minor issue, definitely not rise to the level of having multiple phones being easier. It’d be a few button presses per year.

IanCal · 2026-01-21T19:18:39 1769023119

I’ll share another great version of Beowulf- Bea Wolf. Based on kids, with fantastic artwork and a great story/version. My kids absolutely love me reading this and I absolutely love reading it as a large passed down story of battles.

https://www.goodreads.com/book/show/60316971-bea-wolf

wvbdmp · 2026-01-22T12:13:38 1769084018

Cool, that was enough to intrigue me, but for those on the fence perhaps it’ll help to note that the author is none other than Zach Weinersmith of SMBC fame.

Regarding the topic, this graphic novel begins “Hey, wait! Listen to the lives…”

wduquette · 2026-01-22T18:10:13 1769105413

It's a whole lot of fun.

IanCal · 2026-01-21T16:51:09 1769014269

I would absolutely put ssh access to the prod server way above submitting a pr for danger, that’s an enormous step up in permissions.

borenstein · 2026-01-21T16:56:06 1769014566

I'm with you here! The idea with yolo-cage is that the worst the LLM can realistically do is open an awful PR and waste your time. (Which, trust me, it will.) Claude suggested the phrase: "Agent proposes, human disposes."

snowmobile · 2026-01-21T16:59:06 1769014746

I'm not saying you should allow all your devs access to the prod server in practice (security in layers and all that). I'm saying, if you wouldn't trust a person to be competent and aligned enough with your goals to have that access in principle, why would you trust them to write code for you? Code that's going to run on that very same server you're so protective about. Sure you may scrutinize every line they write in detail, but then what's the point of hiring them?

IanCal · 2026-01-22T13:50:50 1769089850

Because it’s way easier to completely fuck up a system with running arbitrary commands on it while in use than it is by changing your code. It’s a massive step up in power and a massive drop in how much you can scrutinise a change (to zero).

Maybe the llm can carefully craft an exploit that happens when nginx reads some HTML. Maybe it found a way of hiding file system access in an import I didn’t notice.

I can completely destroy a prod service by accidentally not escaping a space in an rm command.

I’m genuinely confused by this question unless you’ve never worked on production systems in a team before. In which case that’s fine and it’s good to learn but there’s going to be a lot of material out there about deploying and safety.

IanCal · 2026-01-19T23:34:20 1768865660

A few points:

1. I think you have mixed up assistance and expertise. They talk about not needing a human in the loop for verification and to continue search but not about initial starts. Those are quite different. One well specified task can be attempted many times, and the skill sets are overlapping but not identical.

2. The article is about where they may get to rather than just what they are capable of now.

3. There’s no conflict between the idea that 10 parallel agents of the top models can mostly have one that successfully exploits a vulnerability - gated on an actual test that the exploit works - with feedback and iteration BUT random models pointed at arbitrary code without a good spec and without the ability to run code, and just run once, will generate lower quality results.

IanCal · 2026-01-18T11:44:40 1768736680

Th consent is about tracking and your data, not specifically cookies. If you accept them tracking and selling your data then deleting cookies only impacts one way that happens.

goodluckchuck · 2026-01-18T13:35:43 1768743343

I disagree with this idea that businesses should have to keep their customers secret. If I go to Wal-Mart, then I should be free to tell my neighbors about what products were on sale and also how the produce was old / left to spoil. I’m not sure why that should be different for the store.

IanCal · 2026-01-18T18:10:26 1768759826

Do you think Walmart should be handing your credit card numbers out? Genetic profiles of you? Is there any limit or do you think if you walk into a space whoever owns that space can get and do whatever they want with any information you might happen to have on you?

> I disagree with this idea that businesses should have to keep their customers secret

They don’t. They just have to ask the person whose personal data it is if they can.

Forgeties79 · 2026-01-18T13:43:23 1768743803

There are plenty of places folks visit that they would rather not have out loud.

goodluckchuck · 2026-01-18T18:04:17 1768759457

I don’t see how personal preference should control other people’s speech. When I put terrible Google reviews down for a shop… I’m sure they don’t want that said publicly either… but it’s not libel… what I’m saying is true. There isn’t generally value in concealing the truth.

Forgeties79 · 2026-01-19T00:01:18 1768780878

Businesses =/= people and people are, or at least should be, entitled to more privacy. This reads like another variation of “you have nothing to fear if you have nothing to hide” but maybe I’m misunderstanding your point

IanCal · 2026-01-17T08:07:33 1768637253

> What does it do when the model wants to return something else,

You can build that into your structure, same as you would for allowing error values to be returned from a system.

IanCal · 2026-01-15T20:12:41 1768507961

One part I like about LLMs is that they can smooth over the rough edges in programming. Lots of people can build pretty complicated spreadsheets, can break down a problem into clear discrete tasks, or can at least look at a set of steps and validate that solves the issue they have & more easily updated it. Those people don’t necessarily know json isn’t a person, how to install python or how to iterate over these things. I cant give directions in Spanish but its not because I don’t know how to get to the library its just I can’t translate precisely.

Also you may only need someone to write the meta prompt that then spits out this kind of thing given some problem “I want to find the easiest blog posts to finish in my drafts but some are already published” then a more detailed prompt out of it, read it and set things going.

IanCal · 2026-01-15T11:09:21 1768475361

They've got a huge amount of space, solar has a low cost and provides an additional consumer to build out yet more capacity for supplying the world.

> Wouldn't it be better to just go with nuclear

If this is legit : https://world-nuclear.org/information-library/country-profil... then they have 59 reactors right now with 37 currently in production. Wikipedia lists 62 reactors being built in the world in total, and 28 of them being in China. The amount of power those additional plants will generate will take them from third in the world to second this year (wikipedia) and in total would pass the US when built.

They're not slouching on nuclear, they're ramping up energy production at an incredible pace on a lot of fronts.

ViewTrick1002 · 2026-01-15T12:32:55 1768480375

Which leads to a shrinking nuclear share in their grid. It peaked at 4.6% in 2021, now down to 4.3%.

Compared to their renewable buildout the nuclear scheme is a token gesture to keep a nuclear industry alive if it would somehow end up delivering cheap electricity. And of course to enable their military ambitions.

IanCal · 2026-01-13T07:51:12 1768290672

That doesn’t matter to the point, which is stored history misses the way in which things moved from state A to state B.

materialpoint · 2026-01-13T07:55:57 1768290957

So you missed the point too. The post depends on versioning being diffs only.