Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A principal engineer at Google posted on Twitter that Claude Code did in an hour what the team couldn’t do in a year.

Two days later, after people freaked out, context was added. The team built multiple versions in that year, each had its trade offs. All that context was given to the AI and it was able to produce a “toy” version. I can only assume it had similar trade offs.

https://xcancel.com/rakyll/status/2007659740126761033#m

My experience has been similar to yours, and I think a lot of the hype is from people like this Google engineer who play into the hype and leave out the context. This sets expectations way out of line from reality and leads to frustration and disappointment.





Thats because getting promoted requires thought leadership and fulfilling AI mandates. Hence the tweet from this PE at Google, another from one at Microsoft wanting to rewrite the entire c++ base to Rust, few other projects also from MS all about getting the right Markdown files etc etc

> A principal engineer at Google posted on Twitter that Claude Code did in an hour what the team couldn’t do in a year.

I’ll bring the tar if you bring the feathers.

That sounds hyperbolic but how can someone say something so outrageoulsy false.


as someone who worked at the company, i understood the meaning behind the tweet without the additional clarification. i think she assumed too much shared context when making the tweet

A principal engineer at Google made a public post on the World Wide Web and assumed some shared Google/Claude-context. Do you hear yourself?

Working in a large scale org gets you accustomed to general problems in decision making that aren’t that obvious. Like I totally understood what she means and in my head nodded with “yeah that tracks”.

Maybe it helps them sleep at night.

People make mistakes, it's not that deep. The correct incentive to encourage is admitting, and understand and forgiving when necessary because you don't want to encourage people to hide mistakes out of shame. That only makes things worse.

Especially considering forgetting the delta between yours and someone else's shared context is extremely common. And the least egregious mistake you can make when writing an untargeted promo post.


My bad. I will be more mindful tomorrow when someone at a big tech company yet again make-a-mistake in the same direction of AI hyping. Maybe with a later addendum. Like journalists that write about a Fatal Storm In Houston and you read down to the eighth paragraph and it turns out the fatality were among pigeons.

> when writing an untargeted promo post.

lol.


> My bad. I will be more mindful tomorrow when someone at a big tech company yet again make-a-mistake in the same direction of AI hyping.

Are you mad at them for playing the game, or mad that that's the game they have to play to advance at their company?

> Like journalists that write about a Fatal Storm In Houston and you read down to the eighth paragraph and it turns out the fatality were among pigeons.

I don't know; I guess I hold people who post on twitter so they can self-promo, or who have attention because they work at $company, to a slightly different standard than I would hold a journalist writing a news article?


Who are you referring to here? If you follow the link, you will see that the Google engineer did not say that.

I am quoting the person that I responded to. Which linked to this: https://xcancel.com/rakyll/status/2007659740126761033#m

> I’m not joking and this isn’t funny. We have been trying to build distributed agent orchestrators at Google since last year. There are various options, not everyone is aligned... I gave Cloud Code a description of the problem, it generated what we built last year in an hour.

So I see one error. GP said “couldn’t do”. The engineer really said “matched”.


The key words in the quote are "not everyone is aligned". It's not about execution ability.

May I ask about your level of experience and which AI you tried to use? I have a strong suspicion these two factors are rarely mentioned, which leads to miscommunication. For example, in my experience, up until recently you could get amazing results, but only if you had let's say 5+ years of experience AND were willing to pay at least $100/month for Claude Code AND followed some fairly trivial usage policies (e.g., using the "ultrathink" keyword, planning mode etc) AND didn't feel lazy actually reading the output. Quite often people wouldn't meet one of those criteria and would call out the AI hype bubble.

From the very beginning everyone tells us “you are using the wrong model”. Fast forward a year, the free models become as good as last year premium models and the result is still bad but you still hear the same message “you are not using the last model”… I just stopped caring to try the new shiny model each month and simply reevaluate the state of the art once a year for my sanity. Or maybe my expectation is clearly too high for these tools.

Are you sure you haven't moved the goalposts? The context here is "agentic coding" i.e. it does it all, while in the past the context was, to me anyway, "you describe the code you want and it writes it and you check it's what you asked for". The latter does work on free models now.

When one is not happy with LLM output, agentic workflow rarely improves quality --- even though it may improve functionality. Now, instead of making sure that LLM is on track at each step, it goes down a rabbit hole, at which point it's impossible to review the work, let alone make it do it your way.

There's people spending 5k a month on tokens, if you're work generates 7-8 figures per year, that's peanuts and companies will happily pay for that per engineer

This discussion is a request for positive examples to demonstrate any of the recent grandiose claims about ai assisted development. Attempting to switch instead to attacking the credentials of posters only seems to supply evidence that there are no positive examples, only hype. It doesn't seem to add to the conversation.

> would call out the AI hype bubble

Which is what it is by describing it as a tool needing thousands of dollars and years of time in learning fees while being described as "replaces devs" in an instant. It is a tool and when used sparingly by well trained people, works. To the extend that any large statistical text predictor would.


I’ve mostly used the 20 a month cursor plan and I’ve gotten to the point I can code huge things with rarely the need to do anything manually

I’ve mostly used the 20 a month cursor plan and I’ve gotten to the point I can code huge things with rarely the need to do anything manually

Yeah that was bullshit (like most AI related crap... lies, damn lies, statistics, ai benchmarks). Like saying my 5 year old said words that would solve the Greenland issue in an hour. But words not put to test lol, just put on a screen and everyone say woah!!! AI can't ship. That stil needs humans.

That, uh, says a lot about Google, doesn't it?

Humans regularly design entire Uber, google, youtube, twitter, whatsapp etc in 45 mins in system design interviews. So AI designing some toy version is meh.

You're choosing to focus on specific hype posts (which were actually just misunderstandings of the original confusingly-worded Twitter post).

While ignoring the many, many cases of well-known and talented developers who give more context and say that agentic coding does give them a significant speedup (like Antirez (creator of Reddit), DHH (creator of RoR), Linus (Creator of Linux), Steve Yegge, Simon Wilison).


Why not in that case provide an example to rebut and contribute as opposed to knocking someone elses example even if it was against the use of agentic coding.

Serious question - what kind of example would help at this point?

Here are a sample of (IMO) extremely talented and well known developers who have expressed that agentic coding helps them: Antirez (creator of Reddit), DHH (creator of RoR), Linus (Creator of Linux), Steve Yegge, Simon Wilison. This is just randomly off the top of my head, you can find many more. None of them claim that agentic coding does a years' worth of work for them in an hour, of course.

In addition, pretty much every developer I know has used some form of GenAI or agentic coding over the last year, and they all say it gives them some form of speed up, most of them significant. The "AI doesn't help me" crowd is, as far as I can tell, an online-only phenomenon. In real life, everyone has used it to at least some degree and finds it very valuable.


A lot of comments reads like a knee jerk reaction to the Twitter crowd claiming they vibe code apps making 1m$ in 2 weeks.

As a designer I'm having a lot of success vibe coding small use cases, like an alternative to lovable to prototype in my design system and share prototypes easily.

All the devs I work with use cursor, one of them (front) told me most of the code is written by AI. In the real world agentic coding is used massively


Those are some high profile (celebrity) developers.

I wonder if they have measured their results? I believe that the perceived speed up of AI coding is often different from reality. The following paper backs this idea https://arxiv.org/abs/2507.09089 . Can you provide data that objects this view, based on these (celebrity) developers or otherwise?


Almost off-topic, but got me curious: How can I measure this myself? Say I want to put concrete numbers to this, and actually measure, how should I approach it?

My naive approach would be to just implement it twice, once together with an LLM and once without, but that has obvious flaws, most obvious that the order which you do it with impacts the results too much.

So how would I actually go about and be able to provide data for this?


> My naive approach would be to just implement it twice, once together with an LLM and once without, but that has obvious flaws, most obvious that the order which you do it with impacts the results too much.

You'd get a set of 10-15 projects, and a set of 10-15 developers. Then each developer would implement the solution with LLM assistance and without such assistance. You'd ensure that half the developers did LLM first, and the others traditional first.

You'd only be able to detect large statistical effects, but that would be a good start.

If it's just you then generate a list of potential projects and then flip a coin as to whether or not to use the LLM and record how long it takes along with a bunch of other metrics that make sense to you.


The initial question was:

> wonder if they have measured their results?

Which seems to indicate that there would be a suitable way for a single individual to be able to measure this by themselves, which is why I asked.

What you're talking about is a study and beyond the scope of a single person, and also doesn't give me the information I'd need about myself.

> If it's just you then generate a list of potential projects and then flip a coin as to whether or not to use the LLM and record how long it takes along with a bunch of other metrics that make sense to you.

That sounds like I can just go by "yeah, feels like I'm faster", which I thought exactly was parent wanted to avoid...


> That sounds like I can just go by "yeah, feels like I'm faster", which I thought exactly was parent wanted to avoid...

No it doesn't, but perhaps I assumed too much context. Like, you probably want to look up the Quantified Self movement, as they do lots of social science like research on themselves.

> Which seems to indicate that there would be a suitable way for a single individual to be able to measure this by themselves, which is why I asked.

I honestly think pick a metric you care about and then flip a coin to use an LLM or not is the best you're gonna get within the constraints.


> Like, you probably want to look up the Quantified Self movement, as they do lots of social science like research on themselves.

I guess I was looking for something bit more concrete, that one could apply themselves, which would answer the "if they have measured their results? [...] Can you provide data that objects this view" part of parents comment.

> then flip a coin to use an LLM or not is the best you're gonna get within the constraints.

Do you think trashb who made the initial question above would take the results of such evaluation and say "Yeah, that's good enough and answers my question"?


I think it is a mix of ego and fear - basically "I'm too smart to be replaced by a machine" and "what I'm gonna do if I'm replaced?".

The second part is something I think a lot about now after playing around with Claude Code, OpenCode, Antigravity and extrapolating where this is all going.


I agree it's about the ego .. about the other part I am also trying to project few scenarios in my head.

Wild guess nr.1: large majority of software jobs will be complemented (mostly replaced) with the AI agents, reducing the need for as many people doing the same job.

Wild guess nr.2: demand for creating software will increase but the demand for software engineers creating that software will not follow the same multiplier.

Wild guess nr.3: we will have the smallest teams ever with only few people on board leading perhaps to instantiating the largest amount of companies than ever.

Wild guess nr.4: in near future, the pool of software engineers as we know them today, will be drastically downsized, and only the ones who can demonstrate they can bring the substantial value over using the AI models will remain relevant.

Wild guess nr.5: getting the job in software engineering will be harder than ever.


Nit: s/Reddit/Redis/

Though it is fun to imagine using Reddit as a key-value store :)


That is hilarious.... and to prove the point of this whole comment thread, I created reddit-kv for us. It seems to work against a mock, I did not test it against Reddit itself as I think it violates ToS. My prompts are in the repo.

https://github.com/ConAcademy/reddit-kv/blob/main/README.md


Aaarg I was typing quickly and mistyped. :face-palm:

Thanks for the correction.


Citation needed. Talk, especially in the 'agentic age', is cheap.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: