OK, I am gonna be the guy and put my skin in the game here. I kind of get the hype, but the experience with e.g. Claude Code (or Github Copilot previously and others as weel) has so far been pretty unreliable.
I have Django project with 50 kLOC and it is pretty capable of understanding the architecture, style of coding, naming of variables, functions etc. Sometimes it excels on tasks like "replicate this non-trivial functionality for this other model and update the UI appropriately" and leaves me stunned. Sometimes it solves for me tedious and labourous "replace this markdown editor with something modern, allowing fullscreen edits of content" and does annoying mistake that only visual control shows and is not capable to fix it after 5 prompts. I feel as I am becoming tester more than a developer and I do not like the shift. Especially when I do not like to tell someone he did an obvious mistake and should fix it - it seems I do not care if it is human or AI, I just do not like incompetence I guess.
Yesterday I had to add some parameters to very simple Falcon project and found out it has not been updated for several months and won't build due to some pip issues with pymssql. OK, this is really marginal sub-project so I said - let's migrate it to uv and let's not get hands dirty and let the Claude do it. He did splendidly but in the Dockerfile he missed the "COPY server.py /data/" while I asked him to change the path... Build failed, I updated the path myself and moved on.
And then you listen to very smart guys like Karpathy who rave about Tab, Tab, Tab, while not understanding the language or anything about the code they write. Am I getting this wrong?
I am really far far away from letting agents touch my infrastructure via SSH, access managed databases with full access privileges etc. and dread the day one of my silly customers asks me to give their agent permission to managed services. One might say the liability should then be shifted, but at the end of the day, humans will have to deal with the damage done.
My customer who uses all the codebase I am mentioning here asked me, if there is a way to provide "some AI" with item GTINs and let it generate photos, descriptions, etc. including metadata they handcrafted and extracted for years from various sources. While it looks like nice idea and for them the possibility of decreasing the staff count, I caught the feeling they do not care about the data quality anymore or do not understand the problems the are brining upon them due to errors nobody will catch until it is too late.
TL;DR: I am using Opus 4.5, it helps a lot, I have to keep being (very) cautious. Wake up call 2026? Rather like waking up from hallucination.
Hate it as well. Why should I bother learning about zones and abstract away ports, adresses, interfaces etc. only to find out pretty soon that my baremetal server actually always needs fine grained rules at least from the firewalld's point of view.
Do both. Using provider's firewall service adds another level of defence. But hiccups may occur and firewall rules may briefly disappear (sync issues, upgrades, vm mobility issues) and you services then may become exposed. Happened to me in the past, were "lucky" enough so no damage was taken.
My client sells on Amazon in Europe and is constantly harassed for presumed IP infringement, safety issues etc. usually due to somebody else either incorrectly renaming item or item name containing some trigger like "life", "battery" or some other brand's name. I always wonder how are examples like yours possible there at all.
Sheer volume mostly. Lots of scammy companies create new accounts to sell products until someone complains, then the abandon the account and start a new one. Basically the same as most spam operations
I have owned Model S since 2018, now driving my second one (Raven).
The first one was just poor manufacturing Tesla has been known for forever.
Second one is however the disaster. So many sounds while turning the steering wheel, driving on tiny slopes, braking etc. forced me to convince the service center to replace the suspension, arms & half-shafts (all under warranty). None of these steps helped, service center proposed to try another service center "where there may have more experience" and called the squeaking the feature of the vehicle.
I visited 3rd party garage, got more information about possible sources and concluded it is probably impossible to fix it.
So here I am, driving the $80k car that squeaks like 30 years old rusty Ford, attracting attention at the parking lots.
Won't be buying a next one for sure.
The error was to buy a second one after "the first one was just poor manufacturing".
I never saw manufacturing quality improve over time from car companies.
After my Nissan car started to have transmission problems that would cost thousands of dollars to fix (among various other small issues), I sold it as quickly as possibly and swore I'll touch the make again.
Subaru burned me on this. I bought my wife an outback. It started to have transmission issues with a full transmission failure at about 145k miles. This is after a life of small problems here and there that didn't really impact performance.
It was a known issue between 125 and 150k miles. Subaru's solution was to extend the warranty to 100k, as if that did anything at all.
We got rid of the broken one, and the one that I drove as well. I'll never go back. I loved those cars, but that's so shady.
I am too tall/long legs to feel comfortable in the Model 3, range of 75D was a bit limiting when traveling with family and I couldn't and still cannot imagine driving ICE car again. The manufacturing did not bother me too much in the interior as all cars I ever owned or rented had loose, squeaky plastics, and body panel gaps were tolerable for me.
In this case, the 4090 is far more memory efficient thanks to ExLlamav2.
70B in particular is indeed a significant compromise on the 4090, but not as much as you'd think. 34B and down though, I think Nvidia is unquestionably king.
Doesn't running 70B in 24GB need 2 bit quantisation?
I'm no expert, but to me that sounds like a recipe for bad performance. Does a 70B model in 2-bit really outperform a smaller-but-less-quantised model?
2.65bpw, on a totally empty 3090 (and I mean totally empty).
I woukd say 34B is the performance sweetspot, yeah. There was a long period where allow we had in the 33B range was llamav1, but now we have Yi and Codellamav2 (among others).
Datacenters in my country usually had some rooms with tower servers 20 years ago here, well my first colo was for the tower server I brought in the large backpack:-). But density requirements, cold/hot aisles etc. prevailed and towers are generally considered inefficient for the datacenter purposes.
And then you have Hetzner datacenter that probably all people running DCs I know would ridicule, but they would not be able to respond to fan replacement at the same time. I wonder how many rack server chassis are recycled each year because the manufacturer just won't let you reuse them with new motherboard, power supply due to new shape, design, ports placement etc.
> I would hope that after a couple of hours downtime, they'd bring up a fresh machine with Ansible or whatever.
It is not just about a fresh machine which hopefully sits in each datacenter. I can imagine they needed the clone of the system due to the design of the fly.io service and that's where the "fun" begins.
That's funny especially when one is constantly harassed by the Amazon to prove you are not selling counterfeits ("hey, this Nike sneakers tons of others are selling are infringing some IP, fix it!"), dealing with products named "XYZ Winter Life Jacket" which Amazon immediately bans, because they think it is lifejacket and not jacket, filling forms for developers full of absurd questions, impossibility to prove identity of your employees if they do not have recent utility bill with their name (wifes in Europe often don't) etc. And don't even get me started about customers keeping ordered items, claiming they never received them and Amazon ignoring DHL shipment tracking data - one of my clients got account suspended for this and Amazon demanded us to provide plan for preventing these situations...
Yes. It costs next to nothing to illicitly register hundreds of fake seller accounts. Legitimate sellers who get caught in the bureaucracy stay banned because they are too law-abiding to circumvent, while scammers just play statistics to keep ahead of the lockouts.
After a few years of this, what do you imagine the amazon marketplace looks like? I don't think amazon deliberately rewards scammers or punishes legit sellers, but the system they have built has that result.
I found some threads mentioning sync so just my 5 cents - it is possible to create a vault on iCloud drive and thus having a content synced across devices. I am not sure if the Sync feature provides more functionality, but I personally would not need anything else.
I have Django project with 50 kLOC and it is pretty capable of understanding the architecture, style of coding, naming of variables, functions etc. Sometimes it excels on tasks like "replicate this non-trivial functionality for this other model and update the UI appropriately" and leaves me stunned. Sometimes it solves for me tedious and labourous "replace this markdown editor with something modern, allowing fullscreen edits of content" and does annoying mistake that only visual control shows and is not capable to fix it after 5 prompts. I feel as I am becoming tester more than a developer and I do not like the shift. Especially when I do not like to tell someone he did an obvious mistake and should fix it - it seems I do not care if it is human or AI, I just do not like incompetence I guess.
Yesterday I had to add some parameters to very simple Falcon project and found out it has not been updated for several months and won't build due to some pip issues with pymssql. OK, this is really marginal sub-project so I said - let's migrate it to uv and let's not get hands dirty and let the Claude do it. He did splendidly but in the Dockerfile he missed the "COPY server.py /data/" while I asked him to change the path... Build failed, I updated the path myself and moved on.
And then you listen to very smart guys like Karpathy who rave about Tab, Tab, Tab, while not understanding the language or anything about the code they write. Am I getting this wrong?
I am really far far away from letting agents touch my infrastructure via SSH, access managed databases with full access privileges etc. and dread the day one of my silly customers asks me to give their agent permission to managed services. One might say the liability should then be shifted, but at the end of the day, humans will have to deal with the damage done.
My customer who uses all the codebase I am mentioning here asked me, if there is a way to provide "some AI" with item GTINs and let it generate photos, descriptions, etc. including metadata they handcrafted and extracted for years from various sources. While it looks like nice idea and for them the possibility of decreasing the staff count, I caught the feeling they do not care about the data quality anymore or do not understand the problems the are brining upon them due to errors nobody will catch until it is too late.
TL;DR: I am using Opus 4.5, it helps a lot, I have to keep being (very) cautious. Wake up call 2026? Rather like waking up from hallucination.