All of the above being said, I have two conspiracy theories:
The pricing on all of these is going to explode now that they’ve got everyone addicted to it
Pretty soon the models will behave in a way that does a simple check: “Is this solution likely to work? If >95% likely to work, feed user a bogus result before fixing it” so they’ll have us paying for extra tokens without even realizing it’s happening
So as much as I really like using the coding assistants, I’m also bracing for them to jump the shark and go the way of Netflix and other streaming platforms. Jack up the price and decrease the quality now that the masses can’t live without it!
I’ve had success with my trial of Claude Code and wrote or improved Plugins for WooCommerce (Etsy to WooCommerce, my Banner that shows deals as well as a back in stock notification box with a double opt in (that costs a looot of money if you buy it, took 5 minutes)) that help me streamline my shop as well as the Filament Mixer and a new version of a Box Generator that I am currently working on (or rather the AI)…
In general, I try to ASSume incompetence before maleficence. Maybe both are happening here
Customer venting on https://www.reddit.com/r/GithubCopilot/ isn’t letting up, I’m ASSuming LLM providers thought they could optimize (algorithms/hardware) their way to SaaS margin level profits using IP stolen from humanity, and orders of magnitude cheaper inference before demand caught up with their capacity.
Cheaper inference tech seems to be coming, but not soon enough for the pockets of VCs (subsidizing tokens) that desperately want their IPO exits, and to offload COGS burden to institutional/retail investors.
I have faith that Dev Customers are sharp, and righteous enough to provide checks and balances to notice, callout and shutdown shifty behavior.
I don’t understand finance stuff, and why LLM providers have such high valuations, is their moat durable given the pace that open weight models are releasing?
Ai is still pretty good because it is in fase 1 of enshittification, fase 3 is coming soon!
Btw a good tip if you are still on windows. Use AI to switch to Linux, get Arch which normally is a bit of a pain to install but with AI you can easily do it.
Running AI local is quite a hobby. I have a decent server running, but would like to expand it to more capability.
I’ve had some success with coding using the local model, but I’ve had a LOT of success with using local models for more AI driven inference in my projects.
One thing I do to save costs while coding is use the paid for AI for the initial plan. I let it create a PLAN.md file where we work through 80% of the app. Then I hand that plan over to my local AI stack to go implement. If the local model gets stuck trying to debug something, then I send a very small and pointed prompt to the paid AI to fix just that one issue. Then I send it back to the local AI to continue.
Others have had more success than I have, but they’ve also invested a lot more in hardware for running the local models.
If really like to turn my computer into a local LLM, giving it like, a TB on an SSD (or whatever it needs) so I’d be able to integrate it into some tasks.
opencode has some free models OpenCode | The open source AI coding agent
I was using Big Pickle model in there and quite happy with it. mostly for hacking/configuring stuff
you can pay $10 per month and get a bigger quota. mostly Chinese models.
I’ve used the free version for:
Configure a non standard router running OpenWrt. was lazy to do it myself so connected the board over serial port and asked it to configure network
Making linux kernel device tree fixes to get wifi working under armbian running on some random android tv box.
Create an esphome configuration out of tuya smart socket firmware dump.
getting old quadro GPU working with Ubuntu
you run out of quota if you are trying to do heavy lifting, but it resets daily
I’m now using it to make a DRO for the lathe running on CYD and using capacitive scales
the advice is to run it on linux or wsl. dont run on windows directly
running stuff locally would be nice but it is too expensive now . you can build a strong cnc with servo motors for the cost of a GPU
I Use Claude Enterprise at work and Opus has a 1M context and it is big difference from the free models with smaller context. But you still need to setup claude.md, agents and all of the stuff to get something reasonable out of it
I’m still rocking my 2080 Super 8GB, so my options are limited but I’m gonna look into it some more! I do get a kick out of “making shit work” (as we all do here lol)… will report back if I get anything going on it
For Ai the gpu memory is more important than speed. The more of the model you can cram into vram the faster it runs.
I really want to try out the Intel b70 with 32GB, but the software side looks to still require a lot of work to get it going. Vulkan does look promising.