Checking in... been real busy... stuff I'm working on now

Hello maker friends! Sorry I have been AWOL lately. I’ve barely taken time to eat or sleep enough for about the past two months. For years I’ve known I needed to build (code) a very complex, highly polished SaaS platform for certain state affiliate clients I have nationwide, but I’ve been both planning it for years and putting it off for years because I already built one such system and knew how much work it takes. What finally tipped me over to go ahead build it, was finding out how capable Claude Code is (it’s a local installed flavor of Claude.ai). In about 6 weeks or so (including working 16-20 hours a day) I’ve built, with its help, what would have taken me 10 or 15 times longer to build without its help. There is actually a saying now: the most popular programming language right now, is… English. Anyhow, while it is capable, it’s also capable of mistakes, including very damaging ones, so you have to watch it like a hawk. It also tends toward making nice looking facades without finishing the backend, and then telling you it’s all done, so it can border on dishonest. But it’s strength in building the backend, when you know and can articulate exactly what needs done, is really great. It lifts a load of tedious, mundane, time-consuming work.

Also, the same Ella’s Playtown local client that had me make some signage and 4 designed-from-the-ground-up slime dispensing machines, is now opening a game room inside the playtown, and they are having me design, cut, and assemble, lockable side cabinets that will turn a large flatscreen TV into an emulated giant Nintendo Switch.





*Note: my wall mounts do NOT hold the TV up. Just the cabinets.

16 Likes

Welcome back, great to hear from you again.

2 Likes

That’s hilarious timing. I was just thinking earlier that I hadn’t seen you post in a while, probably as you were writing that post!

1 Like

Was literally going to ping you today!

Also Claude Code is amazing +1.

1 Like

A couple or three things I did that enhanced the process… I had Claude Code write me some scripts to make life easier during dev work. In my claude.md file I have wording that coaches any fresh AI on all this as needed.

  • As is typical, my hosting provider has the PHP error log in root where the AI cannot get to it, but a script on the server can. So I got a PHP page script positioned where, when we need to check the error log, the AI does a two-step process: it fetches the script on server, which renames the existing error log (which means it resets the error log for next time), and then copies that renamed error log to a normal folder in my dev site that the AI can fetch from. The only caveat is the renaming has to be to something like .txt so there won’t a security issue with the filename. So, no more of me playing fetch for the AI to get the log. I just tell Claude, fetch log.
  • I also had the AI write me a sync script that is path-coded to work from the parent folder above the project folder, and is able to receive environment aware commands to sync to either dev or live or both. Whenever the AI has edited, say, five pages or whatever, when we’re ready sync and debug, it writes me a sync script in project folder, and I just move it one folder up, and paste the name of the script into a separate shell. For a solid workflow with the AI, I copy and paste the echoed output of the script back to the AI so it knows the sync took place. I train it to know that, “If you don’t see me post that echo output, assume the file did not get synced.” Important note: you really want the “master sync script” out of the project folder or the AI gains power to write to your remote server, and he ain’t afraid to use it. Several times it has forgotten 1. that only I execute sync scripts, 2. that the master script is one level up… and has actually tried to run one of the scripts it wrote me, that I had already moved out of its reach.
  • I also had Claude write another type of master sync script, also for one level up from project folder, that has a very carefully crafted ignore list, and this script is to clear out the potentially stale live code files (but not live environment files) and sync all potentially newer code files from dev to live (but not dev environment files and not images etc that are usually static and seldom need updated). Then in VS Code it’s an easy right-click and “sync local->remote” and it gets my site code up to date.
  • I also had Claude (once installed local in my terminal shell) to create me a local GIT repo, and whenever I need to save, or restore, or investigate file history, I just tell Claude and he does it!
  • I also use claude.md file to educate each fresh new AI on my workflow, including knowing that when I say, “document all this” it is to write detailed documentation into my “Context and history” folder, and when I say “save chat” it saves a summary of the chat in the same folder, and when I say “save todo” it writes a copy of the todo list into the same folder.
5 Likes

That is in depth and more than I would have thought of. I am curious what the project is. But I would have loved to be a fly on the wall while you were working on it (for an hour or two, not 16 lol).

I still can’t bring myself to AI code. I should probably bite the bullet and force myself to try it for a week or two. I love coding. But I haven’t done much lately.

Re. “I am curious what the project is”

I serve on both the state board of a non-profit and the national board of the “parent” non-profit. The state affiliates of the national, are almost always both 501-C3 and 501-C4 + have an internal PAC. Quite a few of them endorse candidates for elections, and I have been providing one-off scripting that gives them a digital voter guide lookup tool that uses geocoding lookups to match their voters to their endorsed candidates, giving their voters a tailored guide that has only candidates that are on their ballot. Because it’s a simple one-off script it means a lot of workload falls on me when they all hire me to update for each new election cycle. The new platform is a multi-tenant SaaS system gives them self-service updating for each new election cycle, with drag-and-drop of spreadsheet of endorsed candidates, and built in processing. It includes gobs of features for customization.

Re. “would have loved to be a fly on the wall while you were working on it”

About 8 or 10 months earlier I had looked at AI from a standpoint of how close to AGI it had gotten and was not impressed. But I was not looking at Claude.ai specifically. Either they all made vast improvements quickly or Claude was already ahead. But when I checked Claude out a couple months ago, it is very capable and for the most part can listen to your plain English explanation and come up with a plan that can code what you need. I make it get approval from me on the plan before it starts, and I watch it closely.

You can try it for free in a browser at Claude.ai. The script pages are shown to the right of the conversation and are referred as “artifacts” that you can download. But if you type a couple commands to install Claude Code locally either in your terminal shell or in your VS Code terminal, you gain a lot of added features. I think you have to be on a paid plan to use Claude Code.

1 Like

Claude has definitely come the closest to impressing me for coding compared to other AI options.

But I’m still far from impressed enough by it to trust it much.

As an experiment I tried getting it to build a chip-8 emulator - I gave it a list of chip-8 instructions and what they do and asked it to first build a basic disassembler / assembler. It came up with something that seemed to work at first glance in just a few minutes which was impressive. Then I asked it to add a simple GUI to it - and again it got it mostly working…with just a few minutes of my own fixes it was assembling and disassembling sample code with results that matched known working tools. Cool.

Then I had it go the next step and create a JS emulator. And…wow did I suddenly waste a LOT of time :rofl:

What it built initially looked great…and running a few very simple test programs looked like it may work. But then I started trying some actual chip-8 games known to work in other emulators…and all kinds of problems showed up. I was able to get it to work through a number of them and eventually got it working enough that a simple rock paper scissors game was playable…but then tried another game and quickly found a whole new bunch of problems.

At that point it went from a fun experiment to frustration as claude kept making the same mistake and saying it had fixed things that were still obviously broken. It got slower and slower at making these “changes” and worse and worse at actually fixing anything.

Was an interesting afternoon but not sure I’d call it a fun one. The code it came up with looked nice at first…but going through it trying to fix it’s mistakes more and more odd stuff started jumping out.

I’ve also tried using it to automate some tedious tasks I didn’t feel like writing actual code to automate. And it’s annoying how often it will come back and how how to do it once but not actually complete the full task across a data set. Um…yeah…I just showed you how to do it AI - don’t explain it back to me actually do the work :roll_eyes: then it gets into the loop of trying to correct what it’s doing…having it say it’s actually done it - only to find that once again…nope I just wasted a bunch of time and said it did it but didn’t actually do it.

I have had some success with using it to scaffold some small projects or quickly refactor some old code - but still felt like I spent more time checking it’s work and fixing things it mixed up than it would have taken to have just done it myself in the first place.

Though - a few times it has taken approaches I wouldn’t have thought of to accomplish something. And while it may not have fully worked - the change in thought patterns it led me to did allow me to eventually come up with a solution that did work.

It’s improving for sure - but still frustrates me when it gets locked into loops where it says it’s fixed something when it hasn’t and sometimes even re-introduces bugs after I resolve them and ask it to take the next step.

1 Like

It does have its strengths and weaknesses, and I’ve used it enough in the past 6 weeks to get a pretty good feel for what those are. My thought about its cost for my paid tier, is not “can it code better than me with fewer mistakes” but rather, if I’d hired a team of human coders to build for me, could they have made as many mistakes or more, and how much $$$ would they have cost me, and can I debug faster with AI helping me debug what it wrote than trying to get disparate human coders to debug. I think overall it has definitely saved me about $50K and a LOT of time. I have patiently lived with its weaknesses given the benefits. I’m in at $400 total for two months worth at the max tier.

5 Likes

Personally appreciate hearing about other’s experiences with the current LLM models. I appreciate the topic isn’t related to V1E directly, but we seem to have a decent mix of people here with interesting experiences and perspectives to share, and, we have people interested in the topic. So…

Curious if people doing mainly engineering tasks are experiencing decent ROI of max tiers across these LLM providers, are they worth the cost compared to the cheaper, but still paid tiers? For example, ChatGPT “Plus” at $20 monthly jumps up to $200 for Pro.

I appreciate many Devs seem to prefer Claude (Max being ‘just’ $100 monthly). Maybe most of the Engineer/Dev types here are using code editing Apps that internally call LLM via APIs, so they’re paying rates based on token consumption and work done (as opposed to fixed monthly :smiling_imp: subscription rate)?

2 Likes

@jamiek created an ~8 hour live stream of his vibe coding session recently…

It’s insightful learning from how others use tools. Am happy to help generate a Zoomable version of the live stream with Jamie’s permission?

4 Likes

In the specialist power electronics/electrical engineering groups I’m in, the prevailing sentiment seems to be that it’s a dangerously misleading boondoggle, hated by most ‘on the tools’ engineers and continually pushed by business analysts/middle managers/external salespeople. There have been a few companies over time that have been developing task specific ‘AI’ solutions. Not LLM based, I don’t think, more old school garden variety ML.

Magnetics design (inductors and transformers) is an area where the design process is very muddy and based on a lot of iteration mixed with ‘gut feel’ experience. It’s also an area where the tools available to assist with design have been woefully lacking and very prone to garbage in/garbage out type scenarios. I think it could be a great application for machine learning given how ‘fuzzy’ the outcomes are and how much experience plays into the design process but the people who are claiming to be using it seem to find it unacceptably bad. It’s also unfortunately one of those scenarios where if you don’t know what the traps are, they don’t become apparent until quite late in the process and it’s easy to design yourself into a corner of saving $1 on the inductor then spending $5 on cooling/mounting/shielding etc. to accommodate it.

I have no idea if there are people who are actually using these tools productively. I would presume there are and they’re probably just getting on with it rather than posting the most egregiously incorrect outcomes, so I don’t trust that to be an accurate assessment, really.

2 Likes

At the time I bumped up to the highest tier, there was a $200 / month option that came with more context and usage time, and I opted for that. Based on what you said here, I went and looked at the current tiers, and $100 seems to be the highest. I’m guessing (hoping) that $100/mo users are dealing with limits that I’m not hitting. Anyhow, once I get the platform built, I will go down to either free or $20/mo.

1 Like

I can definitely see how AI code would not yet be good at complex engineering coding. But for the common tedious C.R.U.D. work (Create, Read, Update, and Delete) and such, it lifts a load. And if your business model logic can be articulated, it does pretty decent at understanding and offering options for solutions, and being able to get them done.

1 Like

Sure, go ahead, have at it!

That particular stream doesn’t have that much AI. But it sure was fun.

Oh, and by the way, Doug, You’re absolutely right!

2 Likes

I just looked again, and noticed it says, “From $100” (for higher limits, priority access).
The operative word there being “from.” Once you get in there, there is a $200 option.

Here’s some more videos from a little while ago when I tried to do something different, outside my normal expertise. (Build a CRUD app, they said. It will be easy, they said.)

I haven’t watched these videos, so I’m not sure if they are at all watchable :wink: Six videos at about 2 hours each, even at 2x speed I would get super bored :skull:

Aza you can do whatever you want with these videos (including publishing if you want).

Live recordings

https://youtube.com/KzeP5LG6Ylk?si=nwr_VfluQiLudz_W

https://youtu.be/LhkGFZdt5N8?si=IMHnT-7h5RmBn5Kr

https://youtu.be/4jPQ6wpRmwQ?si=bUQc4yrP9dnF11A0

https://youtu.be/95Omd1DpyzY?si=NlDVq3YqOnPD35D7

https://youtu.be/EG44PnI2YBo?si=SyodBDNOsxIQOaUt

https://youtu.be/7yf63nzZ3TQ?si=d6OmmCdNWmJesxN-

Now I switch between Claude Code and Cursor. I’m on the $100 tier for Claude Code and I haven’t hit the usage limits yet.

It’s not great at everything and it has its blind spots, and part of the learning curve is understanding where the pitfalls are. But once you get the intuition it is like magic.

1 Like

I did a test a few months ago when I was stuck debugging something in a gulpfile and I presented the problem to chatGPT, Gemini, Perplexity, and Claude.

ChatGPT and Gemini both immediately identified the issue and gave me “fixes” but their fixes created more problems than they solved. I wasted over an hour with ChatGPT iterating on it’s suggestions before getting stuck in a loop where it would say it fixed it only to give me the same broken solution back.

Gemini was similar - but I got stuck in the loop considerably quicker.

Perplexity didn’t really work for me at all on that.

Claude - nailed it. It did take one cycle of iteration but it took a different approach to resolving the issue and it’s approach actually worked.

I think one of the keys is to remember that these AI’s aren’t really “reasoning” but are actually just super fancy search engines / pattern matching algos. So if what you’re looking for help with is something that’s common and well documented in the sources they were trained on - it will probably work fairly well. But if you’re looking for help on something that isn’t an existing pattern or just isn’t very common…it falls apart.

I’m in a mailing list focused on the RCA 1802 processor from the late 70’s since I’ve had a lot of fun building some retro computers around it. People on there have tried using various AI tools to help write code for the 1802 and while the AI’s are getting closer…they’re still WAY WAY off. Usually they wind up assuming it’s actually a 6502 which is a completely different architecture - but FAR more widely used.

Same with my issues - my problem with my gulpfile was a good one for AI since gulp is a fairly widely used task runner with lots of recent modern examples in the datasets the AI’s were trained on.

But - most of my code I look for help with is our own internal system that the AI’s haven’t trained on and may not follow the patterns that it expects. And that trips it up fast. If I take the code and refactor it into a pattern more similar to what the AI is more familar with then it can sometimes help. But that’s a lot of extra work. I’m curious to see how claude code may do - after @DougJoseph’s example I gave it a try last night and ran it one of my repos. I was fairly impressed with how well it did identifying what the repo was and how it was structured and functioned. (and pleased to see it declared my code to be "legitimate, well-structured and comprehensive :grin: )

I’m going to give it a deeper try later today seeing if it can take one of my existing API endpoints and build a new one for something else based on it…if so that could be a good help where it eliminates tedious repetitive work. It definitely seemed like having the whole codebase accessible like that may give it a better chance than the web based systems which tend to work on a single file / excerpt.

1 Like

Also, when I was explaining scripts I had it to make for me to make work easier, I forgot to mention, that I had it write a script that pulls the entire set of schema files in from the database to a folder on the web server, just as a temporary thing for dev work. Anytime I ran that script I also copied them from remote to local. When the AI is coding queries for the database, it is invaluable for it to be able to scan through the schema files and get the right names.