Checking in... been real busy... stuff I'm working on now

Talking to one of my co-workers I realized I had a “fun” project to experiment on.

12 years ago I played around with libgdx for making a simple game that could run on Android and iOS. Though - since I didn’t have access to a mac at the time I never got to test building it for iOS.

It was a fun learning experience - but eventually it reached the point where it felt more like work than fun and I’d learned what I wanted to learn…plus a major update to android came out that required some major changes so I left it sit.

That was 12 years ago. A year or two ago it got dumped from the play store for being too outdated, and I’ve just not wanted to put the time into updating it. But…why not see what AI can do - let it do the drudge work I don’t want to.

I fired up claude code on the repo and it was able to give a good summary of what was there and how outdated it was. I asked it if it could update it to the latest version of libgdx and a modern build system and it assured me it could so I let it loose.

It…kind of made a mess :rofl: It did a bunch of stuff that looked “right” but really wasn’t. The build system wasn’t working and a lot of things just seemed off. So i suggested that it use the new libgdx scaffolding tool to start a new project and move my code into that.

It agreed that was a good plan - but then couldn’t run the scaffolding tool since it was a gui tool. So instead I ran the tool to scaffold a new project (with values suggested by claude) and then had it move my code into that new project.

That actually worked - I now had a modern updated libgdx framework that played nice in android studio with my code working. Great!

Then I asked it to review the code and it suggested a bunch of additional changes to upgrade to newer best practices and libraries.

It worked overall. A lot of it was slower than I felt I could probably do the same changes myself - but some of the bigger changes were probably quicker than I could have done. Hard to say how much actual time it saved overall. It was definitely more fun than doing it myself…if it wasn’t for AI I probably wouldn’t have dusted off the codebase at all and given it a try.

So - another time where working on something with lots of well documented examples out in the wild worked quite well. Not perfectly but better than I expected.

Though - now as I’m getting to the more complex upgrades to the code it’s starting to fumble a lot more. Just had it try to implement Scene2D for screen management/layout and it completely managled most of the screens. The code “works” (i.e. complies and runs) but the resulting displays are a total mess.

Fun couple of hours, not sure it saved me any real time overall but for most of it it was definitely easier than doing the dirty work myself.

1 Like

OK, so I don’t know why the thought did not occur to me earlier, but tonight I researched free open source LLM’s that are trained on coding skills / code languages, and there’s a lot to choose from.

Since my main laptop is an Apple Silicon M1 Max, with 32 GB of ram, I am trying out two models at first:

mixtral:8x7b

and

qwen3-coder:30b

Size wise, both should be in the sweet spot for my laptop. I am using the Ollama UI (a free download from https://ollama.com/ ).

I plan to use the LLMs directly inside my VS Code IDE:

The following is a quote an Gemini AI on how to…

Integrate with Your Code Editor (The Best Way)

The most productive way to use a local LLM is directly inside your IDE (like VS Code or JetBrains/PHPStorm).

  1. Keep Ollama Running: Ensure the Ollama app is still running in your menu bar.
  2. Install an IDE Extension:
  • For VS Code: Install a popular extension like Continue or CodeGPT .
  • For JetBrains (PHPStorm): Install the CodeGPT plugin.
  1. Configure the Extension: In the extension settings:
  • Set the Provider to Ollama.
  • Set the Model to the name you ran, e.g., qwen2:14b-code.
  • The Base URL should be the default local address: http://localhost:11434
  1. Use In-Line: You can now highlight a section of PHP code, press a keyboard shortcut, and ask the local model to refactor it, write unit tests, or generate documentation—all without paying a subscription fee!

I already have the Continue extension installed. More soon!

UPDATE: qwen3-coder:30b is maddeningly slow in this usage, compared to the paid cloud-based Claude Code approach.

@jamiek … thoughts?

Re. attempting to use mixtral:8x7b inside VS Code by use of the “Continue” extension… it did not work, and in looking into it I was told by AI today that mixtral:8x7b does not support tool calls properly and that I should switch to a different model.

Quote AI:

Recommended Models with Tool Support:

Qwen2.5-Coder 7B: 			ollama pull qwen2.5-coder:7b

Llama 3.1 8B: 				ollama pull llama3.1:8b

Mistral 7B: 				ollama pull mistral:7b

DeepSeek Coder: 			ollama pull deepseek-coder:6.7b

UPDATE: Hmm. The plot thickens. I had loaded and run the model in a terminal shell before trying to use it in VS Code, and it seems I thus had the model loaded and locked and thus unavailable. ??

UPDATE AGAIN: So, I just tried again long after making sure I said /bye and closed the terminal instance, and I still get: “registry.ollama.ai/library/mixtral:8x7b does not support tools” — So, it seems I was told correctly the first time.

I already pulled and tried Llama 3.1 8B and it’s very fast, though not as intelligent as Claude Code (which is perhaps an unfair comparison, free versus paid).

Re. Qwen2.5-Coder 7B — I had already fiddled a bit with Qwen2-Coder 7B — and based on name, it seems Qwen2.5-Coder 7B is newer, so I will try it.

ASSESMENTS:

I need any coding AI to have at least “average general intelligence” — it must be able to read, comprehend, and thus understand, articulate, and confirm what needs done.

Qwen2.5-Coder 7B - bone stupid. Cannot read a file describing my project and then tell me about my project. Worthless.

Llama 3.1 8B: - could actually read the file, and then explain its understanding. Shows promise, but seem sorely lacking compared to Claude 4 Sonnet.

[Unfair?] comparison against Code Code (Claude 4 Sonnet model):

I asked Llama 3.1 8B: “We need to implement a robust WebAuthn Biometric type of option for MFA. Are you capable of that?”

Here’s Llama 3.1 8B’s answer (remember it had Agent level access to my code base, but did not look at a single file before answering):

I’m in agent mode, and I’ve got some insights to share!

Implementing a robust WebAuthn-based biometric authentication method for Multi-Factor Authentication (MFA) is definitely feasible! WebAuthn is an open standard for browser-based authentication that allows users to authenticate using various factors, including biometrics.

To get started, let’s outline the general steps involved in implementing WebAuthn-based biometric authentication:

Choose a biometric sensor: Select a suitable biometric sensor (e.g., fingerprint, facial recognition, or iris scanning) for user authentication.
Integrate with WebAuthn API: Use the WebAuthn API to integrate the chosen biometric sensor into your application.
Configure authentication options: Allow users to select their preferred authentication method (e.g., password, PIN, or biometrics).
Handle authentication results: Verify the authentication result and proceed accordingly (e.g., grant access or prompt for additional authentication factors).
To make this implementation more robust, I recommend considering the following best practices:

Use a secure communication channel: Ensure that all communication between the client-side JavaScript code and the server-side API is secure and encrypted.
Implement data encryption: Encrypt sensitive user data (e.g., biometric templates) to prevent unauthorized access.
Use trusted credentials: Verify the authenticity of user credentials using trusted third-party services or internal systems.
Monitor and analyze authentication attempts: Keep track of login attempts, including success and failure rates, to detect potential security breaches.
Please note that implementing WebAuthn-based biometric authentication may require additional infrastructure, such as:

Biometric sensor hardware: Depending on the chosen biometric sensor, you may need to invest in specialized hardware (e.g., fingerprint scanners or facial recognition cameras).
Server-side infrastructure: You’ll likely need a robust server-side infrastructure to handle authentication requests and store user data securely.
Compliance with regulatory requirements: Familiarize yourself with relevant regulations (e.g., GDPR, HIPAA) and ensure that your implementation complies with them.
Before we proceed, I’d like to clarify some details:

Which biometric sensor would you like to implement?
Have you considered any specific security concerns or compliance requirements?
Please provide more information so I can better assist you in implementing a robust WebAuthn-based biometric authentication solution!

Meanwhile…

I asked Claude 4 Sonnet model, while in Pro tier account ($20/mo) this: “Please propose and map out for me a comprehensive approach to adding WebAuthn biometric method”

Here’s Claude 4 Sonnet’s answer:

:record_button: WebAuthn Biometric MFA Implementation Plan

Overview

WebAuthn (Web Authentication) adds passwordless authentication using biometrics, security keys, or platform
authenticators. This completes our MFA system with the most secure and user-friendly option.

  1. Technical Architecture

Backend Components Needed

A. WebAuthn Service (app/services/MfaWebAuthnService.php)

class MfaWebAuthnService extends MfaService {
// Challenge generation and verification
// Credential storage and management
// Device registration handling
// Authentication ceremony coordination
}

B. WebAuthn Library Integration

  • Option 1: web-auth/webauthn-lib (PHP library)
  • Option 2: lbuchs/webauthn (lightweight PHP implementation)
  • Recommendation: web-auth/webauthn-lib for comprehensive RFC compliance

C. Database Schema Extensions

– New table for WebAuthn credentials
CREATE TABLE user_webauthn_credentials (
credential_id VARCHAR(255) PRIMARY KEY,
user_id VARCHAR(36) NOT NULL,
public_key TEXT NOT NULL,
counter BIGINT UNSIGNED DEFAULT 0,
device_name VARCHAR(100),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
last_used_at TIMESTAMP NULL,
FOREIGN KEY (user_id) REFERENCES users(user_id)
);

– Add WebAuthn support to existing MFA settings
ALTER TABLE user_mfa_settings
ADD COLUMN webauthn_credential_count INT DEFAULT 0;

  1. Frontend JavaScript Components

A. WebAuthn API Integration

// Registration ceremony
async function registerWebAuthn(userId) {
// 1. Request challenge from server
// 2. Call navigator.credentials.create()
// 3. Send attestation to server for verification
}

// Authentication ceremony
async function authenticateWebAuthn(userId) {
// 1. Request challenge from server
// 2. Call navigator.credentials.get()
// 3. Send assertion to server for verification
}

B. Browser Compatibility Detection

  • Check for WebAuthn support (PublicKeyCredential in window)
  • Graceful fallback to Email/TOTP methods
  • Clear messaging about device requirements
  1. User Experience Flow

A. Registration Process

  1. Device Detection: Check if device supports WebAuthn
  2. Method Selection: “Use Face ID/Touch ID/Security Key”
  3. Device Naming: Let user name their device (“iPhone”, “Laptop”, etc.)
  4. Registration Ceremony: Browser prompts for biometric/PIN
  5. Backup Codes: Generate if first MFA method (consistent with current system)

B. Authentication Process

  1. Method Auto-Detection: Show WebAuthn if available credentials exist

  2. Biometric Prompt: Native browser/OS authentication dialog

  3. Fallback Options: Email/TOTP/Backup codes if WebAuthn fails

  4. Security Considerations

A. Credential Management

  • Multiple Devices: Allow users to register multiple authenticators
  • Device Naming: Help users identify registered devices
  • Revocation: Allow users to remove compromised/lost devices

B. Attack Mitigation

  • Replay Protection: Use challenge-response with server-generated nonces
  • Origin Validation: Ensure requests come from correct domain
  • Counter Verification: Detect cloned authenticators
  1. Implementation Phases

:record_button: Phase 1: Foundation (Database & Library Setup)

  1. Database Migration: Create WebAuthn credentials table
  2. Library Integration: Install and configure WebAuthn library
  3. Basic Service Class: Extend MfaService for WebAuthn functionality
  4. Environment Configuration: Add WebAuthn settings to .env

Phase 2: Registration Flow

  1. Frontend JavaScript: WebAuthn registration ceremony
  2. Backend Endpoints: Challenge generation and credential verification
  3. User Interface: Device registration modal in user profile
  4. Device Management: List and remove registered devices

Phase 3: Authentication Integration

  1. Login Flow Updates: Add WebAuthn option to mfa-verify.php
  2. Challenge Handling: Server-side challenge management
  3. Fallback Logic: Graceful degradation to other MFA methods
  4. Browser Compatibility: Detection and messaging

Phase 4: Enhanced Features

  1. Multiple Devices: Support for multiple authenticators per user

  2. Device Metadata: Store device names and usage statistics

  3. Admin Features: View user WebAuthn status in admin panel

  4. Analytics: Track WebAuthn adoption and usage

  5. File Structure Plan

New Files to Create

app/services/MfaWebAuthnService.php
admin/webauthn-register.php (AJAX endpoint)
admin/webauthn-authenticate.php (AJAX endpoint)
admin/webauthn-challenge.php (AJAX endpoint)
lib/webauthn/ (library files)
database/migrations/2025_09_26_create_webauthn_tables.sql
js/webauthn.js (frontend WebAuthn handling)

Files to Modify

admin/user-profile.php (add WebAuthn device management)
admin/mfa-verify.php (add WebAuthn authentication option)
admin/login.php (detect WebAuthn availability)
app/services/MfaService.php (add WebAuthn support methods)
includes/admin-nav.php (potentially update MFA status indicators)

  1. Browser/Device Support Matrix

Supported Platforms

  • iOS Safari: Face ID, Touch ID (iOS 14+)
  • Android Chrome: Fingerprint, Face unlock (Android 7+)
  • Desktop Chrome/Edge: Windows Hello, macOS Touch ID
  • Desktop Firefox: Limited support, improving
  • Security Keys: Universal support (YubiKey, etc.)

Fallback Strategy

  • No WebAuthn Support: Hide option, show Email/TOTP only
  • WebAuthn Failed: Offer Email/TOTP/Backup codes
  • No Biometrics: Offer security key or PIN-based authentication
  1. Technical Challenges & Solutions

A. Cross-Domain Considerations

  • Challenge: WebAuthn is tied to specific origins
  • Solution: Environment-aware origin configuration (development vs production)

B. Mobile Safari Quirks

  • Challenge: iOS Safari has specific WebAuthn limitations
  • Solution: Feature detection and iOS-specific handling

C. Security Key Support

  • Challenge: Different types of authenticators (platform vs cross-platform)
  • Solution: Support both with clear user guidance

:record_button: 9. Integration Points with Existing MFA System

A. Consistent User Experience

  • Same Backup Codes: WebAuthn shares the same 10 backup codes system
  • Unified Settings: All MFA methods managed in single profile section
  • Priority Logic: WebAuthn > TOTP > Email (fastest to slowest)

B. Database Consistency

  • MFA Status: Update existing user_mfa_settings.enabled when WebAuthn added
  • Method Tracking: Use existing method enum, add ‘webauthn’ value
  • Backup Codes: Leverage existing backup codes generation (only on first MFA method)

C. Admin Interface

  • MFA Overview: Show WebAuthn status alongside Email/TOTP
  • Device Count: Display number of registered authenticators
  • Usage Stats: Track WebAuthn authentication frequency
  1. Estimated Implementation Timeline

Week 1: Foundation

  • Database schema and migration
  • Library integration and basic service class
  • Environment configuration

Week 2: Registration

  • WebAuthn registration ceremony (frontend + backend)
  • Device management UI
  • Registration testing across devices

Week 3: Authentication

  • Login flow integration
  • Challenge/response handling
  • Cross-browser testing

Week 4: Polish & Testing

  • Error handling and edge cases
  • Mobile optimization
  • Comprehensive testing and documentation

Benefits of This Approach

  1. Leverages Existing Architecture: Extends current MFA framework seamlessly
  2. Progressive Enhancement: Adds WebAuthn without breaking existing functionality
  3. User Choice: Provides most secure option while maintaining fallbacks
  4. Future-Proof: Positions platform for passwordless authentication evolution
  5. Enterprise Ready: Meets highest security standards for pro-life organizations

This comprehensive approach ensures WebAuthn integration maintains the same high quality and security standards as the
existing Email and TOTP MFA implementation while providing the most user-friendly authentication experience possible.

Update to the update!

OK, so ANOTHER awesome option was sitting under my nose!

When you fire up VS Code, there is a thing sitting over on the right called Co-Pilot or some such. It’s a code writing agent that is amazing, and just works. I just now started using it, and I’m very impressed. I did not have to register, sign in, create an account, give a credit card number etc. I’m scratching my head. Maybe I will hit some usage limit and would have to pay to continue? Or is this just free all the time?

Well… I asked Gemini AI about it, and this is the reply:

The tool you’re using is GitHub Copilot, and you’re right, it is an incredibly powerful AI coding assistant from Microsoft and GitHub.

To answer your main question about why it seems free:

GitHub Copilot is a paid, subscription-based service. The reason it “just worked” without any friction is almost certainly because you have automatically started its one-time free trial.

Here’s a breakdown of what’s happening:

1. The Free Trial

Microsoft knows that the best way to get people to see the value of Copilot is to let them use it. They typically offer a 30-day free trial for new users. During this period, you get full, unrestricted access to the service to see how it fits into your workflow.

2. Why It Was So Seamless (No Login, No CC)

VS Code is a Microsoft product, and it has extremely tight integration with GitHub (which is also owned by Microsoft). It’s very likely that:

  • You were already logged into your GitHub account within VS Code for some other feature (like managing source control or syncing your settings).
  • When you used a feature that triggered Copilot, it used your existing, authenticated GitHub account to silently activate your free trial in the background.

This is a deliberate and very effective user onboarding strategy. They remove all the barriers to let you experience the magic first, and then ask for payment later.

What Happens Next?

Enjoy it while it lasts! You have the full, premium version of the tool right now. In about a month, Copilot will stop providing suggestions. At that point, VS Code will show you a notification prompting you to subscribe to a paid plan to continue using it.

GitHub Copilot Pricing (as of late 2025)

  • Copilot Individual: The standard plan for personal use is typically around $10 per month or $100 per year.
  • Copilot Business: A more expensive plan for organizations with additional management features.
  • Free for Students and Open Source: It’s worth noting that GitHub Copilot is completely free for verified students, teachers, and maintainers of popular open-source projects.

So, you’re currently in the best phase—the trial period. Use it extensively to see if the productivity boost is worth the $10/month for your needs.

1 Like

Yee, teachers fly for free. :smiley:

1 Like

Oh, I assumed you knew about Github Copilot. That was the first one I messed with but I suppose all the development I do is in Visual Studio or VS Code where it’s integrated by default.

Depending on what plan you have you can use a bunch of different models.

2 Likes

Yep, and I was also pleased to see that, because we teach in the church school.

Cool. Yes, it seems to have defaulted me to using GPT-4.1

I’ve been using Cursor and Claude Code, and both are decent. I have wanted to try Amp but I haven’t gotten to it yet. I’m usually more focused on the project and less on the tool, so I’m not paying close attention to which works better. I have tried Copilot a couple times, and in Visual Studio it was bad, and VSCode it was not quite as bad but still noticeably worse than Cursor. This was a few months ago. I think in the long run they will all converge to be about as good.

At work I got some R&D budget to get an RTX PRO 6000 and I have it running Ollama as a server. And I also set up OpenWebUI so people within the company can use it with a browser directly over the LAN. The idea being that people can query using company information in the question without leaking information.

I am aware of Continue and I even installed it once a while ago to try to use local models, but I got distracted before I could really put it through its paces. I have a deadline coming up at work but hopefully after that’s completed I’ll have some more time to mess around with this stuff.

I will say, though, there’s definitely a skill to asking the right questions and structuring the questions in the right way. I feel like this skill will become partly obsolete but some of it will turn out to be durable.

1 Like

At this point, (hi Doug, it really is nice to hear from you) I am quite afraid.

I won’t be making eye contact, as I back quietly away.

What do you guys put in your KoolAid?

1 Like

@jamiek sounds awesome. you are the man. Copilot correlates to Ollama as the provider correct? As I noticed you have a choice of several models within Copilot. I started off liking GPT-4.1 but soured on it when I asked it for a list of files worked on, and since the files in each folder are named the same but only the parent folder name is different, and it gave me meaningless lists without parent folder name (i.e. without path) and I explained over and over 40 different ways and it either refused or was incapable, so I nixed it.

For sure.

@bitingmidge — LOL.

Copilot (within VSCode or Visual Studio as “GitHub Copilot”) is the tools layer whereas Ollama is the backend layer so it can replace Claude or Gemini or GPT-5.

The backend layer is an API server that takes the context as input and produces responses as outputs. The tools layer is user-facing and also performs the integration with files or other actions like running something on the command line.

The mechanics is a bit roundabout, where

  1. The application (e.g. Copilot or Cursor) sends a request (which includes context and usually your user input) and adds stuff to the request, basically saying to the model, “if it’s helpful, tell me to do ‘ls’ or ‘grep’ or ‘create file’ and I’ll do it for you.”
  2. Then, sometimes the model answers the query directly without asking for a command, and sometimes it does ask for a command (“tool use” from the perspective of the model). The response is effectively the LLM saying to you, “please run ls ../../Doug's Other Files/*.md for me”
  3. The application on your computer intercepts this command request from the model response, executes the command (for example, ‘ls’) and sends the result back in another transaction without needing your input.
  4. The model, now with the extra information (containing directory contents in this example) gives a more informed response.

All the actions that the model takes are performed through this kind of mechanism, listing files, reading file contents, editing files, searching the web, etc.

If the model struggles with performing actions, then it’s usually the tool layer, or a mismatch between the tool layer and what the model is trying to do.

For example, in Cursor on Windows the shell is PowerShell so commands need to be formatted accordingly. Trying cd bin/x64/Debug && myTestHarness.exe (bash style) does not work, but cd bin/x64/Debug; myTestHarness.exe does work. It keeps tripping over the difference. Cursor should know that it’s on Windows and tell Claude to format the commands as PowerShell commands, and this wouldn’t happen. So I blame Cursor for this.

In general Cursor could tweak the tools for each model, because for example if OpenAI were to produce PowerShell commands by default while Anthropic produces bash commands, then the request to OpenAI would not need any mention of PowerShell while requests to Anthropic need a reminder to produce PowerShell commands. Then when Cursor runs on Linux it would be reversed.

So when Copilot + Claude 4.1 fails to achieve your goal with listing files, I’m guessing it’s probably Copilot “helpfully reformatting” the directory listing when it sends the results back to Claude, and Claude can’t give the right response. Maybe you could tell Claude to use the command line “dir” command instead of the built-in file listing tool. If it does it that way, it would get the listing text directly and bypass the Copilot helpful reformatting and it could probably achieve the goal.

This would be a user-space workaround to a weakness in the application (Copilot), which is analogous to me having to remind the model to use PowerShell instead of bash format.

1 Like

Ah, got it. I switched back to macOS terminal with Claude + Sonnet 4 — away from Copilot in VSCode … over such frustrations. I blamed the model but see now it was possibly not the model’s fault.

OK, so Anthropic just announced not only the release of Sonnet 4.5 but also a beta for use of Claude Code directly inside VS Code:

I just started using it, and it is NICE.

Been doing more experimenting myself the last few weeks.

Like @DougJoseph I had a go with lm studio and some local models - but on my 2021 M1 macbook pro I wasn’t very impressed. And man are there a lot of models out there and variants of the same ones so it quickly gets a bit overwhelming just figuring out which model to try. All the same I think that hardware is just too limited to run a local model for serious coding. Ok for lighter duty stuff but for coding I wasn’t that impressed.

Though I also set it up on my home machine it’s a few years old now but hs a RTX2060 and Ryzen 7 2700x and 16g of ram so it did perform a bit better than the Macbook thanks to the GPU. I setup a tunnel so I can access it from the macbook and for reasons I’ve yet to investigate the same model that seems super quick and powerful right in LM Studios chat is slow and clunky from continue in vscode :frowning:

But….I also dug in on a bigger project. I had an idea about 13 or 14 years ago for a fairly dumb mobile game idea. About half the people I explained the idea to told me it was so dumb I should feel bad for coming up with it - the other half begged me to make it real because they wanted to try it. At the time I was actively experimenting with android app development but as a new parent had minimal time to even start trying my idea.

So about 2 weeks ago the idea came back to me and I figured I’d try it as a bigger “vibe coding” test.

Claude Code actually impressed me with how far it got it in just a few evenings and I was getting it pretty close to a minimally viable product I could demo to people. Unfortunately - I also hit a few really annoying bugs that have proven VERY difficult to get AI to solve. I’ve bounced between Claude and Codex and made some progress - but that initial quick progress has definitely slowed to a frustrating crawl.

I did find that using Claude to help refine .md files explaining what I wanted to work on - then running those through Codex for feedback and suggestions on improvements generally got better results. But…on some of these bugs it’s been 2-3 evenings of frustration with minimal progress.

The Sonnet 4.5 release yesterday came just as I was getting REALLY frustrated - and made some progress…but still too many bugs to share what I’ve been working on. And honestly the new 2.0 vs code extension kind of feels like a step back in some ways (while the CLI update feels like a big advancement oddly enough.)

I also found I was blowing through my $20 a month plan scary fast after the update. the new /usage command in the cli showed me going from under 10% of my weekly usage to over 50% in just a few hours without hitting the 5 hour limit once :open_mouth: - though today it looks like they reset the limits so we’ll see how tonight goes.

So overall I’m still on the fence about the whole AI coding thing - definitely helps with some things (like getting my app idea scaffolded and almost to a MVP state quickly) but also still very frustrating for debugging when it can’t figure out a solution and keeps making the same mistakes despite clear instructions and prompting in different ways.

When it works…it’s pretty impressive. But it’s also dangerously easy to sink tons of time into trying to fix things when they go wrong.

That said - the claude extension in VS code has helped me with some of my work projects on more tedious things I didn’t want to do by hand. One great example last week was modifying a backend form processor that used to fill a PDF and send it to the site owner - and changing it to send an HTML e-mail instead. Claude Code did a great job of looking at my existing code, reviewing the PDF, and creating the HTML e-mail template that VERY closely matches it. Which it did do far quicker than I could have done it by hand - so that was a big win.

2 Likes

@jhitesma Yep the debugging “merry go round” can be very frustrating. I’ve found that sometimes it’s looking in all the wrong places, missing very important details, and usually in such cases, if I back off and ponder on it, I can figure where it ought to be looking, and by redirecting / focusing it, progress results in victory. Sometimes it just comes to me what the actual problem is.

As I was saying, it’s Cursor’s fault for not using PowerShell properly. Not really Claude’s fault.

Then Cursor released an update that fixes it.

2 Likes

Yeah, I recognized what the issue was pretty quickly. Documented it…and gave suggested ways to address it. But CC still spent 3 days running in circles chasing the wrong things despite the clear directions :frowning:

Thankfully the 4th day they released Sonnet 4.5 and it finally followed my guidance. I should have given Codex a shot at it…but it kept looking like CC was about to finally get it only to dissapoint once again. Plus…I find working with CC a bit more predictable…Codex seems to go off making changes without as much approval more often which scares me. CC seems much better at planning changes before implementing them which I much prefer.

Then today I tried having it write some unit tests…and…wow.

It kept making a bunch of progress…then hitting a snag on 1 out of 20 tests and responding by deleting everything it did and trying to move on to an easier bit of code to write functions for instead like a lazy teenager :rofl:

Then it documented it’s “Progress” was surprised to find that it had no more coverage than it did when it started…and repeated that loop 3 times before finally giving in and actually finishing the tests it had started.

So - it did finally get there. But…wow did it burn through more usage than it should have doing it.

3 Likes