Rendered at 13:57:57 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
noelwelsh 7 hours ago [-]
I wish people would describe in more detail the tasks they use LLMs to code. My experience is that simple components in an existing architecture are fine, but anything requiring architectural considerations quickly becomes a mess. On my projects (e.g. a ui framework), running multiple agents in parallel would just increase the speed at which it can stuff up the project.
germanptr 5 hours ago [-]
I get this question a lot, and I found it hard to answer briefly, so I ended up writing a longer post about how I work:
The short version is that I don’t let AI agents work unsupervised on my code. I treat them like participants in a mob programming session instead of autonomous developers. Different agents get different roles (implementer, reviewer, architect, security reviewer, etc.), and I stay involved throughout the process.
I also agree with your point about architecture. Generating isolated components is relatively easy; preserving and evolving the architectural boundaries across a larger codebase is much harder.
We’re still missing a good way to express and measure architectural quality. Until then, architecture heavy work requires much closer supervision than implementation heavy work
pipes 28 minutes ago [-]
I found that this guys stuff has really helped me:
It's taken _a lot_ of time and effort, but this is an example of what can be developed using LLMs alone.
You have to have dedication and a goal to reach, but you can absolutely build anything if you're building with the right foundations in mind.
ryanackley 4 hours ago [-]
I think the relevant question isn’t what can be built but the amount of effort in comparison to doing this the old fashioned way.
What do you think the productivity gain was from using an LLM? This question assumes you’re already an experienced developer.
motoroco 14 minutes ago [-]
There’s no free lunch, it takes time and effort still. And expertise if you need it to be robust.
In terms of velocity, let me offer some numbers. In 6 months I generated >150k lines of code and merged 10k PRs to ship and iterate on https://plotalong.app
I follow best practices and isolate agents to continuously deployed dev environments, semi-manually review PRs and gate the release process between multiple protected envs. The project is getting close to 500 end-to-end tests in Playwright.
That’s just working nights and weekends. Before AI, it took my team at the office 4 years to produce this much work. There are some qualitative differences but the speed and results are real
andai 3 hours ago [-]
n=1 but, a friend of mine spent the last few months working on an experimental music software with Claude. What he built is amazing and far beyond my abilities (I have been programming for 20 years). He doesn't know any programming.
In fact, it's far beyond what I would even attempt, because I've just spent two decades building up a data bank of how hard things are supposed to be.
He doesn't know it's supposed to be hard, so he just does it.
nullbio 7 hours ago [-]
It's great for people who are just maintaining something. Less so for someone building something from scratch, in the earlier phases.
Npovview 6 hours ago [-]
There are hour long youtube videos where people explain the process by using a complex toy project. Search for them.
gnunicorn 4 hours ago [-]
Interestingly, despite it being much more detailed and a lot more process and procedure than what I currently do - which is more akin to the version 0 described, but in parallel - we come up at the same final problem: reviews and quality assurance.
I sign off the code I merged, part of company policy but also just to be sure it is actually decent. But reviewing has become the real draining bottleneck: even stacked PRs, if that total 5-6k lines is not a 5min job. Even if I brainstormed and set the plan, that's really the part that doesn't scale right now for me in this. But the author is very shy about that: either the changes arent that big in the end or they trust the process enough to review in a more casual manner. Being equally untrusting I can't do that ...
philbo 2 hours ago [-]
For decades, engineers understood that large code reviews are harder than small ones. Out of both politeness and a desire to receive better code reviews, we learned to break our large changes into smaller chunks. Some engineers took things even further and replaced code reviews with pair programming. But then LLMs showed up and everyone seems to have forgotten those lessons.
They can be still be applied now using coding agents, if you're willing to push back against the default setup and change your mode of thinking a little bit. Of course it doesn't help that an entire industry is dedicated to persuading us that maximizing token spend is the only way to get shit done.
I appreciate this probably seems like an extremist take, but I wrote some more about it here in case there's anybody out there who identifies with it:
Agree with this completely. This push for more autonomy I think is the complete wrong direction for how to use LLMs.
I want less code to maintain not more that I don't even fully understand.
I think research and very supervised coding with lots of guardrails is the way to actually gain productivity from these tools.
strogonoff 4 hours ago [-]
Proper review should take longer than writing it yourself, because you need to know the correct solution, understand the proposed solution, and evaluate the difference between the two. When designing it yourself, you just need to know the correct solution and write it, and with modern high-level languages and IDEs with autocomplete writing it is hardly a bottleneck.
minihat 2 hours ago [-]
It is harder to solve a sudoku than verify a solution's correctness. I find similar benefits occasionally when coding with LLMs.
skydhash 16 minutes ago [-]
Sudoku’s constraints are knownn and easy to build an harness for. Software has a more malleable structure. An harness is hard to build and the tests cases for the constraints can be a lot.
nisabek 3 hours ago [-]
If I'm attentive during spec/plan creation I sort of build this "expectation" of what the actual PR will look like, the mental model of it. Then it's somewhat easier to review.
But the mental load is brutal tbh, and still not sure if it's "worth it"
4 hours ago [-]
general1465 3 hours ago [-]
I am completely calm regarding AI and development.
First nobody sane want to give their domain IP to OpenAI/Anthropic. That's why local AI will eventually prevail and flourish because people who actually have some IP will have no problem to buy 10k+ EUR machine to run some pretty good models on it. However if your main job is just doing CRUD stuff, then you are screwed.
Secondly hallucination is really Achilles heel of every LLM. Sure you can recreate an application which exists in thousand of variations on the internet, but the moment you will try to go more into domain knowledge you will start struggling more and more.
Try to make CAN driver for ESP32, easy it is probably going to work. Try to make CAN driver for STM32F7xx now the AI will start having a problem but probably will be able to produce something what is working after a lot of debugging. Now let's make CAN driver for MPC5555. AI will start writing fairy tales about registers which do not exist. All of processor above have reference manuals and sometimes example git repositories available on open internet.
pydry 3 hours ago [-]
>Automating myself out of development
>I want to start by saying that I’m neither an AI-fanatic
Kind of like saying you are a fanatic before saying you aren't.
I don't think theres too much here (e.g. "spec driven development") I haven't seen elsewhere.
yieldcrv 7 hours ago [-]
I don't know if I’m overly critical but there’s gotta be a middle ground between totally AI pilled people that otherwise have no talents, and control freak veteran developers who cant let go
My current process is also using Github projects in a normal scrum style way, with many tickets written or fleshed out and state managed by the LLM, and it doubling as the memory system
Completely leapfrogging all these other open and closed source concoctions and being more effective
But its effective enough that I don’t need OP’s final form state of still approving everything
Auto-mode is fine. Worktrees are built into Claude Code now. I just tell it to classify tickets as sequential or parallel possible and spawn subagents to tackle all of the tickets in the todo list
They all get their own context window its pretty perfect now
in the meantime I work in a couple tabs of Claude Design for different flows of any client side app. My philosophy has been that devs could pick up graphic and UI/UX design easily, its just still a full time job to make variations of layouts and portray their states.
UI/UX is not a full time job anymore.
And I use Claude chat to flesh out aspects of the overall idea
I think you may be overcomplicating your workflow in the concluding state.
Overall I agree that planning and intention is now most of the time, before a 10 subagent precision strike is initiated
thi2 4 hours ago [-]
There are tons of people, those are just not as vocal.
nisabek 3 hours ago [-]
Could be (the overcomplicating part), I'm just not yet comfortable loosing the mental model of the final application. At least not in all types of tickets. Are you not seeing that?..
yieldcrv 45 minutes ago [-]
I focus on one side project at a time, alongside work applications
Both are giving me skillsets to excel in the other domain
I watch the subagents, push back on some choices, look at commits and glance at pull requests
uyhgbbhakusho 1 hours ago [-]
[dead]
ai_fry_ur_brain 3 hours ago [-]
All these people saying UI/UX is dead, then I see their designs and they're absolutely the worst (but they're always swearing by how incredible it is).
Sorry access to an LLM (even if it could center a div reliably and make a responsive designs, it can't) does not give you taste, intuition or make you good at building user interfaces. You people/sloppers have no idea the amount of sweat that gets poured into great UX.
Its insulting when you people say these things and Im not even a designer or frontend dev.
I actually think UI/UX designers and devs will be the last to fall. I will want beautiful products that were built by beautiful minds, thats how you will set yourself apart from the slop. And fortunately it will be even easier when 80% of everything is half assed cranked out UI by llm design tools. The contrast is already glaring.
yieldcrv 51 minutes ago [-]
I’ve seen that slop but
Claude Design has barely been out for a month
And it’s fulfilled my needs better than v0, lovable, playwright via LLM or just iterating in the coding LLM. I’ve worked with graphic designers my whole career and have also contracted design agencies to do style guides and collaborate on branding and layouts. I’ve gotten the output that I’m looking for with Claude Design
eventually you’ll see examples but its not in my purview to publicly link any of my projects as being vibe coded
https://www.trigosec.com/insights/mob-programming-for-one/
The short version is that I don’t let AI agents work unsupervised on my code. I treat them like participants in a mob programming session instead of autonomous developers. Different agents get different roles (implementer, reviewer, architect, security reviewer, etc.), and I stay involved throughout the process.
I also agree with your point about architecture. Generating isolated components is relatively easy; preserving and evolving the architectural boundaries across a larger codebase is much harder.
We’re still missing a good way to express and measure architectural quality. Until then, architecture heavy work requires much closer supervision than implementation heavy work
https://youtu.be/-QFHIoCo-Ko?is=FYYdukWluYX3vdQL
Worth a watch.
The complete log of all prompts and commits is here: https://demo.buildermark.dev/projects/u020uhEFtuWwPei6z6nbN
https://demo.buildermark.dev/projects/u020uhEFtuWwPei6z6nbN/...
still show content of page 1
It's taken _a lot_ of time and effort, but this is an example of what can be developed using LLMs alone.
You have to have dedication and a goal to reach, but you can absolutely build anything if you're building with the right foundations in mind.
What do you think the productivity gain was from using an LLM? This question assumes you’re already an experienced developer.
In terms of velocity, let me offer some numbers. In 6 months I generated >150k lines of code and merged 10k PRs to ship and iterate on https://plotalong.app
I follow best practices and isolate agents to continuously deployed dev environments, semi-manually review PRs and gate the release process between multiple protected envs. The project is getting close to 500 end-to-end tests in Playwright.
That’s just working nights and weekends. Before AI, it took my team at the office 4 years to produce this much work. There are some qualitative differences but the speed and results are real
In fact, it's far beyond what I would even attempt, because I've just spent two decades building up a data bank of how hard things are supposed to be.
He doesn't know it's supposed to be hard, so he just does it.
I sign off the code I merged, part of company policy but also just to be sure it is actually decent. But reviewing has become the real draining bottleneck: even stacked PRs, if that total 5-6k lines is not a 5min job. Even if I brainstormed and set the plan, that's really the part that doesn't scale right now for me in this. But the author is very shy about that: either the changes arent that big in the end or they trust the process enough to review in a more casual manner. Being equally untrusting I can't do that ...
They can be still be applied now using coding agents, if you're willing to push back against the default setup and change your mode of thinking a little bit. Of course it doesn't help that an entire industry is dedicated to persuading us that maximizing token spend is the only way to get shit done.
I appreciate this probably seems like an extremist take, but I wrote some more about it here in case there's anybody out there who identifies with it:
https://philbooth.me/blog/agentic-coding-and-mental-models
I want less code to maintain not more that I don't even fully understand.
I think research and very supervised coding with lots of guardrails is the way to actually gain productivity from these tools.
First nobody sane want to give their domain IP to OpenAI/Anthropic. That's why local AI will eventually prevail and flourish because people who actually have some IP will have no problem to buy 10k+ EUR machine to run some pretty good models on it. However if your main job is just doing CRUD stuff, then you are screwed.
Secondly hallucination is really Achilles heel of every LLM. Sure you can recreate an application which exists in thousand of variations on the internet, but the moment you will try to go more into domain knowledge you will start struggling more and more.
Try to make CAN driver for ESP32, easy it is probably going to work. Try to make CAN driver for STM32F7xx now the AI will start having a problem but probably will be able to produce something what is working after a lot of debugging. Now let's make CAN driver for MPC5555. AI will start writing fairy tales about registers which do not exist. All of processor above have reference manuals and sometimes example git repositories available on open internet.
>I want to start by saying that I’m neither an AI-fanatic
Kind of like saying you are a fanatic before saying you aren't.
I don't think theres too much here (e.g. "spec driven development") I haven't seen elsewhere.
My current process is also using Github projects in a normal scrum style way, with many tickets written or fleshed out and state managed by the LLM, and it doubling as the memory system
Completely leapfrogging all these other open and closed source concoctions and being more effective
But its effective enough that I don’t need OP’s final form state of still approving everything
Auto-mode is fine. Worktrees are built into Claude Code now. I just tell it to classify tickets as sequential or parallel possible and spawn subagents to tackle all of the tickets in the todo list
They all get their own context window its pretty perfect now
in the meantime I work in a couple tabs of Claude Design for different flows of any client side app. My philosophy has been that devs could pick up graphic and UI/UX design easily, its just still a full time job to make variations of layouts and portray their states.
UI/UX is not a full time job anymore.
And I use Claude chat to flesh out aspects of the overall idea
I think you may be overcomplicating your workflow in the concluding state.
Overall I agree that planning and intention is now most of the time, before a 10 subagent precision strike is initiated
Both are giving me skillsets to excel in the other domain
I watch the subagents, push back on some choices, look at commits and glance at pull requests
Sorry access to an LLM (even if it could center a div reliably and make a responsive designs, it can't) does not give you taste, intuition or make you good at building user interfaces. You people/sloppers have no idea the amount of sweat that gets poured into great UX.
Its insulting when you people say these things and Im not even a designer or frontend dev.
I actually think UI/UX designers and devs will be the last to fall. I will want beautiful products that were built by beautiful minds, thats how you will set yourself apart from the slop. And fortunately it will be even easier when 80% of everything is half assed cranked out UI by llm design tools. The contrast is already glaring.
Claude Design has barely been out for a month
And it’s fulfilled my needs better than v0, lovable, playwright via LLM or just iterating in the coding LLM. I’ve worked with graphic designers my whole career and have also contracted design agencies to do style guides and collaborate on branding and layouts. I’ve gotten the output that I’m looking for with Claude Design
eventually you’ll see examples but its not in my purview to publicly link any of my projects as being vibe coded