My Story of Adopting LLM Tools

↑ Top

Brief Intro

I remember when OpenAI showed the first transformer model GPT-1 in 2018. That transformer architecture “changed the world,” but for us at the time it was nice and had no practical applications. Then someone decided to make a web chat, and we got ChatGPT. It became useful from version 3.5, where you had to copy and paste text from the UI. It was very bad at code, but quite good at generic text.

Then we got new powerful models from different vendors. Google was usually a few steps behind, but now Gemini 3 (Nano Banano Pro, etc.) are very powerful models. Anthropic was founded with a focus on AI safety, but later leaned heavily into engineering and coding use cases — and they’ve become one of the strongest in that space. Microsoft released their own model but uses Claude Code inside. Apple… well, something went wrong there. Meta released Llama in February 2023, and I believe they made the weights publicly available on purpose, because the community produced a lot of fine-tuned models — and I’m almost sure Meta hoped for exactly that.

History of Usage

I remember when Anthropic released the first version of Claude Code in February 2025. It was interesting — I started using it from day one. My usage scenario has gone through a big metamorphosis this year.

At first, I used it to write small chunks of code, mostly because it was quite expensive. When they introduced the Max plan with Claude Code included, I started using it more and more — writing tests, building PoCs, etc. Then at some point they introduced slash commands, sub-agents, MCP, and more. That was the turning point.

Now you can customize the workflow for your style or project style. You can feed proper context, add custom data sources, etc.

I built a lot of custom tools and MCPs at my previous job. At one point I had around 120 MCP tools enabled, but that was eating context way too fast. When Claude Code added support for slash commands, I removed almost all MCPs and migrated to slash commands instead. A few weeks ago I even made my own Ralph loop implementation — still not public, and maybe it doesn’t need to be anymore since there are plenty of nice implementations out there.

I was using Claude Code not only for code, to be honest — after the release of Claude Cowork, that became a bit unnecessary.

What I wanted to say with this long text: I have a lot of experience with LLM tools.

Now

Now it’s a different story. I think not running an agent in the background is a waste of time — you should almost always have one running.

Key change: I almost don’t write code via agents anymore. The biggest part now is gathering information, investigation, etc.

I still use LLM tools to write code in a few cases:

  • Making a prototype
  • Trying something new or experimenting
  • Repeating something where I can show an example

What I’ve Noticed

  • Changing code has become a cheaper operation — you can experiment in the background with a proper plan while you’re busy with more important tasks.
  • You might notice it’s hard to memorize things, and yes, that’s true. With a proper workflow it’s too easy to just load context into your brain. I don’t know if that’s good or bad, but on the other hand, you can do more in the same amount of time.
  • It amplifies user stupidity or intelligence.
  • Simple became simpler, harder became even harder.

What Matters for Using LLM Tools Productively

This part is specifically about Claude Code.

Based on my experience:

  • It’s very bad on empty projects with no examples.
  • A big CLAUDE.md is bad. I noticed that when working on really big projects, it’s better to keep it quite short and use headings or <custom_tag> blocks to define important sections.
  • “Important blocks” = short structure descriptions, instructions on how to get information about the project, etc.
  • Customize the workflow to project requirements.
  • If you want to make a PoC or experiment, use something like a brainstorm session to create a proper plan for the agent.
  • Better to start each task from scratch. Yes, with a 1M context window you can put a lot of information in, but when context reaches ~70% you start getting unpredictable results.

Overall I’m happy now — programming became fun again. Though, to be honest, when Anthropic goes down I think “phew, finally I can do something by hand in my Helix editor.” Not always, but I don’t feel locked into LLMs (at least for now).

In Conclusion

I still believe this is another tool that will transform many times in the future — just like third- and fourth-generation programming languages (does anyone remember what they promised?). You become more than just a software engineer: you become an engineer, scrum master, and product owner in one person, but for autonomous agents. It requires even more expertise, knowledge, and observability.

You can resist it, but why? Just imagine: instead of using a hand brush, you get a nice spray system for ink.

Yes, definitely — as I said earlier — it requires even more expertise and knowledge to use LLMs right and productively.