I'd argue that another key aspect is to break programs up into small independent units that can be verified in isolation, and to compose them into larger programs with contracts between them. I've had a pretty good experience using Claude with a framework where I express the program as a state graph, and each node is treated like a microservice that gets some input and produces some output. Then the workflow engine verifies that the output matches the declared schema and then decides which step to execute next. https://github.com/yogthos/mycelium
As the state travels across the graph, I keep a trace of the steps which were executed, which means that when an error happens, the agent has a lot more information than it normally would, it can see what decision points the code passed through already, it can cross references that with the declared workflow, and quickly find where it screwed up.
The idea of workflow engines has been around for a long time, but they feel too awkward to use when you're writing code by hand. Writing conditional logic directly in the code keeps you in your flow, and having to jump out and declare it in config somewhere feels awkward. Coding agents completely change the dynamic though because they don't have that problem. If the LLM is writing the code, then I can just focus on ensuring the code meets the contract, while the agent can deal with the implementation details.
My general thesis here is that context rot and other problems agents often end up exhibiting largely stems from the way we structure code, which is not conducive towards LLMs. Even small models that you can run locally are quite competent writing small chunks of code, say 50~100 lines or so. And any large application can be broken up into smaller isolated components.
In particular, we can break applications up by treating them as state machines. For any workflow, you can draw out a state chart where you have nodes that do some computation, and then the state transitions to another node in the graph. The problem with traditional coding style is that we implicitly bake this graph into function calls. You have a piece of code that does some logic, like authenticating a user, and then it decides what code should run after that. And that creates coupling, cause now you have to trace through code to figure out what the data flow actually is. This is difficult for agents to do because it causes context to quickly grow in unbounded way, leading to context rot. When an LLM has too much data in its context, it doesn’t really know what’s important and what to focus on, so it ends up going off the rails.
But now, let’s imagine that we do inversion of control here. Instead of having each node in the state graph call each other, why not pull that logic out. We could pass a data structure around that each node gets as its input, it does some work, and then returns a new state. A separate conductor component manages the workflow and inspects the state and decides which edge of the graph to take.
The graph can be visually inspected, and it becomes easy for the human to tell what the business logic is doing. The graphs don’t really have a lot of data in them either because they’re declarative. They’re decoupled from the actual implementation details that live in the logic of each node abstracted over by its API.
Going back to the user authentication example. The handler could get a parsed HTTP request, try to look up the user in the db, check if the session token is present, etc. Then update the state to add a user or set a flag stating that user wasn’t found, or wasn’t authenticated. Then the conductor can look at the result, and decide to either move on to the next step, or call the error handler.
Now we basically have a bunch of tiny programs that know nothing about one another, and the agent working on each one has a fixed context that doesn’t grow in unbounded fashion. On top of that, we can have validation boundaries between each node, so the LLM can check that the component produces correct output, handles whatever side effects it needs to do correctly, and so on. Testing becomes much simpler too, cause now you don’t need to load the whole app, you can just test each component to make sure it fills its contract correctly.
What’s more is that each workflow can be treated as a node in a bigger workflow, so the whole thing becomes composable. And the nodes themselves are like reusable Lego blocks, since the context is passed in to them.
This whole idea isn’t new, workflow engines have been around for a long time. The reason they don’t really catch on for general purpose programming is because it doesn’t feel natural to code in that way. There’s a lot of ceremony involved in creating these workflow definitions, writing contracts for them, and jumping between that and the implementation for the nodes. But the equation changes when we’re dealing with LLMs, they have no problem doing tedious tasks like that, and all the ceremony helps keep them on track.
I find people tend to miss the productive aspects of Chinese state led investments because they don't consider their value at scale. Take the HSR system, it has been derided time and again as being wasteful, and too expensive, and so on. Yet, now it's become a key artery for trade and commerce across China. It allows goods to move at an incredible speed, boosts tourism, and helps overall development of many regions which otherwise wouldn't see much economic activity.
I'd argue you can have much more precise definition than that. My definition of intelligence would be a system that has an internal of a particular domain, and it uses this simulation to guide its actions within that domain. Being able to explain your actions is derived directly from having a model of the environment.
For example, we all have an internal physics model in our heads that's build up through our continuous interaction with our environment. That acts as our shared context. That's why if I tell you to bring me a cup of tea, I have a reasonable expectation that you understand what I requested and can execute this action intelligently. You have a conception of a table, of a cup, of tea, and critically our conception is similar enough that we can both be reasonably sure we understand each other.
Incidentally, when humans end up talking about abstract topics, they often run into exact same problem as LLMs, where the context is missing and we can be talking past each other.
The key problem with LLMs is that they currently lack this reinforcement loop. The system merely strings tokens together in a statistically likely fashion, but it doesn't really have a model of the domain it's working in to anchor them to.
In my opinion, stuff like agentic coding or embodiment with robotics moves us towards genuine intelligence. Here we have AI systems that have to interact with the world, and they get feedback on when they do things wrong, so they can adjust their behavior based on that.
Are you absolutely positive? Danes and Germans already sent troops and France a nuclear sub and a Frigate. UK, Sweden, Finland and others are preparing to send troops as well. Perhaps "small operation" is not a foregone conclusion.
80% of the place is frozen ice - you would need specialist units to fight there - US has never done that really - the Russians had units equipped for arctic conditions and they were locals and had unique armor meant for that (tracked-lighter-more durable).
Even China has more experience due to their conflicts in the Himalayas and Tibet.
You not going to get a Stryker or Abrahams tank working there.
Good point. Ukraine example:
"Battlefield Challenges (Abrahams): Despite being formidable, these tanks struggled against Russia's extensive drone warfare, leading to high attrition rates, with nearly 90% of the original U.S. fleet lost or damaged..."
Whatever military capability there is in NATO, it's clearly on the side of the US. The EU can't even produce basic things like artillery shells and explosives at this point. The UK can't even make steel.
I don't think America ever said that Greenland would be a state, that it would be an overseas territory of the US instead, which still sucks, but I don't think Trump wants 2 senators and 1 house member from a territory that is bound to be much more liberal than the average American state. Inuits vote Democrat even in Alaska.
In my view, this is the exact right approach. LLMs aren’t going anywhere, these tools are here to stay. The only question is how they will be developed going forward, and who controls them. Boycotting AI is a really naive idea that’s just a way for people to signal group membership.
Saying I hate AI and I’m not going to use it is really trending and makes people feel like they’re doing something meaningful, but it’s just another version of trying to vote the problem away. It doesn’t work. The real solution is to roll up the sleeves and built an a version of this technology that’s open, transparent, and community driven.
reply