Thoughts on AI

Published on Last edited on

As my AI usage has seen significant increase in the past ~quarter, I also began thinking about it more. Besides the practicality — techniques, incantations, workflows, tools, subscriptions —, the approach I have, or the right approach one should have when using these tools. Or more simply put, what does these tools can really do?

For one, what has been curious to me is the degree (and speed!) that my premises and assumptions about this approach keeps changing. Or my willingness to update them, (which in some ways is driven by fear, followed by excitement).

A rough outline of my usage has been: I went from moderate code completion on SuperMaven and Cursor, to a deferral of simple repetitive tasks, to using extensively to research, investigate and prototype, to deferring to it increasingly complex and ambiguous tasks (with varying degrees of hand-holding and continuous reviewing). In essence, sliding right on the autonomy slider. Though, in my opinion, it is hard to reconcile my experience (or of my peers) with the headlines that "software will be dead in 6mo-12mo".

Creativity has not been solved

Tasks that are inherently creativity-bounded are still largely not solved by any of the frontier offerings. This is entirely drawn from anecdotals (personal and close friends'): daily driving both Codex and Claude Code can expose the limitations of these clankers.

This could also be proved by looking at results. Specifically, by the output distribution from people using these tools. If creativity had been solved, or close to be solved, we'd be seeing an uniform explosion of meaningfully creative and high-quality contributions, but that has not been the case. The average case is that individuals produce more, and most output take the shape of experimentation/rudimentary/amateur artifacts. Most, if not all, meaningful results are from shovel sellers⁠[1].

System understanding and accountability

The idea that systems can remain completely opaque, while being fully operated by its observed behavior is.. foreign to many people, myself included, but still very much likely what an end state[2] looks like. In 2026, there is no dispute of that. However, there seems to be very little evidence that we'll get there in the short term (1-5y window). We are, after all, "6mo-12mo away from a complete automation of the field". With that in mind, given two individuals: one that deeply understand software systems and specific architectures, and another individual that doesn't, it is rather obvious that the former would be chosen to own and lead any particular system, over the latter. What can be argued is how much this preference is now worth, and that now, more than ever, the cost to ramping up understanding is the lowest it has ever been. In other words, this new dynamic surely disturbs the status-quo of the software engineering labour market, which, at this point, is also something that is somewhat common sense.

Bryan Cantrill presented an insightful analysis on a recent talk⁠[3] about the trust dynamics that are embedded in every layer of socities, engineering systems included. The mention of the infamous IBM quote cements a fundamental aspect of society that AI does not change:

A computer can never be held accountable. Therefore, a computer must never make a management decision.

Extrapolating the above, experienced and competent software engineers will still be in demand as the next-to-best system owners, albeit the total market demand for software engineers' growth likely decreases over time, in a conservative model. Therefore, it is still valuable to be competent in software, but now the market is more akin to a zero-sum game, assuming things evolve in the pace of the past 2y.

The frontier

The engineering of the future does seem interesting, as the properties of cheap and intelligent codegen unlock and make feasible things projects and enterprises that would otherwise be laughed off.

My mind has been marinating on some crazy ideas after stumping on this great piece about StrongDM (a company that spends over $1,000/day per engineer), and its technique and approach for generating ambitious software at a higher scale. It is, indeed, fascinating to read their reports of commanding a swarm of software that is never reviewed - only its outputs are observed, to build a system that tests access permission correctness. Reminded me of stories from folks doing similar cloning stunts around the time Ralph Wiggum was discovered. Which is what Cursor and Claude had also been doing by building a browser and a GCC clone autonomously.

These systems and approaches all share a common denominator - they all require concretely defined artifacts that can be used for objective source-of-truths (e.g., "can we build linux?", "coverage of wpt suite?", "do we have API parity with GitHub.com?", etc).