Aaron Dong

25.08.25
Sam Altman has recently decried investor overexcitement about artificial intelligence. The AGI and human replacement narrative that he and a subsect of the scientific and speculation industry has pushed for the past few years has become factually untenable. There are two important takeways from this period of time: 1) the cultural shift in how we access and create information 2) what is possible with machine learning. Many have started delegating information discovery and processing to text generators. It has become the path of least resistance. There is a reason why our imaginations have been enraptured by text generators. We’ve been able to glimpse what is possible with machine learning beyond analytics and domain-specific modelling. The latent space has proven to hold richer properties than previously thought.

15.08.25
The release of Gemma 3 270M is indicative of the times. Large language model development trends are not making bigger models, but smaller, more efficient models. Model distillation, inference optimization and parameter efficiency has been addressed by open source small language models (e.g. Gemma, Llama, DeepSeek, Qwen) for some time. No matter what scale a language model is at, the fundamental question remains if instruction-following without explicit program synthesis will ever be reliable.

12.08.25
With the release of GPT-5, the era of test-time compute and pre-training scaling is over. It is now clear that no more gains in utility are possible through the previous paradigm. Agents are an attempt towards real-world applications but are inevitably constrained by the reliability of their most unreliable component. Test-time training must be the new frontier, it will unlock the ability to generalize that is key in intelligent, adaptive systems. Real-world reliability requires generalizable, adaptive systems that update and learn at runtime.