Large Language Models: A Series
Building products on LLMs and AI generally.
The Rise of Transparency
Finding signal in the firehose.
Link: Maggie Appleton on Gas Town and Coding Agent Orchestration
Maggie was already perhaps the best writer on the intersection of engineering and design, but now that she’s joined Github Next, she’s also extremely keyed in to where tools for coding are going. Her piece on Gas Town and orchestrating coding agents is sharp and worth reading in full.
As the pace of software development speeds up, we’ll feel the pressure intensify in other parts of the pipeline: thoughtful design, critical thinking, user research, planning and coordination within teams, deciding what to build, and whether it’s been built well.
The most valuable tools in this new world won’t be the ones that generate the most code fastest. They’ll be the ones that help us think more clearly, plan more carefully, and keep the quality bar high while everything accelerates around us.
We’ve known for a couple years now that faster coding will mean non-coding work will increasingly be a bottleneck, and now it’s happening. Deciding what to build – and whether it’s been built well – was already one of the most important tasks on a software team.
But in the face of tools that can add anything to your product, desirable or not, this judgement becomes the core of the work.
A Box of Many Inputs
On browsers, local classifiers, and Roger Rabbit.
Link: Why is ChatGPT for Mac So… Bad?
Last week I wrote an exploration of Ben Thompson’s recent question, “Why is the ChatGPT Mac app so good?” A lot of people on the internet, it turns out, do not agree with this premise!
Many folks have been having problems with ⌘C not copying text. Hacker News sees the app as “not good at all”, to the point that my post about it being better than the alternatives was flagged off the site. X doesn’t like it either.
Beyond the bugs I mentioned in last week’s post, I’ve recently been plagued with a ChatGPT Mac bug of my own, where every time I start a new chat, it will pre-fill the text field with the first input I used last time I started a new chat on Mac.
All of this led me to an informative post by one of OpenAI’s Mac developers, Stephan Casas:
nearly everyone who works on the ChatGPT macOS app has been stretched thin, and hard at work building Atlas.
[…]
i’m thankful that our users appreciate our decision to develop a native app just as much as i’m thankful for the heightened expectations they hold because we did so
Apparently he merged a fix this week for the copy-paste bug that has been plaguing many folks, which is promising.
Something implied in last week’s article that’s worth saying explicitly: although many good Mac apps are native, being native is neither necessary nor sufficient for being a great app.
While OpenAI is investing more in desktop apps than any other model labs, they have much to do before they can transcend “better than the alternatives” and achieve “great.”
Why is ChatGPT for Mac So Good?
Claude, Copilot, and making a good desktop app.
Spending Too Much Money on a Coding Agent
On making use of large thinking models.
Post-Chat UI
How LLMs are making traditional apps feel broken.
The Era of Tab Continuation
Press tab to complete your work.
It’s Good for Apple, and Okay for You
Apple Intelligence, so far.
Testing the Untestable
The four phases of automated evals for LLM-powered features.
Link: Infer, an AI Eng Meetup in Vancouver
Next week, we’ll be kicking off a new speaker series in Vancouver called Infer. The goal of the meetup is to bring together folks who are doing great AI engineering work, so we can learn from one another.
The format will be familiar to folks that have attended my previous meetups: two speakers, often one of whom will be visiting from out of town, with time to chat afterward. Events will happen roughly every two months, when we have compelling topics lined up.
If you’re building LLM-powered apps in Vancouver, you can subscribe to our event on Luma. There are still a few spots open for our first “beta” event on October 9th, and we’ll be hosting another during NeurIPS in December.
There’s something electric about getting smart people who are working in a rapidly-changing field in a room together. I recommend it.
Starting Forestwalk
A wild startup appears.
Pushing the Frontier
If – and when – GPT-5 might eat your lunch
LLMs Aren’t Just “Trained On the Internet” Anymore
A path to continued model improvement.
From Chatbot to Everything Engine
A curious design constraint signals an ambitious future.
Going Way Beyond ChatGPT
Techniques for building products on LLMs today.
32K of Context in Your Pocket
A wild large-context LLM appears.
A 175-Billion-Parameter Goldfish
The problem and opportunity of language model context.