Some hidden time sinks of the writing process
At 8am, two days ago, I started writing how A/B testing could lead LLMs to retain users instead of helping them. I had some notes beforehand, but I did go from an empty page to a full draft of what I wanted to say by about 2pm. Unfortunately, the writing process is more than just dumping my thoughts in the editor and hitting “Publish”.
This is a short note on issues in the second half of the writing process that I ran into while writing on the Internet; it seems useful to write them down to be able to point to them in the future.
The post-release requires lots of time and energy
Clicking “Publish” is not the end of the writing process. Writing on the Internet means doing several ancillary tasks after the post is published, such as posting on social media, responding to comments, etc.
If I post a post that has any level of engagement, responding will take time and energy, and not responding leaves the post in this weird state where new viewers see the unanswered questions around it.
The goal of writing is ultimately to get people to read it (unless you want to write for the AIs). This means that people engaging with the post should be the default outcome, not some kind of unplanned contingency. I remember, early in my PhD, going for drinks with a friend and not being able to enjoy it because there was a Twitter shitstorm around my paper, and I felt obligated to track what was going on. Nowadays, I would be able to let it go until the next day; but it would still be a distraction.
Lesson: be ready to spend time engaging with people who respond to your post on social media, email, etc.
LLMs vary widely in how helpful they are with various tasks
While I always write the first draft of every post myself (I write to get my ideas out, not the LLMs!), using LLMs for editing and feedback before publication is fair game.
After I wrote a messy 1700-word draft of the A/B testing post, I expected to have something publishable in an hour or two using LLM help to polish it up.
However, the writing suggestions I’m getting from some frontier models are abysmal. Here are the decisions I have made so far:
GPT-5 is hereby banned from doing anything on my writing, even fixing grammar.
Claude Sonnet 4.5 in Claude Code is good for a restricted set of writing tasks. Here is what I tend to use it in particular:
implementing Scott Alexander’s nonfiction writing advice, that I put in my
CLAUDE.mdfile;as an interactive Grammarly of sorts; I just ask it to help me write certain sentences better, and reject its suggestions if they are not helpful.
Claude 4.1 Opus and Gemini Deep Think are useful for actual, meaningful high-level feedback. I still do not like their default prose at all, it is not how I would write things, and it is not the kind of text I want to read
GPT 4.5 (deprecated, only available in ChatGPT Pro starting $100/mo) is the model with the most natural writing style; but I still reject most of its suggestions.
Some people manage to get way more mileage out of LLMs in writing than I do. I feel that, to trigger that, I need to add way more examples of my writing style. I hope to report how I stand after this month of posting.
Lesson: just pasting text and asking LLMs to fix it is of course not going to work.
Feedback from people does not get you advice you are looking for
I attended a writing feedback circle with a draft about the philosophy of red-teaming LLM defenses. The circle proposed many ideas for how to expand the post in different directions, and even some research directions; but once I sat down with the notes from the feedback session, I realized there was no advice on how to reword the post to make the original argument clearer.
Lesson: to get feedback you can do something with, ask precise questions from people reading it.
Your idea might have already been written about, but it’s hard to find
My friend Adrià wrote an excellent post called Statistical tests are complicated because their inventors did not have fast computers. Unfortunately, the same point had been made in 2011 in the well-known post There is only one test.
I knew there was a post on the same topic, but didn’t recall the title or the author. What surprised me is that OpenAI Deep Research did not find it either!
My prompt was:
Find me a post making this very same point. I know I have read it. post. You can replace statistical test formulas by sampling when you have computers. It’s not a Substack.
I managed to find it with a bit more work. However, what this shows is that you cannot really rely on 2025-level LLMs to do literature search well.
Lesson: literature search for non-academic work can take a lot of effort, and while Deep Research tools are the state of the art, as of 2025 they have many false negatives.



> Lesson: just pasting text and asking LLMs to fix it is of course not going to work.
counter: maybe you need a better system prompt. [mine](https://open.substack.com/pub/lydianottingham/p/my-bayesian-chaplain?utm_campaign=comment-list-share-cta&utm_medium=web&comments=true&commentId=179845421) does just fine
> Lesson: literature search for non-academic work can take a lot of effort, and while Deep Research tools are the state of the art, as of 2025 they have many false negatives.
& even for academic! i learned of the papers in https://lydianottingham.substack.com/p/lit-review-stated-vs-revealed-preferences long after a Deep Research sweep that should've picked them up.
Appreciate you writing this! FWIW I thought the LLM’s suggestions seemed pretty fine? I was expecting much worse from the way you’d captioned!
I find something like ‘the writing process doesn’t end when you publish’ is a clearer and more hook-y title, and likewise for the opening lede with percentages. YMMV of course, and I agree on the general point that LLM prose is usually quite bad.