The Goldilocks Era of AI
Everyone knows that AI will lead to mass unemployment, what this article presupposes is... maybe it won't?
It seems like on a very basic level, we should want AI that is maximally assistive to humans, but which also stops just short of being intelligent enough to cause widespread disruption to human life.
Or at least that’s just my opinion, man.
The AI that we have now is pretty cool. I don’t agree with people who say that it’s inaccurate, or overly hallucinates, or is damaging to artists1. I think it’s pretty incredible.
But it’s incredible precisely because it still has so many limitations. It’s still complementary to humans, as compared to a situation where humans become a vestigial appendage in the whole “thinking” thing.
One of the ways that I was able to bring the podcast back was to use AI to radically improve episode production. My prior flow was… ok I am embarrassed to say exactly how time intensive it was. But I’m going to go through it anyway. It went:
Record interview
Roughly clip each interview answer into it’s own audio file. Maybe a 90 minute interview would equal 70 clipped files. This required actually listening to every second of every clip - multiple times - to get the clip bounds correctly.
Roughly assemble the show order.
Write narrative, which would again require me to go back and re-listen to every clip so that I could be sure the narrative would be relevant to the guest audio.
Experience crippling writer’s block.
Write really bad voiceover, then take weeks to replace it with slightly less bad voiceover.
Finely clip every audio file in order to get down to the desired run time. This was an extra step, in addition to the rough clip.
Record voiceover. Edit the VO in a similarly time intensive manner, since it often took me multiple takes to get it right. So every voiceover section was done 2-3 times and had to be edited.
Assemble the final show.
I am also embarrassed to say that it wasn’t apparent to me how many of these steps could be streamlined. But eventually I got it.
Even though all of this stuff does require a bit of meticulous attention, it’s also pretty formulaic. I was going through the same steps in each episode, just in a really ass backwards way.
So I started coding up a Podcast Assistant. It uses AI to turn audio clips into transcripts, and then create audio blocks that can be re-arranged by drag/drop. It also makes the editing - i.e. delete the bad stuff - much easier. Also, because now I am looking at words instead of listening to audio, it’s much faster. When I’m writing I can just see words on a screen so I know what the clip is about2.
I also use AI to help with the writing. It can’t write an episode because it can’t write in my voice. Not even close. But it can help me through the writer’s block. I have it set up to review the interview summary and suggest broad concepts. Most of the suggestions are not good. It reallllly wants me to talk about The Black Swan!
I also have it set up to take what I’ve written and suggest ways it might be polished. Almost all of the suggestions get discarded because they sound too AI-y. If your goal is conversational, and then the AI suggests polishes that make it less conversational, that’s going in the wrong direction.
However, I have to say is that just getting a note about a possible change is often enough. Part of writing is just realizing: “This section sucks!”
It’s basically like having a writing assistant… and one who doesn’t mind their ideas being ignored.
I programmed the Podcast Assistant in Cursor, which is incredible. I am a terrible programmer. I can’t even write spaghetti code. But Cursor can program well enough to make a functioning tool.
There are also other AI tools I’m using. I use Gemini Deep Research to get… deep research. Then I can import that research into Notebook LM and it will create an audio summary for me. So I just type something I want to learn about in Deep Research and then within an hour I have a 40 minute audio summary. That’s basically magic.
I’m also using AI for music. I used to spend lots of time trying to find the right music for episodes. But usually you’re just hoping for something that is slightly percussive, with a decent tempo, to keep things going. Now I can just tell Suno the kind of music I’m looking for. Once you have the prompting down, the time spent becomes trivial.
Actually, that reminds me of another process improvement I am making. I used to choose a different theme song for every episode. I did this in part because it was fun, and in part because it made each episode feel like its own thing. But going forward I am just going to ditch that part to cut down on production time. Now we will have one song. I am calling it “Arbs Never Sleep” - and it was made in Suno. I hope that eventually the sound quality will improve, but I still like it for now.
To summarize, I have been using the following tools to help with the podcast:
Gemini Deep Research
Notebook LM
Cursor / Claude / Gemini for coding
Whisper AI voice recognition model
Suno
To get back to the title of this post, I hope that we stay in the Goldilocks zone for AI, and it gets better without getting too good. It’s not hard to imagine all of the ways that really powerful AI could be used that would be in conflict with humanity3. So for now I just use my tools and hope that the people who expect AGI any minute are wrong.
I produced a song in Suno, then sent it to a violinist and paid them to perform it. I never would have done this before Suno.
Someone on Twitter pointed out that these features have been available in off the shelf software. I think that probably most users would be good with something like Descript for instance. However, my value calculation is basically: will I find enough points of friction over time that can be removed from the process, to justify the time spent to code a custom solution? I’m basically certain that I can. I know this because I’ve done the same in the past for other processes that I run. I started doing this in 2016 when I created a custom GIS for my business, because the off the shelf GIS solutions had lots of small points of friction I wanted to remove.
Every major global regime **really** wants to have its own AI precisely because they realize how dangerous it would be to not have their own AI. It’s not hard to get from there to “really bad stuff happening.”