Howdie folks, I finally pushed my first audiobook over the finishing line. Now, I’m by no means an expert at any of this stuff, but boy did I learn a lot from the process. I love learning from other people’s mistakes, so I thought I would share 8 lessons I learned from narrating and producing my first audiobook.
If you would like to listen to a copy of my first audiobook: 13 Steps to Evil: How to Craft a Superbad Villain, you can using the links below. Note, the audiobook is wide and you should be able to purchase it using any store you usually use, I’m still trying to figure out how to find the other store links.
ONE – YOU’RE ALWAYS GOING TO BE A NEWB
So, here’s the thing. I have voice acting experience, I used to have an agent and I worked many contracts as a teen using my voice. Plus I’ve been podcasting for two years. So, naively—or perhaps arrogantly—I figured I’d wipe the floor with this “audiobook” exercise. Oh you sweet, sweet, naive fool, Sacha.
It doesn’t matter how long you’re in this game, if you keep doing new things, if you keep trying to grow and develop, if you keep experimenting then you’re going to stumble across things you haven’t done before. You’re going to make mistakes, and believe me, I think I must have made all of them on this journey. But that means you’re going to be taken right back to the beginning again.
I’d almost forgotten what it was like to be a newbie in this world. Which is terrifying because 1. I don’t feel like I’ve been in it that long really—and when did I start feeling so damn old—I’ve only been full-time two years. 2. Shit changes so fast I have whiplash.
I’d forgotten how frustrating it is to be new and doing something for the first time and not having all the answers. I stumbled multiple times, thinking I’d taken something to the finish line only to discover I had eighty-five thousand other tasks that needed completing first.
It was strange, I loved and hated it. But what it did do, was give me a much more significant sense of achievement completing each task than I’d been getting with things I knew the ropes for.
A takeaway for me to remember not to get too comfortable.
TWO – BOOTH BUILDING – WOBBLY WHEELS AND DOGS
We built a booth to record in. So if you’re not able to do that you might not find this lesson as useful. But, my incredibly talented wife built me a 3ft by 3ft… umm… phone booth style box. The base was 3ft square but of course it was taller than that. I’m 5”5 so it’s maybe 6ft tall. It lives in the garage which is an exterior building with electrics. The original plan was to have it in the house, but after having built it I’m so pleased we situated it outside not least because it would be a fucking giant eyesore in my office. But also because after all the homeworking, I’d never have gotten the silence and muted sound I needed in the house. Something to consider if you have members of the family at home.
The booth is made out of MDF, drilled together with screws and situated on a MDF base. We cut a hole out of one panel to make a door and hung it back on with hinges. Door hanging was a bit of a saga, we didn’t quite get it right, it’s a bitch to shut, but it functions.
Then my sister in law had the genius idea of using the spare carpet we’d saved from when we had our carpets redone after we bought the house last year. So we lined the interior with the spare carpet, and I’d actually advise everyone to do this if you build a booth because it was a great initial layer. It kept the booth a fraction warmer, and even with just carpet, the sound was muffled.
I then bought three acoustic panels. One for each side (except the door which remained just carpeted). And because I am a whore for branding, I paid extra to have them in purple. I then used acoustic foam to fill in the gaps between carpet segments near the ceiling.
I kitted out the inside of the booth with a mic stand and a sound shield and that’s about it.
Oh, I also used push LED lights, battery operated lights and popped them around the ceiling, but they weren’t bright enough, so I ended up with a light clipped to the mic stand.
Last, in terms of the build, we drilled a small hole in the side of the booth after it was built and we’d decided where it was going to live in the garage so that we could pull USB wires through to connect the mic to the laptop.
The biggest lesson I learned was that we decided to attach locking wheels to the base of the booth so we could move it and take it with us if we ever moved house. But it could also lock in place to save movement. This was a great idea in theory, but it lifted the booth off the floor. Meaning the base platform would creak if any chunky butts stepped on it. And being a chunky butt, this wreaked havoc with the audio. I essentially had to stay absolutely motionless while narrating which is difficult when you’re aggressive with gestures.
We fixed this by placing wooden planks under the booth that filled the gap and strengthened the base.
THREE – KIT AND CABOODLE
What else do you need? I use:
- A sound shield (AmazonUS, AmazonUK)
- Pop filter / wind shield (AmazonUS, AmazonUK)
- Mic stand (AmazonUS, AmazonUK)
- A Blue Yeti mic and USB wire connecting (AmazonUS, AmazonUK)
- Amadeus Pro (Audacity is free and suitable for mac or PC)
- Karl Hughes Audio Mastering Services
- WeTransfer for file sending
- Infinite supply of honey, lemon and ginger tea
- Your inner diva
FOUR – MOUTH SOUNDS SUCK
It is astonishing to me how many fucking sounds our mouths make. Clicks and tuts and clucks and claps and sticky wet noises and pops. And that’s the shit you don’t even mean to do. Here’s what I know:
- Stay the motherfuck away from dairy. It makes your mouth clag and the sticky sounds rife
- Eat before you work. I didn’t on many occasions and it led to gurgles and belly rumbles which powerful mics pick up and also meant I tired quicker than on days I ate.
- Having gotten rid of myriad sounds, that nearly drove me to insanity, I will never understand why people like ASMR
- Ginger, honey and lemon can help your throat when you start to get sore
- If possible, work in 1-2 hour sprints. But start smaller than that.
- Put a seat outside your booth or have cushions for when you take breaks
- Stand, always. I recorded a couple of chapters sitting—thinking I’d be able to work for longer—and all it led to were more fucking mouth sounds and an abundance of editing hours slapped on my total. Sitting changes the shape of your airways and the distance between you and the mic. Stand bitches or behold the copious amount of hours you’ll need in post production.
FOUR – IT’S A PERFORMANCE… SO PERFORM
Recording audiobooks is a performance. You get to be a diva, an actress, a fucking daaahling for the day. HAVE FUN. It’s so much fun talking your words and putting all of your sarcasm and expression and nuanced voice into the words you wrote. If you’re narrating you can say the words exactly as you intended them to be said. It’s a lot of fun, but, it’s also tiring.
I found after an hour or two in the booth, I was done in. My throat kinda ached, even though I wasn’t hoarse, and I was tired enough that if the literary Gods had allowed, I’d have napped. Alas the bastards smited me with more work. Such is life.
I definitely feel like I learned a tough lesson with this one. I almost got the performance right, I wasn’t quite as relaxed as I wanted to be and I think the next audiobook will be even better because I’ve gone through this one.
EXAMPLE IN AUDIO SACHA READS TWO VERSIONS OF THE FOLLOWING QUOTE
“What it means, is that when you create your villain, whatever traits you do show, need to be in your face. Like a red-light district’s glowing streets only louder and with big red fire truck sirens that blast ‘come get me, sugar’ in your reader’s face.”
ONE WHERE SHE STRAIGHT READS AND ONE WHERE SHE PERFORMS
FIVE – RECORDING IS FIFTY SHADES OF SAVAGE… BUT I KINDA LIKED IT
Okay, down to business how do you record.
You and Your Fashion Diva
Wear cotton fabric. Jogging bottoms, loose jumpers. Don’t wear nylon or jeans or anything that rustles when you move. Remove bracelets and any loose jewelry and watches that tick. And whatever you do, don’t drink milky coffee, don’t eat cereal with milk, or porridge or chocolate unless you’re recording ASMR in which case off you pop sunshine and have fun.
Black coffee, water, and ginger, honey and lemon tea.
I ate oat based flapjacks – not the healthiest but if you make you’re own they’re a darn sight better. If I needed a sip of water, I tended to take it in the booth, I’d leave the recording going as when you edit you need to listen to the whole thing anyway and it’s very obvious from the waveforms where the sound drops in and out. I left hot drinks out of the booth as it gave me an opportunity to step out and rest for five.
Podcasts have lots of stray breaths in them and we accept that because it’s a conversational medium. There’s laughing and piggy snorts and gasps and deep breaths.
But audiobooks are not like that. Now, there’s lots of different methods for dealing with breathing. I listened to the sample of Addie LaRue on audible narrated by Julia Wheelan often breathes on the end of words. I tended to: breathe. Hold and pause for one beat. Narrate. The reason I did that is it put a gap between the breath and the narration which made cutting it out easier in post.
SACHA READS THE SAME EXAMPLE WITH BREATHING AT THE START AND END
“There’s laughing and piggy snorts and gasps and deep breaths.”
I tried to cut as many breaths out as possible. Most audiobooks don’t have many. Though there are always some. Especially in longer sentences. I actually think I may have cut too many out as some of the sentences may be narrated too quickly without that pause. Something to take on board for the next one.
Editing them out is a techy aspect and something you’ll have to wrangle with your editing software. I use Amadeus Pro, and I literally highlighted what I wanted deleting and hit the delete button. Simple as that.
Speaking / Narrating
One of the requirements Karl had was for me to be quiet for a few seconds at the start and end of each track. This allowed him to gather up a baseline “room tone” which he could match the levels for the rest of the audio.
I’m still not quite sure why speaking is so exhausting but I was genuinely surprised at how exhausted I was after sessions. I suppose because I’m thinking about breathing and speaking and performing and not moving or crinkling fabric and not popping P’s and myriad other things.
Anyhoo, on to the technical stuff:
I try to stand about a foot from the mic. I don’t know if that’s what everyone else does, but because I over enunciate, I have to be a little further back to avoid popping P’s.
SACHA GIVES AUDIO EXAMPLE OF POPPED ‘P’ AND ‘P’ WITHOUT POPPING
“This is popping your “P’s” and this is not popping your “P”
Ps, Ths and W’ are a bitch if you over enunciate. They will pop and fry your audio even with a windshield / pop filter, I got to a point where I almost let go of the end of the word so it didn’t over exert the mic.
I also start with my mouth open. So I breathe, open mouth, pause, narrate. This got rid of clicks and clacks from saliva when you open your mouth to start a word.
Speak using your normal voice, make sure the gain is relatively low. You’d be surprised at how low it should be, I was. I paid for a 1 hour consultation with Karl. He got me to record a few minutes of samples and then told me how to adjust my gain and levels to make sure I was producing the audio quality needed for ACX and Findaway etc. This was invaluable. One of the biggest lessons I learned was that I had the mic facing the wrong way. I didn’t have a clue. I must have been podcasting like that too! It would have screwed the audio completely had I gotten all the way through. That consult was definitely the best money I spent in this process.
If I made mistakes while in the booth, I would usually cuss like a violent sailor—something that made for a delightful outtake reel for my Patrons. And then I’d take a breath and say the offending line again and continue narrating. When I wasn’t turning the air blue, I’d blow raspberries or click my fingers as these all created spikes in the waveforms which were easily identifiable when it came to editing.
If you hear dogs, cars, planes or anything in the background, then your mic can hear it too. The number of times I had to stop for fucking dogs yapping at birds was unreal. The rumble of cars on the street 30m away was also audible. I couldn’t believe how powerful the mic was. It truly was a pain in the ass. I tended to record first thing in the morning straight after the school run as the roads were quieter and all the day job folks had fucked off. Kids were at school and the only people left at home were old codgers and creative randos.
This is the most difficult aspect for me. As a podcaster, I’m used to speaking normally—if a little fast—but audiobooks are considerably slower. I’ll be honest and say I still need to work on slowing down. I think if I had slowed down more, I might not have had so many errors to fix in the editing rounds. I don’t have any good advice on this other than to say listen to a lot of audiobooks at normal 1x speed. Buy the book and audiobook combo. Then try to read along (speaking not in your head reading) at the same time to match their pace.
What I will say is that some parts to me (more so in non-fiction perhaps than fiction) felt like they should be faster if the tone I’d implied was faster. Like a run on sarcastic sentence I would speed up for performance purposes.
One of the biggest lessons for me, though, is speed. I do feel if I’d slowed down—not something that’s a natural inclination for me—I may have made fewer mistakes. So this is one I want to work on for the next book.
I tried to limit to 1-2 hour recording bursts to keep my voice healthy. As I’m not a professional narrator, I haven’t built the vocal muscle to go for eight hours. I would always finish a chapter or summary chapter in the session. So sometimes I finished in less time than others. The reason for that is I wanted the background base audio sound to match for each chapter and also, I hate leaving things unfinished so I wanted to end each session with something complete.
I mixed up the long chapters in with the shorter chapters to keep me motivated. I’m sure some people record front to back, but I don’t even write that way so wasn’t about to record that way. Matching a short with a long chapter helped my motivation.
Given this was nonfiction and first person nonfiction. The only real voice I had to do was mine. That said, when I read titles or subtitles I did kind of drop my tone a little and made it slightly flatter to indicate that it was a title and not normal text.
SACHA READS THIS SECTION TITLE IN FLATTER TITLE TONE AND NORMAL NARRATING TONE
“FIVE – RECORDING IS FIFTY SHADES OF SAVAGE… BUT I KINDA LIKED IT”
My natural tone for narrating my own words is very rollercoaster up and down, using the full vocal range. But I used a slightly less rollercoaster tone when reading out quotes. And where there was a silly voice required or a villain voice I’d just go deep or gravelly.
I am no expert at voices, there’s some resources at the bottom, I particularly recommend Storyteller by Lorelei King and the Rebel Author Podcast episode 104 with Jillian Yetter who is a narrator and they both talk about voices.
SIX – EDITING IS SATAN’S FAVORITE FORM OF TORTURE
There I was smug as a fucking button when I finished recording. Off I fucked back to the computer thinking this was about to be done and dusted mate. But oh, fucking no. The editing took longer than the blasted recording. I had to snip out the repeated lines, the fuck ups, the breathing, swearing, drink slurping, burping, belly rumbling.
After the 87 years I spent editing, I rather rapidly learned what the waveforms of individual sounds looked like. This enabled me to pick them out faster. For example, my breathing tends to look like a flat-ish mound. I usually have a click when I start a sentence with a word beginning with an ‘O’ this is a very small “I” vertical stick type shape. Claggy mouth sounds looked like a series of those shapes and then a sort of weird trapesium / bent in the middle triangle shape at the start of words that needed cutting out too.
For me personally, I edited all the mistakes out in post. If I made an error while recording I blew a raspberry or clicked or shouted “wanker” or some other obscenity to blow the audio and cause a little waveform spike. That helped me identify the mistakes more easily when I was back at the computer.
I would open each file and just start listening from the beginning. When there was a problem, a click clack titty whack sound that shouldn’t be there, I’d pause the audio and edit.
Where there were speech mistakes, I would make a cut in the audio. The fucked up phrase was then separated from the good audio. It basically made my audio files look like zebras. But it made my life a lot easier in the booth when I was editing. The other thing I did was sticky tab and underline the sentence in a paper copy of the book. If you prefer digital you could do the same in an ebook. The sticky tabs meant I could flip straight to the error and hit record.
The first time I went to re-record the errors, I jumped in the booth and just hit record. Recording all the errors back to back in one go.
This was an epic failure. Don’t do this.
The tones of my voice weren’t right and didn’t match how I’d said it previously, and somehow the gain had been shunted up. So the levels were off (though some of that can be fixed in mastering) but the mismatched voice was not fixable.
Cue attempt two.
I opened one chapter, listened to the audio right before the mistake, listened to the mistake itself so I knew what was wrong, and then when I was satisfied with what I needed to do—and I knew the tone of voice needed—I hit record, jumped in the booth and re-recorded the error. I made sure to say the phrase 2-3 times just incase the first phrase wasn’t good enough—and often it wasn’t. Then I stopped and jumped out of the booth and listened to what I’d just recorded. I’d cut out the ones I didn’t like and stitched the error back to the good audio. So when there were no more zebra stripes left, the file was finished in edits.
No matter how optimistic you are—and by this point I was borderline hysterical wanting the sodding thing done, your proofer is going to pick up errors. I had proofed as I’d done all the corrections listening from the start of the track to the end. But I was still surprised how many errors there were. Mostly I’d left repeated phrases in the audio from corrections. But there were other mistakes, most often she would pick up where I’d said something not quite exactly as I’d written it in the book.
You have a choice – go back and re-record, or change the ebook. Given my levels of hysteria, all bar three of those errors resulted in changing the ebook. I could not be fucked and unless it changed the meaning of the sentence, I just changed the ebook. Why do you do this? So that the book syncs with whispersync. I’ve been told that some people purposely try to stop whispersync because you earn less per purchase. But honestly, I don’t have enough experience to comment on that.
SEVEN – UPLOADING AND PUBLISHING
This was a world of fun. Not.
Thankfully, I had Karl mastering so he had sorted all the levels and requirements for ACX and the other stores. I had to weigh up the best use of my time and I couldn’t be arsed learning anymore technical stuff.
I did have to fill out all the W8 tax forms again. Always a source of anxiety for me. And I have to say, I found the whole uploading a bit stressful. It reminded me very much of uploading my first book. Most of the back end is more or less identical to an ebook back end. But it’s the whole journey of navigating a new thing that just caused anxiety. So I definitely have a new found empathy for first time writers again.
In terms of distributors, I chose ACX, went direct to Kobo and then used Findaway Voices for everything else. Not everyone will want multiple dashboards, but I use multiple dashboards as a wide author so
EIGHT – GREEN EYED DOUGH MONSTER
Cost of Production
The booth build came in at around £250-300 pounds. The most expensive part was buying sheets of MDF and the audio panels. The panels were around £100, including VAT and shipping. The MDF was a little over £100. And then I already had the carpet scraps and the foam. When I threw in screws, wheels and odd bits, the total totted in around £250-300.
With my microphone at £130 and I paid around £50 ish for the audio software, although you don’t have to do that as there’s free software like audacity.
I also paid around £100 for consults and mastering.
I’m sure I’m forgetting things, but for the audio set up, it came in under £500. That isn’t counting the endless hours I spent doing it. But I won’t have the set up costs to pay for again next time.
I struggled with pricing. While Findaway does give pricing suggestions, I felt they were more representative of fiction pricing. And also not so representative of what I found on the audible store. In the end, I asked friends and went to the US audible store and checked what audiobooks in my genre with a similar length in time were priced at. Then I matched their pricing.
This was harder for library pricing (an option on Findaway) you don’t get distribution to libraries if you’re exclusive to audible. I did try to google and looked in a few books but came up with no resources. I followed a similar model to ebook library pricing which was 2-4 times the amount of one copy. I plumped for double because I’d rather the price be slightly lower for my first audiobook to encourage library borrows than worry about smashing bank.
There’s one other distribution mechanism I want to do, which is selling direct. I recently purchased Joanna Penn and Mark Lefebvre’s The Relaxed Author in audio direct from her website and was surprised at how smooth the interface was.
To do this, it will cost an additional $20 a month using Bookfunnel. So I’m still trying to understand whether to do this now or wait until I’ve got a second audiobook out and it’s more likely that the $20 will be covered. Once you start these subscriptions it’s hard to undo them because you establish a precedent that your audience can buy from certain places—another reason to be cautious when hopping from wide to exclusive and back again.
WAS IT WORTH IT?
Until I’ve had income come in, I can’t really tell you whether this was financially worth it, but I will update on that once it’s been out for a while, or perhaps after I’ve published another one or two and I’m able to do discounts and things, then I can really get a lay of the land financially.
What I can tell you is that it was invaluable to learn a new skill. If you like learning, I highly recommend doing it even if you do it once. I also loved the performative aspect in the booth. It was fun and playful and I hadn’t felt that buzz of getting to play while working for a while.
Would I do it again for nonfiction? Absolutely. Would I record fiction? Not sure. I think I need to practice voices more, but as yet, I’m not sure I have the confidence to do that, and not sure whether the amount of time I’d have to expend would be worth it. If I get to this point, I might look to employing an editor so I only have to be in the booth as I think my time might be better spent wording or being a voice than the countless hours I spent in post.
It probably, in all honesty, took me far longer than it should have to produce, edit and upload the audiobook. I hate to admit that because I really thought it would be faster. Lots of estimates suggest 3 hours for every 1 hour of finished audio. Given 13 Steps to Evil is 4.5 hours, that should have been around 13.5 hours. Maybe 1-2 working days.
There’s no fucking way it only took me that long. Granted this was the first time I’d done it and I made a bazillion mistakes. But I know I spent two full working days editing on-screen, because I did it in one big chunk. Plus there was a smattering of hours around that for on-screen editing. Recording I don’t even know. I did it in such small chunks and didn’t record the times so I don’t know. Re-recording edits took probably one and a half full working days. I did a big chunk of four hours on the last day and spent a week doing 2 hours every morning. So that sounds about right.
This is not a small endeavor to pursue. Perhaps for more efficient and effective people it might be fast. But I think I’ll also be much faster next time, so that’s some reassurance at least. Audio is sticking around and growing year on year, so I do think the investment is worth it.
BONUS – STRATEGIC QUESTIONS TO ASK YOURSELF
Going all in to narrate and produce your own audiobook is full on. So here are some of the strategic things I weighed up before committing to this process:
- Which are my best selling books? I started with villains. Villains and Prose are evenly matched but villains was shorter and felt like the easier option to start with. Prose will be next. I started with the best selling books because they are the most likely ones to sell in audio.
- Did I have a history of selling that book well and consistently? Yes, therefore the time and money investment would likely be returned on audio too. Heroes is my worst selling nonfiction book and so that will be the last audiobook I do. If I’m going to invest the time in narrating I would like to see a return on that investment
- Do I have some level of technical capability? You don’t need much. If you can publish a book and work out all the dashboards and fire fight your way through formatting then you can edit and produce an audiobook.
- Am I interested in doing more than one audiobook? You may only want to do one, and that’s cool. I was interested in narrating all my nonfiction and I was happy to give up the time to do that. So setting up properly and making a plethora of fuck ups on the first one felt like the right thing to do because even if the first audiobook didn’t recoup its money, the second or third might—much like publishing books.
- Do you like performing or being a little bit silly? This is important because narrating is hard work, and if you’re not interested in that performative aspect it might not be for you, and that’s okay, your time is better spent elsewhere.
- Do you have an established audio audience? Now, not everyone will and this isn’t a do or die question. But as a podcaster, you can safely assume that those that listen to podcasts in audio are also likely to listen to audiobooks, therefore you have a head start with audience building.
- Are there any parts of this process you can outsource? Obviously if you’re narrating you can’t get rid of that, but can you have someone else do the mastering? Could you pay someone to edit for you? Would an upfront consult work in terms of helping you establish the correct set up be worthwhile?
That’s it. I hope you found this a useful bonus episode, and if you did, please do share it with friends or other creatives who might find it useful.
If you’re interested in listening to the finished audiobook, then you can purchase 13 Steps to Evil: How to Craft a Superbad Villain from all the usual places or request a copy from your library.
FURTHER RESOURCES FOR YOU
Narrated by the Author: How to Produce an Audiobook on a Budget by Rene Conoulty
Audio for Authors by Joanna Penn
Storyteller: How to be an Audiobook Narrator by Lorelei King
Writing for Audiobooks: Audio-First for Flow and Impact: Author Advice from Radio Writing by Jules Horne