Audio Books – How to Get Your Audio Approved by ACX and Amazon: a Complete Step-By-Step Guide

Turtle Beach desk microphone

UPDATE: Thanks to an email from a reader, this page has been revised to include that files should be saved in WAV format while working with them, even if your final upload to audible is MP3.

Hi guys. My Audio for Finish the Damn Book! was 99% approved on the first try, and I did it without studio time or soundproofing. I’m going to show you how I did it, and how to get your book released. This is a comprehensive instruction set, so get your notebook out and be ready to jot down some ideas. if you are interested in doing audiobooks and you have your own recording office at home, you might be interested in something similar to Thinktanks for more information on how to soundproof your office.

If you’ve thought about doing an audio book, then this guide is for you. I didn’t pay for studio time, or an expert digital mastering wizard. I did it all myself. I didn’t even soundproof the room, and I’m going to walk you through the process of creating an audio book. It isn’t easy, and it isn’t fast, but it worked for me. The only issue I had initially was the sound levels. Whenever I went to listen back to audio I had recorded, it would play back quietly, even with the sound turned up. So I had a look at installing a new audio driver which happened to do the job. Other than that, everything else went well.

I submitted Finish the Damn Book! to ACX a couple weeks ago, and I was expecting the worst. They review each book individually, and listen to all of the audio for blatant errors and problems. Much to my surprise, the report that came back listed two things. 1. I needed to make sure my name was on the cover logo (this was omitted because I simply captures a square from my cover that had the title on it). 2. ONE of my files had some annoying background noise that I missed. In fact, it was a 60 second snippet marking the first section of the book, so I simple re-recorded it. I’m not going to upload the new files until next week because I’m worried about the audio book releasing before the ebook, which would be weird, and there is no way to control the launch of your audio book on ACX. At least not at the time of this writing. (I called them this morning)

That said, I have every confidence that Finish the Damn Book! will be released on Audible, and that’s pretty goddamn awesome. I actually have a system in place now (after hours and hours of shuffling around, for getting audio books to work. So, enough of this nonsense, let’s get this party started.

Equipment Needs

The most important piece of equipment you can have is a program called Audacity. Don’t argue with me on this, it’s free, and I’ll be using it for the remainder of this tutorial. Now, I’m running Linux Mint, so I simply opened a console a did a “sudo apt-get install audacity” and my computer took care of the rest. If you are running a Debian-based distro, it should be in your repositories. For the rest of the world that has no idea what I just said, the link above has download files for Mac, Windows, and Linux, so it will run on just about anything. There’s probably a phone app in the works, that wouldn’t surprise me one bit.

I’m going to talk a little bit in this guide about not having studio-grade equipment, but I would say get a microphone. Something capable of directional recording. I’m really happy with my Turtle Beach mic, and it was about $100 at Best Buy. You can get one cheaper on Amazon. In fact, this thing worked magic for me, and I did all of my recording in “blue circle” mode, which is actually a non-directional setting. It has several modes, so I pulled up my audio settings, went to the microphone tab, and swapped modes until I had the lowest amount of ambient noise.

Recording was so easy, in fact, that I started wondering if the mic was even necessary. I recorded a little snipped with my laptop mic, and I can assure you, a good microphone is absolutely necessary. You can get this from Graham Slee HiFi as part of a headset or whatever, but please make sure that you drop a little money, or borrow a mic from someone who has a good one.

Then there’s the pop filter. You probably want one of these. I tried a couple things that didn’t work, because I didn’t want to wait on shipping. A pop filter is basically a fabric circle that goes between your face and the microphone, and reduces sound spikes that can happen when a word ends in ‘p.’ They will probably cut down on clicks and stuff too. I didn’t have one, so I had to move the mic to the side and aim it to where it wasn’t pointed directly at my mouth. That’s a little hack, and it worked okay for m not making promises. I was also very conscious of popping noises while recording. I honestly don’t know enough about these to recommend one, but I assume they are mostly the same, get a cheapie.

With soundproofing, you can get as carried away as you want. They make little boxes of sound proof stuff that you can put around your microphone to cancel out some noise, or you could staple foam all around your recording area, and lock yourself in a little foam pod in the middle of the already soundproof room. Check out Soundproof Panda to find out more ways you can soundproof a room. I’m going to show you how to eliminate a lot of ambient noise, but in general, find the quietest spot you can. I went in my writing lab, shut the door, and put duct tape over the vent. It worked well enough, and the mic didn’t pick up the chirping birds outside, so I may have dodged a bullet there.

Quiet as you can get is what you want.

Set up Audacity

This is something that you don’t want to screw up, so I’ll include screen-shots. Audible is very picky about what they want, and if you don’t give it to them, they will reject your files. (I couldn’t upload my first batch because I exported the files wrong, luckily a minor hiccup). Install Audacity real quick, and then come back.

The first thing that you want to look at is your filters. Make sure that you have them. Mine installed automatically, iirc, with what I needed, but you can download filter packs if any of these are missing. Click on the “Effect” menu item and ensure that you have the following filters: Amplify, Compressor, and Noise Reduction. Those are the big needs. Equalizer, Change Tempo, and Click Removal are nice to have, too, but not absolutely essential, and to keep this guide short, I won’t be covering them in detail.

You probably have these settings without needing to do anything, so we’ll move on (if one is missing, then Google search Audacity plugins). Next go to Edit->Prefrences or hit Ctrl-P

Go to quality, and set the sample rate to 44kHz as shown. This is an Audacity requirement, so it’s important.

Select “interface” on the left and check your meter dB range. -60 works because Audible requires a sound floor of -60. -72 might be better, and if I had to do it again, I would definitely pick a lower setting. What this does is change the meter readout so that you can see finer sounds. I’m setting it at -84 for this tutorial, but I used -60 when I recorded my book. If it’s at -60, then your meter should be dead when you aren’t speaking after you finish processing. Even if you have a little ambient noise while recording, you can fix it, but too much will be a problem, so we’ll do a test run before you start recording chapters.

The next thing that you want to check is on the panel itself. We are looking at the input devices. Just above the timer bar, the far right option, is your input device. Make sure that this is your microphone and not your PC onboard mic. You can usually tell the difference in the amount of noise coming through on the dB meter, and as you can see by that green bar on mine, I’m not writing this blog post from a soundproof room.

You can set the other bar to either stereo or mono, both are acceptable, but use only one or the other for all of your tracks. Double check the Project rate in the lower left-hand corner, it should be 44100. That’s it, you’re ready. Let’s record something. Press the red dot. I’m going to record a line from the Hitchhiker’s Guide to the Galaxy. I’m going to stop and record a “room noise” track as well. All you have to do is fast forward to the end of the recording and hit the record button again.


This is the scary part, and on a long piece of story you are going to make some mistakes. My recommendation. If you mess up, then clap your hands in front of the mic, take a pause and a breath, and then start again on the last sentence or paragraph. I sometimes stop the recorder and start a new track, and then merge them later. When I do this, I will click and drag over the mistake, and then Ctrl-K or Ctrl-X to remove the unwanted fragment. Then I start again.

Audible wants each labeled section of your book to be a different file, so you will need a different audio file for your title and copyright info, the forward, each chapter, afterward, bio, and each appendix. Basically, if it’s a headline, or if it has a mention in the table of contents, then it needs to have it’s own file. I recommend listening to a couple audio books with the hard copy right in front of you. That’s pretty much what I did. Make the intro sound like other audio books, and check out Audible’s guidelines. They are super helpful.

To combine fragments, simply select the tracks you would like to merge (or hit Ctrl-a to grab all of them) and then go to Tracks->Mix and Render. It might take some time to figure out your preferred method of lining things up for editing, so again, do lots of short bits first, and get into a post-processing rhythm. Experiment with moving the microphone around, reading speed, etc, until you are comfortable enough to start logging your book. The tempo feature I mentioned earlier can help if you read a certain section too fast or too slow, but don’t get too carried away with it.

That’s really it. Recording (believe it or not) is the easy part. Now, let’s combine these tracks, and get on with the demonstration…

Generic Editing

Okay, so this is step one for me of post processing. I’m playing back the dead noise section to get an idea of how much noise is floating around me in this living room. My noise floor is hovering around -54 with a spike up to -30 (probably a mouse click). You can play a small section of track by hi-lighting what you want to hear, and hitting the green play button. If you are listening for noise, it’s good to crank your computer speakers all the way up so you can catch the clicks. There’s quite a few visible in the above picture.

I’ve found click removal to be quite effective, but I’m not going to apply that filter now because I want to show you how to remove them manually, just in case, but in general, click removal would have saved me hours of editing time, and it’s a really awesome feature.

So, the first thing that you want to do is get rid of that pesky background noise. Remember, Audible wants a sound floor of -60dB for room noise, so anything over that in white space is bad, okay?

First, I need a clean section, so I’ll zoom in (hold Ctrl and wheel-mouse it) and kill some of these clicks in my white-noise section. Highlight a tiny area around the click, and cut it from the recording.

I prefer to use Ctrl-K for this rather than Ctrl-X, because ‘x’ copies whatever you remove to the program’s clipboard. Actually, if you can get a nice long stretch with no clicks, then that is best for noise reduction. 3-5 seconds should do for most ambient noise.

Highlight your room noise, the flattest part of your recording, and then go to Effect->Noise Reduction. You are going to click the little button near the top of the pop-up that says “Get Noise Profile.” Now, prepare yourself, because something weird is going to happen. The pop-up is going to disappear, and it will look like nothing happened. The coders could have done better with this part, but it’s okay. Audacity has captured your room noise selection. Now use Ctrl-A to select the entire track, and go back to the Noise Reduction menu item. Make sure that “Reduce” is checked at the bottom of the popup, and click “Okay.”

The longer your track, the longer this will take, but when it finishes your room noise should be gone. At least, most of it. Now let’s check our sound floor again. The bottom meter is for playback, you can see that this has cut my ambient noise level to around -63 to -66 over the “room noise” section, and that filter is applied across the whole file, so I’m good with Audible at this point, and this wasn’t in a particularly quiet room. (sorry, I didn’t have it playing when I took the screenshot, but there is a blue line on the meter that shows the loudest point in that bit of recording after you play it) This filter will also not get rid of barking dogs, chirping birds, or Amazon deliveries showing up outside, so if you hear annoying noises while recording, it’s best to stop and re-record that section. Actually, I’m pretty sure this has removed some annoying noises from my audio before, but definitely don’t bank on it.

If you are still over -60, then you need to find a quieter room. Don’t confuse that with clicks. Clicks can be killed just like I showed you. Stretch the window as big as it will go, so that you can see them, and zoom in a bit. Clicks in room-noise are pretty easy to find and eliminate, and click removal will get rid of most of them automatically.

Now the Tedious Part

Now that part is done. You need to go through the whole track and remove unwanted breathing noises, and any left-over clicks. Sometimes clicks will happen in your speech, and those are bloody annoying to get rid of. My trick on those is to zoom way in. The sound will look like a sine wave, and what you want to do is either highlight it and apply Effect->Amplify which I’ll discuss in a second, or remove one whole wave (one crest and one trough) where it looks “scratchy.” Removing talking clicks is tricky and will take some experimentation, but as you get comfortable with the program you can learn some tricks. You can also avoid these pesky clicks in the first place by being hydrated property while recording. I haven’t found the proper mix yet, but if your tongue is too dry or too wet, then you are going to get strange artifacts in your speech. You also need to take frequent breaks from talking to get your voice back to normal.

For the breathing parts. In the waveform, these look like fuzzy spots where there should be a straight line. Unfortunately, my sample her was the first time ever that I didn’t get a breathing noise in my file, go figure. Anyway, if you watch the waveform while you listen, you will see them. They are like little caterpillars, and once you get good at spotting them, you can clean up a whole file without even listening to it. I find that comes after about 10 hours of audio editing. This is how I fix them.

I highlight about 0.3 seconds of room noise, and I copy it. Then I got to each breathing bit, and highlight just the breath. I paste my room noise with Ctrl-V, and presto, no more breathing. There is one warning which you should definitely be listening for until you get the hang of this. F’s will make one of these little caterpillar spots before normal speech-looking waves, and S’s will make one after a sentence. I draw out my snake-letters a bit, so sometimes I will cut around half of these little caterpillars the same way, but you need to be very careful with the cut and paste features. They can destroy a track very quickly. Make frequent back-ups until you figure out what you can get away with, and zoom in close enough to see what you are cutting. It’s going to take some time, so don’t rush it.

The reason I said 0.3 seconds, is because most pauses sound best to me at 0.2-0.5 seconds. I will pick a range that sounds good for a comma break, and I will actually use it to space my sentences. One paste for a comma, two for a period. It works pretty well, and it’s all room noise, so it sounds clean. This lets me correct some of my pacing issues while I remove clicks and breathing.

Also, keep an eye out for particularly noisy speech in the file. If one word is really loud compared to everything around it, highlight the word and select Effect->Amplify. I only use the entry at the very top. I will set it at -1, -2, -3, -4, or -6, depending, and apply the setting. The peaks of the waveform should be somewhat consistent. I don’t worry about the quieter parts, as long as they don’t sound too quiet when I play back the audio, but the loud parts will mess up not only the listening experience, but some of the following steps. Trim them down with Amplify so that you don’t have one or two spikes that are louder than everything else in the file. Fixing these is actually much easier than it sounds. Don’t be afraid to experiment with Amplify. Just do it, and if you don’t like the results, then Ctrl-Z and try again. Save you file before you start messing with it. In fact, save your file as often as you possibly can. I’ve had some crashing problems on the laptop with this program, specially when Facebook is open on my browser.

All Done? You Wish! Post Processing

Listen to the track again. Chances are you missed a breath or a click somewhere. Again, this is a time consuming process, but if you don’t want your audio files rejected, then you need to do it. I spent 30-40 hours editing 2 hours of audio. I could probably do it a lot faster now that I’m learning Audible a little bit, but it’s a steep learning curve, and there is an art to it, so learn how to paint that audio file.

But the fun isn’t over yet. Just like it takes a lot of work to get camera settings and framing right for a picture, post processing is where the magic happens. Lucky for you, I’m going to make this simple. Select the whole track, and go to Effect->Compressor. This is going to normalize your file, and fix a couple of other minor issues as well. I’m not sure of everything that it’s doing, but it’s audio-book sorcery. The biggest thing to look for here, is the overall shape of your waveform. In general, the same zoom level on each of your files should look more or less the same. This is where those annoying oversize peaks are going to kill you. If you have one spot in the file that is too loud, then compressor will use it as a cut-off, and the rest of that track will be lower volume than the rest of your book. That’s very annoying to readers.

Also, if your audio file isn’t very loud to begin with, you might want to use Normalize instead of, or at least before, Compressor. Normalize just changes the volume of the whole track so the peaks are closer to the loudest parts.

As far as settings, leaving things default works pretty well, most of the time. When you do a Normalize effect though, you want to watch out for your noise peaks. Try to keep them under -3dB. I don’t think Audible will eat your lunch for this, but make sure that setting isn’t too high, like -1dB. The louder the cut-off, the more chance of distortion.

Another thing you want to do is trim your opening and closing noise. There’s some wiggle room on this, but I make sure to have 1.5 seconds of room noise at the beginning of each file, and 2.5 seconds at the end. It doesn’t have to be exact, just be consistent. Copy, cut, and paste that clean room noise to fill the gaps, and check it again for tiny pops and clicks, even if you don’t see them on the wave form. Audible will be listening to this white space with bat-like hearing. Trust me. Anything over -60dB, and you are probably going to be informed about it.

Did I miss anything? Probably, but that should be enough to at least turn you into a waveform editing machine for a while. Let’s move on to exporting your audible copies.

Export as MP3

This is important. They won’t accept .wav files. I tried, lol. I’ve updated this section to include this important point. A member from the Audacity forum wrote me about this post with the following:

You can always convert from perfect quality WAV to lesser quality MP3 (or anything else). Once you have MP3 compression distortion, it’s permanent and editing an MP3 in Audacity will create more.

Therefore, the settings below are for the FINAL export to MP3. And in fact, another reader mentioned in the comments that exporting to MP3 on a Windows machine may require an additional plug-in. Here’s the thing. There are a thousand programs out there to rip WAV files to MP3, and WAV files are higher quality (they also take up much more hard drive space). Heeding the suggestion above, it would be prudent to do all of your fiddly work and mastering with WAV files, and remember to save often in case you get sudden glitches or crashes.

This is actually what I did when I mastered FTDB, but I did it with the intention to publish the WAV files. Good to know that I was following a best practice, and it’s really good advice. If you want to put your audio book out in the future to another platform (especially a CD or magnetic tape) then WAV files are going to be what you want for the best possible quality. Ever notice when you rip a CD that the MP3 files are much smaller than the record space on the original purchased CD? Yeah, that’s why. MP3 and MP4 are compressed formats, like the difference between a JPG and a RAW or PNG file for pictures. It looks nice at a glance, but when you go in to start fiddling with individual pixels, you’ll run into problems.

When it is time to do the final MP3 export, these are the settings you want. Check everything on this window, or your audio isn’t going to pass the upload requirements.

I’m recording as stereo, so if you have a mono file, your channel mode will likely look different. Again, more important to be consistent than to be right on that front. But everything else here is chosen for a reason. You need Bit Rate Mode set to Constant and Quality set to 192 kbps. Less and they won’t take it, bigger and your files are going to be unnecessarily huge. I’m not just talking about longer uploads here, I’m also talking longer downloads for your readers. Set it at 192, don’t argue with me! 😛

The upload file type needs to be mp3. Double check to make sure that ACX and Audible haven’t updated this requirement. You might also notice that most of my filenames start with a two-digit number. That’s so I can make sure they stay in order. That part isn’t a requirement, but a habit formed in a decade of working as a data-monkey. I use a similar file structure on many of my draft files when I’m writing a book. If you want to adopt that tactic, and you have 80+ chapters, you might want to use 3 digits instead.


That’s pretty much it. Well that and hours of editing, recording, and mastering. If you have a fiction book that you want to record, you will also need to be careful about voices for different characters and such. I’m pretty sure I will never do the audio for my own fiction, or anyone else’s, like, ever. Well, maybe a flash piece or something, I dunno. This process was troublesome enough for non-fiction. But I learned a lot doing it and wanted to share. Hopefully this will help some of you looking to produce your own audio book on ACX soon.

I wanted to make this guide, because when I started poking around for information, there were a lot of things that were unclear, and even more that were regurgitated bullshit freelance content from word mills. Hopefully this was informative and clear, and more than enough to help you figure it out. Then end-game here is that my system isn’t the best in the world. This should be considered a primer at best, and a minimum viable product. This worked for me to get my book listed on ACX, so I know the process works, but many portions could probably be improved quite a bit. Still, this is something to let you know that you don’t need to dump thousands of dollars in studio time (though if you have it, it would help) or have the greatest equipment on Earth. Audacity is free, it’s powerful, and it has so many features that I haven’t even touched on, here. Learning to use it well will make whatever equipment you can scrounge up work 10 times better.

It’s funny saying that. Because on the phone with ACX today, the guy actually mentioned that my very low error rate was pretty amazing for a first try, and that he’d spoken with people who did the home studio thing, spend thousands of dollars, and had to resubmit their files over and over and over again, sometimes barely getting them published at all. The thing is, the equipment alone won’t do it, but it’ll definitely help. I blame my mic for the ease at which I got this done, and I swear it’s been special made to ignore dogs and birds (though I haven’t fully tested this theory). If I had the time, energy and money to soundproof my little office, I’d still have to deal with horn beeps and airplanes flying overhead.

Each piece of the recording process is important. Doing the best you can with each piece will make the experience easier. I definitely recommend looking up some voice advice online about recording audio books. There are some tricks out there that you can do. In my case. I learned the software, and invested in the best mic I could find at local stores, without going to a music shop or a $2000 studio mic, of course. I spent time tinkering with and learning Audacity. I experimented with clipping files until I found some tricks that worked for me. My voice isn’t perfect, the recording isn’t perfect (I’m sure there’s no such thing as a perfect recording), but I just accomplished something that many, many people fail at day after day, and my total budget, aside from the laptop, was about $100 and a book to read.

The Audio version of FTDB will be out sometime after the Ebook releases. I’m not sending in the final revision until next week, so that I don’t risk it being released early. I talked to ACX about that too, and made a suggestion of one possible fix for that dilemma.

Good luck with your stories. Again, I hope this was helpful. If you have any questions let me know. I’ll try my best to answer them.

Share me

Author: spottedgeckgo

Writer. Making my living on my pen, and working to turn a raw chunk of land into a future homestead.