Audiobook Recording and Technical Specifications for Self-Narrating Authors

Self-narrating your audiobook requires a clean recording environment, appropriate microphone, basic DAW skills, and files that meet ACX and Findaway's strict technical specifications. This guide covers every step from setting up your recording space through mastering and packaging your final files for delivery.

Updated on June 18, 2026 by Randall Wood

Audiobook Recording and Technical Specifications for Self-Narrating Authors - Image

Audiobook Recording and Technical Specifications for Self-Narrating Authors

Self-narrating your audiobook eliminates narrator cost and gives you complete creative control over the reading experience. It also requires skills — recording, editing, and mastering audio — that most authors are not starting with. The learning curve is real, and the technical specifications for audiobook delivery are strict.

This guide covers the complete self-narration workflow: setting up a recording environment, selecting and using a microphone, recording and editing in a DAW (Digital Audio Workstation), mastering your files to meet ACX and Findaway Voices specifications, and packaging your final files for delivery.

This guide covers the technical production side of audiobook creation. For manuscript preparation, pronunciation guides, front and back matter decisions, and reviewing finished audio, see HT27 (How to Format Your Manuscript for Audiobook Production).

Is Self-Narration Right for You?

Self-narration is the right choice if you:

  • Have a voice that naturally suits your book's genre and tone

  • Are willing to invest time in learning recording basics — typically 3–5 hours of recording/editing per finished hour of audio

  • Have a quiet recording space or can create one

  • Write nonfiction, memoir, or narrative nonfiction where author voice is part of the product

Self-narration is harder to pull off for:

  • Fiction with large character casts requiring significant voice differentiation

  • Authors whose natural reading voice does not match their book's genre energy

  • Authors who produce at high volume where the time investment becomes prohibitive

If self-narration isn't right for your specific project, the HT16 (ACX) and HT17 (Findaway Voices) guides cover working with professional narrators. Both paths are legitimate — the goal is the best audiobook for your readers, not any particular production method.

The Technical Requirements You're Working Toward

Before setting up equipment, understand the specifications your finished files must meet. These are non-negotiable — ACX and Findaway Voices both run automated QC and will reject files that don't comply.

Field / Spec

Value / Requirement

Notes

Format

MP3

Required for both ACX and Findaway delivery

Bitrate

192 kbps minimum, CBR

Constant bitrate — not variable bitrate (VBR)

Sample rate

44.1 kHz

 

Channels

Mono

Stereo accepted but mono is standard for narration

RMS (loudness)

-18 dB RMS average target

Some platforms accept -23 to -18 dB range

Peak level

-3 dB maximum

No clipping — hard ceiling

Noise floor

-60 dB or lower

Background noise measured in silence between words

Opening/closing silence

0.5–1 second room tone

Each file starts and ends with recorded silence

File naming

Sequential, consistent convention

e.g., TitleAuthor_Part01.mp3


These specs are your mastering targets. Every decision in your recording and post-production workflow is aimed at producing files that hit these numbers. Keep them visible while you work.

Step 1: Your Recording Environment

The recording environment is the most important variable in home audio production. A mediocre microphone in an excellent room produces better results than an excellent microphone in a bad room. Address your space before you buy equipment.

What Makes a Room Bad for Recording

Hard, parallel surfaces (walls, floors, ceilings) create room echo — the reverberant sound that makes a recording sound like it was made in a bathroom. Open rooms with lots of glass and hardwood sound terrible. The problem is reflections: your voice hits the walls and bounces back to the microphone, creating a smeared, echoey sound that is not correctable in post-production.

Room echo is the primary reason home recordings fail ACX's quality review — not microphone quality or technical issues.

Creating a Treated Recording Space

You do not need a professional recording booth. You need a space with enough soft, sound-absorbing material to prevent reflections from reaching your microphone:

  • Closet full of hanging clothes — genuinely one of the best home recording environments. Clothes absorb reflections on all sides. Record inside the closet with the door closed or mostly closed

  • Tent or blanket fort method — build a small enclosure around yourself and your microphone using moving blankets or thick comforters. Theatrical overkill-looking but acoustically very effective

  • Mattress recording — place a mattress vertically behind you, hang blankets to the sides, and record in the enclosed space

  • Treated corner — cover the corner behind your recording position with moving blankets or acoustic foam panels; record facing the treated corner

Test your space before recording: clap sharply once and listen. In a bad room, you'll hear a metallic, echoey ring after the clap. In a well-treated space, the clap sounds dull and dead — that's what you want.

⚠ Room echo cannot be reliably removed in post-production. Plugins that attempt to remove reverb damage audio quality significantly. Treat your recording space before you record. If your first recording test sounds echoey, fix the environment — not the recording.

Background Noise

The noise floor specification (-60 dB or lower) measures background noise during silence between words. Sources of background noise to eliminate:

  • HVAC and air conditioning — turn off before recording sessions; record when the system is off

  • Refrigerator hum — move away from kitchen; close intervening doors

  • Computer fan noise — position your computer outside the recording space if possible; use a laptop if it runs quieter than your desktop

  • Street noise — record during quiet hours; use your closet or treated space which attenuates external sound

  • Phone notifications — silence all devices

  • Pets — record when they are settled; have them in another room

After setting up your space, record 30 seconds of silence and measure the noise floor in your DAW before beginning any narration. If the noise floor exceeds -60 dB, identify and address the source.

Step 2: Microphone Selection

Authors self-narrating on a limited budget often overthink microphone selection. For home recording, a quality mid-range microphone in a treated space consistently outperforms a premium microphone in an untreated space. Solve the room first.

USB Microphones (Recommended for Most Authors)

USB microphones connect directly to your computer without additional equipment. They are the simplest setup for authors who are not planning a long-term audio production operation:

Field / Spec

Value / Requirement

Notes

Audio-Technica AT2020USB+

~$150

Consistently recommended for voice recording

Blue Yeti X

~$150

Popular; USB; multiple pickup patterns

Rode NT-USB Mini

~$99

Compact; good frequency response for voice

Shure MV7

~$250

USB + XLR; USB for simplicity now; XLR for upgrade path


XLR Microphones (For Authors Committed to Audio Production)

XLR microphones require an audio interface (a small hardware device that connects the microphone to your computer via USB and handles the analog-to-digital conversion). The additional equipment adds cost but provides better audio quality and more upgrade flexibility:

Field / Spec

Value / Requirement

Notes

Audio-Technica AT2020 (XLR)

~$100

+ Audio interface: Focusrite Scarlett Solo (~$120)

Rode NT1

~$250

Low noise floor; excellent for voice recording

Shure SM7B

~$400

Industry standard for podcasting and narration


For authors narrating one book to test self-narration, a USB microphone is the right choice. For authors planning to narrate multiple books, the XLR path with an audio interface produces better results and provides a cleaner upgrade path.

Microphone Technique

How you position yourself relative to the microphone matters as much as which microphone you use:

  • Position the microphone 6–10 inches from your mouth at approximately mouth level or slightly above — pointing slightly down toward your mouth prevents plosives (explosive P and B sounds) from hitting the capsule directly

  • Use a pop filter (a mesh screen between your mouth and the microphone) to reduce plosives — $15–20 at most audio retailers, or make one from a nylon stocking stretched over a wire hanger frame

  • Never position the microphone directly in front of your lips — offset it slightly to one side or angle it to reduce P and B plosive impact

  • Record your first chapter with the microphone at a test position, then listen back and adjust before continuing

Step 3: Digital Audio Workstations (DAW)

A DAW is the software where you record, edit, and process your audio. You don't need the most powerful option — you need one you can learn quickly.

DAW Options for Self-Narrating Authors

Field / Spec

Value / Requirement

Notes

Audacity

Free, open-source

Widely used; functional for basic narration; steeper learning curve than paid options

GarageBand

Free (Mac only)

Excellent for beginners; clean interface; professional results

Reaper

$60 (discounted license)

Affordable; highly capable; widely used for audiobook production

Adobe Audition

~$55/month (Creative Cloud)

Professional; steep cost unless already on CC

Logic Pro

$199 one-time (Mac only)

Professional; excellent tools; high upfront cost


Start with Audacity (free) or GarageBand (free on Mac) before investing in paid software. The skills transfer between DAWs — learn to record, edit, and process audio in one, and you can use the same workflow in a more powerful DAW later if needed.

Step 4: Recording Your Audiobook

Recording Setup in Your DAW

  • Set your recording sample rate to 44.1 kHz and bit depth to 24-bit (you'll convert to 16-bit and MP3 for delivery, but record at higher resolution)

  • Set your input gain so your voice peaks in the -12 to -6 dB range during recording — never allow the recording level to touch 0 dB (clipping)

  • Check your noise floor in 30 seconds of recorded silence before beginning narration — verify it's well below -60 dB

  • Create a new audio track for each chapter file

Recording Technique

  • Read from your audio script — not from your printed book. Have the audio script visible (second monitor, tablet, or printed) with scene breaks and front matter decisions already marked

  • Do not try to record a full chapter in one pass without errors. Record in natural sections — complete thoughts, paragraphs, or scenes — and re-record any stumbles immediately

  • For stumbles: pause, take a breath, skip back a sentence, and re-read from a clean start point. Leave the mistake in the recording — you'll edit it out in post-production

  • Maintain consistent distance from the microphone throughout — backing away changes your tone quality

  • Drink room-temperature water during sessions — cold water and dairy products affect vocal quality

  • Limit recording sessions to 90 minutes maximum before taking a break. Vocal fatigue affects performance quality before you consciously notice it

Step 5: Editing Your Recordings

Editing removes mistakes, stumbles, long pauses, and background noise intrusions. This is where the difference between a rough recording and a professional-quality chapter file is made.

Basic Editing Workflow

  • Listen to the chapter in full with your script alongside — follow along and mark any sections with issues

  • Cut any false starts and re-reads, leaving only the best take of each passage

  • Remove long pauses (more than 0.5–1 second of silence where there should be none) by selecting and deleting the excess silence

  • Cut any significant background noise intrusions — a car horn, a dog bark, an AC unit cycling on mid-sentence

  • Do not remove all breathing — some breath sounds are natural and humanizing. Remove only heavy, distracting breaths before a sentence begins

Noise Reduction

If your recording has consistent low-level background noise (HVAC hum, computer fan), apply noise reduction using your DAW's noise gate or noise reduction plugin:

  • In Audacity: select 0.5 seconds of pure background noise (recorded before your narration starts), run Effect > Noise Reduction > Get Noise Profile, then select the full track and apply Noise Reduction at a conservative setting (8–12 dB reduction)

  • In Reaper: use the ReaFIR noise reduction plugin with a similar profile-capture workflow

⚠ Apply noise reduction conservatively. Aggressive noise reduction (excessive dB reduction) creates a metallic, artifact-laden sound that is immediately obvious to trained listeners. A slight residual noise floor is preferable to an over-processed recording. If your noise floor is consistently above -60 dB even after conservative noise reduction, the solution is a better recording environment — not more aggressive post-processing.

Step 6: Mastering to Specification

Mastering is the final processing step that brings your audio to the required technical specifications. The three key targets: RMS loudness (-18 dB), peak (-3 dB maximum), and noise floor (-60 dB or lower).

Step 6A: Normalize Peak Level

Peak normalization sets your loudest moment to a specified level. In your DAW, apply peak normalization to -3.0 dB. This ensures no audio exceeds the -3 dB ceiling required by ACX and Findaway.

Step 6B: Check and Adjust RMS Loudness

RMS (Root Mean Square) measures average loudness — the perceived volume of your narration. After peak normalization, check your RMS level:

  • In Audacity: Analyze > Contrast — measures RMS of selected audio

  • In Reaper: the SWS Loudness meter reads RMS

  • Target: -18 dB RMS (the center of the acceptable range)

If your RMS is too low (quieter than -18 dB), your narration is too quiet. Apply amplification (Effect > Amplify in Audacity) to bring the RMS up, then re-check that your peak hasn't exceeded -3 dB.

If your RMS is too high (louder than -18 dB), your narration is recorded too hot. Use compression or reduction to bring it down.

ACX Check Plugin (Audacity)

The ACX Check plugin is a free Audacity add-on that checks your audio against ACX's technical specifications in one operation — measuring RMS, peak, and noise floor simultaneously and telling you pass or fail for each. Install it from the Audacity plugin repository:

  • Download ACX Check from the Audacity forum or plugin sites

  • Install following Audacity's plugin installation instructions

  • After editing and mastering, run ACX Check on your complete chapter file

  • Fix any failing measurements before export

ACX Check is the most practical QC tool available for self-narrating authors using Audacity. Run it on every file before export.

Step 7: Export to MP3

After mastering, export your chapter file as MP3.

Field / Spec

Value / Requirement

Notes

Format

MP3

 

Bitrate

192 kbps CBR

Constant bitrate — set this explicitly, not 'auto' or VBR

Sample rate

44.1 kHz

 

Channels

Mono (stereo also accepted)

Mono reduces file size with no quality loss for voice

Naming

Sequential naming convention

e.g., TitleAuthor_Part01_OpeningCredits.mp3


⚠ Verify CBR (constant bitrate) is selected at export — not VBR (variable bitrate). VBR is the default setting in some DAWs and Audacity configurations, and VBR files are a common automated QC failure on both ACX and Findaway Voices. In Audacity: File > Export > Export as MP3 > set 'Bit Rate Mode' to 'Constant' and quality to 192 kbps before exporting.

Step 8: Quality Control Before Submission

After exporting all chapter files as MP3, do a final QC pass before uploading to ACX or Findaway:

  • Play the first 30 seconds and last 30 seconds of every file — catch any export errors

  • Listen to 2–3 random minutes from the middle of each chapter — check for editing artifacts, noise intrusions, or level inconsistencies

  • Run ACX Check (or equivalent tool) on every file — all three metrics (RMS, peak, noise floor) should be green

  • Verify file naming is sequential and consistent across all files

  • Check that every file opens and closes with 0.5–1 second of room tone

  • Count your files against your chapter structure — opening credits, all chapters, closing credits. Confirm no files are missing

Step 9: Complete File Set for Delivery

Your complete audiobook delivery package:

Field / Spec

Value / Requirement

Notes

Opening credits

1 file

Title + Author + Narrator statement, 5–15 seconds

Chapter files

One per chapter

Named sequentially

Closing credits

1 file

Title + Author + Narrator + Copyright statement

Total file count

Chapters + 2

 

Total finished hours

Estimate before starting

80,000 word novel ≈ 8–9 finished hours


Upload your complete file set to ACX or Findaway Voices. ACX's automated QC runs immediately on upload and returns results quickly. Findaway's QC takes 24–48 hours. Address any flagged files — the error reports are specific enough to diagnose exactly what needs correction.

Audiobook royalties from self-produced titles on ACX and Findaway sync into ScribeCount once your accounts are connected. Authors who produce their own audio — eliminating narrator cost — typically see significantly higher audiobook margins than traditionally narrated titles. Tracking those margins in ScribeCount alongside your ebook and print royalties shows the full financial picture of your self-narrated catalog.

Time Investment: Realistic Expectations

Set honest expectations for self-narration time before committing:

Field / Spec

Value / Requirement

Notes

Recording ratio

3:1 to 5:1

3–5 hours of recording/editing per finished hour

Standard novel

8–10 finished hours

24–50 hours total production time

Learning curve

Higher for first book

First chapter takes 2–3× longer than subsequent

Editing ratio

Roughly 1:1 with recording

Editing takes approximately as long as recording

Mastering and QC

0.5 hours per finished hour

 


A self-narrated novel is a 24–50 hour project for an experienced author. For your first book, expect it to take longer. Factor this into your publishing schedule before committing to self-narration for a time-sensitive launch.

Common Self-Narration Mistakes

  • Recording in an untreated room with reflective surfaces — room echo cannot be fixed in post-production

  • Using variable bitrate (VBR) instead of constant bitrate (CBR) at export — common automated QC failure

  • Not addressing background noise before recording — HVAC hum, refrigerator noise, computer fan consistently push noise floor above -60 dB

  • Applying aggressive noise reduction to cover environment problems — creates artifact-laden audio worse than the original noise

  • Not running ACX Check on every file before upload — discovering QC failures after upload is slower than catching them before

  • Not leaving room tone at the start and end of each file — required by both ACX and Findaway

  • Not connecting ACX and Findaway to ScribeCount — losing visibility into self-narrated audiobook income


Self-narrating your audiobook is a learnable skill that takes most authors two or three recording sessions to feel comfortable with. The technical requirements are strict but understandable — treat your room, record consistently, edit honestly, master to the RMS and peak targets, export at CBR 192 kbps, and run ACX Check before every upload. The authors who do this well produce audiobooks that readers cannot distinguish from professionally narrated titles at a fraction of the cost.

-Randall Wood

Ready to Take Control of Your Author Career?

Join thousands of authors who trust our platform to manage their sales, streamline their reporting, and focus on what they love—writing!

Start Your 14-Day Free Trial