Audiobook Recording and Technical Specifications for Self-Narrating Authors

Self-narrating your audiobook eliminates narrator cost and gives you complete creative control over the reading experience. It also requires skills — recording, editing, and mastering audio — that most authors are not starting with. The learning curve is real, and the technical specifications for audiobook delivery are strict.

This guide covers the complete self-narration workflow: setting up a recording environment, selecting and using a microphone, recording and editing in a DAW (Digital Audio Workstation), mastering your files to meet ACX and Findaway Voices specifications, and packaging your final files for delivery.

This guide covers the technical production side of audiobook creation. For manuscript preparation, pronunciation guides, front and back matter decisions, and reviewing finished audio, see HT27 (How to Format Your Manuscript for Audiobook Production).

Is Self-Narration Right for You?

Self-narration is the right choice if you:

Have a voice that naturally suits your book's genre and tone
Are willing to invest time in learning recording basics — typically 3–5 hours of recording/editing per finished hour of audio
Have a quiet recording space or can create one
Write nonfiction, memoir, or narrative nonfiction where author voice is part of the product

Self-narration is harder to pull off for:

Fiction with large character casts requiring significant voice differentiation
Authors whose natural reading voice does not match their book's genre energy
Authors who produce at high volume where the time investment becomes prohibitive

If self-narration isn't right for your specific project, the HT16 (ACX) and HT17 (Findaway Voices) guides cover working with professional narrators. Both paths are legitimate — the goal is the best audiobook for your readers, not any particular production method.

The Technical Requirements You're Working Toward

Before setting up equipment, understand the specifications your finished files must meet. These are non-negotiable — ACX and Findaway Voices both run automated QC and will reject files that don't comply.

Field / Spec	Value / Requirement	Notes
Format	MP3	Required for both ACX and Findaway delivery
Bitrate	192 kbps minimum, CBR	Constant bitrate — not variable bitrate (VBR)
Sample rate	44.1 kHz
Channels	Mono	Stereo accepted but mono is standard for narration
RMS (loudness)	-18 dB RMS average target	Some platforms accept -23 to -18 dB range
Peak level	-3 dB maximum	No clipping — hard ceiling
Noise floor	-60 dB or lower	Background noise measured in silence between words
Opening/closing silence	0.5–1 second room tone	Each file starts and ends with recorded silence
File naming	Sequential, consistent convention	e.g., TitleAuthor_Part01.mp3

These specs are your mastering targets. Every decision in your recording and post-production workflow is aimed at producing files that hit these numbers. Keep them visible while you work.

Step 1: Your Recording Environment

The recording environment is the most important variable in home audio production. A mediocre microphone in an excellent room produces better results than an excellent microphone in a bad room. Address your space before you buy equipment.

What Makes a Room Bad for Recording

Hard, parallel surfaces (walls, floors, ceilings) create room echo — the reverberant sound that makes a recording sound like it was made in a bathroom. Open rooms with lots of glass and hardwood sound terrible. The problem is reflections: your voice hits the walls and bounces back to the microphone, creating a smeared, echoey sound that is not correctable in post-production.

Room echo is the primary reason home recordings fail ACX's quality review — not microphone quality or technical issues.

Creating a Treated Recording Space

You do not need a professional recording booth. You need a space with enough soft, sound-absorbing material to prevent reflections from reaching your microphone:

Closet full of hanging clothes — genuinely one of the best home recording environments. Clothes absorb reflections on all sides. Record inside the closet with the door closed or mostly closed
Tent or blanket fort method — build a small enclosure around yourself and your microphone using moving blankets or thick comforters. Theatrical overkill-looking but acoustically very effective
Mattress recording — place a mattress vertically behind you, hang blankets to the sides, and record in the enclosed space
Treated corner — cover the corner behind your recording position with moving blankets or acoustic foam panels; record facing the treated corner

Test your space before recording: clap sharply once and listen. In a bad room, you'll hear a metallic, echoey ring after the clap. In a well-treated space, the clap sounds dull and dead — that's what you want.

⚠ Room echo cannot be reliably removed in post-production. Plugins that attempt to remove reverb damage audio quality significantly. Treat your recording space before you record. If your first recording test sounds echoey, fix the environment — not the recording.

Background Noise

The noise floor specification (-60 dB or lower) measures background noise during silence between words. Sources of background noise to eliminate:

HVAC and air conditioning — turn off before recording sessions; record when the system is off
Refrigerator hum — move away from kitchen; close intervening doors
Computer fan noise — position your computer outside the recording space if possible; use a laptop if it runs quieter than your desktop
Street noise — record during quiet hours; use your closet or treated space which attenuates external sound
Phone notifications — silence all devices
Pets — record when they are settled; have them in another room

After setting up your space, record 30 seconds of silence and measure the noise floor in your DAW before beginning any narration. If the noise floor exceeds -60 dB, identify and address the source.

Step 2: Microphone Selection

Authors self-narrating on a limited budget often overthink microphone selection. For home recording, a quality mid-range microphone in a treated space consistently outperforms a premium microphone in an untreated space. Solve the room first.

USB Microphones (Recommended for Most Authors)

USB microphones connect directly to your computer without additional equipment. They are the simplest setup for authors who are not planning a long-term audio production operation:

Field / Spec	Value / Requirement	Notes
Audio-Technica AT2020USB+	~$150	Consistently recommended for voice recording
Blue Yeti X	~$150	Popular; USB; multiple pickup patterns
Rode NT-USB Mini	~$99	Compact; good frequency response for voice
Shure MV7	~$250	USB + XLR; USB for simplicity now; XLR for upgrade path

XLR Microphones (For Authors Committed to Audio Production)

XLR microphones require an audio interface (a small hardware device that connects the microphone to your computer via USB and handles the analog-to-digital conversion). The additional equipment adds cost but provides better audio quality and more upgrade flexibility:

Field / Spec	Value / Requirement	Notes
Audio-Technica AT2020 (XLR)	~$100	+ Audio interface: Focusrite Scarlett Solo (~$120)
Rode NT1	~$250	Low noise floor; excellent for voice recording
Shure SM7B	~$400	Industry standard for podcasting and narration

For authors narrating one book to test self-narration, a USB microphone is the right choice. For authors planning to narrate multiple books, the XLR path with an audio interface produces better results and provides a cleaner upgrade path.

Microphone Technique

How you position yourself relative to the microphone matters as much as which microphone you use:

Position the microphone 6–10 inches from your mouth at approximately mouth level or slightly above — pointing slightly down toward your mouth prevents plosives (explosive P and B sounds) from hitting the capsule directly
Use a pop filter (a mesh screen between your mouth and the microphone) to reduce plosives — $15–20 at most audio retailers, or make one from a nylon stocking stretched over a wire hanger frame
Never position the microphone directly in front of your lips — offset it slightly to one side or angle it to reduce P and B plosive impact
Record your first chapter with the microphone at a test position, then listen back and adjust before continuing

Step 3: Digital Audio Workstations (DAW)

A DAW is the software where you record, edit, and process your audio. You don't need the most powerful option — you need one you can learn quickly.

DAW Options for Self-Narrating Authors

Field / Spec	Value / Requirement	Notes
Audacity	Free, open-source	Widely used; functional for basic narration; steeper learning curve than paid options
GarageBand	Free (Mac only)	Excellent for beginners; clean interface; professional results
Reaper	$60 (discounted license)	Affordable; highly capable; widely used for audiobook production
Adobe Audition	~$55/month (Creative Cloud)	Professional; steep cost unless already on CC
Logic Pro	$199 one-time (Mac only)	Professional; excellent tools; high upfront cost

Start with Audacity (free) or GarageBand (free on Mac) before investing in paid software. The skills transfer between DAWs — learn to record, edit, and process audio in one, and you can use the same workflow in a more powerful DAW later if needed.

Step 4: Recording Your Audiobook

Recording Setup in Your DAW

Set your recording sample rate to 44.1 kHz and bit depth to 24-bit (you'll convert to 16-bit and MP3 for delivery, but record at higher resolution)
Set your input gain so your voice peaks in the -12 to -6 dB range during recording — never allow the recording level to touch 0 dB (clipping)
Check your noise floor in 30 seconds of recorded silence before beginning narration — verify it's well below -60 dB
Create a new audio track for each chapter file

Recording Technique

Read from your audio script — not from your printed book. Have the audio script visible (second monitor, tablet, or printed) with scene breaks and front matter decisions already marked
Do not try to record a full chapter in one pass without errors. Record in natural sections — complete thoughts, paragraphs, or scenes — and re-record any stumbles immediately
For stumbles: pause, take a breath, skip back a sentence, and re-read from a clean start point. Leave the mistake in the recording — you'll edit it out in post-production
Maintain consistent distance from the microphone throughout — backing away changes your tone quality
Drink room-temperature water during sessions — cold water and dairy products affect vocal quality
Limit recording sessions to 90 minutes maximum before taking a break. Vocal fatigue affects performance quality before you consciously notice it

Step 5: Editing Your Recordings

Editing removes mistakes, stumbles, long pauses, and background noise intrusions. This is where the difference between a rough recording and a professional-quality chapter file is made.

Basic Editing Workflow

Listen to the chapter in full with your script alongside — follow along and mark any sections with issues
Cut any false starts and re-reads, leaving only the best take of each passage
Remove long pauses (more than 0.5–1 second of silence where there should be none) by selecting and deleting the excess silence
Cut any significant background noise intrusions — a car horn, a dog bark, an AC unit cycling on mid-sentence
Do not remove all breathing — some breath sounds are natural and humanizing. Remove only heavy, distracting breaths before a sentence begins

Noise Reduction

If your recording has consistent low-level background noise (HVAC hum, computer fan), apply noise reduction using your DAW's noise gate or noise reduction plugin:

In Audacity: select 0.5 seconds of pure background noise (recorded before your narration starts), run Effect > Noise Reduction > Get Noise Profile, then select the full track and apply Noise Reduction at a conservative setting (8–12 dB reduction)
In Reaper: use the ReaFIR noise reduction plugin with a similar profile-capture workflow

⚠ Apply noise reduction conservatively. Aggressive noise reduction (excessive dB reduction) creates a metallic, artifact-laden sound that is immediately obvious to trained listeners. A slight residual noise floor is preferable to an over-processed recording. If your noise floor is consistently above -60 dB even after conservative noise reduction, the solution is a better recording environment — not more aggressive post-processing.

Step 6: Mastering to Specification

Mastering is the final processing step that brings your audio to the required technical specifications. The three key targets: RMS loudness (-18 dB), peak (-3 dB maximum), and noise floor (-60 dB or lower).

Step 6A: Normalize Peak Level

Peak normalization sets your loudest moment to a specified level. In your DAW, apply peak normalization to -3.0 dB. This ensures no audio exceeds the -3 dB ceiling required by ACX and Findaway.

Step 6B: Check and Adjust RMS Loudness

RMS (Root Mean Square) measures average loudness — the perceived volume of your narration. After peak normalization, check your RMS level:

In Audacity: Analyze > Contrast — measures RMS of selected audio
In Reaper: the SWS Loudness meter reads RMS
Target: -18 dB RMS (the center of the acceptable range)

If your RMS is too low (quieter than -18 dB), your narration is too quiet. Apply amplification (Effect > Amplify in Audacity) to bring the RMS up, then re-check that your peak hasn't exceeded -3 dB.

If your RMS is too high (louder than -18 dB), your narration is recorded too hot. Use compression or reduction to bring it down.

ACX Check Plugin (Audacity)

The ACX Check plugin is a free Audacity add-on that checks your audio against ACX's technical specifications in one operation — measuring RMS, peak, and noise floor simultaneously and telling you pass or fail for each. Install it from the Audacity plugin repository:

Download ACX Check from the Audacity forum or plugin sites
Install following Audacity's plugin installation instructions
After editing and mastering, run ACX Check on your complete chapter file
Fix any failing measurements before export

ACX Check is the most practical QC tool available for self-narrating authors using Audacity. Run it on every file before export.

Step 7: Export to MP3

After mastering, export your chapter file as MP3.

Field / Spec	Value / Requirement	Notes
Format	MP3
Bitrate	192 kbps CBR	Constant bitrate — set this explicitly, not 'auto' or VBR
Sample rate	44.1 kHz
Channels	Mono (stereo also accepted)	Mono reduces file size with no quality loss for voice
Naming	Sequential naming convention	e.g., TitleAuthor_Part01_OpeningCredits.mp3

⚠ Verify CBR (constant bitrate) is selected at export — not VBR (variable bitrate). VBR is the default setting in some DAWs and Audacity configurations, and VBR files are a common automated QC failure on both ACX and Findaway Voices. In Audacity: File > Export > Export as MP3 > set 'Bit Rate Mode' to 'Constant' and quality to 192 kbps before exporting.

Step 8: Quality Control Before Submission

After exporting all chapter files as MP3, do a final QC pass before uploading to ACX or Findaway:

Play the first 30 seconds and last 30 seconds of every file — catch any export errors
Listen to 2–3 random minutes from the middle of each chapter — check for editing artifacts, noise intrusions, or level inconsistencies
Run ACX Check (or equivalent tool) on every file — all three metrics (RMS, peak, noise floor) should be green
Verify file naming is sequential and consistent across all files
Check that every file opens and closes with 0.5–1 second of room tone
Count your files against your chapter structure — opening credits, all chapters, closing credits. Confirm no files are missing

Step 9: Complete File Set for Delivery

Your complete audiobook delivery package:

Field / Spec	Value / Requirement	Notes
Opening credits	1 file	Title + Author + Narrator statement, 5–15 seconds
Chapter files	One per chapter	Named sequentially
Closing credits	1 file	Title + Author + Narrator + Copyright statement
Total file count	Chapters + 2
Total finished hours	Estimate before starting	80,000 word novel ≈ 8–9 finished hours

Upload your complete file set to ACX or Findaway Voices. ACX's automated QC runs immediately on upload and returns results quickly. Findaway's QC takes 24–48 hours. Address any flagged files — the error reports are specific enough to diagnose exactly what needs correction.

Audiobook royalties from self-produced titles on ACX and Findaway sync into ScribeCount once your accounts are connected. Authors who produce their own audio — eliminating narrator cost — typically see significantly higher audiobook margins than traditionally narrated titles. Tracking those margins in ScribeCount alongside your ebook and print royalties shows the full financial picture of your self-narrated catalog.

Time Investment: Realistic Expectations

Set honest expectations for self-narration time before committing:

Field / Spec	Value / Requirement	Notes
Recording ratio	3:1 to 5:1	3–5 hours of recording/editing per finished hour
Standard novel	8–10 finished hours	24–50 hours total production time
Learning curve	Higher for first book	First chapter takes 2–3× longer than subsequent
Editing ratio	Roughly 1:1 with recording	Editing takes approximately as long as recording
Mastering and QC	0.5 hours per finished hour

A self-narrated novel is a 24–50 hour project for an experienced author. For your first book, expect it to take longer. Factor this into your publishing schedule before committing to self-narration for a time-sensitive launch.

Common Self-Narration Mistakes

Recording in an untreated room with reflective surfaces — room echo cannot be fixed in post-production
Using variable bitrate (VBR) instead of constant bitrate (CBR) at export — common automated QC failure
Not addressing background noise before recording — HVAC hum, refrigerator noise, computer fan consistently push noise floor above -60 dB
Applying aggressive noise reduction to cover environment problems — creates artifact-laden audio worse than the original noise
Not running ACX Check on every file before upload — discovering QC failures after upload is slower than catching them before
Not leaving room tone at the start and end of each file — required by both ACX and Findaway
Not connecting ACX and Findaway to ScribeCount — losing visibility into self-narrated audiobook income

Self-narrating your audiobook is a learnable skill that takes most authors two or three recording sessions to feel comfortable with. The technical requirements are strict but understandable — treat your room, record consistently, edit honestly, master to the RMS and peak targets, export at CBR 192 kbps, and run ACX Check before every upload. The authors who do this well produce audiobooks that readers cannot distinguish from professionally narrated titles at a fraction of the cost.

-Randall Wood

Audiobook Recording and Technical Specifications for Self-Narrating Authors

Audiobook Recording and Technical Specifications for Self-Narrating Authors

Is Self-Narration Right for You?

The Technical Requirements You're Working Toward

Step 1: Your Recording Environment

Step 2: Microphone Selection

Step 3: Digital Audio Workstations (DAW)

Step 4: Recording Your Audiobook

Step 5: Editing Your Recordings

Step 6: Mastering to Specification

Step 7: Export to MP3

Step 8: Quality Control Before Submission

Step 9: Complete File Set for Delivery

Time Investment: Realistic Expectations

Common Self-Narration Mistakes

Ready to Take Control of Your Author Career?