Audio Settings
Adjust audio settings including bitrate, sample rate, channels & volume. Optimize audio files for quality or file size. Professional audio customization.
About Audio Settings
Free Online Tool
Audio Settings
Adjust pitch, volume, and output bitrate on any audio file — independently, in any combination — and download a clean re-rendered file with all three applied in one pass.
How to Use This Tool (30 Seconds)
- 1Upload Your Audio File: Click the upload zone and select your source file. MP3, WAV, FLAC, OGG, and M4A are all accepted as input formats up to 200MB.
- 2Adjust Pitch: Drag the pitch slider up or down in semitone increments. The range runs from −12 semitones (one octave down) to +12 semitones (one octave up). Pitch changes without altering playback speed.
- 3Adjust Volume: Set the output volume level as a percentage of the original. 100% is unchanged, below 100% reduces loudness, above 100% amplifies — with a clipping warning shown if the signal would exceed 0 dBFS.
- 4Set Output Bitrate: Choose your target bitrate from the dropdown — 64, 96, 128, 192, 256, or 320 kbps. This controls the file size and quality of the exported audio.
- 5Preview and Download: Hit 'Play Preview' to hear all three adjustments applied live, then click 'Apply & Download' to export the final file with every setting baked in.
The Formulas Behind Pitch, Volume and Bitrate
Each of the three settings operates on a distinct audio property using a separate processing formula:
// Pitch shift — frequency ratio from semitone offset
frequencyRatio = 2 ^ (semitones / 12)
Example: +7 semitones → 2^(7/12) = 1.498× frequency
Example: −12 semitones → 2^(−12/12) = 0.5× (one octave down)
// Volume — linear gain applied to each sample
outputSample = inputSample × (volumePercent / 100)
Clipping occurs when outputSample > 1.0 (0 dBFS ceiling)
// Bitrate — output file size estimate
fileSizeMB = (bitrate kbps × duration seconds) ÷ 8,000
Pitch shifting uses the phase vocoder algorithm — it transforms the audio into the frequency domain using a Short-Time Fourier Transform (STFT), scales the frequency bins by the target ratio, then reconstructs the time-domain signal. This separates pitch from tempo, so a +5 semitone shift raises the key without making the audio play faster — unlike the naive approach of simply resampling at a different rate, which changes both simultaneously.
Settings Reference — Range, Unit & Practical Impact
| Setting | Range | Unit | Neutral Value | Practical Impact |
|---|---|---|---|---|
| Pitch | −12 to +12 | Semitones | 0 | ±1 octave range; 1 semitone = one piano key step |
| Volume | 0% to 200% | Percent | 100% | Above 100% risks clipping if source peaks near 0 dBFS |
| Bitrate | 64 – 320 | kbps | 128 kbps | Higher = better quality + larger file size |
Pitch Semitone Quick Reference
−12 st
1 octave down — deepest male voice range
−5 st
Lower key — male-to-baritone shift for songs
−2 st
Subtle deepening — barely noticeable on speech
0 st
Original pitch — no change applied
+2 st
Slight lift — common key shift for covers
+5 st
Noticeable raise — female vocal range shift
+7 st
Perfect fifth up — harmonic interval shift
+12 st
1 octave up — chipmunk range, high vocal effects
⚡ Pro Tip
If you are boosting volume above 100% and hearing distortion, the problem is peak clipping — not average loudness. A file can have a low average volume but contain transient peaks near 0 dBFS. Applying 150% gain clips those peaks hard. The fix is to apply a −3 dB headroom reduction first — set volume to 75%, which pulls peaks away from the ceiling — then boost back up to your target level. This two-step approach, called gain staging, is what mastering engineers do before any amplitude processing and eliminates clipping without a limiter.
Frequently Asked Questions
Q: Does pitch shifting change the speed or duration of the audio?
No. This tool uses phase vocoder pitch shifting, which changes the frequency content independently of playback speed. The duration of the output file is identical to the source regardless of how many semitones are applied. Speed-based pitch changes — like simply resampling — are not used here.
Q: How many semitones equal one octave?
Exactly 12 semitones equal one octave. Setting pitch to +12 doubles the fundamental frequency of every sound in the file. Setting it to −12 halves all frequencies. Each individual semitone step represents a frequency ratio of 2^(1/12) ≈ 1.0595 — approximately a 5.95% frequency increase per step.
Q: Why does my audio distort when I increase the volume above 100%?
Increasing volume amplifies every sample in the file. If any samples are already near the 0 dBFS digital ceiling, the amplified values exceed the maximum representable level and clip — producing hard distortion. Use the gain staging approach: reduce to 75% first to create headroom, then amplify to your target without clipping.
Q: Does changing the bitrate affect pitch or volume?
No. Bitrate only controls the amount of data used to encode one second of audio — it affects file size and compression quality, not the tonal or loudness properties of the audio. Pitch and volume changes are applied to the audio signal before bitrate encoding.
Q: What is the best bitrate for voice-only audio after adjustments?
96–128 kbps is sufficient for speech and voice recordings after pitch or volume adjustments. Human speech occupies a narrow frequency range (85–8,000 Hz) that does not require the high bitrates needed for music. Use 128 kbps as a safe default for any adjusted voice file shared online.
Q: Can I shift pitch without re-encoding the entire file?
No. Pitch shifting modifies the actual audio sample data — the frequency content of every frame changes. This requires a full decode, process, and re-encode cycle. There is no metadata-only shortcut for pitch like there is for video rotation flags.
Q: What happens if I set volume to 0%?
Setting volume to 0% produces a valid but completely silent audio file. The file retains its format, duration, and bitrate — only the amplitude of every sample is set to zero. This is useful for creating silent audio placeholders or syncing with video where no audio track is needed.