Apple iPhone 12 Audio review: Subtle improvements to sound

A few weeks ago Apple unveiled no fewer than four new iPhones, ranging both in size as well as in price. Beyond sharing the same A14 Bionic chipset — “the fastest chipset in a smartphone”— and OLED displays, this new generation represents above all Cupertino’s first ever 5G-equipped lineup. In this larger-than-usual family, the iPhone 12 comes third in line, after the iPhone 12 Pro Max and the iPhone 12 Pro, and before the aptly named iPhone 12 Mini.

On the audio side of things, the Apple iPhone 12 doesn’t seem to bring much change to its predecessors’ recipe. Apple’s latest phone packs two speakers for stereo playback, two microphones for background noise reduction, and Dolby Atmos technology that promises sounds moving “around you in 3D space, so you feel like you’re inside the action.” Also worth noting is that audio zoom is available when filming videos.

Audio specifications include:

  • Two front-facing speakers (one at the bottom, one at the top of the device)
  • Microphones with noise cancellation
  • Audio zoom
  • Dolby Atmos surround sound
  • User‑configurable maximum volume limit

About DXOMARK Audio tests: For scoring and analysis in our smartphone audio reviews, DXOMARK engineers perform a variety of objective tests and undertake more than 20 hours of perceptual evaluation under controlled lab conditions. This article highlights the most important results of our testing. Note that we evaluate both Playback and Recording using only the device’s built-in hardware and default apps. (For more details about our Playback protocol, click here; for more details about our Recording protocol, click here.)

Test summary

With an overall score of 73, the iPhone 12 ranks among the best devices for audio performance we have tested so far. It is only 3 points away from our top-scoring phone to date, the Xiaomi Mi 10 Pro, and just above the previous series’ flagship phone, the 11 Pro Max.

In playback testing, the iPhone 12 is among the best for almost every sub-attribute. It delivers a great timbre performance, powerful dynamics (even at soft volumes), a rather immersive and realistic spatial rendering, and highly satisfying maximum volume. While more than adequate for playing music or watching movies, Apple’s latest iPhone performs best in gaming mode, thanks to the clever positioning of its speakers, which are nearly impossible to occlude.

Apart from the inverted stereo in landscape mode, the phone’s only shortcomings when playing back audio are its occasional resonances and distortion in the low end of the spectrum, which affect bass precision and overall bass quality from nominal to maximum volumes.

Filming selfie videos with the iPhone 12

The iPhone 12 performed equally well in Recording tests, with an exemplary frequency response, sharp attack and efficient punch, very good overall spatial attributes, and good loudness in every tested scenario. It also recorded very natural and pleasing backgrounds. The phone fared poorest for artifacts, whose sub-score was considerably lowered by the phone’s aggressive compression and slight distortion at loud volumes, in addition to its easy-to-occlude microphones.

Sub-scores explained

The DXOMARK Audio overall score of 73 for the iPhone 12 is derived from its Playback and Recording scores and their respective sub-scores. In this section, we’ll take a closer look at these audio quality sub-scores and explain what they mean for the user.



Timbre tests measure how well a phone reproduces sound across the audible tonal range and takes into account bass, midrange, treble, tonal balance, and volume dependency.

In the timbre area, the iPhone 12 performed particularly well for movies.

The iPhone 12’s very good timbre sub-score is due to precise high-ends and consistent tonal balance from softest to loudest volumes. The phone fares better when playing back audio in landscape mode, especially when listening to music and watching movies.

As shown in the graph above, low-end extension is deeper than that of the Oppo Find X2 Pro. That said, its quality is impaired by occasional bass resonances. Further, upper midrange frequencies are too prominent, which results in a less balanced tonal reproduction than that of the iPhone 11.


DXOMark’s dynamics tests measure how well a device reproduces the energy level of a sound source, and how precisely it reproduces bass frequencies.

Despite being affected by bass resonances and distortion at both nominal and maximum volumes, bass precision is among the best we’ve measured to date, thanks to the speakers’ deep low-end extension. Attack is also good, especially at nominal volume, and so is punch, even at soft listening levels.


Sub-attributes for perceptual spatial tests include localizability, balance, distance, and wideness.

The precision of the high-end frequencies provides even better localizability of the sound sources than the previous generation. Balance between the left and right channels is well centered, while distance perception is realistic.

While the sound field’s wideness in its purest sense is good, its associated sub-score is impaired by the fact that the stereo is inverted in landscape mode (left and right channels are swapped).


Volume tests measure both the overall loudness a device is able to reproduce and how smoothly volume increases and decreases based on user input.

Another core strength of the iPhone 12 is its playback volume performance, which offers very satisfying maximum volume as well as consistent volume steps for the human ear.

Additionally, minimum volume is not too low, which allows dynamic content such as classical music to remain intelligible.

Hip-Hop Classical
75.4 dBA 72.4 dBA


Samsung Galaxy Note20 Ultra 5G

Artifacts tests measure how much source audio is distorted when played back through a device’s speakers. Distortion can occur both because of sound processing in the device and because of the quality of the speakers.

The iPhone 12’s speakers are almost impossible to occlude, even while playing games.

The iPhone 12’s performance is only average when it comes to controlling playback artifacts. Noise and spectral artifacts, more precisely bass distortion, are both noticeable at nominal and (especially) at maximum volumes. That all said, temporal artifacts remain discreet, and speakers are almost impossible to occlude when watching movies or while playing games.



The iPhone 12’s timbre when recording is only one point lower than the top-scoring phone in this category, which is none other than the iPhone SE.

The iPhone 12’s recordings exhibit a very well tuned tonal balance and good bass definition. The high quality of midrange frequencies adds a natural rendering of voices in the mix, while the overall timbre performance remains consistent even when recording in loud environments.

However, a slight lack of high-end extension generates a loss of clarity that can make voices sound slightly nasal, especially when recording memos.


Compared to its predecessor, the iPhone 12’s recording dynamics show an improved signal-to-noise ratio. Background sounds are quieter, and the sound envelope is well preserved.

In loud environments, our perceptual tests showed good attack for both instruments and voices. However, male voice intelligibility is slightly impaired by the previously mentioned lack of high-end extension.


The iPhone 12’s microphones do a fairly good job at representing sound sources in space. Videos recorded with the main camera offer good localizability for voices and wide sound fields. Selfie videos also ensure good wideness as well as realistic distance rendering. In memos, while distance is also highly realistic, localizability of sound sources is quite poor.

Memos recorded with the iPhone 12 provide realistic distance rendering, but poor localizability of sound sources.


While nominal loudness is average for videos filmed with the main camera and for memos, it is a little higher for selfie videos and meeting room scenarios. On average, loudness is higher than what iPhone 11 or Oppo Find X2 recordings provide. Additionally, maximum acceptable level is very good.

Here are our test results, measured in LUFS (Loudness Unit Full Scale); as a reference, we expect loudness levels to be above -24 LUFS for recorded content:

Meeting Life Video Selfie Video Memo
-24.1 LUFS -21.7 LUFS -19.7 LUFS -19.1 LUFS


Unlike in playback, the iPhone 12’s recordings exhibit aggressive temporal artifacts: when filming in loud environments, pumping is particularly noticeable in both life and selfie videos. Further, slight distortion appears on shouting voices, and microphones can easily be occluded by the user’s hands. In our meeting room scenario, our sound engineers noticed no compression; however, they heard slight distortions. You can hear some of the artifacts for yourself in this sample recording:


Background rendering with the iPhone 12’s microphones sound particularly natural, comfortable, and non-aggressive. Frequency ranges (bass/mids/treble) are very well balanced, and few artifacts are noticeable overall. The performance is equally good for life and for selfie videos.

The iPhone 12 delivers particularly natural background renderings.


Discreetly but surely, Apple improves the audio quality of its iPhone series. While no new earth-shattering audio technology is advertised, nearly every sub-score is higher than the previous generation’s — especially in recording, where the iPhone 12 performed consistently better than its predecessor. Apart from certain audio and user artifacts, the iPhone 12 is a solid performer across all our tested attributes and use cases, with a great timbre performance for both playback and recording, powerful dynamics, immersive and realistic spatial attributes, highly satisfying volume attributes, and natural background renderings.



  • Great timbre performance, with precise high-ends and good low-end extension
  • Tonal balance remains consistent regardless of the volume.
  • Sharp attack and good punch, even at soft volumes
  • Good wideness of the sound field, great balance between the left/right channels, and realistic distance rendering


  • Low-end resonances and distortion from nominal to maximum volume slightly impair the overall bass quality and precision.
  • Left and right channels are inverted in landscape mode.



  • Great timbre performance
  • Sharp attack and good punch
  • Good preservation of the sound envelope
  • Sound stage is wide, distance is realistic, and localizability is precise.
  • Loudness is good in every use case.


  • The microphones are too easy to occlude.
  • When recording in loud environments, aggressive compression and slight distortion are both noticeable.

Source Article