We put the Xiaomi 12 through our rigorous DXOMARK Audio test suite to measure its performance both at recording sound using its built-in microphones, and at playing audio back through its speakers.
In this review, we will break down how it fared in a variety of tests and several common use cases.
The Xiaomi 12’s audio was quite similar to the 12 Pro’s performance, even though the 12 has stereo speakers, compared with the 12 Pro’s four-speaker setup. When it came to volume, the Xiaomi 12 actually received a higher score than the Pro version. But the Pro did better in other key areas like playback dynamics. In playback, the Xiaomi 12 scored well in the gaming use case. Playback and recording showed good wideness and few artifacts. Both playback and recording lacked bass for a premium-segment device. But overall, the Xiaomi 12 remains a steady performer with no big flaws and some very good points for all types of playback and recording scenarios.
Key audio specifications include:
- Stereo speakers: top left – bottom left
- No jack audio output
- Dolby Atmos, tuned by Harman Kardon
- Pleasant tonal balance at maximum volume
- Snappy attack and good punch across all volume steps
- Good wideness
- Mostly artifact-free
- Tonal balance lacks bass and brilliance, resulting in a quite muffled overall sonority
- First volume step is a bit too quiet
- Very good Spatial performance, with especially good wideness in general and flawless localizability
- Very good Dynamics performance, with good SNR and envelope rendition
- Natural timbre overall, especially with recorder app
- Pretty clean artifacts-wise, except for some clipping
- Severe lack of bass and limited high-end extension
- Lack of low-midrange and brightness
- Less convincing timbre performance in high-SPL
About DXOMARK Audio tests: For scoring and analysis in our smartphone audio reviews, DXOMARK engineers perform a variety of objective tests and undertake more than 20 hours of perceptual evaluation under controlled lab conditions.
(For more details about our Playback protocol, click here; for more details about our Recording protocol, click here.)
The following section gathers key elements of our exhaustive tests and analyses performed in DXOMARK laboratories. Detailed performance evaluations under the form of reports are available upon request. Do not hesitate to contact us.
DXOMARK engineers test playback through the smartphone speakers, whose performance is evaluated in our labs and in real-life conditions, using default apps and settings.
The Xiaomi 12‘s timbre performance is very good, especially at high volume, even though tonal balance lacks some bass and treble, which puts the emphasis on the high-midrange and makes the inconsistency of the low-midrange obvious. That combination makes the Xiaomi 12 sound thin and muffled. But timbre is much more satisfying at maximum volume because of an added brightness, despite some occasional and slightly aggressive resonances in high-midrange. Still the overall lack of bass results in a thin tonal balance, which gets better as the volume increases. Dynamics performance is fine, even though attack is hindered by distortion. The Xiaomi 12’s bass precision and punch are quite good, regardless of the volume, despite the lack of bass and low-midrange energy.
In the spatial attribute, the device offered a decent localizability, except for classical music where pinpointing the specific instruments was blurry. Wideness was good and on par with Mi 11 series, but the audio did not rotate accordingly in inverted landscape, in both music and movie applications. Otherwise, the Xiaomi 12 offered a good balance, with a realistic rendition of distance and localizability of voices. The Xiaomi 12’s maximum volume step was satisfying, when compared with the intelligibility at the minimum volume step, which is subpar, and will probably affect the audio experience of broader dynamic range contents.
When it comes to artifacts, the Xiaomi 12 is nearly free of them, except for a bit of distortion. The device’s main drawback in this attribute is that it is easily occluded when held because of the speakers’ placement at the top left and bottom left of the device.
Listen to the tested smartphone’s playback performance in this comparison with some of its competitors:
The Timbre score represents how well a phone reproduces sound across the audible tonal range and takes into account bass, midrange, treble, tonal balance, and volume dependency. It is the most important attribute for playback.
The Dynamics score measures the accuracy of changes in the energy level of sound sources, for example how precisely a bass note is reproduced or the impact sound from drums.
The sub-attributes for spatial tests include pinpointing a specific sound's location, its positional balance, distance, and wideness.
The Volume score represents the overall loudness of a smartphone and how smoothly volume increases and decreases based on user input.
|Xiaomi 12||73.6 dBA||69.7 dBA|
|Xiaomi 11T||71.3 dBA||68.6 dBA|
|Google Pixel 6||74.8 dBA||69.7 dBA|
The Artifacts score measures the extent to which the sound is affected by various types of distortion. The higher the score, the less the disturbances in the sound are noticeable. Distortion can occur because of sound processing in the device and because of the quality of the speakers.
DXOMARK engineers test recording by evaluating the recorded files on reference audio equipment. Those recordings are done in our labs and in real-life conditions, using default apps and settings.
The Xiaomi 12 showed its strength in recording, with a subscore that was slightly better than the 12 Pro’s. Timbre was highlighted with a very natural tonal balance in simulated use cases, but it was less so in an electronic concert, life and selfie videos. In the main video app, timbre was affected by dull treble and a very limited high-end extension, regardless of how the phone was held and of the SPL scenario. Timbre was much better when using the recorder app, but tonal balance was focused mostly on the midrange, depending on the content. Dynamics performance for a device in this segment was very good, with a full envelope and with very little compression and pumping getting in the way of the audio experience at a high SPL. Like most Xiaomi devices, spatial performance offers great wideness in life videos, with precise localizability and realistic distances in most use cases. Selfie videos, on the hand, have a bit more limited wideness and narrower stereo scene. The Xiaomi 12 delivers a particularly excellent performance in maximum loudness tests, where it showed nearly no distortion at high SPLs like electronic concerts. The Xiaomi 12 had a few noticeable artifacts such as slight pumping and clipping on loud outbursts. While the device does not occlude easily, fingers near or on the microphone do sound a bit loud on the recordings. Background noises, however, were not intrusive and were natural-sounding.
Here is how the Xiaomi 12 performs in recording use cases compared to its competitors:
The Timbre score represents how well a phone captures sounds across the audible tonal range and takes into account bass, midrange, treble, and tonal balance. It is the most important attribute for recording.
The Dynamics score measures the accuracy of changes in the energy level of sound sources, for example how precisely a voice's plosives (the p's, t's and k's, for example) are reproduced. The score also considers the Sound-to-Noise Ratio (SNR), for example how loud the main voice is compared to the background noise.
The sub-attributes for spatial tests include pinpointing a specific sound's location, its positional balance, distance, and wideness on the recorded audio files.
The Volume score represents how loud audio is normalized on the recorded files and the how the device handles loud environments, such as electronic concerts, when recording.
|Meeting||Life Video||Selfie Video||Memo|
|Xiaomi 12||-28.8 LUFS||-21.6 LUFS||-19.9 LUFS||-23.4 LUFS|
|Xiaomi 11T||-26.8 LUFS||-21.2 LUFS||-19.5 LUFS||-20.3 LUFS|
|Google Pixel 6||-27.8 LUFS||-17.9 LUFS||-16.3 LUFS||-19.8 LUFS|
The Artifacts score measures the extent to which the recorded sounds are affected by various types of distortions. The higher the score, the less the disturbances in the sound are noticeable. Distortions can occur because of sound processing in the device and the quality of the microphones, as well as user handling, such as how the phone is held.
Background evaluates how natural the various sounds around a voice blend into the video recording file. For example, when recording a speech at an event, the background should not interfere with the main voice, yet it should provide some context of the surroundings.