Back to Smartphones
Home > Smartphones > Vivo X51 5G
premium

Vivo X51 5G Audio review: A single speaker can’t work wonders

Reading Time: 10 min read
53
audio

Last October, Vivo made a noteworthy entrance into the European market with the X51 — a slightly tweaked version of a premium smartphone already known in Asia as the X50. While the marketing emphasis certainly is on its innovative camera setup, the premium smartphone has more than one trick up its sleeve: powered by an AK4377A independent audio chip, the X51 promises to “immerse you in a world of music and sound.” When recording videos, the premium phone also features Autozoom and 3D Sound Tracking to “follow the sound coming from a chosen subject, no matter where it goes.”

We put the Vivo X51 through our rigorous DXOMARK Audio test suite to measure its performance both at recording sound using its built-in microphones, and at playing audio back through its single built-in speaker. In this review, we will break down how it fared in a variety of tests and several common use cases.

Audio specifications include:

  • One side-firing speaker on the bottom
  • Autozoom and 3D Sound Tracking technologies
  • No headphone jack

About DXOMARK Audio tests: For scoring and analysis in our smartphone audio reviews, DXOMARK engineers perform a variety of objective tests and undertake more than 20 hours of perceptual evaluation under controlled lab conditions. This article highlights the most important results of our testing. Note that we evaluate both Playback and Recording using only the device’s built-in hardware and default apps. (For more details about our Playback protocol, click here; for more details about our Recording protocol, click here.)

Test summary


Vivo X51 5G
53
audio
51
playback

61
recording

With an overall score of 53, the Vivo X51 5G placed towards the nether end of our DXOMARK Audio protocol rankings, way behind Xiaomi’s Mi 10 Pro score of 76, the top-scoring phone to date.

The phone’s Playback score is heavily brought down by its single-speaker design, and by its lack of high- end low-end extension. This results in a midrange-focused rendering, nonexistent wideness, off-centered left/right balance, and weak dynamics sub-attributes. While maximum volume is decent, minimum volume isn’t well tuned, inducing a loss of intelligibility for dynamic content (such as classical music or movies). That all said, the phone has one impressive skill: it does an excellent job at managing sonic artifacts, both spectral (distortion) and temporal (compression), especially from quiet to nominal volumes.

Recording a life video with the Vive X51 5G

Audio recorded through the Vivo X51 5G’s microphones doesn’t have much more to offer. In recording, too, the frequency response is shrunk down to a midrange-focused reproduction, which also affects the sound envelope. Spatial performance is slightly superior in the recording area, thanks to good localizability and above-average directivity. However, stereo is inverted in life videos (landscape mode), and an elevated level of background noise impairs the distance rendering as well as the signal-to-noise ratio. Finally, in loud environments, recordings exhibit compression and distortion, but otherwise, the phone is as effective at controlling artifacts when recording audio as it is when playing it back.

Sub-scores explained

The DXOMARK Audio overall score of 53 for the Vive X51 5G is derived from its Playback and Recording scores and their respective sub-scores. In this section, we’ll take a closer look at these audio quality sub-scores and explain what they mean for the user.

Playback

Timbre

Vivo X51 5G

54

82

Black Shark 4 Pro
Best: Black Shark 4 Pro (82)

Timbre tests measure how well a phone reproduces sound across the audible tonal range and takes into account bass, midrange, treble, tonal balance, and volume dependency.

The Vivo X51 5G’s playback timbre performance is below average, with a tonal balance exhibiting a lack of both high- and low-end extension. This induces a particularly midrange-focused sound. Classical music, pop rock, and movies fare timidly better than other genres, but still remain below average.

Music playback frequency response

That said, compared to similar mono devices, the overall rendering is fairly clear.

Dynamics

Vivo X51 5G

59

81

Black Shark 4 Pro
Best: Black Shark 4 Pro (81)

DXOMark’s dynamics tests measure how well a device reproduces the energy level of a sound source, and how precisely it reproduces bass frequencies.

Sound played back through the X51 5G’s single speaker exhibits middling dynamics: attack is weak and lacks sharpness, and both bass precision and punch are impaired by the lack of low-end extension. At maximum volume, a noticeable compression occasionally appears, which further weakens the punch reproduction.

Spatial

Vivo X51 5G

27

82

Black Shark 4 Pro
Best: Black Shark 4 Pro (82)

Sub-attributes for perceptual spatial tests include localizability, balance, distance, and wideness.

In the Spatial sub-category, the X51 5G is only one up from the lowest-scoring phone in this category, namely the Honor 20 Pro. Since it is a mono device, the sound field’s wideness is nonexistent, and its left/right balance is drastically shifted to the right side of the device (in landscape mode).

Localizability of the sound sources is average when evaluated against similarly-built devices, but quite poor in comparison to stereo smartphones. The distance rendering, however, is reasonably realistic.

Volume

Vivo X51 5G

55

79

Black Shark 4 Pro
Best: Black Shark 4 Pro (79)

Volume tests measure both the overall loudness a device is able to reproduce and how smoothly volume increases and decreases based on user input.

Playback volume isn’t one of the X51 5G’s strong suits either. Minimum volume doesn’t allow dynamic content such as classical music or movies to remain intelligible. Further, as shown in the graph above, volume could increase in a more consistent and natural manner (as is the case with LG’s V60 ThinQ 5G). Maximum volume is average for a mono device.

Hip-Hop Classical
Vivo X51 5G 71.3 dBA 70.4 dBA
Oppo Reno4 Pro 5G 75.4 dBA 72.1 dBA
LG V60 ThinQ 5G 75.3 dBA 71.8 dBA

Artifacts

Vivo X51 5G

90

93

Asus ROG Phone 5
Best: Asus ROG Phone 5 (93)

Artifacts tests measure how much source audio is distorted when played back through a device’s speakers. Distortion can occur both because of sound processing in the device and because of the quality of the speakers.

Unlike every other playback sub-attribute, the X51 5G delivers a stellar performance in terms of artifacts management, tied with our best-scoring phone so far.

Except for occasional pumping at maximum volume, almost no type of artifact is noticeable — whether temporal (compression), spectral (distortion), or noise. The Vivo X51 fares best when playing games, where no user-induced artifact (such as possible occlusion of the speaker by the user’s fingers or palms) is encountered. Movie and music playback are not far behind, with excellent sub-scores as well.

The speaker isn’t easy to occlude when playing games.

Recording

Timbre

Vivo X51 5G

58

86

Black Shark 4 Pro
Best: Black Shark 4 Pro (86)

Similarly to the playback timbre performance, the overall reproduction is focused on midrange frequencies and suffers from a lack of high-end extension in both selfie and life videos — particularly noticeable when filming in urban surroundings. In loud environments, bass and low-mids are also recessed (especially in selfie videos), which results in slightly nasal midrange frequencies.

Life video frequency response

That said, the X51 5G fares well in our meeting use case, thanks to a harmonious tonal balance.

Meeting recordings exhibit good tonal balance.

Dynamics

Vivo X51 5G

61

78

OnePlus 8
Best: OnePlus 8 (78)

The X51 5G delivers a rather narrow signal-to-noise ratio, due to elevated background noise. Recessed treble impairs the envelope reproduction, which in turn impairs the sharpness of plosives (sounds like “p” and “b”). When recording in loud environments, the envelope is also affected by noticeable distortion and compression.

Spatial

Vivo X51 5G

58

78

Black Shark 4 Pro
Best: Black Shark 4 Pro (78)

The X51 5G fares better when recording spatial attributes, thanks to decent localizability of the sound sources (albeit it slightly impaired by the lack of high-end extension), and above-average directivity. Selfie videos, for instance, provide a particularly well-suited directivity pattern, which helps attenuate sound sources outside of the field of view; but the other side of that coin is that the sound field is consequently very narrow. The default setting when recording live videos, however, does not implement such directivity. (It is available by switching from “surround recording” to the “front recording” or “back recording” option in the video app settings.)

Also note that stereo is inverted in life video recordings. Distance rendering is globally affected by loud background noise in which voices get easily drowned, and by the lack of treble.

Selfie videos benefit from a directivity pattern which focuses on the user.

The 3D Sound Tracking technology undoubtedly has an effect on the spatialization in recorded videos. The target sound does indeed follow moving and speaking subjects: balance is quite precise, which helps with the localization of sound sources. That said, the feature is quite tricky to activate: besides requiring the user to tap on the subject to track it, the device doesn’t make it clear if the tracking is on or off. Additionally, depending on the recording, the balance between left and right channels sometimes seems to be inverted: when the subject is going to the left, the sound occasionally goes… to the right.

Volume

Vivo X51 5G

62

89

Xiaomi Mi 10S
Best: Xiaomi Mi 10S (89)

While nominal loudness of recorded content is only average, the maximum level reachable without exhibiting disturbing artifacts is good. This adds up to a decent volume performance when recording for the Vivo X51 5G. Here are our test results, measured in LUFS (Loudness Unit Full Scale); as a reference, we expect loudness levels to be above -24 LUFS for recorded content:

Meeting Life Video Selfie Video Memo
Vivo X51 5G -29.2 LUFS -24.1 LUFS -20.5 LUFS -23.7 LUFS
Oppo Reno4 Pro 5G -30 LUFS -22.4 LUFS -20.8 LUFS -23.5 LUFS
LG V60 ThinQ 5G -24 LUFS -15.6 LUFS -15.4 LUFS -19.2 LUFS

Artifacts

Vivo X51 5G

81

88

Asus ROG Phone 5
Best: Asus ROG Phone 5 (88)

When recording, too, the Vivo X51 5G does an excellent job at keeping sonic artifacts to a minimum… except in high-SPL scenarios. From soft to nominal volumes, both spectral and temporal artifacts are very well controlled, with only slight distortion and clipping on shouting voices. In loud environments, however, distortion and hissing become noticeable, especially on midrange-focused content.

Background

Vivo X51 5G

28

58

Apple iPhone XS Max
Best: Apple iPhone XS Max (58)

The Vivo X51 5G’s background performance is quite limited by its tonal imbalance, which results in a perceivable loss of details, and therefore a lack of realism. That said, our experts noticed no artifacts during our background recording tests, except when filming selfie videos.

Conclusion

Despite being a high-end smartphone, the Vivo X51 5G can’t quite keep its promise of immersing the listener in a world of music and sound. Although it proves impressively effective at keeping sonic artifacts under control, its single-speaker design, significant tonal imbalance, and elevated background noise affect its timbre, spatial, and dynamics performances in both playback and recording.

Playback

Pros

  • Tonal balance is fairly clear compared to similar mono devices.
  • Almost no noticeable artifacts

Cons

  • Lack of low- and high-end extension, leading to a midrange-focused sound reproduction.
  • Poor spatial performance (especially for wideness and balance) due to the single-speaker design.
  • Limited attack, punch, and bass precision due to midrange-focused tonal balance.
  • Minimum volume isn’t well tuned, resulting in a lack of intelligibility for dynamic content.

Recording

Pros

  • Good tonal balance in meeting use case
  • Few noticeable artifacts overall

Cons

  • Inverted stereo in life video
  • Midrange-focused tonal balance and lack of high- and low-end extension
  • Envelope affected by lack of treble, as well by as by compression and distortion when recording in loud environments.

DXOMARK invites our readership (you) to post comments on the articles on this website. Read more about our Comment Policy.