Back to Speakers
Home > Speakers > Baidu Xiaodu Smart Speaker Ultimate Edition

Baidu Xiaodu Smart Speaker Ultimate Edition Speaker review: Struggles at playback

Reading Time: 5 min read

After having launched the highly intriguing Raven H smart speaker in 2017 and reportedly selling a disappointing number of units, Baidu went to the other extreme. In June 2018, the Chinese company released the Xiaodu, a low-cost smart speaker which went on to sell millions of units in only a few months. By the end of 2019, this impressive growth eventually led to Baidu’s taking over Alibaba’s lead in the Chinese smart speaker market, and edging past Google on the global scale to become Amazon’s first smart speaker rival.

In May 2020, Baidu rolled out a second edition of the Xiaodu, called Xiaodu Smart Speaker Ultimate Edition. While the updated version boasts an impressive list of connected features and voice recognition technologies, it doesn’t give out much information about its audio playback abilities. In fact, it doesn’t give any information at all, except for the fact that the very compact speaker integrates “modeling technology to ensure advanced acoustics” and is able to “automatically synthesize the parents’ tones to tell over 500 stories for children.”

To find out more about this best-seller’s audio performance, we put the Baidu Xiaodu Smart Speaker Ultimate Edition through our rigorous DXOMARK Wireless Speaker test suite. In this review, we will break down how it fared at audio playback in a variety of tests and several common use cases.

Key specifications include:

    • Single upward-firing speaker
    • Bluetooth 4.2
    • No battery
    • DuerOS assistant
    • 300g

Test conditions:

    • Tested with Motorola G8
    • Communication protocol used: Bluetooth
    • Firmware version:

About DXOMARK Wireless Speaker tests: For scoring and analysis in our wireless speaker reviews, DXOMARK engineers perform a variety of objective tests and undertake more than 20 hours of perceptual evaluation under controlled lab conditions. This article highlights the most important results of our testing. Note that we evaluate playback using only the device’s built-in hardware. (For more details about our Speaker protocol, click here.) The Baidu Xiaodu Smart Speaker Ultimate Edition falls into the Essential category of devices in the DXOMARK Speaker rankings.

Test summary

Baidu Xiaodu Smart Speaker Ultimate Edition

With a global score of 48, the Baidu Xiaodu is among the lowest-scoring speakers we have tested so far. With the exception of the artifacts category, all of its sub-scores rank last in our database. Its overall playback performance is heavily hampered by a tonal imbalance resulting from absent bass, recessed treble, and a lack of low-midrange frequencies. Consequently, bass precision and punch performances are both particularly poor.

The Baidu Xiaodu’s overall playback performance is impaired by a severe tonal imbalance.

Since the Xiaodu is built with a unique upward-firing speaker, wideness is essentially non-existent. And to cap it all, Baidu’s low-cost speaker delivers insufficient maximum volume and inconsistent volume steps.

The good news is that undesirable artifacts are under control. Indeed, the Baidu Xiaodu produces fairly clean sound from soft to nominal volumes, and very few artifacts at maximum volumes (granting that its maximum volume is considerably lower than that of most of the other tested speakers). It also ensures good localizability, and realistic distance rendering for vocal content.

Sub-scores explained

The DXOMARK Speaker overall score of 48 for the Baidu Xiaodu Smart Speaker Ultimate Edition is derived from a range of sub-scores. In this section, we will take a closer look at these audio quality sub-scores and explain what they mean for the user, and we will show some comparison data from two of the Xiaodu’s principal competitors in the Essential category, the TMall Genie X5 and the Yandex Station.

Playback attribute comparisons


Baidu Xiaodu Smart Speaker Ultimate Edition



Bowers & Wilkins Formation Wedge
Best: Bowers & Wilkins Formation Wedge (152)

DXOMARK timbre tests measure how well a speaker reproduces sound across the audible tonal range and takes into account bass, midrange, treble, tonal balance, and volume dependency.

Playback timbre comparison

Most notable in our timbre tests of the Baidu Xiaodu is the deficit of high- and low-end extension. In other words, bass is essentially absent, and treble is critically recessed.

Low-mids are also lacking; as shown in the graph below, the frequency response starts dropping below 300 Hz, and plummets below 200 Hz.

Music playback frequency response

This means that the speaker produces a particularly narrow frequency range that focuses on high-mids, resulting in an overall nasal sound not well-suited for listening to any music genre nor for watching movies. With such a timbre performance, it is safe to say that most children would hardly recognize their “synthesized parents’ tones.”


Baidu Xiaodu Smart Speaker Ultimate Edition



Bowers & Wilkins Formation Wedge
Best: Bowers & Wilkins Formation Wedge (137)

Our dynamics tests measure how well a device reproduces the energy level of a sound source, taking into account attack, bass precision, and punch.

Playback dynamics comparison

The Xiaodu’s dynamics sub-score is also quite weak. While attack remains acceptable thanks to the prevalence of high-mids, the lack of low-mids impairs punch, and unsurprisingly, bass precision is severely affected by the near-absence of bass.


Baidu Xiaodu Smart Speaker Ultimate Edition



Bowers & Wilkins Formation Wedge
Best: Bowers & Wilkins Formation Wedge (111)

Our spatial tests measure a speaker’s ability to reproduce stereo sound in all directions, taking into account localizability, balance, wideness, distance, and directivity.

Playback spatial comparison
Sound is evenly spread at 360° due to the the Xiaodu’s single upward-firing speaker design.

In the spatial area as well, Baidu’s best-selling speaker leaves something to be desired. Its sub-score of 62 is heavily impaired by an absence of wideness and unrealistic distance rendering except for vocal content such as podcasts.

Playback directivity

Because of the Xiaodu’s single upward-firing speaker design, its directivity performance is great — meaning that sound spreads out evenly at 360° around the speaker. Furthermore, its small form factor and sound field narrowness ensure good localizability of the various sound sources.


Baidu Xiaodu Smart Speaker Ultimate Edition



Yandex Station
Best: Yandex Station (136)

Our volume tests measure both the maximum loudness a speaker is able to produce and how smoothly volume increases and decreases based on user input.

Playback volume comparison
Playback volume consistency comparison

In volume testing, the Xiaodu performs poorly, with a maximum volume unable to reach the target SPL (sound pressure level) for our protocol’s party scenario. This is why you will see only two distortion (THD) curves in the artifacts section further below, since the loud and maximum volumes are identical. Additionally, as shown in the graph above, the volume steps are far from being consistent.

Baidu’s Xiaodu was unable to reach our target sound pressure level for the party scenario.

Here are a few SPL measured when playing our sample recordings of hip-hop and classical music at maximum volume:

Correlated Pink Noise Uncorrelated Pink Noise Hip-Hop Classical Latin Asian Pop
Baidu Xiaodu Smart Speaker Ultimate Edition 67 dBA 64.1 dBA 65 dBA 57.8 dBA 66.7 dBA 60.2 dBA
TMall Genie X5 75.4 dBA 73.8 dBA 72.1 dBA 67.1 dBA 73.9 dBA 66.3 dBA
Yandex Station 91.7 dBA 89 dBA 86.7 dBA 79.4 dBA 89.2 dBA 80.2 dBA


Baidu Xiaodu Smart Speaker Ultimate Edition



Sonos Five
Best: Sonos Five (133)

Our artifacts tests measure how much source audio is distorted when played back, along with other sound artifacts such as noise, pumping effects, and clipping. Distortion and other artifacts can occur both because of sound processing and because of the quality of the speakers.

Playback artifacts comparison

In the light of its performance in most of our other attribute categories, the Xiaodu’s control of artifacts comes as a pleasant surprise, achieving a sub-score that is only two points away from the top-scoring speaker in this category, the Amazon Echo Studio. But let’s not forget that most artifacts are typically triggered at loud volumes, often by low- and high-end frequencies. Since the Xiaodu severely lacks bass and treble, and since its maximum volume is fairly weak, the task of controlling undesirable artifacts is considerably easier for it.

Thus from soft to nominal levels, no temporal artifacts are perceivable, and very few spectral artifacts can be heard. While compression and distortion certainly become more noticeable at loud and maximum volumes (especially in our bathroom, outdoor, and party use cases), they remain within an acceptable range.

Playback total harmonic distortion

Our testers perceived no user artifacts; however, the Bluetooth latency is quite significant, which makes the wireless connection unsuitable for watching videos, unless the player offers the possibility of adjusting the delay manually.


The Baidu Xiaodu’s single upward-firing speaker delivers a low-scoring performance, with significant tonal imbalance, considerable lack of bass and treble, poor dynamics, absent wideness, and weak maximum volume. The only bright side is that both spectral and temporal artifacts are well under control. In light of all this, it is fair to say that Baidu’s best-selling smart speaker is undoubtedly more smart than speaker.


  • With only one upward-firing speaker, sound is evenly distributed around the speaker.
  • Undesirable sound artifacts are kept under control.
  • Localizability of sound sources is decent.


  • Tonal reproduction suffers from tonal imbalance, with bass, low-mids, and high-ends critically lacking.
  • The produced sound field is particularly narrow.
  • Poor overall dynamics performance
  • Maximum volume is well below average and volume steps are inconsistent.

DXOMARK invites our readership (you) to post comments on the articles on this website. Read more about our Comment Policy.