Why blind listening tests are flawed


This may sound like pure flame war bait - but here it is anyway. Since rebuilding my system from scratch, and auditioning everything from preamps to amps to dacs to interconnects to speaker cable etc, it seems clearer than ever.

I notice that I get easily fooled between bad and great sounding gear during blind auditions. Most would say "That should tell you that the quality of the gear is closer than you thought. Trust it".

But it's the process of blind listening tests that's causing the confusion, not a case of what I prefer to believe or justify to myself. And I think I know why it happens.

Understanding the sound of audio gear is process of accumulated memories. You can listen to say new speakers for weeks and love them until you start hearing something that bothers you until you can't stand them anymore.

Subconsciously you're building a library of impressions that continues to fill in the blanks of the overall sound. When all the holes are filled - you finally have a very clear grasp of the sonic signature. But we know that doesn't happen overnight.

This explains why many times you'll love how something sounds until you don't anymore? Anyone experience that? I have - with all 3 B&W speakers upgrades I've made in my life just to name a few.

Swapping out gear short term for blind listening tests is therefore counter productive for accurately understanding the characteristics of any particular piece or system because it causes discontinuity with impression accumulation and becomes subtractive rather than additive. Confusion becomes the guaranteed outcome instead of clarity. In fact it's a systematic unlearning of the sound characteristics as the impression accumulation is randomized. Wish I could think of a simpler way of saying that..

Ok this is getting even further out there but: Also I believe that when you're listening while looking at equipment there are certain anchors that also accumulate. You may hear a high hat that sounds shimmering and subconsciously that impression is associated with some metallic color or other visual aspect of the equipment you happen to be watching or remember.

By looking at (or even mentally picturing) your equipment over time you have an immediate association with its' sound. Sounds strange, but I've noticed this happening myself - and I have no doubt it speeds up the process of getting a peg on the overall sound character.

Obviously blind tests would void that aspect too resulting in less information rather than more for comparison.

Anyone agree with this, because I don't remember hearing this POV before. But I'm sure many others that have stated this because, of course, it happens to be true. ;
larrybou
If you can't tell the difference with your eyes closed, then don't close your eyes when you listen.
Art Dudley has an interesting comparison in this month's stereophile.

Two of the examples are brilliant.

a. you don't ask an art expert to make determination of whether a painting is a forgery or fake using a blind ABX test.

b. a blind "sip" test showed Pepsi to be preferred in both Pepsi and Coke's own tests - which caused Coke to launch the ill fated New Coke - but there's a difference between a sip and a full can of the drink - and while some may prefer Pepsi's (and New Coke's) sweeter taste, it's different when you drink an entire can.
The Coke/Pepsi/New Coke tests are a great way to illustrate the limitations of blind comparison tests. The only way you can really tell what people prefer is what they choose over the long run.

Interestingly, after 27 years of digital dominance, analog-chain LPs have been roaring back. They're voting with their wallets, which is much more reliable than contrived short-term tests.

Short-term tests ignore the mechanism of mental schemas, whereby we build mental models of everything we sense. A short test ignores the mind's need to build schemas to understand constructions of various concepts (e.g., sonic signatures, musical values, etc.) and compare their virtues over time.
Art Dudley's is a big disappointment to me. He constructs a straw man model of blind testing and flails away. I simply do not get his analogy about art forgery. Is there anyone out there saying you should examine visual art blindfolded? Plus he ignores the fact that peer reviewed scientific (non-subjectivist) testing is used in forgery investigation. Dudley's straw man model is limited to rapid switching and he is correct in how such quick switching or short samples can be misleading, but he refuses to explore blind testing with long-term sampling. Extended sample time blind testing probably is a very effective methodology for judging the quality of audio equipment.

The New Coke switch is probably the most studied business case ever. Pepsi knew what the outcome of their challenge would be. Coke knew it too. A sweeter drink is equivalent to an audio test where one sample is consistently louder than the other. The real question has always been why Coke reformulated? The answer to that question is still being debated, but what is undeniable is that Coke got more press attention and marketing buzz with the intro of New Coke and the reversion to Classic Coke then any consumer products company has ever received. The popular notion is that New Coke was a fiasco, but it actually revitalized Coke.