Reviews with all double blind testing?


In the July, 2005 issue of Stereophile, John Atkinson discusses his debate with Arnold Krueger, who Atkinson suggest fundamentally wants only double blind testing of all products in the name of science. Atkinson goes on to discuss his early advocacy of such methodology and his realization that the conclusion that all amps sound the same, as the result of such testing, proved incorrect in the long run. Atkinson’s double blind test involved listening to three amps, so it apparently was not the typical different or the same comparison advocated by those advocating blind testing.

I have been party to three blind testings and several “shootouts,” which were not blind tests and thus resulted in each component having advocates as everyone knew which was playing. None of these ever resulted in a consensus. Two of the three db tests were same or different comparisons. Neither of these resulted in a conclusion that people could consistently hear a difference. One was a comparison of about six preamps. Here there was a substantial consensus that the Bozak preamp surpassed more expensive preamps with many designers of those preamps involved in the listening. In both cases there were individuals that were at odds with the overall conclusion, and in no case were those involved a random sample. In all cases there were no more than 25 people involved.

I have never heard of an instance where “same versus different” methodology ever concluded that there was a difference, but apparently comparisons of multiple amps and preamps, etc. can result in one being generally preferred. I suspect, however, that those advocating db, mean only “same versus different” methodology. Do the advocates of db really expect that the outcome will always be that people can hear no difference? If so, is it the conclusion that underlies their advocacy rather than the supposedly scientific basis for db? Some advocates claim that were there a db test that found people capable of hearing a difference that they would no longer be critical, but is this sincere?

Atkinson puts it in terms of the double blind test advocates want to be right rather than happy, while their opponents would rather be happy than right.

Tests of statistical significance also get involved here as some people can hear a difference, but if they are insufficient in number to achieve statistical significance, then proponents say we must accept the null hypothesis that there is no audible difference. This is all invalid as the samples are never random samples and seldom, if ever, of a substantial size. Since the tests only apply to random samples and statistical significance is greatly enhanced with large samples, nothing in the typical db test works to yield the result that people can hear a difference. This would suggest that the conclusion and not the methodology or a commitment to “science” is the real purpose.

Without db testing, the advocates suggest those who hear a difference are deluding themselves, the placebo effect. But were we to use db but other than the same/different technique and people consistently choose the same component, would we not conclude that they are not delusional? This would test another hypothesis that some can hear better.

I am probably like most subjectivists, as I really do not care what the outcomes of db testing might be. I buy components that I can afford and that satisfy my ears as realistic. Certainly some products satisfy the ears of more people, and sometimes these are not the positively reviewed or heavily advertised products. Again it strikes me, at least, that this should not happen in the world that the objectivists see. They see the world as full of greedy charlatans who use advertising to sell expensive items which are no better than much cheaper ones.

Since my occupation is as a professor and scientist, some among the advocates of double blind might question my commitment to science. My experience with same/different double blind experiments suggest to me a flawed methodology. A double blind multiple component design, especially with a hypothesis that some people are better able to hear a difference, would be more pleasing to me, but even here, I do not think anyone would buy on the basis of such experiments.

To use Atkinson’s phrase, I am generally happy and don’t care if the objectivists think I am right. I suspect they have to have all of us say they are right before they can be happy. Well tough luck, guys. I cannot imagine anything more boring than consistent findings of no difference among wires and components, when I know that to be untrue. Oh, and I have ordered additional Intelligent Chips. My, I am a delusional fool!
tbg

Showing 17 responses by qualia8

This Rouvin-Pableson exchange is fascinating. I agree with Pableson on just about everything. Perhaps that is because I'm an academic (I'm a philosopher, but I'm also part of the cognitive science faculty b/c of my courses on color and epistemology). Anyway, I'm no psychologist, but I am aware of the powerful external forces shaping perceptual evaluation. So I am especially leery of those extra-acoustical mechanisms, which are, by their very nature hidden from us.

SOME RELEVANT PSYCHOLGOICAL MECHANISMS TO BEAR IN MIND.

To start with, there's the endowment effect. The experiment takes place at a three-day conference. At the beginning of the conference, everyone is given a mug. At the end of the conference, the organizers offer to buy the mugs back for a certain price. Turns out, people want something like $8 (can't remember the exact number) to give their mug back. But other groups at different conferences are not given the mug; it is sold to them. Turns out, the price they are willing to *pay* for the mug is, like $1. Conclusion: people very quickly come to think the things they have are worth more than things they don't have, but could acquire.

This may seem to run counter to our constant desire to swap out and upgrade in search of perfect sound, but it explains the superlatives that people use -- "best system I've ever heard," "sounds better than most systems costing triple"-- when describing mediocre systems they happen to own. (Other explanations for this are also possible, of course.)

Our audiophiliac tendencies are also in part explained by the "choice" phenomenon: when you are faced with a wide variety of options, you're not as happy with any of them as you otherwise would be. When subjects are offered three kinds of chocolate on a platter, they're pretty happy with their choice. But when they're offered twenty kinds, they're less happy even when they pick the identical chocolate. That's us!

Another endowment-like effect, though, and this is what got me to write this post, is one that happens after making a purchasing or hiring decision. After making the decision say, to hire person A over person B, a committee will rate person A *much* higher than prior to the hiring decision, when person B was still an option. In other words, we affirm our choices after making them.

This phenomenon is more pronounced the more sacrifices you make in the course of the decision-making process. In other words, if you went all out to get candidate A, you'll think he's even better. Women know this intuitively. It's called playing hard to get.

In the audio realm, when you spend a couple grand on cables, your listening-evaluation mechanisms will *make* the sound better, because you have sacrificed for it.

So *this* made me wonder whether really expensive cables *do* sound better, to those who know what they cost and who made the sacrifice of buying them. If so, then those cables are worth every penny to those who value that listening experience. DBT cannot measure this difference, because it's not a physical difference in the sound. But it is still a *real* difference in the perceptual experiences of the listener. In the one case (expensive cables), your perceptual system is all primed and ready to hear clarity, depth, soundstage, air, presence, and so on. In the other case (cheap cables), you perceptual system is primed to hear grain, edge, sibilance, and so on. And hear them you do!

Best of all would be forgeries, *faked* expensive cables your wife could buy, knowing they were fakes, and stashing the unspent thousands in a bank account. You'd get to "hear" all of this wonderful detail, thinking you were broke, but years later, you'd have a couple hundred grand in your retirement fund!

Sorry for the rambling post, but I am interested to hear what Pableson has to say. You are missing out, Pableson. Knowing about the extra-acoustical mechanisms, you cannot "hear" the benefits of expensive cables. It's all ruined for you, as if you discovered your "wonderful" antidepressants were just pricey sugar pills.
I teach a course on the philosophy of color and color perception. One of the things I do is show color chips that are pairwise indistinguishable. I show a green chip together with another green chip that is indistinguishable. Then, I take away the first chip and show a third green chip that is indistinguishable from the second. And then I toss the second chip and introduce a fourth chip, indistinguishable from the third. At this point, I bring back the first green chip and compare it with the fourth. The fourth chip now looks bluish by contrast, and is easily distinguished from the original. How does that happen? We don't notice tiny differences, but they add up to noticable differences. We can be walked, step-wise, from any color to any other color without ever noticing a difference, provided our steps are small enough!

Same for sound, I bet. That's why I don't understand the obsession with pair-wise double-blind testing of individual components. Comparing two amps, alone, may not yield a discriminable difference. Likewise, two preamps might be pairwise indiscriminable. But the amp-pre-amp combos (there will be four possibilities) may be *noticably* different from one another. I bet this happens, but the tests are all about isolating one component and distinguishing it from a competitor, which is exactly wrong!

The same goes for wire and cable. It may be difficult to discern the result of swapping out one standard power cord or set of ic's or speaker cables. But replace all of them together and then test the completely upgraded set against the stock setup and see what you've got. At least, I'd love to see double-blind testing that is holistic like this. I'd take the results very seriously.

From the holistic tests, you can work backward to see what is contributing to good sound, just as you can eventually align all color chips in the proper order, if presented with the whole lot of them. But what needs to be compared in the first place are large chunks of the system. Even if amp/pre-amp combos couldn't be distinguished, perhaps amp/pre-amp combos with different cabling could be (even though none of the three elements used distinguishable products!). I want to see this done. Double blind.

In short: unnoticable difference add up to *very* noticable differences. Why this non-additive nature of comparison isn't at the forefront of the subjectivist/objectivist debate is a complete mystery to me.

-Troy
My point was not to call into question the efficacy of blind testing. I am quite in favor of it. Even when only one element of a system is varied, the results are interesting, and valuable. For instance, if I can pairwise distinguish speakers (blindly) of $1K and $2K, but not be able to distinguish similarly priced amps, or powercords, or what have you, then my money is best spent on speakers. Likewise, if preamps are more easily distinguishable than amps, I'll put my money there. A site that's interesting in this regard is:

http://www.provide.net/~djcarlst/abx_data.htm

I never said DBT is ineffective. It's just that *most* testing ignores the phenomenon that I cited: sameness of sound is intransitive, i.e., a=b,b=c, but not a=c. If the question is whether a certain component contributes to the optimal audio system, this phenomenon can't be ignored.

Of course scientists studying psychoacoustics are already aware of the phenomenon. I don't think I'm making a contribution to the science here. But the test you cite above is an exception, and for the most part, A/B comparisons are done while swapping single components, not large parts of the system. This is fine, when you *do* discover differences. Because then you know they're significant. But when you don't find differences, it's indeterminate whether there are no differences to be found OR the differences won't show up until other similar adjustments are made elsewhere in the system.

But I am *very much* in favor of blind testing, even in the pair-wise fashion. For instance, I want to know what the minimum amount of money is that I could spend to match the performance of a $20K amp in DBT. Getting *that* close to a 20K amp would be good enough for me, even if the differences between my amp and it will show up with, say, simultaneously swapping a $1K preamp with a $20K preamp. So where's that point of auditorily near-enough for amps?

I've also learned from DBT where I want to spend my extremely limited cash: speakers first, then room treatment, then source/preamp, then amp, then ic's and such. I'll invest in things that make pair-wise (blind) audible differences over (blind) inaudible differences any day.

Still, for other people here, who are after the very best in sound, only holistic testing matters. Their question (not mine) is whether quality cabling makes any auditory difference at all, in the very best of systems. Same for amps.

Take a system like Albert Porter's. Blindfold Mr. Porter. If you could swap out all the Purist in his system and put in Radio Shack, and *also* replace his amps with the cheapest amps that have roughly similar specs, without his being able to tell, that would be very surprising. But I haven't seen tests like that... the one you mention above excepted.
the golden-eared: an anecdote

i am a glenn gould fan. according to his biographers, gould could reliably distinguish between playback devices (blind) in the studio, which were indistinguishable to everyone else involved in the studio. gould was special in many ways. it wouldn't surprise me if the anecdote were true.

however, i'm not glenn gould. i'll spend my money on components that are distinguishable by ordinary folks like me.
One more question for Pabelson:

Since you've obviously read a lot more DBT stuff than I have, I'm interested to know: what's your system? (Or, what components do you think match up well against really really expensive ones?)
Pableson:

I think we haven't nearly exhausted all of the non-acoustic mechanisms in play, but the ones you mention are certain among them, and probably more relevant than the ones I mentioned. My general point was that the little bit of psychology I have studied makes me awfully wary of the "objectivity," or context-independence of my own perceptual judgments of quality.

It's good to hear you still take a lot of joy in the audio hobby. It remains unknown whether you can take *as much* joy as you would if you weren't such a skeptic!
To the doubters of DBT:

Women are fairly recent additions to professional orchestras. For years and years, professional musicians insisted they could hear the difference between male and female performers, and that males sounded better. Women were banished to the audience. The practice ended only after blind listening tests showed that no one could discern the sex of a performer.

Surely, these studies had as many flaws as blind cable comparisons. Probably more, since they involved live performances by individual people, which are inevitably idiosyncratic.

Would the DBT doubters here have been lobbying to keep women out of orchestras even after the tests? Or would they, unlike the professional musicians of the day, never have heard the difference in the first place?
Study proposal:

I don't know if any studies of the following kind have been done. But if not, then one should be done.

Materials: two sets of cheap cables -- cosmetically different, and a set of expensive cables that look just like the cheap ones.

First experiment(s): subjects are introduced to the two sets of cheap cables and told the one is a very expensive $15K cable, the other a $15 cable. Descriptions of each cable, in lavish audiophile prose, are printed on glossy tri-fold with nice pictures, and given to the subjects. The "expensive" cable is praised to the heavens and the "cheap" cable is described modestly.

Then the cables are used (not blind) alternately, to play back a variety of music. Subjects are then asked to rate their listening experiences, both quantitatively, and also qualitatively.

To eliminate the worry about cosmetic differences in the cheap cables making a difference, you could do the test twice, once with cable A being the "cheap" one, and once with cable B being the "cheap" one.

Second experiment(s): do the first experiment but with one expensive cable and one cheap cable that look the same. Do it first by telling the truth about the cables, but then, in the second case, by telling the subjects that the expensive cable is cheap and the cheap cable is expensive.

Here, nothing is bliind. Subjects are all looking at the equipment, and can even observe, from a little distance, the cables being hooked up. But if the DBT guys are right, and it's all hype, we should expect in the first experiment, that the introductions to the cables will lead subjects to favor whatever happens to be described as the more expensive cable, both quantitatively, and in their qualitative descriptions, even though the cables are basically identical cheap cables. In the second experiment, we should expect that when subjects are told the true values of the cables, their judgments favor the more expensive one, but also, that when lied to, they prefer the cheaper cable *just as much* as they preferred the expensive one.

If DBT proponents are wrong, you should expect that subjects will rate the cheap (identical) cables about the same, and that in the second experiment, they will vastly prefer the expensive cable when truthfully described, and when lied to, either still prefer the expensive cable (contrary to what they're being told) or prefer the cheap one, but only by a little.

The point is, we don't need to have people "blind" to do the tests.

And if the cables were manufactured especially for this purpose, you could do the testing through the mail, with in-home trials over a long period of time. Wonder what the results would be?
Btw,Tbg, like your former self, I am a modestly paid Assistant Prof. who would dearly love to subsist on cheap electronics. Right on the mark!

What do you think of my suggested experiment? Not blind, but with deception alternated with truth-telling about the values of the cables played?

And Pabelson: do you know if an experiment like this has been performed? I have grad student friends in psych who could do it pretty easily. But there's no point if it's already been done.
WARNING: LONG POST -- LIFE HISTORY AND ITS ILLUSTRATION OF BIASES -- YOU MAY WANT TO SKIP

I went through grad school with a $150 boombox. As a classical music lover, I obviously wasn't happy with it, but what was I going to do? Sell my '87 Buick and walk? Audio was not *that* important to me. So it wasn't until I got a job that I decided to invest a little something in a decent 'stereo'. Still, I was married and my wife was in law school, racking up debts. Not knowing anything about hifi, I decided to get a simple home theatre setup. I went to Best Buy, dropped three hundred on a Yamaha receiver, and another couple hundred on a 5.1 speaker set-up. ($100 off b/c I bought the two together.) Some cheap cables, and I was headed home to set up my new rig. Hooray! And man, this thing came with a sub!

You know what happened, of course. The system actually did pretty well with movies. I don't care all that much about HT being perfect. Eminem's 'Eight Mile' was the first movie I watched with the new setup, and it ROCKED! Highs were crystal clear. Very sharp. And the bass, or actually, it was mid-bass, b/c that sub doesn't go too low, was nice and full in my apartment. Gladiator was great too. Cool.

Then I popped in my cds. I wanted *so* badly to like what I heard. After all, my wife was already pissed that I had spent $500. "$500? And you don't like it? What's wrong with you? If you're going to be so picky, you should have gone to law school rather than taking forever to write a dissertation." (I still wasn't done at the time.) She didn't even know about the extra 100 I had spent on a Monster surge protector and cables.

But it sounded terrible. Mid-range just sucked. There's no other way to say it. And treble portions were highly highly annoying.

I stuck with the system for the next year or so. After a separation from my wife, I did what any lover of sound would do, and finally allowed myself in the local hifi shop. (It was only two blocks from my apartment.) I walked in with the idea of purchasing new monitors for the fronts and leaving the rest of the 5.1 system in place. Explaning my situation, the staff (quite helpful, really) suggested Paradigm monitor bookshelves. They were a few hundred bucks and sounded great in the store. There it was -- lifelike voices, not the tinny, metalic sounds I heard at home. Ahhhh!!!

Before I left with the Monitors, one of the sales guys said he had a pair of Studio v.3's I should listen to before making a purchase. Well... I listened, and *wow*. Incredibly accurate sound. It was nothing you could hear with any combination of Best Buy equipment. I bought the speakers on a pretty hefty discount, with stands, and charged home to listen.

The improvement *was* dramatic, don't get me wrong, but still not anything like I heard in the store. Hmmm... Could it be the other stuff in my system? Nah... Cd players are all the same. And amps too. The Yamaha was rated *way* above the requirements for my new Paradigms. And so what if my source was an old dual VHS/dvd player? Bits is bits, right? So it must be my room.

I spent another several months trying to like the sound. Very quickly, I discovered that speaker positioning mattered, and room treatment too. I made a lot of adjustments, but my sound was never *smooth*, as it was in the store. Hmmm...

About this time, I started researching audio. I was relieved to find out people liked my Paradigms for a "budget" speaker. "Budget? Seriously?" I thought. But everyone seemed to think that source and amplification were also important. And there was this thing called a "preamp".

I went back to the audio store and tried some better receivers -- Pioneers with room eq -- but the sound still wasn't to my liking. Sure it was loud, dynamic, and even full. But it left me cold. I pointed to some shiny gear across the room. "What about that?"

"Oh, you don't want that. It's just two-channel. You want home theatre, right?"

"Well yeah, but first and foremost, I want something that sounds good."

So he played me a Musical Fidelity integrated and cd player (around $1,500 each), with my Paradigms. Unbelievable. I just sat and listened for about two hours, entranced, letting the music work its magic on me.

I couldn't afford the MF, but there was a demo Classe, which sounded very similar in the store, only significantly less cash. I brought that home and auditioned it. Definite improvement on the Yamaha, or so I thought.

I bought it and sold the sub + sats on Ebay. Now I was *there*, right? No. I still got annoyed. But closer. Definitely closer.

Anyway, about this time, I discovered Audiogon. I also started talking to my brother-in-law, who had tried dozens and dozens of combinations of amps and speakers to get vocal music right. I realized I was only at the beginning. I was just started on the audio path. Damn. I thought I could just walk into Best Buy, walk out, and be done with it. I had no idea this would be a hobby, and a long-lasting costly hobby at that.

Anyway, I still have the Classe. And now I wonder whether it actually sounds any better than my old Yamaha. Even if it doesn't, objectively speaking, I think it does, subjectively. Because it's a really pretty amp. It has this super-heavy milled steel remote and the display, volume knob, and everything, just ooze quality. (Ok, the outputs don't. They seem cheap.) I can't help but look at my setup when I listen, and I much prefer looking at the Classe.

Maria Callas had a magnificent voice, but she was also hot, and I'm sure that added to the experience of opera-goers of the time. Speaking for myself, I prefer a grotesquely fat and ugly soprano who sounds good to a waifish beauty who sounds strained, BUT, other things being equal, a beautiful soprano actually *sounds* better in the typical soprano role. I once saw Angelina Gheorghiu in the role of Mikaela in Carmen at the Met. Gorgeous coloratura soprano, but also, she was beautiful, at least from the cheap seats where I sit. Took the breath out of my chest. I bet Gheorghiu wouldn't prove that much better than her fatter and uglier peers in blind comparison. But at the opera, you ain't blindfolded.

Maybe what happened when I looked across that showroom and spotted the shiny MF gear was just love. Just as hunger is the best sauce, love makes things sound better. A *lot* better.
Wattsboss:

I *do* like your analogy of the beauty that grows on you. I've had that experience, as well as its opposite -- the superficial beauty that fades quickly (or immediately upon conquest). True. Typically, it's because facial expressions take on a representational character; they come to stand for the moods and traits of the person. And in the case of a good-to-the-core person, that goodness starts to shine through. In the hot-bitchy type I usually go for, the nastiness gets associated with what I previously thought was cute.

Anyway, it may be that the beauty of an audio system takes time to appreciate fully. But distinguishing between looks doesn't take time, even if the full evaluation of those looks does. Maybe the analogy here is identical twins who no one can tell apart initially, but whose family and close friends can... immediately.

After all, I'm not sure I could distinguish the sounds of two violins immediately, in the hands of a skilled violinist. Each violin makes a wide range of sounds, and I'm not sure what's due to the violin and what's due to the violinist. Yet one violin might be $1K and the other $10K, because violinists themselves can immediately hear the difference. Maybe it's like this with audio. But I have no reason to think so, given the studies I've read, in which audiophiles who are familiar with the equipment, do no better than non-audiophiles (who are also familiar with the equipment).

Also: there is a long-term in-home disguised cable experiment going on right now. It has a few more months. We'll see how that goes.
The standoff between Pabelson and Tbg reminds me of the stalemate between the external-world skeptic and the dogmatist.

Skeptic: You don't know that you're not a brain in a vat of nutrients, being stimulated by a computer simulation, carefully monitored by a team of scientists, to think you're in a real, concrete world... the world you *think* you're in. Since you don't know you're not a brain-in-a-vat, you don't know anything mundane about the external world, e.g., that you have two hands.

Dogmatist: I know I have two hands! If I know I have two hands, then I know I am not a handless brain-in-a-vat. Therefore, I know I am not a handless brain-in-a-vat.

One man's modus ponens is another man's modus tollens, as the saying goes.

(For non-logicians, modus ponens is: If P then Q. P. Therefore Q. Modus tollens is If P then Q. Not-Q. Therefore, not-P.)

Pabelson: DBT shows no audible difference between cables, therefore there is no audible difference.

Tbg: There is an audible difference between cables, therefore, DBT is flawed.

Logic alone (formal logic) cannot settle the dispute, any more than logic can settle the skeptic/dogmatist dispute.

But in this case, it's odd to think of Tbg's favored cables being a/b'ed with cheapos, without his being able to tell the difference, and then, only when told the true identity of the cables, his insistence that there *is* a perceivable difference. Very odd.

Here's a question for the doubters of DBT-ing. Given that there are perceptual biases at work (expectation, confirmation, endowment effect, etc.) how would one test for such biases? That is, what *would* count as two components sounding the same?

Suppose you have two amps that are identical except one of them has a beetle put inside and the beetle runs around, I don't know, defecating in there. And then reviewers praise the beetle effect: "Widened the soundstage by meters! You don't need golden ears to hear this one!" How would you go about evaluating the beetle effect?
Pabelson: we've said the same thing... what *would* count as a test of audible difference if not dbt? How would we ever know a beetle in the box wouldn't make an audible difference? Or what if we were simply to change the faceplate on an amp. Nothing more. Would that change the audible sound? How would you know? What if the reviewers rave?

Shadorne: I said much the same earlier in this thread. Why anyone, whether or not they think DBT is the *final* word, would ignore DBT as a way of determining where to spend their own money (speakers, room treatment first, then other stuff) is beyond me. In other words, I really don't understand someone who would spend more on power cords, conditioners, and interconnects than speakers, given the DBT results. And there are plenty such people!
Shadorne:

Another way of making your point is this. Even if ABX tests do not reveal all audible differences, somehow, they do reveal *degrees* of difference. Components that ABX as different, and clearly so, are different to a *greater* degree than components that are indistinguishable under ABX conditions. Therefore, they are more deserving of audiophile evaluation. Likewise, ABX-distinguishable gear that is perceived as clearly better in DBT-ing, is more deserving of audiophile cash than gear that is not perceived as clearly better in DBT-ing.

This is independent of whether or not there is some perceiveable difference between components that are ABX indistinguishable. (Although I still can't understand how that could be.)

Yet, ABX opponents seem to ignore this more modest lesson. They reject ABX as a way of ultimately distinguishing components, and therefore decide it is unworthy as a reviewer tool at all, even in deciding where to drop their cash. Why?
An explanation of why we pick up auditory differences closely spaced in time but not those spaced out over time:

The auditory system works like most of our perceptual systems, by detecting differences and similarities, rather than absolute values. What we detect, for the most part, are differences from a norm or differences within a scene itself (synchronically). The norm gets set contextually, by relevant background cues. This is more evolutionarily advantageous than detecting absolute qualities, because the range of difference we can represent is much smaller than the range of possible absolute value differences. By setting a base rate relevant to the situation and representing only sameness and difference from the base rate, one can represent differences across the whole spectrum of absolute values, without using the informational space to encode for each value separately.

For instance, we can detect light in incredibly small amounts -- only a few photons -- and also at the level of millions of photons striking the retina, but we can't come close to representing that kind of variation in absolute terms. We don't have enough hardware. What does our visual system do? Well, the retina fires at a base rate, which adjusts to the prevailing lighting condition. Below that is seen as darker, above that is seen as lighter. A great heuristic.

As it gets completely dark, you don't see black, but what is called "brain grey", because there is no absolute variation from the background norm. You see almost the same color in full lighting when covering both eyes with ping pong balls, to diffuse the light into a uniform field. With no differences detected, the field goes to brain grey.

Ask yourself why the television screen looks grey when it's not on, but black when you're watching a wide-screen movie. Black is a contrast color and true black only exists in the presence of contrast. Same for brown and olive and rust.

Same for happiness, actually. The psych/econ literature on happiness shows that most traumatic or sought after events are mere blips on the happiness meter, as we simply shift base rates in response, adjusting to the new conditions. Happiness is primarily a measure of immediate changes, bumps above base rate. So minor things, like good weather and people saying a friendly hello, are more tightly correlated with happiness than major conditions like having the job or the car you've been wanting.

Think about pitch. We can tell whether pitch is moving, but only the lucky few have any sense of absolute pitch... and this is usually a skill developed with a lot of feedback and practice. Why? Because it's more useful and economical to encode that information.

Far from cleansing the auditory taste of one note from one's mind and then playing another, you need to play them immediately back to back for comparison purposes. Perhaps you can switch the order around to eliminate after-effects.

By the way... wine-lovers *do* take blind taste tests. And experts can readily identify ingredients in wine, as well as many other objectively verifiable qualities. So it is perhaps not the best analogy for audiophiles who cannot do the same, and won't deign to try.
Several people here seem to mistake the purpose of DBT. The purpose is not necessarily finding the "best" component, although that may be the case, for instance, in Harman's speaker testing. The point is often simply to see if there is any audible difference whatsoever between components. As Pabelson noted way, way back in this thread, if two systems differ with respect to *any* fancy audiophile qualities (presentation, color, soundstage, etc.) then they will be distinguishable. And if they are distinguishable, that will show up in DBT. Ergo, if two systems are NOT distinguishable with DBT, they do not differ with respect to any fancy audiophilic qualities. (That's modus tollens.)

So, if two amps cannot be distinguished unless you're looking at the faceplates, why buy the more expensive one? Now who finds fault with that reasoning?

It's not a matter of "I like one kind of sound, that other guy likes another kind of sound, so to each his own." If no one can distinguish two components, then our particular tastes in sound are irrelevant. There's just no difference to be had.
Tgb:

All of us hear are interested in one thing: the truth. If DBT is a fundamentally flawed methodology, its results are no guide to the truth about what sounds good. So if the studies are all flawed, and there are audible differences between amplifiers with virtually the same specs, even if, somehow, no one can detect those differences without looking at the amps, then I'm with you. Likewise, if there isn't anything fundamentally wrong with the studies, and they strongly indicate that certain components are audibly indistinguishable, then you should be with me.

Your own perceptions -- "I can hear a difference and my tastes are all that matters" -- should not trump science any more than your own experiences in general should trump science. I remember seeing ads with athletes saying "Smoking helps me catch my wind." I also recall people saying how smoking made them healthy and live long. Their personal experiences with smoking did not trump the scientific evidence, though. This is just superstition. The Pennsylvania Dutch used to think that if you didn't eat doughnuts on Fastnacht's Day, you'd have a poor crop. Someone had that experience, no doubt. But it was just an accident. Science is supposed to sort accident from true lawful generalization. It's supposed to eliminate bias, as far as possible, in our individual judgments and take us beyond the realm of the anecdote.

Now, if your perception of one component bettering another is blind, then ok. But if you're looking at the amp, then, given what we know about perception, your judgments aren't worth a whole lot.

So... are the studies all flawed? Well, certainly some of the studies are flawed. But, as Pableson said, the studies all point to the same conclusions. And there are lots of studies, all flawed in different ways. Accident? Probably not.

Compare climate science. Lots of models of global temperatures over the next hundred years and they differ by a wide margin from each other (10 degrees). They're all flawed models. But they all agree there's warming. To say that the models are flawed isn't enough to dismiss the science as a whole. Same in psychoacoustics.

Long story short: there's no substitute for wading through all of the studies. I haven't done this, but I've read several, and I didn't see how the minor flaws in methodology could account for no one's being able to distinguish cables, for instance.