In defense of ABX testing


We Audiophiles need to get ourselves out of the stoneage, reject mythology, and say goodbye to superstition. Especially the reviewers, who do us a disservice by endlessly writing articles claiming the latest tweak or gadget revolutionized the sound of their system. Likewise, any reviewer who claims that ABX testing is not applicable to high end audio needs to find a new career path. Like anything, there is a right way and many wrong ways. Hail Science!

Here's an interesting thread on the hydrogenaudio website:

http://www.hydrogenaud.io/forums/index.php?showtopic=108062

This caught my eye in particular:

"The problem with sighted evaluations is very visible in consumer high end audio, where all sorts of very poorly trained listeners claim that they have heard differences that, in technical terms are impossibly small or non existent.

The corresponding problem is that blind tests deal with this problem of false positives very effectively, but can easily produce false negatives."
psag

To offset results that are not accurate, you can increase the number of tests, or try's, each subject takes. So, for example, if you were trying to test to see if a difference can be heard between a silver cable and a copper cable, all other things being equal, maybe have them listen to 50 or 100 samples. Maybe they can get lucky and guess correctly for 5 or 10, but 100 is highly unlikely.
01-21-15: Zd542
50 to 100 samples you say? Why not make it 100 to 200? Over how many years do you expect your ABX listening test experiment to take?

Just curious have you ever A/B compared 2 or 3 cables to one another? More than 3 or 4 at a time? Could you hear audible differences between the cables?

50 to 100 samples.... Do you believe there are people in the world that can tell which key of a piano is struck on a tuned grand piano in a blind test? Do you think their brain learned the sound of each key in the span of a week or so, or even a few months or so? How about in a year?
.

This bleeding thing is a bit off topic, but since it keeps coming up, I thought I'd clarify that issue. (because I have no good opinion on the ABX thing)

During ancient and medieval times doctors believed in "the humor theory". It's pretty complicated and a bit funny by modern standards, but the short explanation is that the blood carries liquids called "humors". A sick person has bad humors in their blood and you have to let it out. Thus "blood letting". A person in good humors is healthy.

This theory died out in the 1700's and early 1800's, when the new and wonderful "germ theory" of disease became more popular.

For medical testing, doctors would normally draw a sample of the blood and check the color and taste. Good humors taste good, I imagine. =-}

http://en.wikipedia.org/wiki/Humorism

I read a lot of ancient writings...
"50 to 100 samples you say? Why not make it 100 to 200? Over how many years do you expect your ABX listening test experiment to take?

Just curious have you ever A/B compared 2 or 3 cables to one another? More than 3 or 4 at a time? Could you hear audible differences between the cables?"

Yes, I have. I did an experiment a few years ago and compared AQ Cheetah IC's to a pair of AQ Panther IC's. Both cables are identical except for the conductors themselves. One silver, one copper. The goal was to see of a difference could be heard between the 2 metals, and nothing else. It wasn't about what one sounded better, just if there was a difference. There was 4 of us took the test and we listened to 100 samples of a 10 second audio clip that took around 30-40 minutes for each of us.

"50 to 100 samples.... Do you believe there are people in the world that can tell which key of a piano is struck on a tuned grand piano in a blind test? Do you think their brain learned the sound of each key in the span of a week or so, or even a few months or so? How about in a year? "

Actually yes, and I can prove it. My brother has something called perfect pitch. He can tell with 100% accuracy what any note or cord played on any instrument is, and if its in tune or not. I don't have it myself, but if you have ever played an instrument, you can develop something called relative pitch. Its not as good as perfect pitch, but its a skill that can be learned. For me, I needed to develop the skill somewhat when I played drums in school. If you have ever seen kettle drums or tympani, they have to be tuned to a certain note when you play them. That is what the food pedal is for, it sets tension on the drum head. Anyway, you have to be able to set the drums to different notes while the band is playing. To do this, you tap it very lightly (because the band is playing), and hopefully tune it to the right note before you need to play it. Its not an easy thing to do, but its a skill that can be learned.
Yes, I have. I did an experiment a few years ago and compared AQ Cheetah IC's to a pair of AQ Panther IC's. Both cables are identical except for the conductors themselves. One silver, one copper. The goal was to see of a difference could be heard between the 2 metals, and nothing else. It wasn't about what one sounded better, just if there was a difference. There was 4 of us took the test and we listened to 100 samples of a 10 second audio clip that took around 30-40 minutes for each of us.
01-22-15: Zd542

There was 4 of us took the test and we listened to 100 samples of a 10 second audio clip that took around 30-40 minutes for each of us.
01-22-15: Zd542
And the findings, results, of the listening test?

Just curious what is behind your thinking of needing so many samples for your listening test?
.
This to me is very indicative of the people in power or the ones that are the "experts" wanting to stay that way. Remember the attitude in the sixties and early seventies regarding wines and how the "experts" continuously stated that French wines were the best and everyone else's was not very good? It wasn't until the "judgement in Paris" happened that the world realized that opinions were changed dramatically when blind testing occurred. There I absolutely no scientifically logical explanation why blind testing isn't the best comparison method.

Of course it has to be an apples to apples comparison. This to me means price point testing. Just like cars. Pick a price point, get the equipment that falls within that price range and go at it. But, tube lovers will pick tube equipment most of the time based on knowing what they are hearing ahead of time. Same is true for solid state lovers. But, blind testing? within price points? Lets see what the experts say then. But, the "experts" don't what to do that because the would show people that many of them (absolutely not all of them) are frauds.

if test are not done scientifically and are not based on "opinions" they really aren't real to me. How does one measure whether the equipment accurately demonstrated the sound stage depth? dimensionality? etc. I hear many opinions of the reviewers, but based on what? What criteria? are you going by memory in your opinions and comparisons? or did you listen intently and then switch out that amp with another (without changing anything else) and listen again?

I have read of some reviews that do exactly that. And the equipment they are reviewing is compared to similar equipment within the price point. That is alright for me. But, I still prefer an A/B comparison test that is blind to really identify the sonic differences in an unbiased way.

enjoy