In defense of ABX testing

We Audiophiles need to get ourselves out of the stoneage, reject mythology, and say goodbye to superstition. Especially the reviewers, who do us a disservice by endlessly writing articles claiming the latest tweak or gadget revolutionized the sound of their system. Likewise, any reviewer who claims that ABX testing is not applicable to high end audio needs to find a new career path. Like anything, there is a right way and many wrong ways. Hail Science!

Here's an interesting thread on the hydrogenaudio website:

http://www.hydrogenaud.io/forums/index.php?showtopic=108062

This caught my eye in particular:

"The problem with sighted evaluations is very visible in consumer high end audio, where all sorts of very poorly trained listeners claim that they have heard differences that, in technical terms are impossibly small or non existent.

The corresponding problem is that blind tests deal with this problem of false positives very effectively, but can easily produce false negatives."

psag01-14-2015 7:14pm

...
56 posts total
View entire discussion

nonoise Details Discussions Posts This discussion Message User 4,382 posts 01-21-2015 at 06:27pm Copy Comment URL Report Comment Edit Delete Jea48, I think if they did it the way you recommend, then we'd have nothing to argue about. :-)
dbphd Details Discussions Posts This discussion Message User 1,058 posts 01-21-2015 at 06:57pm Copy Comment URL Report Comment Edit Delete One of the difficulties in behavioral research is precise definition of stimulus-response. The more complex the stimulus, the less precise the definition. In ABX testing, the definition of the cumulative response is trivial: can reliable hear a difference or not. But because the stimulus is likely to be imprecisely defined, absence of reliably making the distinction does not mean there is no difference. For music, Gestalt seems too relevant. That's one of the reasons we know so much about what a rat is likely to do in a maze and so little about what a kid is likely to do in a classroom. db
schubert Details Discussions Posts This discussion Message User 4,867 posts 01-21-2015 at 09:25pm Copy Comment URL Report Comment Edit Delete Another being a kid is not a rat.
zd542 Details Discussions Posts This discussion Message User 5,339 posts 01-21-2015 at 10:44pm Copy Comment URL Report Comment Edit Delete "LOL, but doctors in that time period of history didn't know any better." That was my point. The only thing they knew for sure about blood was that if you loose too much of it, you die. Armed with only that info, I stand by my statement. lol. Maybe there was some logic to it, but I don't see. I guess its possible that every time a doctor saw a bleeding, injured person, he thought that was the body getting rid of some excess blood. Why blame it on the arrow stuck in the arm. About the rest of your post, my comments on all this, in context, is in reference to the thread the OP mentioned on Hydrogen Audio. The complaint was that reviewers were listening to audio components and then basing their review on what they heard, and not doing any type of scientific listening tests. They've complaining about this very thing for years. My comment was that if these types of test are so damn important, then just do them already. Even if its just a few tests just to show us all how to do it. Instead, its just year after year complaining, and they never do anything. That said, if they can come up with some kind of useful tool to help better evaluate audio equipment, I know I would be interested in seeing it. Why not? My personal opinion is that they don't have the guts to do anything they talk about. There's always the chance they would be wrong. They have too much invested in the argument. "In ABX testing, the definition of the cumulative response is trivial: can reliable hear a difference or not. But because the stimulus is likely to be imprecisely defined, absence of reliably making the distinction does not mean there is no difference." I agree with that. My view is that you would have to tell the test subjects what they're listening for. There's really no way around it, they have to know. To offset results that are not accurate, you can increase the number of tests, or try's, each subject takes. So, for example, if you were trying to test to see if a difference can be heard between a silver cable and a copper cable, all other things being equal, maybe have them listen to 50 or 100 samples. Maybe they can get lucky and guess correctly for 5 or 10, but 100 is highly unlikely. Not only that, under the same scenario, you can tell them exactly what they should be listening for. If there is really no difference to be heard, over time/individual tries, the test subjects will have to trend towards a 50/50 split. It won't matter what they think, or know they can hear.
jea48 Details Discussions Posts This discussion Message User 2,944 posts 01-22-2015 at 02:14pm Copy Comment URL Report Comment Edit Delete To offset results that are not accurate, you can increase the number of tests, or try's, each subject takes. So, for example, if you were trying to test to see if a difference can be heard between a silver cable and a copper cable, all other things being equal, maybe have them listen to 50 or 100 samples. Maybe they can get lucky and guess correctly for 5 or 10, but 100 is highly unlikely. 01-21-15: Zd542 50 to 100 samples you say? Why not make it 100 to 200? Over how many years do you expect your ABX listening test experiment to take? Just curious have you ever A/B compared 2 or 3 cables to one another? More than 3 or 4 at a time? Could you hear audible differences between the cables? 50 to 100 samples.... Do you believe there are people in the world that can tell which key of a piano is struck on a tuned grand piano in a blind test? Do you think their brain learned the sound of each key in the span of a week or so, or even a few months or so? How about in a year? .

56 posts total
View entire discussion

Recent Activity

Unanswered

Related to You

Following

Insider Lobby

Start A New Discussion

In defense of ABX testing

More to discover

Audiogon

The world's largest high-end audio community.

Virtual Systems

Let the world see what you've built.

Bluebook

The right price. Every time.

Merch

Rep the community and hobby you love so much.

Audiogon

Virtual Systems

Bluebook

Merch

Rep the community and hobby you love so much.

Recent Activity

Unanswered

Related to You

Following

Insider Lobby

Start A New Discussion

In defense of ABX testing

nonoise

dbphd

schubert

zd542

jea48

More to discover