How can a 40 watt amp outshine a 140 watt amp


My query is; I see $6,000 integrated amplifiers with 40 watts per channel, how is this better than my Pionner Elite SC-35 @ 140 watts per channel, what am I going to hear different, with a, let's say Manley Labs - STINGRAY II? I obviously don't understand the basics involved and if someone could explain or point me in the right direction, I would greatly appreciate it.

I would like to set up a nice two channel analog system. I really can't afford the aforementioned Stingray, what is "out there" in the 2.5 grand range?
mystertee
How can a 40 watt amp outshine a 140 watt amp
So, I've resisted responding here for quite a while, but temptation has overcome me. My first thought is that the challenge is pretty silly, in that it relies on crippling the better amp, or jacking around with the lesser amp to make the sounds least able to be discriminated. What's left is exactly what most listeners don't care about, which is whether the sound fits some arbitrary standard, rather than whether it sounds lifelike.

Several comments above have made similar points, so I'll add another observation a little more technical. The test requires 24 judgements to be correct. If we are willing to assume that each judgement is statistically independent (arguable, but not terribly germane), then the probability of passing the test if you can detect exactly no difference between the amps is roughly .00000006, a pretty stringent test.

That is, if the probability of choosing the better amp is exactly .5 (we are just flipping a coin to make our choice), then the probability of passing the test (by chance) is less than .0000001. Let's call the probability of detecting a difference on any given trial "p", and the probability of passing the test "P". In our example, p=.5 and P<.0000001, if there is exactly no difference between the amps. Now, suppose there is a small, but hard to detect difference between the amps. Since we have to introduce a variable source signal (music), we cannot just compare one sine wave signal to another, and we are unable to compare the amps to each other with 100% accuracy. The music thus introduces uncertainty into the comparison. If this uncertainty is large, or the difference between the amps is small, p will be near .5 (might as well flip a coin). If the uncertainty is small, and the difference between the two amps is large, then p will approach 1.0. Note that the challenge forces P to equal 1.0. In other words, the challenge is based on the assumption that ANY difference in amps should make it possible to detect a difference in EVERY case. Looking at it from this point of view, the fact that the challenge has never been overcome is just a statistical artifact of the design of the challenge. For those who have had a stat course, the design has almost no statistical power when the signals from the amps are pretty close, or the uncertainty introduced into the signal by the music is large. The design is strongly (!) biased in favor of the null hypothesis of no difference.

Suppose we allow there to be some difference between the amps, but not enough to be detected every time, say, a p value of .6, meaning that we only can detect the difference about 60% of the time. Now, P (the probability of winning the challenge) is less than .00001, still very unlikely. But notice that there is a real difference between the amps. It's obscured by our jacking around with the signals, our confusion induced by the variability of the music, and the fact that we require perfect performance on each trial, but the difference between the amps is still very real.

What situation would lead us to be able to pass the test more often than not? We would have to be able to detect the difference on every trial more than 97% (p greater than .97) of the time--an extraordinary level of performance for an ambiguous stimulus.

The bottom line is that the challenge is primarily a statistical artifact based on the fallacy of accepting the null hypothesis. We cannot conclude that there is exactly no difference between the amps, because we can never prove that p is exactly .5. All we have proven is that we can set up an experiment with enough ambiguity, and so little statistical power, that the result is a foregone conclusion. The prize money is safe for quite some time.
Thanks Bryon. An intellectually honest way of setting up the experiment would have been to test whether a listener ever gets it right more often than chance. A totally different analysis, and a money losing proposition for him, I suspect.
My objection is simple and to my thinking is the 'trick' to being unable to distinguish betweem 2 amps... the equaliser.... as soon as that is inserted in line? It is MODIFYING the sound of the amplifier it's inline with to be identical, as far as electronic testing is concerned... He cheats by removing that difference.