Are our 'test' records adequate?


Most of us have some favourite records with which to check the health of our systems, or to assess a new component within our systems.
These records are often carried with us whenever we wish to assess a completely foreign system in a different environment. I have my favourite ‘test’ records, some of which I continue to use even after 30 years. I know them (or parts of them), so intimately that I feel confident in my ability to assess a component or complete system after just one listen.
I know other audiophiles who have specialised their ‘test’ records to such an extent that they have different discs to evaluate for Voice, Bass, Large Orchestral, Chamber, Piano, Strings, Drums, Jazz, Rock.
Almost invariably, these vinyl discs are superbly recorded and sound stunning, not just on very fine systems, but also on average systems.
Of course, because each of us knows his own discs so intimately, it is possible to assess the 'omissions'in a foreign system by memory, often to the puzzlement of those to whom the discs are not so well known and to whom the sound had been thoroughly satisfying and impressive?
But I have begun to wonder recently, if this is in fact the most reliable method of evaluating components and systems?
I am sure most of us have heard records on our systems which are almost unlistenable or certainly unpleasant and we have simply placed these discs in the 'never to played' shelf of our storage unit?
But perhaps some of these records might be more revealing than our fabulously recorded 'test' material?
For some time I have been disturbed by two records in my collection, which despite their fame, have sounded poorly (in various parts) despite improvements to my turntable, speakers, amplifiers and cartridges.

Harvest by Neil Young on Reprise (7599-27239-1) has some nicely recorded tracks (Out On The Weekend, Harvest, Heart Of Gold) as well as 2 tracks (Alabama, Words), which have confounded me with their leaness, lack of real bass, vocal distortion and complete lack of depth. The album was recorded at four different venues with three different Producers and those two tracks share the same Producers and venues.
After mounting a Continuum Copperhead arm as well as a DaVinci 12" Grandezza on my Raven AC-3 and carefully setting arm/cartridge geometries with the supplied Wally Tractor and Feikert disc protractor, I was actually able to listen to these tracks without flinching, and could now clearly ascertain the 'out-of-key' harmonies of Stephen Stills together with the clearly over-dubbed lead guitar boosted above the general sound level on the right channel and the completely flat soundstage.

Respighi Pines of Rome (Reiner on the Classic Records re-issue of the RCA LSC-2436) had always brought my wife storming down the hallway at the 'screeching' Finale whilst I scrambled for the volume control to save my bleeding ears.
Again with the two stellar arms and strict geometry, the 117 musicians could not hide the shrill, thin and overloaded recording levels of the horns (particularly the trumpets).
But the wife stayed away and my volume level remained unchanged.

My wonderfully recorded 'test' records had sounded just fine with my previous Hadcock arm but it's only now, when two 'horror' discs can be appreciated, that I truly believe my system 'sings'.
Perhaps we could re-listen to some 'horror' discs in our collection and, with some adjustments to our set-up, make them, if not enjoyable, at least listenable?
128x128halcro
good topic, invite here to read this interesting piece written some time ago. it is called:

Are you on the road to audio hell?

The proposed system: comparison by contrast.

When audiotioning only two playback systes using the usual method (comparison by reference) we will have at least 50% chance of choosing the one which is the more accurate. However, evaluations of single components willy-nilly test the entire playback chain; therefore efforts to choose the more accurate component are compounded by the likelihood that we will be equally uncertain as to the accuracy of each of the system’s associated components if for no other reason than that they were chosen by a method which only guarantees prejudice. How can we have any confidence that having chosen one component by such a method that its presence in the system won’t mislead us when evaluating other components in the playback chain, present or future.

The way to sort out which system or component is more accurate is to invert the test. Instead of comparing a handful of recordings -presumed to be definitive- on two different systems to determine which one coincides with our present feeling about the way that music ought to sound, play a larger number of recordings of vastly different styles and recording technique on two different systems to hear which system reveals more differences between the recordings. This is a procedure which anyone with ears can make use of, but requires letting go of some of our favoured practices and prejudices.

In more detail, it would go something like this: Line up about two dozen or so recordings of different kind of music – pop vocal, orchestral, jazz, chamber music, folk, rock, opera, piano – music you like, but recordings of which you are unfamiliar. ( It is very important to avoid your favourite ‘test’ recordings presuming that they will tell you what you need to know about some performance parameter or another, because doing so will likely only serve to confirm or deny an expectation based on prior ‘performances’you have heard on other systems or components. More later ) First with one system and then the other, play through complete numbers from all these in one sitting. ( The other systems may be entirely different or have only one variable such as cables, amplifier or speakers.)

The more accurate system is the one which reproduces more differences – more contrast between the various program sources.

To suggest a simplified example, imgagine a 1949’s wind-up phonograph playing recordings of Al Jolson singing ‘Swanee’ and the Philadelphia Orchestra playing Beethoven. The playback from these recordings will be more alike than LP versions of these very recordings played back through a reasonably good modern audio system. Correct? What we are after is a playback system which maximises those differences. Some orchestral recordings for example, will present stages beyond the confines of the speaker borders, others tend to to gather between the speakers, some will seem to articulate instruments in space; others present them in a mass as if perceived from a balcony; some will present the winds recessed deep into the orchestra; others up front; some will indulge the bass drum with tremendous power; others barely distinguish between the character of tympany and bass drum. It is absolutely no consequence that these differences may have resulted from performing style or recording methodology and manufacture, or that they may have completely misrepresented the actual live event. Therefore when comparing speaker systems, it would be a mistake to assume that one which always presents a gigantic stage well beyond the confines of the speakers, for example, is more accurate. You might like –or even prefer- what that system does to staging, but the other speaker, because it is realising differences between recordings, is very likely more accurate; and in respect of the other variables from recording to recording, will turn out to be more revealing of the performance.

Some pop vocal recordings present us with resonant voices, others dry; some as part of the instrument texture, others envelope us leaving the accompanying instruments and vocals well in the background; some are nasal, some gravely, some metallic; others warm. The old method –Comparison by Reference- would have us respond positively to that playback system, which together with the associated ‘reference’ recording, achieves a pre-conceived notion of how vocal is presented and how it sounds in relation to the instruments in regard to such parameters as relative size, shape, level, weight, definition, etc. Over time we find ourselves preferring a particular presentation of pop vocal (or orchestral balance, or rock thwack, or jazz intimacy, or piano percussiveness- you name it) and infer a correctness when approximated by certain recordings. We then compound our mistake by raising these recordings to reference status (pace prof. Johnson), and seek this ‘correct’ presentation from every system we later evaluate; and if it isn’t there, we are likely to dismiss that system as incorrect. The problem is that since neither recording nor playback system was accurate to begin with, the expectation that later systems should comply is dangerous. In fact, if their presentations are consistently similar, then they must be inaccurate by definition simply because either by default or intention no two recordings are exactly similar, and while there are other important criteria which any satisfactory audio component or system must satisfy –absence of fatigue being one of the most essential- very little is not subsumed by the new method of comparison offered here.

Peter Quortrup AUDIOPHILE UK edition February 1994
Tuboo,

Good stuff.

I agree with the process outlined as an effective way to really differentiate and select a good playback system
We then compound our mistake by raising these recordings to reference status (pace prof. Johnson), and seek this ‘correct’ presentation from every system we later evaluate; and if it isn’t there, we are likely to dismiss that system as incorrect.
Some very perceptive reflections Tuboo.
Thanks
thanks for reply.
it intrigued me in 1994 the moment i read this piece and it still inspires me in 2009.
the more you read this piece the more it dawns on you.
Mr Quortrup is quite a special figure.
i'm a great advocate of this protocol but there is no single 'law' to apply for enjoying the music.
nevertheless this single quote was like an audiophile bomb to me when i first read it:

The more accurate system is the one which reproduces more differences – more contrast between the various program sources.

and it still is, i can't escape to it.
Very interesting, Turboo. Thank you for posting that.

The more widely used method, listening to a few reference recordings and comparing the sound to a retained memory of what something should sound like, can be useful for *analyzing* a system's or component's ability to reproduce a specific trait. For instance, at last year's RMAF we were interested in the capabilities of two particular components. Based on their designs, we had predicted that both might have difficulty reproducing fast, complex yet delicate harmonics. We therefore brought along one LP which contains a lot of such material.

One of the components performed as we expected, failing miserably. The other surprised us, happily, and might be a candidate for our system in the future.

Having passed that test, the next logical step would be to extensively A/B the second component in our system using Quortrup's protocol.

We recently had a visit from another A'goner. After playing several of his favorite/reference LP's we subjected him to a variety of music that he'd ever listened to before. I didn't let him take my LP's, but he did make some notes. :-) Perhaps he'll find similar or identical recordings and compare the degree of differences in his system vs. ours.

Good stuff...