I listen mainly to music that was originally generated using non-electro-acoustic sources.
So I too would like perfectly accurate sound (unless it be from the echo-affected seats in the Royal Albert Hall, in which case I would happily accept a measure of echo cancellation.
The situation could be different for "audiophiles" who do not listen to live, acoustically-generated music. I would prefer to call these people audiorasts (aka pederasts).
Unfortunately, no part of the reproduction chain is perfect, and this can easily lead to specification difficulty. For example, some of the best systems I have heard use speakers have total output acoustic power that correlates very well over the frequency range with on-axis density - but the intrinsic frequency response is not that flat; accordingly the amplifier frequency response has to be modified (with quite a fine resolution) to correct the overall response*. We are indeed degrading the amplifier to correct for the defects in the speaker - measurements of either would not look good on paper.
But the hardware is not the only problem - some "sound engineers" modify the microphone balance and even frequency response during the course of a piece of music. Even where they only do this where the featured instrumentalist is silent (i.e. in order to reduce background noise) you get a distracting change in the reproduced acoustic; but it also seems that some sound engineers think they know better than the conductors. BBC proms producers please note.
*In case anyone thinks otherwise, I am not referring here to any widely advertised system.
Being involved in instrumentation design, I was always a bit skeptical that a certain amplifier or speaker could sound better or worse than than measured results.
One day on a whim, I hooked up my distortion meters to the actual speaker terminals of my living room stereo, and I was in for a rude awakening. The distortion readings were a couple of orders higher than for a pure resistor load! Not only that, but the frequency response drooped quite significantly above about 6Khz using lamp cord.
When I tested various other power amplifiers, I found that their performance under a speaker load bore no resemblance to their performance under a resistor load.
Performance also varied wildly with the particular speaker systems used. A full range electrostatic by Quad made many amplifiers go unstable and most perform very badly. Not surprisingly, the Quad amplifier seemed unaffected by actual speaker loads.
Speaker cables did have a significant effect on frequency response and distortion.
So if ear tests don't correspond with instrument tests, you're not testing properly.
How can you be certain that you are measurung the correct things, and you know what "Accurate" really means wrt sound quality? Here are a few historic examples where listeners uncovered gaps in the measurement methodologies available at the time:
Early SS amplifiers measured better than their tube conterparts, but many audio buffs found the sound to be clearly inferior. The culprit? Very high levels of crossover distortion having high peak/average ratios. Clearly audible. THD analyzers in the day notched out the fundamental and measured the average or RMS level of the residual. Thus a favorable objective measurement of a subjectively bad amplifier. We now know to look at distortion harmonic spectra at all amplifier power output levels and can easily uncover these flaws.
Jitter is another example of an audible defect uncovered by careful listeners and widely panned by many engineers as unmeasureable and so irrelevant. It wasn't until John Meyer developed and published a measurement methodology and started to correleate jitter with defects in reproduced sound that people started to take notice.
Historically these issues have been solved by savvy engineers and trained listeners who work together to achieve a beneficial result. Sometimes the engineer/listener is the same person, sometimes not.
The use of "audiofool" terminology is deragatory and prejuducial wrt to a group of individuals that historically have been able to uncover issues with audio reproduction that the then current engineering measurement methodologies were unable to uncover.
I'm not defending green pens, mpingo discs, quantum sinks, and the like. Far from it.
I'm suggesting there is a middle ground and broad brushed dismissal of the audibility of unmeasurable differences has been an historically false position to hold.
Most published speaker measurements lie like a rug.
A speaker enclosure has 3 fundamental dimensions which will result in resonances in multiples of the half wavelength and multiples right up the audio spectrum. The Q of these resonances will typically range from 10 to 20. If one dimension is the same or a multiple of the other, the resonances are boosted. To evenly spread the peaks up the audio spectrum the fundamental dimensions of the enclosure should relate as the cube root of two.
To avoid revealing these nasty resonant peaks, speakers are typically tested with pink noise so the high Q resonances have no chance to build. Unfortunately music consists of frequencies held for significant time and the resonances will obviously color the music to the ear.
Every manufacturer wants his specs to look good. Testing realistically would significantly worsen the numbers. So most speaker and amplifier specifications are meaningless to real world performance.
So if ear tests don't correspond to instrument tests, you're not testing honestly.
Speaker and especially headphone transducers deteriorate rapidly with age due to temperature and humidity effects as well as fatigue. This deterioration doesn't have a direct relationship to original specifications or cost.
This is why a transducer that ages gracefully can sound much better to the ears than one that doesn't.
Hanging a curtain, rolling up a rug, or turning the speakers at an angle will noticably change the perceived sound in the room. A person's listening experience is subjective - measurements are not. So? The best advice I've been given is - "listen to the system, buy what you liked". Take into account that whay you heard in the padded listening room in the shop will sound different in your tiled floor living room.
I pretty much agree with everything that has been written here so far...
So anything I add is intended as clarification rather than contradiction.
I think I can be certain that the things I measure are correct. On the other hand, I cannot be certain that I am measuring everything that is important. Similarly, I cannot be certain that the way I am measuring is appropriate, albeit physiological work means the situation should gradually be improving.
That was the case when crossover distortion was routinely ignored; however, the period during which accepted measurements ignored this is long gone - and audiorasts still claim that transistor amplifiers are never the equal of valve.
Known measurements that appear to be little-understood (or underused) at the present are mostly on speakers and on listeners.
Speaker issues include:
hangover, frequency-uniformity of angular distribution, and distortion (most speaker tests only measure high-level distortion - a potential mistake, particularly when organic and other so-called intelligent materials are used).
To return to "accuracy", I agree that listener tests are critical. But they need to be blind tests. The environment and performance method are also critical. For my money all tests should sporadically reference live music. The background is that the human nervous system is quite plastic, and for short-term tests people tend to prefer what they have become accustomed to. (The best counter-example was when a researcher brought a noisy change-over switch that was not connected in any way. In the absence of a difference in the sound, the difference between the "clunks" switching one way and another appeared to colour listeners' preferences.
Another audible flaw where the measurements had to catch up to trained ears was from the 1970's with transient intermodulation distortion (TIM or TIMD). It wasn't unil Prof. Marshall Leach (1940-2010) mathmatically defined it in 1977, tracing it to (lack of) amplifier slew rate. As he explained it to us in his EE4026 Audio Enginering class at Geogia Tech, amplifiers that have a high slew rate can still have TIMD, those that have slow slew rates *will* have it.
Three other factors come into play in the human auditory system for normal hearing people:
1) Ear canal resonance will vary greatly in both center frequency and Q: For my "guesstimating" for hearing aid and in-ear monitors (when I'm not using a probe tube mic (real ear measurement)), I use a figure of 2kHz for men & 2700 Hz for women, with a Q of 15; but varying that figure based on otoscopic examination for external auditory meatus texture. For more on this, google "Marshall Chasin"
2) You'll experience about 5% THD from the non-linear motion of the ossicular chain as it transforms tympanic membrane vibration into launching acoustic waves into the perilymph via the oval window;
3) Non-linear neural firing based on intensity, including direct stimulation of cochlear inner hair cells by the basilar membrane above 60dB HL (hearing level).
Editor, The Hearing Blog
As I recollect, the Acoustical Company (among others) was using both instantaneous and continuous signal slew testing at least as far back as the design phase of the Quad 405 (early 1970s). They also tested with realistic speaker loads (in addition to the electrostatics, which might be regarded as unreasonable).
Leach was formalising what he and the more meticulous of his colleagues had been doing for some years.
Other than maybe telling you what can and can't be ignored, I'm not clear about the relevance of the auditory system to this discussion - as it's a fixed part of the chain (albeit variable from person to person). The parts I think significant are the impact of sound level (to get subjective test conditions right), and (for design purposes only) of masking (to give an indication of criticality)
I once attended a course on studio recording at a well known recording studio.
I was appalled as to what was taught about "equalizing" the sound. The teacher tweaked every knob on each parametric equalizer for each sound channel for maximum "punch" to what I would call piercing. Monitoring was done at deafening sound levels. We were informed that this was standard studio recording practice.
This "equalizing" process was then repeated when it came time to make the master CD by another expert in this field.
It would appear that to earn his keep, a recording engineer is expected to adjust every single knob on his mixing board.
It's small wonder that so many recordings sound terrible when played on good audio equipment. How can one possibly judge the sound of an audio system when the the frequency and phase response of most sound sources has been so messed up?