Me-peg layer 3 Caste Evolution
by, 05-15-2012 at 11:26 PM (1735 Views)
Soon compressed audio codecs are going to sound like this if we don't tickle our spines and start walking proud. Hawk Software - HawkVoice codecs
Of course we can't be a planet of people that stupid. (To make music that compressed). Though we almost started being so.
We may have gotten away with ultra-compression of voice for telecommunication, a phone call back in the 1940s would've sounded like todays am radio stations minus the enhancements. These trends of highly complex digital algorithms for sound manipulation aren't staying away from music. Evidence of this is clear in the loudness war.
The creation of countless DSP, A/D/A, VST, DAW, and the procreated manner of sampled music.
It seems as if we want to make everything as man-made as possible :/
Just as man learned to kill with a rock, the reason for this could be that our ears actually have acoustic chambers that can in theory phase out frequencies, or cut them to rudimentary 3rds, and all manners of physical manipulation. Just as with our hands we manipulated mud to form geometric shapes. It would only make logical sense to create music that follows our rules of acoustic physics, if only for our brain organ to be more comfortable with-in our environments.
It wouldn't be surprising if there is a resonant vibration that, if applied deep enough and close enough to our ear drum a 'hypnotic' state could be induced. Pain is induced by high air pressure levels, so why not pleasure.
Our ears protect themselves from high pressure situations, just as we use dynamic range control or normalize before anything goes up a wire into a set of earbuds. And so we evolved our ears past the breach of the ear-lobe.
Early recordings, couldn't even imitate the stereo capacity of human hearing. Slowly we developed technology further along.
Codecs that layer audio have been around since the days of mp3pro, these attempt to bring back lost 'high bandwidth audio information' kind of the same way the outer most part of the ear is sensitive to wind augmenting or sound. And we've advanced dsp technology and spoon fed protein pills too the silicon business from all this D/A conversion fandango to a rather critical point.
Our ears are evolving, and we are making tools that will just like the spear progress into rail gun prototypes.
And yet all this is in the name of good business.
It seems that, we've been developing technologies and sciences rather well. Though an ignorance towards music, or better stated, towards overall sound-quality is beginning to seep into the next generation of children.
Considering that the human body constantly adapts and builds up behaviors and niches. How a child is grown by a complicated web of social interactions, which now are also online, and through mediums such as twatter or more rarely t.v..
The community, or better stated the world is raising the 'mp3-boomers' to build up a trait where the ears aren't as attentive as they could be, though they will have a very high threshold for high air pressure .
Then the term audiophile begins to replace aristocrate and jester is replaced by hipster. And to make things worse the word musician becomes a separate race of Humos Sapios and a minority at that..
Just like a man with one leg knows the quickest way across town, and how I can eat spicy chilly peppers cause I've spent years building up to eat jalapinos like their skittles. Kids that are raised in a world of compressed audio won't know the difference between todays standards of a good recording and tomorrows ideal.
Even we're forgetting.
So back to the technical aspects, of sound engineering. We hear 1khz through to 5khz very well.
That's 5,000 frequencies that stand out like a pin prick in the ear.
When we go from pcm data to something like .aac we compress it.
Most mpeg layer 3 compression runs at a ratio of 10:1, take that compression and apply it to the rich small spectrum of well perceived by the human ear, and we are left with 500 frequencies.
Take into account ANALOG, which means that there is a decimal point, and the fact that from 1000.00 to 1004.00 you just counted 500 frequencies. With just the hundredth..We all know this goes further.
(To explain:1000.01 1000.02 1000.03, and etc.)
And that our ears are believed to detect up to the micro-pascals level. We lose alot.
Sadly it doesn't stop there, this area of 500 viable frequencies gets cut down even further, because the end audio file has to cover the spectrum of 0-20k hertz (though some start at 15. But because of ISO and IEC standards we don't screwed over as bad as we could if standards weren't imposed.
These ratios are thought out, by smart men, to create a space so that frequencies don't mask each other.
But most current permutations of loseless codecs are proving to mask a great deal of musical content. And usually they mask with the voice and the kick. Who knows why, must be the evolved taste in music.
When mpeg layer 3 was finally up and pushing for market space (it launched itunes dammit), there came out one song that set the guide-lines for how to use the codec.
Toms Diner (Suzane Vega) was engineered by the original coders to do a rather good job of depicting the space availabe. The song shows in detail what an ideal instrument and vocal set sounds like and just exactly how much high end is available for the bass before it starts masking the vocals. It also shows the higher frequency range reserved for effects.
And today (well last week) follow rivers (if you don't know the song I wish I was you) follows the same template as originally created.
Of course the newer encoding ensures that the pianos here have a proper attack. (The song did have pianos right?)
The problem with these compressed recordings how-ever, is that coarticulation of the human voice, requires very precise timing, frequency replication and further more sound pressure levels with a very fast and dynamic range of play.
As for say, an encoded recording will use the same exact phonics through-out a song. Whilst the human version of the phonic will never be replicated the same. We don't calculate lip resistance or our mouth moisture when we speak that's why the quantitization that occurs on phonics damage the human element of recordings..
Encoded/compressed audio also quantitizes timing as well. This is a problem because if there is a gap of 1-10 milli-seconds between sequential sounds, we will perceive these as 1 continuous sound or loop, everyone but super-man has this handicap. The threshold to hear seperate sounds is past 10 milliseconds for most people. Quantitization may possibly move sound around 1-2 ms also. Someone will say, but that doesn't matter, it's too spread out.
A song has alot of milli-seconds and when you start moving around all of them, it changes the song period.
If these speech acoustic properties are offset from meticulous quantitized compression than the psychological interpretation could be afflicted, that would be the conclusion.
And in most cases it is.
For example, listen to a vynal of Ella Fitzgerald. Than the same thing on youtube. For some people she might seem younger and thinner on the youtube version!
This is because of how and what the human ear perceives. That's why we can approximate the age of anyone we converse with, for example.
The on-going sale of HD and blu-ray and dobly digital extra-plus gold certified is just technology moving along. Just like in the days of memorex.
Even though youtube uses the latest incarnation of the aac filter, which contains multiple object data and has been slaved over for the past 10-20 years, it's still based on a masking theory from 1986 that doesn't appreciate infrasound or room acoustics as much as it should. It's all one big compromise. If we want more music we have to manage it better.
This whole compressed audio thing has been sold to the public very well, few people have ever complained. And literally no-one did a thing about it.
We literally ripped the pages out of books made of sound and sold them for more.
And some of the marketing seems as if it may have been devised by a god with how perfectly orchestrated it's been, or maybe Sony wanted to put RCA out of business ?
The industrial truth, and the legal aspect is more realistic.
The reason for the loudness war, is a simple one at that, some audio is hard to compress, (compress encode not compress make it louder) because of its randomness and sharp attacks.
We have to cut out, to the limits of code. As we close up more than half of the audible spectrum, a song has to be prepared when it comes to mastering maybe summoning into a compressed format.
We have to compensate what will be lost, and bring the levels of what will remain to a place where the end result won't flutter and sound like a wrenched fart.
We set a good level so frames have an easier job of containing volume, we kill transients to create less object data, we add reverb to bring back high-end...and so forth.
The engineering bit is:
576 samples are taken per? milisecond? which then get transformed into 576 frequency-domain samples which get cut down to 192 samples in case of a transient (to keep temporal spread of quantization noise accompanying the transient out.). As in that automation curve you made will lose a point or two during the conversion, simply because there is a limit to how much data can be taken.
And yes transients used to make mp3s sound worse. That's why they have a bad name in music today. And back in the days of wax recording they made the needles jump off the track.
Other problems that have been published about how mpeg layer 3 imposed errors on music have been
1.Bit-rates can cause artifacts/slow-downs when changing intensity of information domains.
2.May cause smearing of percussive sounds, not to be confused with percussion.
3.Due to the tree structure of the filter banks, pre-echo problems are worsen.
4.The combined impulse response of the filter banks does not, and cannot, provide an optimum solution in time/frequency resolution.
5.The combining of the two filter banks' outputs creates aliasing problems that must be handled partially by the "aliasing compensation" stage.
6. Excess energy to be encoded in the frequency domain created by the aliasing compensation decreases coding efficiency.
7.Frequency resolution is limited by the small long block window size, which decreases coding efficiency.
8.There is no scale factor band for frequencies above 15.5 kHz.
9.Joint stereo is done only on a frame-to-frame basis.
10.Internal handling of the bit reservoir increases encoding delay.
11.Encoder/decoder overall delay is not defined, which means there is no official provision for gapless playback.
A quote from an iso is,
"At the present time, methods to automatically convert natural sound into synthetic or multi-object descriptions are not mature:therefore, more immediate solutions will involve interactively-authoring the content stream in some way."
And the list of how many 'ghost in the machine' type of BS errors show up trying to make a scale model of the original audio is HUGE!
The whole compress audio to make it feed the masses worked in the 90s and early 2000s but, now that hard-drives are getting bigger we no longer have to put up with errors, correlation, rounding, encoding, frame, resolution, and whatever else will complicate a PCM from sounding good.
Now-a-days most portable thing-a-dings can play .wav.
The advance in storage capacity is finally allowing us to hold a .mp3s worth of songs with a lose-none codec (as in raw form). Something alot of people dreamed about back when napster was 'screwing musicians'. But we all know napster was not for music, or games, or books. It was for pictures and movies of kitty cats.
Mastering for lose-less audio however is slowly evolving too.
The term adaptive mastering involves real-time monitoring of the conversion process
Pro-Codec Product Overview
Now imagine, this. Your mixing away, got the master running nice and smooth and you plug this up on your out bus.
You make a note of every frame that gets compressed wrong, and you re-code the codec on the script level to follow the dynamics of each song individually. In other words, each encoding is individual to the song.
User - Universal Mastering
That would be ... where we should be today at mastering and engineering but sadly. We are lazy and we didn't get there also it's inappropriate for us to do that much work, considering that we are almost at the stage of listening to full multi-tracks instead of that compressed garbage.
Now though, we are at a point where things are extremely peculiar, we use compression ratios of 1.5:2 !!!!!!
Why the hell !!!
That's millions of dollars spent investing, countless hours coding and testing just for 3.5:1 ?
Would it not make more logical and economical, and clinical sense to just use the raw data without having to compress 3.5?
The industry is milking the .mp3 craze very heavy man. Dolby truehd, meridan loseless, and wavs seem to be fighting it out for dominance now.
Flac may be able to recompile an original iso from it's format, and ogg might be nice, but these 3 goliaths really do sound good.
We seem to have gotten to the point where encoding music is not needed, in terms of collecting, distrubiting, and processing. But we're still doing it....We're still buying it.
A normal person though, gets confused now.
Remember when Hd-ready, and Full Hd came out. Alot of people got coined into buying a tv that lasted only for 2 month, before they felt like they actually needed 1080p and not jut 480p.
Laser-disc vs Vhs, yet another 'war' that claimed a few bucket fulls of cash.
Blu-ray Audio Explained
Mainstream advertising is very deceptive. That's why the government protects us making public ISO information, for example.
Yet if a company wants to sell a product, it must follow these rules, which these rules separate a con from a technology.
Because of simple principals. If your gonna be dumb you gotta be tuff....
So now that we've almost reached the pearly gates of acoustic treatment for our earbud. When my cell phone is loaded with 2 gigs of .wavs of music. Whats next, and how long before songs will be separate multi-tracks loading up into wmp 13 ?
Are we going to develop a sort of discrimination for people who can't stop listening to .mp3s? Just as 12 year old Johnny doesn't understand why grandpa-pa still listens to his old 8track cassettes, or how some people horde cds while others horde blu-rays (and some unfortunates horde hd-dvd). There is a small gap between everyone who has been at a technological for-front.
Luckily we care more about what people listen to more than on what. If grand-pa was listening to Spice Girls on 8 track little Johnny might visit less often.
alot of wiki
NewMusicBox » The Musical Ear
Apple - iPod touch - View technical specifications for iPod touch.
Techniques in Speech Acoustics - Jonathan Harrington, Steve Cassidy - Google C
MP3 vs Uncompressed
a thing to massage your head | Flickr - Photo Sharing!
Manfred R. Schroeder
Yet still. This whole encode music and sound thing did have plenty of German folk working on it. Maybe it's some sort of Nazi comspiracy.