Broadcast News
11/11/2016
What Is The Future For Immersive Audio?
Peter Poers, Managing Director at Jünger Audio, looks at production efforts versus consumer experience.
Introduction
Along with the evolution of higher resolution in video images, a new way of creating and delivering audio content will be required and is already on the way. All the changes for future audio systems are covered by the general title "Next Generation Audio" (NGA). In addition to the very common existing channel based or fix mixed audio formats, some more audio channels will need to be added to make a difference.
Obviously, the creation and delivery of one layer surround sound (horizontally, with the listener surrounded by audio elements) will not meet future expectations anymore. Following the successful introduction of 3D audio in cinemas by Dolby® Atmos or in VR applications we can expect that immersive audio programs will become part of the delivery for future TV formats. Of course, the client at home can't expect to have a listening experience similar to that of a large scale multi-speaker cinema theatre. However, spatial audio effects can be delivered by using additional height channels reproduced by separate speakers or special 3D sound bars (better called sound projectors as opposed to ordinary external amplifier/loudspeaker combinations), or by using headphones driven by 3D virtualization software. And that will give immersive audio a realistic chance to become a standard feature of home entertainment systems in the near future.
The future NGA based surround sound formats adopted by TV Broadcast and OTT will typically be a maximum of 7.1 + 4H channels – in total, up to 11.1 speakers (as referenced in DVB NGA survey May 2015), arranged as a mid layer surround array of up to seven speaker positions and up to four height speakers on a top layer plus the sub-woofer for low frequency effects.
There are more elements than just a higher channel count that will define the next generation of audio format technology though – the presence of audio objects. At the moment, audio programs are typically produced and mixed in their final reproduction surround sound audio format. That can be for example 5.1 or 5.1 + 2H or even 7.1 + 4H. The mix is created and finalised and is then ready for delivery. These types of program mix we can call a channel based immersive audio format. For the NGA formats, there will be the additional introduction of audio objects. Audio objects are typically discrete mono or stereo audio channels that will be rendered to the reproduction audio mix in the final receiver audio decoder. With this method, these audio elements can remain as objects with individual changes applied just before the final end of the audio delivery path.
Another element for defining the NGA formats is the use of metadata. All of the existing new audio codec systems (e.g. MPEG-H, Dolby AC-4, DTS:X) use an extensive set of metadata to describe audio program details, to optimize production workflow, to control audio encoding and to allow optimum audio performance at the final receiver device decoder. Besides controlling and monitoring audio content in the process of program production, the generation of metadata is a most important step for introducing and launching next generation audio formats. Working with metadata will be essential to "authoring" audio programs in new formats.
Workflow considerations – introduction of a "side car" device
The next generation immersive and personalized audio formats will require changes in the audio production workflow. New procedures for managing object based encoded content and also for the personalization of services through the selection of alternative audio objects (such as commentator languages) needs to be defined. Of course, loudness control during production and the loudness definition for the final output formats are other aspects to consider. The NGA formats will offer a new surround sound experience and the use of upmix, format rendering and downmix algorithms will be essential for creating and monitoring the audio programs.
Some additional tools and changes to existing production environments will be required to be able to create audio content for these new audio formats. One of the important aspects to give the new formats a good chance to succeed will be to minimize the cost of transition. Production costs on the professional side cannot be raised significantly without running the risk of the industry rejecting the new formats. The use of existing digital production infrastructures will be essential to begin content creation for new formats in the near future.
One particular new supporting tool will become most important for different workflow areas – the Multichannel Monitoring & Authoring Unit MMA. This tool must combine audio interfacing, audio computing and metadata authoring in a unique way and will be the key to start production for immersive audio encoding systems or technologies. It will host intellectual property elements from the codec vendor of choice to perform codec specific features and processing. In addition, and depending on the workflow situation, additional sophisticated audio processing features such as surround upmix and loudness control may be options that could be integrated.
Monitoring the immersive audio content will require rendering and downmix. Especially if the local speaker setup is not capable of reproducing the higher order audio formats. It is strongly recommended to also to monitor (or emulate) lower order speaker setups to verify the result of rendering and downmix for home reproduction in environments with different speaker installations. Also, metadata controlling the processes must be verified so that the settings are correct for optimum performance.
Immersive – by the introduction of audio objects
The addition of audio objects is the key for delivering a personalized audio experience. In the case of personalized audio, certain separated audio tracks will be mixed to the final receiver audio format based on decisions made by the end user. The user might select certain objects to use and might also define the mixing ratio between the audio bed and the objects. One example of this application will be dialogue enhancement. There will also be advantages for multi-language programs from object based technology. Several commentary tracks – not just different languages but also different presenters and perspectives – can be delivered within the same audio mix. Additional descriptive audio tracks can be mixed to any possible output format. Of course just one mix can be monitored at any one time. The limits for possible gain changes available for the viewer must be set and will be part of the metadata structure. The final audio format will be determined by the channel count of the audio bed. Depending on the channel order of the audio bed, a rendering procedure and downmix will be required for lower order audio formats. If the final format isn't 3D immersive, the personalized objects will typically be mixed to the center channel and/or to the front stereo side channels.
Conclusion 1
It will take more time for the market for NGA formats at home to become established. The time frame will be set by codec releases from known vendors, by technical preparation of professional production and of distribution networks (content creation and delivery). And finally by support from the consumer industry regarding the implementation of codecs and offering sophisticated reproduction systems (home theater systems, 3D sound projectors, 3D binaural headphones virtualization).
But never forget – many people are still quite happy with the "easy listening" experience with no interaction needed on their part to select from a list of available audio tracks. Another limiting factor is also the practical implications of receiving immersive audio for viewers globally! Just a fraction of consumers will have the chance to use the higher order audio formats in their home or when out and about using mobile devices. For the majority of countries, we should expect that just 5... 10% of households will be prepared and capable of using real 5.1 or higher order audio formats (17% of German households had 5.1 AV home systems in 2014, by Verband Deutscher Musikindustrie). All the others will get immersive and surround sound content just as (rendered) stereo downmix.
Conclusion 2
One question remains. What is the real definition of immersive with the new audio formats? And who will get the most out of it? Immersive can mean very different things to different people and not necessarily just a case of hearing sounds from above! Simple, well done audio recordings can be really immersive! In a simple format that delivers a meaningful audio experience! In recent years, the quality of audio productions has not improved in terms of natural and good sounding audio content. We are living in a world where many audio programs no longer represent the dynamic range and the structure that such content should typically offer. Whilst in previous decades, audio professionals did their best to overcome the technical limitations, now that we have all digital technologies, we cannot maintain the audio quality of the content anymore! Loudness is largely solved, but as we see in many cases now, speech intelligibilty is often worse than ever.
Yes, there is some audio from above and it is surrounding us, but by nature we do not focus on listening from above. So I guess the third dimension in audio cannot be the motivation to move to modern and new codec systems alone. Many common codecs in use today are from an old generation. New codecs can bring technical improvements and higher audio quality level at lower bit rates. The aspect of object based audio (OBA) and the option for personalization of delivered audio content is maybe more attractive for many consumers even if it does not really improve the delivery and performance of audio programs. Three dimensional audio and object based audio – both formats will require changes to production and delivery. Now is the time to discuss and explore how to move forward in the direction of creating a new audio experience.
www.jungeraudio.com
This article is also available to read at BFV online as part of this issue's Audio feature here, page 33.
(JP/LM)
Introduction
Along with the evolution of higher resolution in video images, a new way of creating and delivering audio content will be required and is already on the way. All the changes for future audio systems are covered by the general title "Next Generation Audio" (NGA). In addition to the very common existing channel based or fix mixed audio formats, some more audio channels will need to be added to make a difference.
Obviously, the creation and delivery of one layer surround sound (horizontally, with the listener surrounded by audio elements) will not meet future expectations anymore. Following the successful introduction of 3D audio in cinemas by Dolby® Atmos or in VR applications we can expect that immersive audio programs will become part of the delivery for future TV formats. Of course, the client at home can't expect to have a listening experience similar to that of a large scale multi-speaker cinema theatre. However, spatial audio effects can be delivered by using additional height channels reproduced by separate speakers or special 3D sound bars (better called sound projectors as opposed to ordinary external amplifier/loudspeaker combinations), or by using headphones driven by 3D virtualization software. And that will give immersive audio a realistic chance to become a standard feature of home entertainment systems in the near future.
The future NGA based surround sound formats adopted by TV Broadcast and OTT will typically be a maximum of 7.1 + 4H channels – in total, up to 11.1 speakers (as referenced in DVB NGA survey May 2015), arranged as a mid layer surround array of up to seven speaker positions and up to four height speakers on a top layer plus the sub-woofer for low frequency effects.
There are more elements than just a higher channel count that will define the next generation of audio format technology though – the presence of audio objects. At the moment, audio programs are typically produced and mixed in their final reproduction surround sound audio format. That can be for example 5.1 or 5.1 + 2H or even 7.1 + 4H. The mix is created and finalised and is then ready for delivery. These types of program mix we can call a channel based immersive audio format. For the NGA formats, there will be the additional introduction of audio objects. Audio objects are typically discrete mono or stereo audio channels that will be rendered to the reproduction audio mix in the final receiver audio decoder. With this method, these audio elements can remain as objects with individual changes applied just before the final end of the audio delivery path.
Another element for defining the NGA formats is the use of metadata. All of the existing new audio codec systems (e.g. MPEG-H, Dolby AC-4, DTS:X) use an extensive set of metadata to describe audio program details, to optimize production workflow, to control audio encoding and to allow optimum audio performance at the final receiver device decoder. Besides controlling and monitoring audio content in the process of program production, the generation of metadata is a most important step for introducing and launching next generation audio formats. Working with metadata will be essential to "authoring" audio programs in new formats.
Workflow considerations – introduction of a "side car" device
The next generation immersive and personalized audio formats will require changes in the audio production workflow. New procedures for managing object based encoded content and also for the personalization of services through the selection of alternative audio objects (such as commentator languages) needs to be defined. Of course, loudness control during production and the loudness definition for the final output formats are other aspects to consider. The NGA formats will offer a new surround sound experience and the use of upmix, format rendering and downmix algorithms will be essential for creating and monitoring the audio programs.
Some additional tools and changes to existing production environments will be required to be able to create audio content for these new audio formats. One of the important aspects to give the new formats a good chance to succeed will be to minimize the cost of transition. Production costs on the professional side cannot be raised significantly without running the risk of the industry rejecting the new formats. The use of existing digital production infrastructures will be essential to begin content creation for new formats in the near future.
One particular new supporting tool will become most important for different workflow areas – the Multichannel Monitoring & Authoring Unit MMA. This tool must combine audio interfacing, audio computing and metadata authoring in a unique way and will be the key to start production for immersive audio encoding systems or technologies. It will host intellectual property elements from the codec vendor of choice to perform codec specific features and processing. In addition, and depending on the workflow situation, additional sophisticated audio processing features such as surround upmix and loudness control may be options that could be integrated.
Monitoring the immersive audio content will require rendering and downmix. Especially if the local speaker setup is not capable of reproducing the higher order audio formats. It is strongly recommended to also to monitor (or emulate) lower order speaker setups to verify the result of rendering and downmix for home reproduction in environments with different speaker installations. Also, metadata controlling the processes must be verified so that the settings are correct for optimum performance.
Immersive – by the introduction of audio objects
The addition of audio objects is the key for delivering a personalized audio experience. In the case of personalized audio, certain separated audio tracks will be mixed to the final receiver audio format based on decisions made by the end user. The user might select certain objects to use and might also define the mixing ratio between the audio bed and the objects. One example of this application will be dialogue enhancement. There will also be advantages for multi-language programs from object based technology. Several commentary tracks – not just different languages but also different presenters and perspectives – can be delivered within the same audio mix. Additional descriptive audio tracks can be mixed to any possible output format. Of course just one mix can be monitored at any one time. The limits for possible gain changes available for the viewer must be set and will be part of the metadata structure. The final audio format will be determined by the channel count of the audio bed. Depending on the channel order of the audio bed, a rendering procedure and downmix will be required for lower order audio formats. If the final format isn't 3D immersive, the personalized objects will typically be mixed to the center channel and/or to the front stereo side channels.
Conclusion 1
It will take more time for the market for NGA formats at home to become established. The time frame will be set by codec releases from known vendors, by technical preparation of professional production and of distribution networks (content creation and delivery). And finally by support from the consumer industry regarding the implementation of codecs and offering sophisticated reproduction systems (home theater systems, 3D sound projectors, 3D binaural headphones virtualization).
But never forget – many people are still quite happy with the "easy listening" experience with no interaction needed on their part to select from a list of available audio tracks. Another limiting factor is also the practical implications of receiving immersive audio for viewers globally! Just a fraction of consumers will have the chance to use the higher order audio formats in their home or when out and about using mobile devices. For the majority of countries, we should expect that just 5... 10% of households will be prepared and capable of using real 5.1 or higher order audio formats (17% of German households had 5.1 AV home systems in 2014, by Verband Deutscher Musikindustrie). All the others will get immersive and surround sound content just as (rendered) stereo downmix.
Conclusion 2
One question remains. What is the real definition of immersive with the new audio formats? And who will get the most out of it? Immersive can mean very different things to different people and not necessarily just a case of hearing sounds from above! Simple, well done audio recordings can be really immersive! In a simple format that delivers a meaningful audio experience! In recent years, the quality of audio productions has not improved in terms of natural and good sounding audio content. We are living in a world where many audio programs no longer represent the dynamic range and the structure that such content should typically offer. Whilst in previous decades, audio professionals did their best to overcome the technical limitations, now that we have all digital technologies, we cannot maintain the audio quality of the content anymore! Loudness is largely solved, but as we see in many cases now, speech intelligibilty is often worse than ever.
Yes, there is some audio from above and it is surrounding us, but by nature we do not focus on listening from above. So I guess the third dimension in audio cannot be the motivation to move to modern and new codec systems alone. Many common codecs in use today are from an old generation. New codecs can bring technical improvements and higher audio quality level at lower bit rates. The aspect of object based audio (OBA) and the option for personalization of delivered audio content is maybe more attractive for many consumers even if it does not really improve the delivery and performance of audio programs. Three dimensional audio and object based audio – both formats will require changes to production and delivery. Now is the time to discuss and explore how to move forward in the direction of creating a new audio experience.
www.jungeraudio.com
This article is also available to read at BFV online as part of this issue's Audio feature here, page 33.
(JP/LM)
Useful Links
Top Related Stories
Click here for the latest broadcast news stories.
16/09/2004
NTI Dragon Burn now supports leading image & audio file formats
In addition to fine-tuning the DVD Video burning capabilities of Dragon Burn, NTI has now added Mac Panther compatibility and support for virtually al
NTI Dragon Burn now supports leading image & audio file formats
In addition to fine-tuning the DVD Video burning capabilities of Dragon Burn, NTI has now added Mac Panther compatibility and support for virtually al
13/09/2022
Calrec Audio Unveils New Audio Mixing System
At IBC 2022 Calrec Audio will unveiled a new audio mixing system designed to keep pace with the changes broadcasters are experiencing in their product
Calrec Audio Unveils New Audio Mixing System
At IBC 2022 Calrec Audio will unveiled a new audio mixing system designed to keep pace with the changes broadcasters are experiencing in their product
03/09/2018
Pro Audio Named New Audio-Technica Distributor In South Africa
Audio-Technica has appointed Pro Audio as its new distributor in South Africa. The Johannesburg-based company will deal with Audio-Technica's consumer
Pro Audio Named New Audio-Technica Distributor In South Africa
Audio-Technica has appointed Pro Audio as its new distributor in South Africa. The Johannesburg-based company will deal with Audio-Technica's consumer
08/06/2011
Jünger Audio Introduces New High Performance Audio Processing Products At Broadcast Asia 2011
Dynamics processing specialist Jünger Audio will be showing a number of new projects at Broadcast Asia 2011 (Stand: 4U3-01), including the award-winni
Jünger Audio Introduces New High Performance Audio Processing Products At Broadcast Asia 2011
Dynamics processing specialist Jünger Audio will be showing a number of new projects at Broadcast Asia 2011 (Stand: 4U3-01), including the award-winni
26/09/2002
DK-Audio launch new audio monitoring unit
DK-Audio have launched their new PT0600M-LS Audio Monitor, which will be shipping within the next two months. This new addition to DK-Audio's range of
DK-Audio launch new audio monitoring unit
DK-Audio have launched their new PT0600M-LS Audio Monitor, which will be shipping within the next two months. This new addition to DK-Audio's range of
12/04/2023
Audio-Technica's BP3600 Immersive Audio Microphone Now Available
Audio-Technica has announced scheduled availability in Europe and the UK for its recently launched BP3600 Immersive Audio Microphone. A premium broadc
Audio-Technica's BP3600 Immersive Audio Microphone Now Available
Audio-Technica has announced scheduled availability in Europe and the UK for its recently launched BP3600 Immersive Audio Microphone. A premium broadc
17/01/2025
Audio-Technica Expands Its UK Commercial Audio Sales Team
Audio-Technica has recently expanded its UK Commercial Audio sales team with two new appointments, Craig Higgins and Sonny Sloggett, reflecting the co
Audio-Technica Expands Its UK Commercial Audio Sales Team
Audio-Technica has recently expanded its UK Commercial Audio sales team with two new appointments, Craig Higgins and Sonny Sloggett, reflecting the co
18/11/2024
W E Audio Invests In Martin Audio WPL
Long-term Martin Audio rental partner, W E Audio, recently underlined its commitment to the British manufacturer, by making a massive investment acros
W E Audio Invests In Martin Audio WPL
Long-term Martin Audio rental partner, W E Audio, recently underlined its commitment to the British manufacturer, by making a massive investment acros
20/02/2024
NADiV Audio Introduces Range Of Dante Audio And Control Devices
NADiV Audio has launched its NADiV range of Dante-enabled audio interface and control devices for portable and installed AV and pro audio environments
NADiV Audio Introduces Range Of Dante Audio And Control Devices
NADiV Audio has launched its NADiV range of Dante-enabled audio interface and control devices for portable and installed AV and pro audio environments
28/07/2023
DHD audio Unveils XS3 Core Audio Processor
DHD audio has announced a new addition to its modular range of audio studio equipment and systems. The XS3 core audio processor supports up to 20 ster
DHD audio Unveils XS3 Core Audio Processor
DHD audio has announced a new addition to its modular range of audio studio equipment and systems. The XS3 core audio processor supports up to 20 ster
17/07/2023
ES-Pro Audio Appointed To Handle Prism Sound's Range Of Audio Converters
Prism Sound has appointed ES-Pro Audio to handle its entire range of audio converters to the professional market in Germany. Formerly a Prism Sound re
ES-Pro Audio Appointed To Handle Prism Sound's Range Of Audio Converters
Prism Sound has appointed ES-Pro Audio to handle its entire range of audio converters to the professional market in Germany. Formerly a Prism Sound re
22/05/2023
Synthax Audio Appointed Distributor For TIERRA Audio
Synthax Audio UK has been appointed UK and Ireland distributor for TIERRA Audio's range of professional audio products. Founded in 2018 in Madrid, Spa
Synthax Audio Appointed Distributor For TIERRA Audio
Synthax Audio UK has been appointed UK and Ireland distributor for TIERRA Audio's range of professional audio products. Founded in 2018 in Madrid, Spa
31/03/2023
Digital Audio Denmark Introduces AX Center Thunder|Core Audio Interface
Digital Audio Denmark's new AX Center is a dedicated Thunderbolt modular audio interface with dual mic and instrument inputs, dual headphone and monit
Digital Audio Denmark Introduces AX Center Thunder|Core Audio Interface
Digital Audio Denmark's new AX Center is a dedicated Thunderbolt modular audio interface with dual mic and instrument inputs, dual headphone and monit
27/07/2022
AETA Audio Systems Adds 5G To Its ScoopFone Audio Codec
AETA Audio Systems has added 5G capability to its ScoopFone audio codec. The company continues to take the lead in innovation by offering a simple and
AETA Audio Systems Adds 5G To Its ScoopFone Audio Codec
AETA Audio Systems has added 5G capability to its ScoopFone audio codec. The company continues to take the lead in innovation by offering a simple and
04/07/2022
Optimal Audio Selects Audio Brands
Optimal Audio has appointed Audio Brands as its exclusive distributor in Australia. Optimal Audio, part of the Focusrite Group, manufactures a one-sto
Optimal Audio Selects Audio Brands
Optimal Audio has appointed Audio Brands as its exclusive distributor in Australia. Optimal Audio, part of the Focusrite Group, manufactures a one-sto