Tracks (Vocals, Guitars, Drums):
Each track represents an audio element (e.g., vocals, guitars, drums) that is part of the overall sound mix. These tracks are treated as individual audio objects.
Panning as Metadata (XYZ Coordinates):
The position of each audio object is defined by XYZ coordinates, which represent spatial metadata. These coordinates determine the object’s position in a 3D sound field, allowing the audio to move dynamically within the space.
Object Audio Renderer (OAR):
The OAR is responsible for rendering the spatial audio. It takes the audio objects and their positional metadata (XYZ coordinates) and converts them into a spatial audio experience that can be played through multiple speakers in a 3D space.
XYZ Coordinates (3D Space):
The XYZ axes represent the 3D space where audio objects are positioned. X refers to horizontal (left-right) movement, Y to vertical (up-down) movement, and Z to depth (front-back), creating a fully immersive sound environment.
Objects:
Each audio element, like vocals, guitars, and drums, is treated as an independent object. This allows them to be moved freely within the sound field without being tied to a specific speaker channel.
Metadata Handling:
The metadata (panning and position) for each object is continuously updated, ensuring that the sound moves naturally through the environment according to the listener’s perspective.
Independent Object Manipulation:
By treating each sound as an object, Dolby Atmos allows for more precise and flexible sound placement compared to traditional channel-based systems, where sounds are fixed to specific speakers.
Multiple Speakers Output:
The object audio renderer sends the audio signals to multiple speakers arranged around the listener. Dolby Atmos systems can support various speaker configurations, including those with height channels, to enhance the immersive experience.