diff options
author | Takashi Iwai <tiwai@suse.de> | 2012-01-12 03:59:14 -0500 |
---|---|---|
committer | Takashi Iwai <tiwai@suse.de> | 2012-01-12 03:59:14 -0500 |
commit | 627b79628f56c3deeb17dec1edf6899b49552fa4 (patch) | |
tree | deac8b2cce5d70708fa944a270ee031f069226d8 /Documentation/sound | |
parent | 29abceb67f8a230da806db4ed73899595bd2ae76 (diff) | |
parent | 8c3f5d8a9b7d0d8506bc2a0525e012eae02b1853 (diff) |
Merge branch 'topic/misc' into for-linus
Diffstat (limited to 'Documentation/sound')
-rw-r--r-- | Documentation/sound/alsa/compress_offload.txt | 188 |
1 files changed, 188 insertions, 0 deletions
diff --git a/Documentation/sound/alsa/compress_offload.txt b/Documentation/sound/alsa/compress_offload.txt new file mode 100644 index 000000000000..c83a835350f0 --- /dev/null +++ b/Documentation/sound/alsa/compress_offload.txt | |||
@@ -0,0 +1,188 @@ | |||
1 | compress_offload.txt | ||
2 | ===================== | ||
3 | Pierre-Louis.Bossart <pierre-louis.bossart@linux.intel.com> | ||
4 | Vinod Koul <vinod.koul@linux.intel.com> | ||
5 | |||
6 | Overview | ||
7 | |||
8 | Since its early days, the ALSA API was defined with PCM support or | ||
9 | constant bitrates payloads such as IEC61937 in mind. Arguments and | ||
10 | returned values in frames are the norm, making it a challenge to | ||
11 | extend the existing API to compressed data streams. | ||
12 | |||
13 | In recent years, audio digital signal processors (DSP) were integrated | ||
14 | in system-on-chip designs, and DSPs are also integrated in audio | ||
15 | codecs. Processing compressed data on such DSPs results in a dramatic | ||
16 | reduction of power consumption compared to host-based | ||
17 | processing. Support for such hardware has not been very good in Linux, | ||
18 | mostly because of a lack of a generic API available in the mainline | ||
19 | kernel. | ||
20 | |||
21 | Rather than requiring a compability break with an API change of the | ||
22 | ALSA PCM interface, a new 'Compressed Data' API is introduced to | ||
23 | provide a control and data-streaming interface for audio DSPs. | ||
24 | |||
25 | The design of this API was inspired by the 2-year experience with the | ||
26 | Intel Moorestown SOC, with many corrections required to upstream the | ||
27 | API in the mainline kernel instead of the staging tree and make it | ||
28 | usable by others. | ||
29 | |||
30 | Requirements | ||
31 | |||
32 | The main requirements are: | ||
33 | |||
34 | - separation between byte counts and time. Compressed formats may have | ||
35 | a header per file, per frame, or no header at all. The payload size | ||
36 | may vary from frame-to-frame. As a result, it is not possible to | ||
37 | estimate reliably the duration of audio buffers when handling | ||
38 | compressed data. Dedicated mechanisms are required to allow for | ||
39 | reliable audio-video synchronization, which requires precise | ||
40 | reporting of the number of samples rendered at any given time. | ||
41 | |||
42 | - Handling of multiple formats. PCM data only requires a specification | ||
43 | of the sampling rate, number of channels and bits per sample. In | ||
44 | contrast, compressed data comes in a variety of formats. Audio DSPs | ||
45 | may also provide support for a limited number of audio encoders and | ||
46 | decoders embedded in firmware, or may support more choices through | ||
47 | dynamic download of libraries. | ||
48 | |||
49 | - Focus on main formats. This API provides support for the most | ||
50 | popular formats used for audio and video capture and playback. It is | ||
51 | likely that as audio compression technology advances, new formats | ||
52 | will be added. | ||
53 | |||
54 | - Handling of multiple configurations. Even for a given format like | ||
55 | AAC, some implementations may support AAC multichannel but HE-AAC | ||
56 | stereo. Likewise WMA10 level M3 may require too much memory and cpu | ||
57 | cycles. The new API needs to provide a generic way of listing these | ||
58 | formats. | ||
59 | |||
60 | - Rendering/Grabbing only. This API does not provide any means of | ||
61 | hardware acceleration, where PCM samples are provided back to | ||
62 | user-space for additional processing. This API focuses instead on | ||
63 | streaming compressed data to a DSP, with the assumption that the | ||
64 | decoded samples are routed to a physical output or logical back-end. | ||
65 | |||
66 | - Complexity hiding. Existing user-space multimedia frameworks all | ||
67 | have existing enums/structures for each compressed format. This new | ||
68 | API assumes the existence of a platform-specific compatibility layer | ||
69 | to expose, translate and make use of the capabilities of the audio | ||
70 | DSP, eg. Android HAL or PulseAudio sinks. By construction, regular | ||
71 | applications are not supposed to make use of this API. | ||
72 | |||
73 | |||
74 | Design | ||
75 | |||
76 | The new API shares a number of concepts with with the PCM API for flow | ||
77 | control. Start, pause, resume, drain and stop commands have the same | ||
78 | semantics no matter what the content is. | ||
79 | |||
80 | The concept of memory ring buffer divided in a set of fragments is | ||
81 | borrowed from the ALSA PCM API. However, only sizes in bytes can be | ||
82 | specified. | ||
83 | |||
84 | Seeks/trick modes are assumed to be handled by the host. | ||
85 | |||
86 | The notion of rewinds/forwards is not supported. Data committed to the | ||
87 | ring buffer cannot be invalidated, except when dropping all buffers. | ||
88 | |||
89 | The Compressed Data API does not make any assumptions on how the data | ||
90 | is transmitted to the audio DSP. DMA transfers from main memory to an | ||
91 | embedded audio cluster or to a SPI interface for external DSPs are | ||
92 | possible. As in the ALSA PCM case, a core set of routines is exposed; | ||
93 | each driver implementer will have to write support for a set of | ||
94 | mandatory routines and possibly make use of optional ones. | ||
95 | |||
96 | The main additions are | ||
97 | |||
98 | - get_caps | ||
99 | This routine returns the list of audio formats supported. Querying the | ||
100 | codecs on a capture stream will return encoders, decoders will be | ||
101 | listed for playback streams. | ||
102 | |||
103 | - get_codec_caps For each codec, this routine returns a list of | ||
104 | capabilities. The intent is to make sure all the capabilities | ||
105 | correspond to valid settings, and to minimize the risks of | ||
106 | configuration failures. For example, for a complex codec such as AAC, | ||
107 | the number of channels supported may depend on a specific profile. If | ||
108 | the capabilities were exposed with a single descriptor, it may happen | ||
109 | that a specific combination of profiles/channels/formats may not be | ||
110 | supported. Likewise, embedded DSPs have limited memory and cpu cycles, | ||
111 | it is likely that some implementations make the list of capabilities | ||
112 | dynamic and dependent on existing workloads. In addition to codec | ||
113 | settings, this routine returns the minimum buffer size handled by the | ||
114 | implementation. This information can be a function of the DMA buffer | ||
115 | sizes, the number of bytes required to synchronize, etc, and can be | ||
116 | used by userspace to define how much needs to be written in the ring | ||
117 | buffer before playback can start. | ||
118 | |||
119 | - set_params | ||
120 | This routine sets the configuration chosen for a specific codec. The | ||
121 | most important field in the parameters is the codec type; in most | ||
122 | cases decoders will ignore other fields, while encoders will strictly | ||
123 | comply to the settings | ||
124 | |||
125 | - get_params | ||
126 | This routines returns the actual settings used by the DSP. Changes to | ||
127 | the settings should remain the exception. | ||
128 | |||
129 | - get_timestamp | ||
130 | The timestamp becomes a multiple field structure. It lists the number | ||
131 | of bytes transferred, the number of samples processed and the number | ||
132 | of samples rendered/grabbed. All these values can be used to determine | ||
133 | the avarage bitrate, figure out if the ring buffer needs to be | ||
134 | refilled or the delay due to decoding/encoding/io on the DSP. | ||
135 | |||
136 | Note that the list of codecs/profiles/modes was derived from the | ||
137 | OpenMAX AL specification instead of reinventing the wheel. | ||
138 | Modifications include: | ||
139 | - Addition of FLAC and IEC formats | ||
140 | - Merge of encoder/decoder capabilities | ||
141 | - Profiles/modes listed as bitmasks to make descriptors more compact | ||
142 | - Addition of set_params for decoders (missing in OpenMAX AL) | ||
143 | - Addition of AMR/AMR-WB encoding modes (missing in OpenMAX AL) | ||
144 | - Addition of format information for WMA | ||
145 | - Addition of encoding options when required (derived from OpenMAX IL) | ||
146 | - Addition of rateControlSupported (missing in OpenMAX AL) | ||
147 | |||
148 | Not supported: | ||
149 | |||
150 | - Support for VoIP/circuit-switched calls is not the target of this | ||
151 | API. Support for dynamic bit-rate changes would require a tight | ||
152 | coupling between the DSP and the host stack, limiting power savings. | ||
153 | |||
154 | - Packet-loss concealment is not supported. This would require an | ||
155 | additional interface to let the decoder synthesize data when frames | ||
156 | are lost during transmission. This may be added in the future. | ||
157 | |||
158 | - Volume control/routing is not handled by this API. Devices exposing a | ||
159 | compressed data interface will be considered as regular ALSA devices; | ||
160 | volume changes and routing information will be provided with regular | ||
161 | ALSA kcontrols. | ||
162 | |||
163 | - Embedded audio effects. Such effects should be enabled in the same | ||
164 | manner, no matter if the input was PCM or compressed. | ||
165 | |||
166 | - multichannel IEC encoding. Unclear if this is required. | ||
167 | |||
168 | - Encoding/decoding acceleration is not supported as mentioned | ||
169 | above. It is possible to route the output of a decoder to a capture | ||
170 | stream, or even implement transcoding capabilities. This routing | ||
171 | would be enabled with ALSA kcontrols. | ||
172 | |||
173 | - Audio policy/resource management. This API does not provide any | ||
174 | hooks to query the utilization of the audio DSP, nor any premption | ||
175 | mechanisms. | ||
176 | |||
177 | - No notion of underun/overrun. Since the bytes written are compressed | ||
178 | in nature and data written/read doesn't translate directly to | ||
179 | rendered output in time, this does not deal with underrun/overun and | ||
180 | maybe dealt in user-library | ||
181 | |||
182 | Credits: | ||
183 | - Mark Brown and Liam Girdwood for discussions on the need for this API | ||
184 | - Harsha Priya for her work on intel_sst compressed API | ||
185 | - Rakesh Ughreja for valuable feedback | ||
186 | - Sing Nallasellan, Sikkandar Madar and Prasanna Samaga for | ||
187 | demonstrating and quantifying the benefits of audio offload on a | ||
188 | real platform. | ||