OpenMediaLib User and Development Guide
- OpenMediaLib User Development Guide
- Introduction
- High Level Use
- Reverse Polish Notation
- Applying RPN to Video/Audio
- Clip Modifications
- Compositing
- Playlists
- Stack Manipulations
- Advanced Stack Usage
- Aspect Ratio Considerations
- The Encoding Filter Graph
- Compositing Revisited
- Really, Really Advanced Stack Usage
- General Audio Issues
- Python
- Interpolation
- Threading
Audio Resampling
Audio resampling has been introduced earlier. It is worth revisiting it at this point since its requirements are quite revealing about the 'separated' graph requirement.
An important point to note: a resampler is typically implemented using band pass filters and band pass filters are accumulative – hence their output changes depending on the amount of information passed through them.
In fact, an ideal band pass filter requires an infinite amount of past and future information – a requirement that is, of course, entirely impossible to provide.
A compromise is obviously needed and in OML, the compromise is simply that the resampler requires 2 or 3 audio frames as input.
Throughout OML there is an implied contractual agreement that a fetch method from any node in the graph will generate a frame which consists of the requested image and/or audio components – in order for a node to do that, it must be able to request as many [similarly formed] frame objects from its connected inputs as it needs to satisfy the demand.
In the resampling case, a request for frame N will result in the resampler requesting 3 frames from its input – those being N – 1, N, N + 1. The audio samples from each are concatenated, the entirety is resampled and the relevant samples are extracted.
Obviously, this is not going to result in an 'ideal' resampling mechanism – but since the 'ideal' is impossible, it provides a very good deterministic approximation. Again, determinism is an important concept here – when we seek to an arbitrary point in the graph, the results and processing overhead are identical to that of sequential playout.
So, how does this relate to the perceived advantages in the separated audio filter graph?
As mooted previously, the audio graph doesn't strictly need to be computed lock step with the video frame rate.
As a justification, assume for a minute that we had a filter which does interlaced field extraction from a video source. If you're unfamiliar with the interlace concept, just think of it like this – when you have an interlaced PAL video input, it has 25 frames per second, but interlace implies two 'fields' per frame, hence, you have 50 fields – each field provides an image, hence to satisfy the implied contractual agreement, the audio would need to be reseated such that half the frames samples would be allocated to each field (and two new 'frames' would be generated).
By halving the number of samples in the 3 frames, the result of the resampling is affected – the more samples provided, the 'better' the result. 'Better' here implies 'more pleasing to the ear'. By treating the two as separate graphs with different quantisation, the results will improve and the number of operations reduce. Alternatively, the number of frames fetched for resampling could be increased – so in essence, it's 6 and half a dozen really :-).
Note that at the 'edges' the resampler may only have 2 frames to work with (ie: a request for frame 0 doesn't have a neighbouring left hand frame).
