Tuesday, October 12, 2021

Some notes on using libswresample

FFmpeg, one of the finest open source software which is a swiss-army knife of multimedia, being used by virtually every media transcoder or player in the world has no guides on how to use the API and provides out of date examples.

Out of the entire framework, libswresample would be the most miserable thing to work with. Its documentation is not clear about how resampling should be done and due to outdated examples none can figure out the elegant way out.

Ironically, the elegant way without swr_delay() calculation is in its fork, Libav. Combining that with the modern libswresample AVFrame API, I was able to get it working.

Despite all this libswresample provides FFmpeg framework interoperability and sample-format conversion or channel re-matrixing or up/downmixing, which most other libraries don't all at once.

Initialization

Initialization is done with swr_alloc_set_opts() or the AVOptions API. It's not that you can't change the resampler parameters once you've configured it, you can with the AVOptions API.

There's a large list of granule options you can pass, if you're a DSP nerd and performance freak and you want to configure the shit out of it do it through the AVOptions API and if you are insane you can explore the Internal API.
int ret;
SwrContext *swr = swr_alloc_set_opts(NULL,  // we're allocating a new context
                      AV_CH_LAYOUT_STEREO,  // out_ch_layout
                      AV_SAMPLE_FMT_S16,    // out_sample_fmt
                      44100,                // out_sample_rate
                      AV_CH_LAYOUT_MONO,    // in_ch_layout
                      AV_SAMPLE_FMT_S16P,   // in_sample_fmt
                      48000,                // in_sample_rate
                      0,                    // [ignore these
                      NULL);                //  we dont need]

// assume returning of -1 as error
if(swr != NULL)
  return -1;

ret = swr_init(swr);

if(ret != 0)
  return -1;

AVFrame* out = av_frame_alloc();

out->format = AV_SAMPLE_FMT_S16;
out->channel_layout = AV_CH_LAYOUT_STEREO;
out->sample_rate = 44100;
out->nb_samples = 666;

Conversion

Assuming you already get AVFrames from your decoder or anywhere. You pass it to the resampler and it will put the resampled data in a FIFO buffer, you can retrieve this at any time. Usually, you will keep feeding and taking data out, else the FIFO buffer will consume your precious memory.

It's a good idea to buffer it, but you can also retrieve it all at once as shown here.

You need to keep calling this piece of code over and over, put in a loop obviously.
// if you input supply ends stop feeding it data.
{
  ret = swr_convert_frame(swr, NULL, in);
  
  // if in a rare circumstance the audio config changes, reconfigure then convert.
  // you should ignore this if you can assure that your input never changes.
  if(ret == AVERROR_INPUT_CHANGED) {
    av_opt_set_channel_layout(swr, "icl", in->channel_layout, 0);
    av_opt_set_sample_fmt(swr, "isf", in->format, 0);
    av_opt_set_int(swr, "isr", in>sample_rate, 0);
	
    ret |= swr_convert_frame(swr, NULL, in);
  }

  if(ret != 0)
      return -1;
}

// keep taking out data, until it exhausts.
ret = swr_convert_frame(swr, out, NULL);
if(ret != 0)
  reutrn -1;

Packup

After all your work is done do swr_free(&swr) and av_frame_free(&out) to ensure no memory leaks.

It may sound all simple and straightforward to you but believe me, to reach this conclusion, it took me many hours of browsing examples, testing and debugging. Shameful, that the guys never bothered to document their so widely used library cum program.

I hope now you don't have to face the pain like I did '='

No comments:

Post a Comment