The Challenge of Fixing FLAC Seek

The Challenge of Fixing FLAC Seek

PS: The author has tried to keep the language as generic as possible. Please pardon the use of some technical jargon and details. Some GStreamer knowledge would help understand it little better.

Once upon a time, when I was very young in this company, I was assigned a task which was attempted a few times earlier but could not be completed. With this information and almost no background of the issue, I believed this to be a challenging task. However, at my position, I have no right to give up :). So, I decided – “Let’s take up the challenge and live up to it”.

Here is the task: “Lossless formats like FLAC, ALAC, WAVE etc can play, but seek functionality does not work on such streams”. That should be simple, huh? Yes, prima-facie, it sounds so. But, remember, I was told that this was an invincible problem so far.

I decided to take it heads on. Let’s enable seek functionality and see what error it gives. And here comes the first challenge – I needed to understand how seeking operation is different for FLAC than other usual formats like MP3 and AAC. Some study showed that while encoding of FLAC file, encoder inserts only few seek points into the stream. To seek into the stream, we need to snap to the nearest seek point. GStreamer gives an option to do so. First target achieved easily.

To improve my chances of hitting a seek point close enough, I wanted to increase the number of seek points. The current GStreamer pipeline to play all lossless formats was re-encoding the data into FLAC. Simply put, the pipeline already contained FLAC encoder. This readily allowed me to increase the number of seek points. Two milestones achieved successfully, and quickly gave me good confidence to move ahead. But who knew, road ahead will not be as simple as the first two steps.

I was young in the company and I was still learning many other aspects of the application. This was taking some additional time to do some tasks which now seem trivial. Context switching to higher priority tasks had an overhead. Anyway, that is how small companies work and I was well acquainted with this mode of working, so it caused little frustration.

Fast forward to a couple of fortnights. I re-created the issue and got the GStreamer logs. Now comes the biggest challenge. GStreamer can give huge amount of logs, however, to make sense of those logs is a skill, very difficult to acquire. Unfortunately, I was still new at that. After many days of hard work, false leads, investigations on wrong lines, I started noticing some patterns. I was slowly inching towards the root cause of the issue. But the battle was still to be fought long.

I found that the seek event was not being forwarded beyond an element. I tried to forcefully forward the seek event to next element after few tweaks. It was not working. I hadn’t given up yet, because I had more undiscovered paths to discover.

On more investigation, I found that the reason for seek event not getting forwarded was that audio encoder was getting more data than it could handle. This overflow of data was causing the GStreamer pipeline to hang due to mismatching time stamps. Initially, this seemed as not an issue, but after lot of investigation on this thread, I confirmed that this was a problem. I did not know the solution.

I tried a few hacks to try to drop the data at some point in the pipeline or to simply ignore it, but nothing worked. I was determined to solve this issue from the root with the best possible solution. I found one place where seek event would flush (drop) the data by itself. WOW!! IT WORKED!!

I rushed to my mentor, now my CEO, to discuss this solution. I explained him my changes and the reason for these changes fixing the issue. But my excitement of a working solution to the problem, was short lived. He explained to me that even though it works, the place where the fix is going is too generic a place to modify. Since it is the encoder base class code, it is highly unlikely that it will have such a gross bug there. He asked me to find a better solution.

Pheww!! Back to work!! I again started analyzing the logs to understand what exactly should happen when the flush is called. This was tricky as the implementation I was looking at, was not there at all. It took me decent time to figure this out. Finally, I figured, that the FLAC encoder holds some data from previous buffer which is not dropped even when asked to flush, because the flush itself is not implemented in FLAC encoder code.  The best solution would be to introduce that functionality to flush this data out of FLAC encoder, and invoke this API when actually flushing the base class. Since this solution avoids major modification in the encoder base class code, it is much safer to implement.

Now, it was just about time that I implemented the required functionality, tested it and sent it for review again. There were few minor issues in testing which I was able to quickly iron out. I eagerly discussed this new solution with my mentor and this time he was content.

Finally, all the hard work and patience paid off. A long pending and much needed feature of Seeking on Lossless file formats was now introduced on the application.

– Nishit Jain

2019-01-28T18:10:56+00:00Categories: Blogs, Technology|Tags: , |

Privacy Preference Center