This related to <a class="issue-link js-issue-link" data-error-text="Failed to load ti

Please review <a class="issue-link js-issue-link" data-error-text="Failed to load titl

Clarification: Wasted bits vs "bits that could be wasted",about ietf-wg-cellar/flac-specification

Comments (15)

ktmf01 commented on August 13, 2024

I do not agree on the notion that 'wasted bits' as they exist in WAV or AIFF should not be called wasted but wasteable. The current doc says this quite clearly:

In this specification, these least-significant zero bits are referred to as wasted bits-per-sample or simply wasted bits. They are wasted in a sense that they contain no information, but are stored anyway.

I don't see how calling those bits, as they are in WAV or AIFF, are waste-able. They are wasted bits, they exist but are not used = wasted.

FLAC can signal these bits and not store them. I agree it isn't very clear this is purely optional. Maybe it is a good idea to augment the following bit of text

The wasted bits-per-sample flag in a subframe header is set to 1 if such wasted bits are present in that subframe. If this is the case, the number of wasted bits-per-sample (k) minus 1 follows the flag in an unary encoding. For example, if k is 3, 0b001 follows. If k = 0, the wasted bits-per-sample flag is 0 and no unary coded k follows.

to something like

The FLAC format can take advantage of these wasted bits by signalling their presence and coding the subframe without them. The wasted bits-per-sample flag in a subframe header is set to 1 if a subframe is encoded ignoring a certain number of wasted bits. If this is the case, the number of wasted bits-per-sample (k) minus 1 follows the flag in an unary encoding. For example, if k is 3, 0b001 follows. If k = 0, the wasted bits-per-sample flag is 0 and no unary coded k follows.

I do agree with the following:

First: it must be possible to avoid the "wasted" phrase elsewhere, as "bits will be wasted on frame headers" in 4.1. Like, for example the following sentence: "If the block size is too small, leading to a large number of frames, an excessive number of bytes will be spent on frame headers"?

from flac-specification.

H2Swine commented on August 13, 2024

I do agree that your suggestion improves, and possibly it is enough. But "exist but are not used = wasted" means you double-use terminology. Possibly even quadruple-use it:

I don't see how calling those bits, as they are in WAV or AIFF, are waste-able. They are wasted bits, they exist but are not used = wasted.

But those are not encoded as wasted bits, if you store it in a 14 bit FLAC file? Problem is, we are kinda stuck with using "wasted bits" for that particular flag and the subsequent unary-coded number. If the source wastes bits in different ways - say through a stupidly big JUNK chunk - that is then something else.

To "clarify the confusion", let me give a few more examples - all assumed mono not to invoke any mid/side subframe questions, and all FLAC encoding without --keep-foreign-metadata so that the encoded file has no knowledge of the source file format and how it stores the signal.

Let's say a signal is 12 bits in the sense that any bit beyond the 12th is zero.

If it is stored in a 12-bit FLAC file, does it then have any wasted bits?
No?
If it were a 12-bit WAVE file (specified as 12, but spending 2 bytes per sample), would it then have wasted bits?
If it were that 14 bit WAVE/AIFF file of 9.2.2, we would agree it had wasted bits. But how many and which ones? If you want to store it in a 14 bit FLAC file, all frames referring bit depth to streaminfo, then it is fair to say that bits 13 and 14 are "wasted" in the source file. Bits 15 and 16 are ... well, for the purpose of the FLAC specification we have just spent the word "wasted" and should maybe call them something else?
If you were encoding that 14 bit FLAC file, would it then have wasted bits?

When you have a 16 bit WAVE container storing a 14 bit signal of which 12 are potentially nonzero, but the FLAC file stores it as 14 bit with wasted-bits-per-sample (k) being 1, what is then the "wasted bits" as per the specification's definition?
I'd say that number is k = 1 ... the number of bits "reclaimed" by that particular method. Bits 15 and 16 are surely reclaimed too, but by STREAMINFO and not by this method.

from flac-specification.

ktmf01 commented on August 13, 2024

Wasted bits are a property of the uncompressed digital audio. One cannot say from a single sample, it is a property of a group or block of samples. In FLAC, uncompressed digital audio is generally only used as a last resort. So, to me, the only way a FLAC file can have wasted bits is if samples are stored uncompressed (verbatim) and the wasted bits flag in a subframe is not used.

FLAC subframes have a wasted bits per subframe flag, followed by the number of wasted bits in the block corresponding to that subframe. In my view, this is no different from a frame having a sample rate, a bit depth or a number of channels: it is of course the audio that has these properties, not the encoded data itself.

So, the data in a FLAC subframe does not have wasted bits, much like it does not have a bit depth or a sample rate. However, the audio encoded in that frame does. Like bit depth, wasted bits have an effect on how data is encoded, but it is not a property of that subframe.

If it is stored in a 12-bit FLAC file, does it then have any wasted bits?
No?

You must first define what you mean with "a FLAC file having wasted bits". If the digital audio stored in it consistently uses the LSBs of those 12 bits to convey information (i.e. they are non zero for long stretched of samples) than no subframe in that file can signal a non-zero number of wasted bits. Is that what you mean?

If it were a 12-bit WAVE file (specified as 12, but spending 2 bytes per sample), would it then have wasted bits?

Depends on how the samples are aligned. In the most common case, compatible with most devices, the file will have 4 wasted bits throughout. In case the samples are aligned with the LSB, most players will output a rather quiet rendition.

If it were that 14 bit WAVE/AIFF file of 9.2.2, we would agree it had wasted bits. But how many and which ones? If you want to store it in a 14 bit FLAC file, all frames referring bit depth to streaminfo, then it is fair to say that bits 13 and 14 are "wasted" in the source file. Bits 15 and 16 are ... well, for the purpose of the FLAC specification we have just spent the word "wasted" and should maybe call them something else?

Like the previous case, it could have 2 wasted bits or 4, depending on alignment. FLAC subframes might or might not signal that.

If you were encoding that 14 bit FLAC file, would it then have wasted bits?

You'll have to define what wasted bits are in a FLAC file. If the samples are stored uncompressed (verbatim), you could say the FLAC file has wasted bits, but I don't see the relevance. Wasted bits are only interesting in a file when it is input to a compressor, not output.

from flac-specification.

ktmf01 commented on August 13, 2024

Please review #215. I think it should provide the distinction you proposed.

from flac-specification.

H2Swine commented on August 13, 2024

Got back from a long week-end and see that I am a late to the party. I think the wording "sounds" weird, but English is not my native tongue and it is more important that it is clearer, and I think it is. But does it still have that issue about 14 bit AIFF in the introductory explanation? I would argue that it isn't a good way to exemplify or explain the FLAC feature, as it is rather a feature of the source container (which is "kinda irrelevant") than a feature of the signal.

from flac-specification.

ktmf01 commented on August 13, 2024

and see that I am a late to the party.

There's no rush.

But does it still have that issue about 14 bit AIFF in the introductory explanation?

I agree that AIFF may not be the best example. The reason I chose AIFF is because WAV does not explicitly define this kind of padding as far as I know, but on the other hand AIFF specifically states the number of valid bits. Perhaps this is better

Most uncompressed audio file formats can only store store audio samples with a bit depth that is an integer number of bytes. Samples of which the bit depth is not an integer number of bytes are usually stored in such formats by padding them with least significant zero bits to a bit depth that is an integer number of bytes. For example, shifting a 14-bit sample right by 2 pads it to a 16-bit sample, which then has two zero least-significant bits. In this specification, these least-significant zero bits are referred to as wasted bits-per-sample or simply wasted bits. They are wasted in a sense that they contain no information, but are stored anyway.

from flac-specification.

H2Swine commented on August 13, 2024

My issue with the example is not that it is AIFF, but whether this is at all "wasted bits behaviour". By that I mean wasted bits-per-sample "in the FLAC format sense" that the subsection is all about, not in the sense that there are bits that encode no information.

If FLAC saves a 14-bit noisy signal as a 14-bit FLAC file, it does not utilize the wasted bits feature: the wasted bits flag is off. The fact that some other format has to spend two bytes to encode the data - "achieving a compression ratio of 1.143" - is kinda irrelevant to the FLAC format, and especially to this particular way to encode, which is the functionality described in that particular subsection.
On the other hand, if someone opens a CDDA signal in some editor and saves it as a 24 bit WAVE/AIFF file with 3 bytes per sample - no dithering no nothing - then FLAC can store it as a 24 bit signal, taking advantage of the waste using this particular format feature.

(Gut feeling says I would rather use the term "utilize 3 wasted bits" than "use 3 wasted bits". Maybe also "employ" is better but nah, you aren't putting them to work. Surely "exploit" could convey the meaning hadn't that been another instance of a phrase that has already been taken.)

from flac-specification.

H2Swine commented on August 13, 2024

Separate question, so separate comment - and I should probably have fine-read the specification better.

Suppose I have a 14 bit signal (mono again), I want to store the fact that it is 14 bits in order to decode it into an AIFF file as above, but I want to maintain compatibility the Appendix C6 way.
Being a little bit creative, I write 14 bits to STREAMINFO, to signify that it should be output as a 14 bit AIFF file; then I encode every subframe as 16 bits with two wasted bits. Stream is subset, anything that picks up mid-stream (including, the very first audio frame!) will pass 16 bits down the playback chain, and everyone is happy ... or?

Question that then shows up, how would a decoder interpret such a file?

The way I indended, as reflected above?
As "starting out as a 14 bit signal, but during the stream it changes to 16"?
As the previous suggestion except that no, it cannot change before the seventeenth sample as per the format's minimum sample count - let's err out?
Err out for other reasons?

And ... is the specification clear enough that it should? Appendix C6 suggests to pad samples, it says nothing about setting STREAMINFO.

Edit: Oh, and while reading 8.2 I found two instances of the string "lesser than". Correct is "less than".

from flac-specification.

ktmf01 commented on August 13, 2024

The fact that some other format has to spend two bytes to encode the data - "achieving a compression ratio of 1.143" - is kinda irrelevant to the FLAC format, and especially to this particular way to encode, which is the functionality described in that particular subsection.

It seems to me that without this context, it will be rather hard for a lot of people not familiar with the limitation of AIFF/WAV to understand why a feature like this makes sense, what its purpose is. Sure, it is irrelevant why the feature exist, but I think it is helpful to understand what is happening here.

On the other hand, if someone opens a CDDA signal in some editor and saves it as a 24 bit WAVE/AIFF file with 3 bytes per sample - no dithering no nothing - then FLAC can store it as a 24 bit signal, taking advantage of the waste using this particular format feature.

Sure, that is another reason why this feature is useful. Still, I think the "context" as the document contains makes more sense from a technical perspective.

Being a little bit creative, I write 14 bits to STREAMINFO, to signify that it should be output as a 14 bit AIFF file; then I encode every subframe as 16 bits with two wasted bits

Seems unnecessarily complicated to me. Sure, could be. There is nothing specific in the document explaining that the numbers have to be the exact same, but the section on streaminfo says it contains information about the stream, not about the source. I think it is quite a stretch to say streaminfo can describe the source (14 bps) instead of the stream (16 bps).

As "starting out as a 14 bit signal, but during the stream it changes to 16"?

I'm pretty sure ffmpeg does this. As far as I know it refuses to decode anything in which the bps changes. libFLAC is rather lenient in this regard, as the user of libFLAC has to take care of it. flac throws an error.

from flac-specification.

H2Swine commented on August 13, 2024

Not unlikely I have gotten everything on its head here, but:

the 14 bit source file is precisely where you wouldn't use the functionality, and in the very least it has confused myself. A "better" context would be where the playback chain flat out refuses the signal if it isn't "altered by right-padding". Here is one such example: https://source.android.com/docs/core/audio/usb#hostAudio .
As for the "unnecessarily complicated" ... I think this is the way to make it Subset?
Final bullet item: well what does the spec now say? Is it unique how to interpret it? In one way, it is desirable that "this FLAC bit stream should be decoded to this and nothing else", but viewed the other way: "this is an oddball case where you might not know what WAVE headers should be, but sure any implementation that will play it, will sound the same" is kinda a neat compromise.

from flac-specification.

ktmf01 commented on August 13, 2024

the 14 bit source file is precisely where you wouldn't use the functionality, and in the very least it has confused myself. A "better" context would be where the playback chain flat out refuses the signal if it isn't "altered by right-padding". Here is one such example: https://source.android.com/docs/core/audio/usb#hostAudio .

With the proposed change of May 9th, I'm no longer talking about a 14-bit source file, that is the whole point.

As for the "unnecessarily complicated" ... I think this is the way to make it Subset?

I meant it as an unnecessarily complicated way to read the spec. The spec says streaminfo describes the FLAC file stream, not the input. The spec also says decoder behaviour in the case of streaminfo differing from the frames is unspecified, and that a decoder refusing to play is acceptable.

Final bullet item: well what does the spec now say? Is it unique how to interpret it? In one way, it is desirable that "this FLAC bit stream should be decoded to this and nothing else", but viewed the other way: "this is an oddball case where you might not know what WAVE headers should be, but sure any implementation that will play it, will sound the same" is kinda a neat compromise.

The spec says the streaminfo should agree with the frame headers.

from flac-specification.

H2Swine commented on August 13, 2024

1: Mea culpa, I thought of it as additional text and not replacement.

2 and 3:

decoder behaviour in the case of streaminfo differing from the frames is unspecified
streaminfo should agree with the frame headers

I read the following from section 9 to explicitly permit frame header independently of streaminfo?
"Each frame header stores the audio sample rate, number of bits per sample and number of channels independently of the streaminfo metadata block and other frame headers. This was done to permit multicasting of FLAC files but it also allows these properties to change mid-stream."
(Editing missing negation:) "independently" doesn't necessarily mean "with no implications whatsoever". I take the following to mean then that streaminfo will no longer agree with frame headers when properties have changed:
"Also, since the streaminfo metadata block can only accommodate a single set of properties, it is only valid for part of such an the audio stream."

But if all this does look clear enough ... close the issue?

from flac-specification.

ktmf01 commented on August 13, 2024

I think the main issue here is the normative spec is (mostly) written from a decoder perspective, and appendix C is written from an encoder perspective. When parsing, everything is possible and everything should be handled. When decoding, certain parts can be skipped/omitted. This is what the quoted part says: everything can change, but decoders are free to stop processing on such a change.

From an encoder perspective, I don't see how one can read the spec and think it is a good idea to create a file in which the streaminfo contain data that doesn't agree with any of the frame headers. Sure, it can be done, but what is the point. Specifically this part:

Being a little bit creative, I write 14 bits to STREAMINFO, to signify that it should be output as a 14 bit AIFF file; then I encode every subframe as 16 bits with two wasted bits. Stream is subset, anything that picks up mid-stream (including, the very first audio frame!) will pass 16 bits down the playback chain, and everyone is happy ... or?

This doesn't make sense to me. The 14 bits in streaminfo do not signify anything about having to output to a 14-bit AIFF file, because streaminfo doesn't say anything about input or output of whatever transcoding process, it says something about the FLAC stream.

So, I'm at a loss what should be clarified in the spec. The spec already says streaminfo correlates to the FLAC stream, not to other audio formats. The spec says they can differ indeed, but it also explicitly says a decoder can choose to abort decoding when they differ, i.e. having them differ is probably not a good idea.

from flac-specification.

H2Swine commented on August 13, 2024

The 14 bits in streaminfo do not signify anything about having to output to a 14-bit AIFF file, because streaminfo doesn't say anything about input or output of whatever transcoding process, it says something about the FLAC stream.

Doesn't it say that the FLAC stream encodes 14-bit PCM audio?
Edit1: at least up to whenever it changes. Edit2: That at least informs about what it should decode to?
Of course that doesn't say it has to decode to a certain file. My point was, if you encode to get your audio back, and decode to get your audio back - then it stands to reason that once you read 14 bit from STREAMINFO, then aha this goes into a 14-bit file also upon decoding?

from flac-specification.

ktmf01 commented on August 13, 2024

To me that's like saying a WAVE fmt chunk says how you should transcode it to AIFF. Or that an AIFF COMM chunk says how you should encode it to FLAC. It says what audio is contained. Obviously that information is important when you transcode.

So, no, you can't have the streaminfo say 14 bits (because that's what AIFF it came from) but have the frame headers say 16 bits (because that is more common). The streaminfo says something about the frames, not about AIFF. Like fmt says something about WAVE data and COMM says something about AIFF.

from flac-specification.

Clarification: Wasted bits vs "bits that could be wasted" about flac-specification HOT 15 OPEN

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs