GithubHelp home page GithubHelp logo

ietf-wg-cellar / flac-specification Goto Github PK

View Code? Open in Web Editor NEW
32.0 32.0 9.0 448 KB

The Free Lossless Audio Codec (FLAC) Specification.

Home Page: https://xiph.org/flac/format.html

License: Other

Makefile 100.00%

flac-specification's People

Contributors

ablwr avatar dericed avatar jeromemartinez avatar ktmf01 avatar mcr avatar mulattokid avatar privatezero avatar retokromer avatar ruuda avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flac-specification's Issues

UTF-8 coding?

In this section: https://github.com/privatezero/flac_markdown/blob/master/flac.md#coded-number

It's talking about UTF-8, and UCS-2 aka UTF-16, so which encoding format does it use?

(btw sorry about the formatting, idk why the UTF encoding process is bolded)

UTF-8 encoding: The highest (left most) X bits are set to 1 to indicate the number of bytes in this code point, a 0 means it's ASCII compatible, and therefore has 7 remaining bits, otherwise 1, 2, 3, and 4 1's can be followed by a zero bit to designate the size of this codepoint in bits.

Example: πŸ¦„ is U+1F984, or 0xF09FA684, the leading byte, 0xF0 says that there are 4 bytes in this code point, then all subsequent bytes are prefixed by 0b10 in the top 2 bits, that means it's a continuation byte, and you skip those bits when decoding it.

0xF0 & 0x07 << 18 = 0x000000
 + 
0x9F & 0x3F << 12 = 0x01F000
 +
0xA6 & 0x3F <<  6 = 0x000980
 +
0x84 & 0x3F <<  0 = 0x000004
 =
0x1F984

UCS-2 aka UTF-16 before Surrogate Pairs is just straight up a 16 bit value with no special encoding.

UCS-2 aka UTF-16 misidentified as UCS-2 is if the codepoint is less than 0xD7FF or it's greater than 0xE000 AND less than 0xFFFF, the value has it's same value, otherwise it's split like this:

Since we're encoding the same Unicorn from above, we need 2 codepoints because 0x1F984 is above 0xFFFF.

So, we take the codepoint, subtract 0x10000, for the Low Surrogate we mod 0xF984 with 0x400, to get 0x184, then add 0xDC00 to get 0xDD84

for the High Surrogate, we take 0xF984 and divide it by 0x400 to get 0x003E, then we add 0xD800 to get 0xD83E.

So which encoding are we actually using? because that wording is very confusing.

Document how total samples in stream should be used

I've seen FLAC files in the wild where non-streaminfo-block-size aligned number samples in a stream is signalled using total samples but the last frame does not have a variable block size. For example ffmpeg currently seems to decode all frame samples in this case and ignores total samples.

Also maybe good to make it clear what the MD5 sum should be based on, the total samples or all frame samples?

APPLICATION metadata block: riff (BWF), aiff and w64

The FLAC format has room for application specific data, which may or may not have an open specification. I haven't heard of widespread use of any of them and I'd say most of them are of no value to cellar.

There are however three of these that are used by the flac command line utility that store RIFF (WAVE), AIFF and W64 metadata. As the cellar wg is specifically targeting archival usage, are the metadata capabilities of WAVE and AIFF used to some extent in archival applications? Would it be of any use to include a definition of these riff, aiff and w64 application metadata blocks, despite not technically being part of the FLAC specification?

edit: or perhaps such metadata would (from a cellar perspective) be better stored directly in the Matroska metadata?

list of authors is unclear

We need to acknowledge Xiph.org Foundation (by person name) as an author, and then we need to make sure that former contributors are acknowledged.

add description

In my opinion, we should add a description for this repository, but I do not have the permission to make (a PR for) this.

Document mapping into containers

As we did for FFV1, we should document mapping into common containers in order to be clear about how to do so.

"fLaC", METADATA_BLOCK_STREAMINFO, METADATA_BLOCK in track header
FRAME in container "blocks".

And document the known issues e.g. METADATA_BLOCK_STREAMINFO info is sometimes (file "cut") not relevant.

METADATA_BLOCK_STREAMINFO MD5 signature field is underspecified

In the reference implementation, the MD5 signature appears to cover the raw, undecoded byte stream. For example, if encoding starts with a WAVE format file, the signature is computed over a stream of signed, 16-bit little-endian words. This is contrary to the rest of the specification, which is based on network byte order.

At this point, it is probably best not to specify a computation method for this field, and say that is implementation-defined and may be used for decoding consistency checks. This would also avoid mentioning MD5, which should make the security people happy (although of course MD5 is not used in a security-relevant way here).

cross references

As noted by @dericed on the CELLAR list, cross references need to be formatted for better rendering in the RFC.

SUBFRAME_VERBATIM

won't render as expected in a plain text RFC, but would simply render to something like "Section X.X.X".

In EBML we use markdown such as
See [the section onElement Data Size](#element-data-size) for rules that apply to elements of unknown length.
so that in the RFC this renders to
See Section 7 for rules that apply to elements of unknown length.
and in the markdown it renders to

See the section on Element Data Size for rules that apply to elements of unknown length.

Rename repo to flac-specification?

I suggest to "normalize" repo names, and I think that the way Matroska and EBML repos are named are easier for searches, so I suggest to rename "Cellar-FLAC" repo to "flac-specification".

Unclear license of this specification

The original HTML file claims it is licensed under the GNU Free Documentation License. I do not think this is something that will be acceptable to the IETF. Will the document be re-licensed under different terms? A liberal license that would permit reusing specification fragments in program code would be best (in other words, not the IETF default license).

RICE coding

Can we stop calling it Rice coding, because people naturally think you mean unary coding, instead of exponential-golomb coding which is what you actually use.

fix semantics in frame_footer section

In the html version of the specification the FRAME_FOOTER and SUBFRAME are on the same level, see https://xiph.org/flac/format.html#frame_footer. But in the markdown, the FRAME_FOOTER section, https://github.com/privatezero/flac_markdown/blob/master/flac.md#frame_footer, contains a list of SUBFRAME components that aren't in the HTML.

Also in the HTML the phrase "The SUBFRAME_HEADER specifies which one." refers to a list of 4 types of subframe contents, but in the markdown that phrase just aspects next to the first of the 4 lists types of subframe, so the relationship between "The SUBFRAME_HEADER specifies which one." and the list of what is specified is lost.

broken list in FRAME_HEADER

In the markdown by the phrase <3> Sample size in bits: the subsequent list is supposed to be a lower depth, but the subsequent list is at the same level in the markdown.

Suggested improvements of the current specs

Posted on 2017-06-06 in [flac-dev]:

Hi all,

I'm jumping in on this thread to make a few remarks about the spec. I
implemented a FLAC decoder by only looking at the spec, and I have a few
notes that would have saved me a lot of time if the spec had mentioned
them. They are obvious in hindsight, of course.

* If the channel assignment includes a difference channel, then the
subframe for that channel has one extra bit per sample in order to
encode the difference.

* The number of bits per sample for a subframe, is the number of bits
per sample of the frame, minus the number of wasted bits per sample of
the subframe (and possibly plus one for a difference channel).

I hope this helps future implementers.

Kind regards,

Ruud van Asseldonk

endianness

At the second paragraph of Format it’s stated that Β«All numbers are big-endian coded.Β» Is this true? I am asking, because my test encoder/decoder works well also with little endian codings, like pcm_s24le which BTW is also signed.

Missing specification of wasted bits-per-sample value

It is not immediately obvious howwhen to apply the wasted bits-per-sample value during decoding. It's possible to guess that a left shift involved, but it is unclear whether this happens after or before LPC decoding. (The impact on the bits-per-sample value of the subframe is also unspecified, see #83.)

Included fixed LPC parameters

I'm not sure how common knowledge these fixed parameters are (the WIkipedia page on linear predictive coding does not list them), but if they are not unique defined in the literature, they need to be included in the specification.

Should decoder limits of current implementations be included in the specification?

I have come across a interoperability problem that I'm not sure should be included in the specification.

For a few years I've been working on a new analysis method to implement in libFLAC to improve compression. This is still to slow for day-to-day use, but it is currently quite useful for research purposes. However, now I've stumbled upon a file created with this modified encoder that is spec-compliant, 13% smaller than the file compressed by an unmodified libFLAC (which indicates the problem isn't caused by being inefficient) but cannot be decoded by ffmpeg and presumably relies on undefined behaviour (signed integer overflow to wrap around) in the libFLAC decoder.

So here is the problem: current decoders are apparently limited to using residuals between 2^31-1 and -(2^31). This is unlikely to change, because the problem is very specific to a certain encoder combined with a specially crafted signal. The real problem to me here seems that the maximum residual sample value isn't bounded by the spec. Current implementations simply assume it fits a 32-bit signed integer.

It could be added to the spec that encoders MUST NOT create residuals that fall outside 2^31-1 and -(2^31) and a file with such residuals is considered invalid. If this is too strong this restriction could be limited to files with a bitdepth of 24 bits or less.

This is in line with what ffmpeg does currently:
https://github.com/FFmpeg/FFmpeg/blob/2e82c610553efd69b4d9b6c359423a19c2868255/libavcodec/flacdec.c#L266-L268

Frame number reset handling

Hi, i've encountered a group of flac files (from the same album) that all have the same peculiar behaviour. They all have end-of-header frame number that behaves like this:

  • In frame 0 to 2048 the frame number 256 resets back 0
  • At frame 2048 the resetting stops and goes from 255 to 2048 and after that seems normal

I can't find anything in the FLAC specification how this should be handled.

The xiph decoder seems to ignore the resets and decodes all the samples. ffmpeg seems to not like it and throws away samples at each reset. Should also note that the files have invalid samples MD5 but i think that is unrelated to this. It's a know issue with the "Switch Plus" flac encoder, for what i can see the frames are fine. I have other switch pro files with invalid MD5 where the frame number does not reset.

Unfortunately i can't share these files because of copyright and i haven't yet been able to generate files with the same behaviour. But here is a session with fq showing the relevant parts and also output from decoding with xiph flac and ffmpeg and then counting number of samples decoded. There is 13692480-13594624 = 97856 samples missing and looking at the output it looks like ffmpeg throws away 3 frames per reset which kind of matches number of resets 97856 / 4096 / 3 = 7.963541666666667.

$ fq -o line_bytes=10 -i . <redacted>.flac

# display all metadata
flac> .metadatablocks[] | d
    β”‚00 01 02 03 04 05 06 07 08 09β”‚0123456789β”‚.metadatablocks[0]{}: (flac_metadatablock)
0x00β”‚            00               β”‚    .     β”‚  last_block: false
0x00β”‚            00               β”‚    .     β”‚  type: "streaminfo" (0)
0x00β”‚               00 00 22      β”‚     .."  β”‚  length: 34
0x00β”‚                        10 00β”‚        ..β”‚  minimum_block_size: 4096
0x0aβ”‚10 00                        β”‚..        β”‚  maximum_block_size: 4096
0x0aβ”‚      00 00 00               β”‚  ...     β”‚  minimum_frame_size: 0
0x0aβ”‚               00 00 00      β”‚     ...  β”‚  maximum_frame_size: 0
0x0aβ”‚                        0a c4β”‚        ..β”‚  sample_rate: 44100
0x14β”‚42                           β”‚B         β”‚
0x14β”‚42                           β”‚B         β”‚  channels: 2
0x14β”‚42 f0                        β”‚B.        β”‚  bits_per_sample: 16
0x14β”‚   f0 00 d0 ee 40            β”‚ ....@    β”‚  total_samples_in_stream: 13692480
0x14β”‚                  fd fd 1c 53β”‚      ...Sβ”‚  md5: "fdfd1c5314bbfe1a8adf8b0b96683bba" (raw bits)
0x1eβ”‚14 bb fe 1a 8a df 8b 0b 96 68β”‚.........hβ”‚
0x28β”‚3b ba                        β”‚;.        β”‚
    β”‚00 01 02 03 04 05 06 07 08 09β”‚0123456789β”‚.metadatablocks[1]{}: (flac_metadatablock)
0x28β”‚      84                     β”‚  .       β”‚  last_block: true
0x28β”‚      84                     β”‚  .       β”‚  type: "vorbis_comment" (4)
0x28β”‚         00 00 b9            β”‚   ...    β”‚  length: 185
    β”‚                             β”‚          β”‚  comment{}: (vorbis_comment)
0x28β”‚                  00 00 00 00β”‚      ....β”‚    vendor_length: 0
    β”‚                             β”‚          β”‚    vendor: ""
0x32β”‚05 00 00 00                  β”‚....      β”‚    user_comment_list_length: 5
    β”‚                             β”‚          β”‚    user_comments[0:5]:
    β”‚                             β”‚          β”‚      [0]{}:
0x32β”‚            26 00 00 00      β”‚    &...  β”‚        length: 38
0x32β”‚                        43 4fβ”‚        COβ”‚        comment: "COPYRIGHT=Switch Plus (c) NCH Software"
0x3cβ”‚50 59 52 49 47 48 54 3d 53 77β”‚PYRIGHT=Swβ”‚
0x46β”‚69 74 63 68 20 50 6c 75 73 20β”‚itch Plus β”‚
0x50β”‚28 63 29 20 4e 43 48 20 53 6fβ”‚(c) NCH Soβ”‚
0x5aβ”‚66 74 77 61 72 65            β”‚ftware    β”‚
    β”‚                             β”‚          β”‚      [1]{}:
0x5aβ”‚                  26 00 00 00β”‚      &...β”‚        length: 38
0x64β”‚45 4e 43 4f 44 45 44 42 59 3dβ”‚ENCODEDBY=β”‚        comment: "ENCODEDBY=Switch Plus (c) NCH Software"
0x6eβ”‚53 77 69 74 63 68 20 50 6c 75β”‚Switch Pluβ”‚
0x78β”‚73 20 28 63 29 20 4e 43 48 20β”‚s (c) NCH β”‚
0x82β”‚53 6f 66 74 77 61 72 65      β”‚Software  β”‚
    β”‚                             β”‚          β”‚      [2]{}:
0x82β”‚                        06 00β”‚        ..β”‚        length: 6
0x8cβ”‚00 00                        β”‚..        β”‚
0x8cβ”‚      47 45 4e 52 45 3d      β”‚  GENRE=  β”‚        comment: "GENRE="
    β”‚                             β”‚          β”‚      [3]{}:
0x8cβ”‚                        26 00β”‚        &.β”‚        length: 38
0x96β”‚00 00                        β”‚..        β”‚
0x96β”‚      50 55 42 4c 49 53 48 45β”‚  PUBLISHEβ”‚        comment: "PUBLISHER=Switch Plus (c) NCH Software"
0xa0β”‚52 3d 53 77 69 74 63 68 20 50β”‚R=Switch Pβ”‚
0xaaβ”‚6c 75 73 20 28 63 29 20 4e 43β”‚lus (c) NCβ”‚
0xb4β”‚48 20 53 6f 66 74 77 61 72 65β”‚H Softwareβ”‚
    β”‚                             β”‚          β”‚      [4]{}:
... <some metadata redacted> ...
*   β”‚until 0xe6.7 (37)            β”‚          β”‚

# display frame number for frames 252-259 and 2045-2050
flac> (.frames[252:260][], .frames[2045:2051][]).header.end_of_header | d
        β”‚00 01 02 03 04 05 06 07 08 09β”‚0123456789β”‚.frames[252].header.end_of_header{}:
0x2ed808β”‚                     c3 bc   β”‚       .. β”‚  frame_number: 252
        β”‚00 01 02 03 04 05 06 07 08 09β”‚0123456789β”‚.frames[253].header.end_of_header{}:
0x2f0152β”‚                  c3 bd      β”‚      ..  β”‚  frame_number: 253
        β”‚00 01 02 03 04 05 06 07 08 09β”‚0123456789β”‚.frames[254].header.end_of_header{}:
0x2f36feβ”‚            c3 be            β”‚    ..    β”‚  frame_number: 254
        β”‚00 01 02 03 04 05 06 07 08 09β”‚0123456789β”‚.frames[255].header.end_of_header{}:
0x2f662eβ”‚c3 bf                        β”‚..        β”‚  frame_number: 255
        β”‚00 01 02 03 04 05 06 07 08 09β”‚0123456789β”‚.frames[256].header.end_of_header{}:
0x2f9270β”‚         c0 80               β”‚   ..     β”‚  frame_number: 0
        β”‚00 01 02 03 04 05 06 07 08 09β”‚0123456789β”‚.frames[257].header.end_of_header{}:
0x2fbb2eβ”‚                  c0 81      β”‚      ..  β”‚  frame_number: 1
        β”‚00 01 02 03 04 05 06 07 08 09β”‚0123456789β”‚.frames[258].header.end_of_header{}:
0x2fe96eβ”‚         c0 82               β”‚   ..     β”‚  frame_number: 2
        β”‚00 01 02 03 04 05 06 07 08 09β”‚0123456789β”‚.frames[259].header.end_of_header{}:
0x301ceaβ”‚                     c0 83   β”‚       .. β”‚  frame_number: 3
         β”‚00 01 02 03 04 05 06 07 08 09β”‚0123456789β”‚.frames[2045].header.end_of_header{}:
0x17f5ba6β”‚   c3 bd                     β”‚ ..       β”‚  frame_number: 253
         β”‚00 01 02 03 04 05 06 07 08 09β”‚0123456789β”‚.frames[2046].header.end_of_header{}:
0x17f8342β”‚               c3 be         β”‚     ..   β”‚  frame_number: 254
         β”‚00 01 02 03 04 05 06 07 08 09β”‚0123456789β”‚.frames[2047].header.end_of_header{}:
0x17faebcβ”‚                        c3 bfβ”‚        ..β”‚  frame_number: 255
         β”‚00 01 02 03 04 05 06 07 08 09β”‚0123456789β”‚.frames[2048].header.end_of_header{}:
0x17fdd38β”‚                  e0 a0 80   β”‚      ... β”‚  frame_number: 2048
         β”‚00 01 02 03 04 05 06 07 08 09β”‚0123456789β”‚.frames[2049].header.end_of_header{}:
0x18009acβ”‚                           e0β”‚         .β”‚  frame_number: 2049
0x18009b6β”‚a0 81                        β”‚..        β”‚
         β”‚00 01 02 03 04 05 06 07 08 09β”‚0123456789β”‚.frames[2050].header.end_of_header{}:
0x1804034β”‚                        e0 a0β”‚        ..β”‚  frame_number: 2050
0x180403eβ”‚82                           β”‚.         β”‚

# collect all frame number deltas that are not 1
flac> [.frames[].header.end_of_header.frame_number] | delta | map(select(. != 1))
[
  -255,
  -255,
  -255,
  -255,
  -255,
  -255,
  -255,
  1793
]
$ flac -f -F -d -o <redacted>.flac.xiph.wav <redacted>.flac

flac 1.3.3
Copyright (C) 2000-2009  Josh Coalson, 2011-2016  Xiph.Org Foundation
flac comes with ABSOLUTELY NO WARRANTY.  This is free software, and you are
welcome to redistribute it under certain conditions.  Type `flac' for details.

<redacted>.flac: ERROR, MD5 signature mismatch
$ ffmpeg -y -i <redacted>.flac <redacted>.flac.ffmpeg.wav
Input #0, flac, from '<redacted>.flac':
  Metadata:
    COPYRIGHT       : Switch Plus (c) NCH Software
    ENCODEDBY       : Switch Plus (c) NCH Software
    PUBLISHER       : Switch Plus (c) NCH Software
    TITLE           : <redacted>
  Duration: 00:05:10.49, start: 0.000000, bitrate: 1057 kb/s
  Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Stream mapping:
  Stream #0:0 -> #0:0 (flac (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '<redacted>.flac.ffmpeg.wav':
  Metadata:
    ICOP            : Switch Plus (c) NCH Software
    ENCODEDBY       : Switch Plus (c) NCH Software
    PUBLISHER       : Switch Plus (c) NCH Software
    INAM            : <redacted>
    ISFT            : Lavf58.76.100
  Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s
    Metadata:
      encoder         : Lavc58.134.100 pcm_s16le
[wav @ 0x7fcfa861a800] Non-monotonous DTS in output stream 0:0; previous: 1028096, current: 4096; changing to 1028096. This may result in incorrect timestamps in the output file.
[wav @ 0x7fcfa861a800] Non-monotonous DTS in output stream 0:0; previous: 1028096, current: 0; changing to 1028096. This may result in incorrect timestamps in the output file.
[wav @ 0x7fcfa861a800] Non-monotonous DTS in output stream 0:0; previous: 1028096, current: 4096; changing to 1028096. This may result in incorrect timestamps in the output file.
[wav @ 0x7fcfa861a800] Non-monotonous DTS in output stream 0:0; previous: 1028096, current: 8192; changing to 1028096. This may result in incorrect timestamps in the output file.
... <lots of similar logs removed>
size=   53104kB time=00:05:10.59 bitrate=1400.7kbits/s speed= 137x
video:0kB audio:53104kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000287%

$ ffmpeg -v verbose -i <redacted>.flac.xiph.wav -f null - 2>&1 | grep "frames encoded"
  Output stream #0:0 (audio): 13372 frames encoded (13692480 samples); 13372 packets muxed (54769920 bytes);
$ ffmpeg -v verbose -i <redacted>.flac.ffmpeg.wav -f null - 2>&1 | grep "frames encoded"
  Output stream #0:0 (audio): 13276 frames encoded (13594624 samples); 13276 packets muxed (54378496 bytes);

logic mismatch

To my reading, this section (a list of block types separated by 'or's which is labelled with an obligation) doesn't match the original format.html. Here is appears that the phrase The block data must match the block type in the block header. only relates to the last item in the list when that doesn't read that way in the original. Needs some adjustment so that both read the same way.

units

Currently there is a mix between numbers and units having and not having a space in between (e.g. 48KHz and 16 bit). I suggest to standardise with an espace, which is easier to read. If you agree, I’m happy to go through the document and change when needed.

change in meaning in a METADATA_BLOCK reference

In the markdown is the line

METADATA_BLOCK Zero or more metadata blocks

but the original format.html places an asterik after METADATA_BLOCK. I'm unclear as to the semantic difference here, but I think that there is one.

Missing singedness indication

In the following places of the specification, signed values are used implicitly (the default are signed values):

  • The constant value in SUBFRAME_CONSTANT.
  • The warm-up samples in SUBFRAME_FIXED.
  • The warm-up samples in SUBFRAME_LPC.
  • The sample data in SUBFRAME_VERBATIM.
  • The Golomb encoding (see this comment).

Except for the last item, these values are encoded as two's complement.

Golomb-Rice (from flac-dev)

Posted on 2017-06-06 in [flac-dev]:

Andrew,

I think it is neither Rice Coding nor Exponential Golomb Coding. The one used in FLAC is Golomb-Rice coding, which is almost optimal for the Laplace (exponential) statistical distribution of residuals after modelling.

Best regards,

Federico

add RFC2119 section and implement the terminology

An RFC2119 section SHOULD be added just after the introductory section, such as

# Notation and Conventions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [@!RFC2119].

All current uses of those keywords with the specification SHOULD be reviewed to ensure that the use the same meaning as RFC2119 defines, see https://www.ietf.org/rfc/rfc2119.txt. If so then the term SHOULD be changed to all caps; if not, the the term should be changed so as not to confuse meaning with RFC2119.

Different name for metadata blocks

In the list of definitions blocks and subblocks refer to unencoded PCM audio, frames and subframes to encoded audio. Metadata blocks might be a confusing term. Perhaps another term can be used?

Ideas:

  • Metadata chunks. Chunks are a term used in WAV for metadata as well, seems to me like a good fit
  • Metadata elements. Sounds 'too small' to me, as these blocks can be up to 16MiB in size and quite complex
  • Metadata frames. Like in ID3. Frames is already used for encoded audio
  • Metadata structure. Sound like there cannot be more than one of them
  • Metadata tracks. Also confusing in audio
  • Metadata attachments
  • Metadata segments
  • Metadata sections
  • Metadata slices
  • Metadata fragments. Sounds like part of it has been lost

Please comment

anchors in markdown

Currently anchors linking to section headers with underscores in them (such as FRAME_HEADER) are broken in markdown.

The current anchors do, however, create working internal links in the html generated by the makefile.

LPC decoding details missing

I think some of the LPC decoding details should be included in the specification:

  • Apparently, it is expected that the decoding happens as if with infinite precision.
  • Is there an expectation that it is possible to implement subset streams with just an 32x32β†’32 multiplier if the decoded stream is 16 bit?
  • What does it mean if the quantized linear predictor coefficient shift in SUBFRAME_LPC is negative? I don't think the reference implementation performs a left shift in this case.
  • I don't think it is obvious where to apply the quantiziation shift.

It may make sense to include an explicit formula for LPC decoding, so that the it's clear whether there is a minus sign involved or not.

Including a list of open-source implementations?

Would it be of any benefit to include a list of (high-quality) open-source decoder implementations? Or would this be something too volatile? I'd think of something like this:

  • libFLAC (C and C++, BSD-like license)
  • ffmpeg (C, GNU LGPL)
  • Firefox (C, MPL)
  • dr_libs (C. Public Domain/MIT No Attribution)
  • Claxon (Rust, Apache 2.0)
  • jFLAC (Java, LGPL)

uniform copyright notice

In the version draft-weaver-cellar-flac-00 on page 1 is say:

Copyright (c) 2019 IETF Trust and the persons identified as the
document authors.  All rights reserved.

and on page 29:

Copyright (c) 2000-2009 Josh Coalson, 2011-2014 Xiph.Org Foundation

I suggest to uniform the copyright notice and to have it only once in the document.

describe method for expressing hex and binary data

I suggest documenting/acknowledgement a method for expressing binary and hex values. Currently the context can be inferred in most places (in not all) by understanding the number of bits described, but I think having expressions such as 0b0000 or 0x0000 would be more clear, rather than 0000 which is ambitiguous (does it mean binary, hex, decimal, etc?).

Missing specification of bits-per-sample adjustment for subframes

The reference implementation increases the number of bits per sample by one in the following cases:

  • For the first subframe if the channel assignment is 0b1001.
  • For the second subframe if the channel assignment is 0b1000 or 0b1010.

Not doing this results in an implementation which is not interoperable.

Document what is allowed to change between frames

For example can one assume that for a valid FLAC stream sample rate, channel configuration and channel count must be the same for all frames and also must match what is specified in streaminfo?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.