GithubHelp home page GithubHelp logo

js-car's Introduction

@ipld/car

codecov CI

Content Addressable aRchive format reader and writer

Install

$ npm i @ipld/car

See also:

Contents

Example

// Create a simple .car file with a single block and that block's CID as the
// single root. Then read the .car and fetch the block again.

import fs from 'fs'
import { Readable } from 'stream'
import { CarReader, CarWriter } from '@ipld/car'
import * as raw from 'multiformats/codecs/raw'
import { CID } from 'multiformats/cid'
import { sha256 } from 'multiformats/hashes/sha2'

async function example() {
  const bytes = new TextEncoder().encode('random meaningless bytes')
  const hash = await sha256.digest(raw.encode(bytes))
  const cid = CID.create(1, raw.code, hash)

  // create the writer and set the header with a single root
  const { writer, out } = await CarWriter.create([cid])
  Readable.from(out).pipe(fs.createWriteStream('example.car'))

  // store a new block, creates a new file entry in the CAR archive
  await writer.put({ cid, bytes })
  await writer.close()

  const inStream = fs.createReadStream('example.car')
  // read and parse the entire stream in one go, this will cache the contents of
  // the car in memory so is not suitable for large files.
  const reader = await CarReader.fromIterable(inStream)

  // read the list of roots from the header
  const roots = await reader.getRoots()
  // retrieve a block, as a { cid:CID, bytes:UInt8Array } pair from the archive
  const got = await reader.get(roots[0])
  // also possible: for await (const { cid, bytes } of CarIterator.fromIterable(inStream)) { ... }

  console.log(
    'Retrieved [%s] from example.car with CID [%s]',
    new TextDecoder().decode(got.bytes),
    roots[0].toString()
  )
}

example().catch((err) => {
  console.error(err)
  process.exit(1)
})

Will output:

Retrieved [random meaningless bytes] from example.car with CID [bafkreihwkf6mtnjobdqrkiksr7qhp6tiiqywux64aylunbvmfhzeql2coa]

See the examples directory for more.

Usage

@ipld/car is consumed through factory methods on its different classes. Each class represents a discrete set of functionality. You should select the classes that make the most sense for your use-case.

Please be aware that @ipld/car does not validate that block data matches the paired CIDs when reading a CAR. See the verify-car.js example for one possible approach to validating blocks as they are read. Any CID verification requires that the hash function that was used to generate the CID be available, the CAR format does not restrict the allowable multihashes.

CarReader

The basic CarReader class is consumed via:

import { CarReader } from '@ipld/car/reader'
import { CarBufferReader } from '@ipld/car/buffer-reader'

Or alternatively: import { CarReader } from '@ipld/car'. CommonJS require will also work for the same import paths and references.

CarReader is useful for relatively small CAR archives as it buffers the entirety of the archive in memory to provide access to its data. This class is also suitable in a browser environment. The CarReader class provides random-access get(key) and has(key) methods as well as iterators for blocks()] and cids()].

CarReader can be instantiated from a single Uint8Array or from an AsyncIterable of Uint8Arrays (note that Node.js streams are AsyncIterables and can be consumed in this way).

CarBufferReader works exactly the same way as CarReader but all methods are synchronous.

CarIndexedReader

The CarIndexedReader class is a special form of CarReader and can be consumed in Node.js only (not in the browser) via:

import { CarIndexedReader } from '@ipld/car/indexed-reader'

Or alternatively: import { CarIndexedReader } from '@ipld/car'. CommonJS require will also work for the same import paths and references.

A CarIndexedReader provides the same functionality as CarReader but is instantiated from a path to a CAR file and also adds a close() method that must be called when the reader is no longer required, to clean up resources.

CarIndexedReader performs a single full-scan of a CAR file, collecting a list of CIDs and their block positions in the archive. It then performs random-access reads when blocks are requested via get() and the blocks() and cids() iterators.

This class may be sutiable for random-access (primarily via has() and get()) to relatively large CAR files.

CarBlockIterator and CarCIDIterator

import { CarBlockIterator } from '@ipld/car/iterator'
// or
import { CarCIDIterator } from '@ipld/car/iterator'

Or alternatively: import { CarBlockIterator, CarCIDIterator } from '@ipld/car'. CommonJS require will also work for the same import paths and references.

These two classes provide AsyncIterables to the blocks or just the CIDs contained within a CAR archive. These are efficient mechanisms for scanning an entire CAR archive, regardless of size, if random-access to blocks is not required.

CarBlockIterator and CarCIDIterator can be instantiated from a single Uint8Array (see CarBlockIterator.fromBytes() and CarCIDIterator.fromBytes()) or from an AsyncIterable of Uint8Arrays (see CarBlockIterator.fromIterable() and CarCIDIterator.fromIterable())—note that Node.js streams are AsyncIterables and can be consumed in this way.

CarIndexer

The CarIndexer class can be used to scan a CAR archive and provide indexing data on the contents. It can be consumed via:

import CarIndexer from '@ipld/car/indexed-reader'

Or alternatively: import { CarIndexer } from '@ipld/car'. CommonJS require will also work for the same import paths and references.

This class is used within CarIndexedReader and is only useful in cases where an external index of a CAR needs to be generated and used.

The index data can also be used with CarReader.readRaw()] to fetch block data directly from a file descriptor using the index data for that block.

CarWriter

A CarWriter is used to create new CAR archives. It can be consumed via:

import CarWriter from '@ipld/car/writer'

Or alternatively: import { CarWriter } from '@ipld/car'. CommonJS require will also work for the same import paths and references.

Creation of a CarWriter involves a "channel", or a { writer:CarWriter, out:AsyncIterable<Uint8Array> } pair. The writer side of the channel is used to put() blocks, while the out side of the channel emits the bytes that form the encoded CAR archive.

In Node.js, you can use the Readable.from() API to convert the out AsyncIterable to a standard Node.js stream, or it can be directly fed to a stream.pipeline().

API

Contents

class CarReader

Properties:

  • version (number): The version number of the CAR referenced by this reader (should be 1 or 2).

Provides blockstore-like access to a CAR.

Implements the RootsReader interface: getRoots(). And the BlockReader interface: get(), has(), blocks() (defined as a BlockIterator) and cids() (defined as a CIDIterator).

Load this class with either import { CarReader } from '@ipld/car/reader' (const { CarReader } = require('@ipld/car/reader')). Or import { CarReader } from '@ipld/car' (const { CarReader } = require('@ipld/car')). The former will likely result in smaller bundle sizes where this is important.

async CarReader#getRoots()

  • Returns: Promise<CID[]>

Get the list of roots defined by the CAR referenced by this reader. May be zero or more CIDs.

async CarReader#has(key)

  • key (CID)

  • Returns: Promise<boolean>

Check whether a given CID exists within the CAR referenced by this reader.

async CarReader#get(key)

  • key (CID)

  • Returns: Promise<(Block|undefined)>

Fetch a Block (a { cid:CID, bytes:Uint8Array } pair) from the CAR referenced by this reader matching the provided CID. In the case where the provided CID doesn't exist within the CAR, undefined will be returned.

async * CarReader#blocks()

  • Returns: AsyncGenerator<Block>

Returns a BlockIterator (AsyncIterable<Block>) that iterates over all of the Blocks ({ cid:CID, bytes:Uint8Array } pairs) contained within the CAR referenced by this reader.

async * CarReader#cids()

  • Returns: AsyncGenerator<CID>

Returns a CIDIterator (AsyncIterable<CID>) that iterates over all of the CIDs contained within the CAR referenced by this reader.

async CarReader.fromBytes(bytes)

  • bytes (Uint8Array)

  • Returns: Promise<CarReader>

Instantiate a CarReader from a Uint8Array blob. This performs a decode fully in memory and maintains the decoded state in memory for full access to the data via the CarReader API.

async CarReader.fromIterable(asyncIterable)

  • asyncIterable (AsyncIterable<Uint8Array>)

  • Returns: Promise<CarReader>

Instantiate a CarReader from a AsyncIterable<Uint8Array>, such as a modern Node.js stream. This performs a decode fully in memory and maintains the decoded state in memory for full access to the data via the CarReader API.

Care should be taken for large archives; this API may not be appropriate where memory is a concern or the archive is potentially larger than the amount of memory that the runtime can handle.

async CarReader.readRaw(fd, blockIndex)

  • fd (fs.promises.FileHandle|number): A file descriptor from the Node.js fs module. Either an integer, from fs.open() or a FileHandle from fs.promises.open().

  • blockIndex (BlockIndex): An index pointing to the location of the Block required. This BlockIndex should take the form: {cid:CID, blockLength:number, blockOffset:number}.

  • Returns: Promise<Block>: A { cid:CID, bytes:Uint8Array } pair.

Reads a block directly from a file descriptor for an open CAR file. This function is only available in Node.js and not a browser environment.

This function can be used in connection with CarIndexer which emits the BlockIndex objects that are required by this function.

The user is responsible for opening and closing the file used in this call.

class CarIndexedReader

Properties:

  • version (number): The version number of the CAR referenced by this reader (should be 1).

A form of CarReader that pre-indexes a CAR archive from a file and provides random access to blocks within the file using the index data. This function is only available in Node.js and not a browser environment.

For large CAR files, using this form of CarReader can be singificantly more efficient in terms of memory. The index consists of a list of CIDs and their location within the archive (see CarIndexer). For large numbers of blocks, this index can also occupy a significant amount of memory. In some cases it may be necessary to expand the memory capacity of a Node.js instance to allow this index to fit. (e.g. by running with NODE_OPTIONS="--max-old-space-size=16384").

As an CarIndexedReader instance maintains an open file descriptor for its CAR file, an additional CarReader#close method is attached. This must be called to have full clean-up of resources after use.

Load this class with either import { CarIndexedReader } from '@ipld/car/indexed-reader' (const { CarIndexedReader } = require('@ipld/car/indexed-reader')). Or import { CarIndexedReader } from '@ipld/car' (const { CarIndexedReader } = require('@ipld/car')). The former will likely result in smaller bundle sizes where this is important.

async CarIndexedReader#getRoots()

  • Returns: Promise<CID[]>

See CarReader#getRoots

async CarIndexedReader#has(key)

  • key (CID)

  • Returns: Promise<boolean>

See CarReader#has

async CarIndexedReader#get(key)

  • key (CID)

  • Returns: Promise<(Block|undefined)>

See CarReader#get

async * CarIndexedReader#blocks()

  • Returns: AsyncGenerator<Block>

See CarReader#blocks

async * CarIndexedReader#cids()

  • Returns: AsyncGenerator<CID>

See CarReader#cids

async CarIndexedReader#close()

  • Returns: Promise<void>

Close the underlying file descriptor maintained by this CarIndexedReader. This must be called for proper resource clean-up to occur.

async CarIndexedReader.fromFile(path)

  • path (string)

  • Returns: Promise<CarIndexedReader>

Instantiate an CarIndexedReader from a file with the provided path. The CAR file is first indexed with a full path that collects CIDs and block locations. This index is maintained in memory. Subsequent reads operate on a read-only file descriptor, fetching the block from its in-file location.

For large archives, the initial indexing may take some time. The returned Promise will resolve only after this is complete.

class CarBlockIterator

Properties:

  • version (number): The version number of the CAR referenced by this iterator (should be 1).

Provides an iterator over all of the Blocks in a CAR. Implements a BlockIterator interface, or AsyncIterable<Block>. Where a Block is a { cid:CID, bytes:Uint8Array } pair.

As an implementer of AsyncIterable, this class can be used directly in a for await (const block of iterator) {} loop. Where the iterator is constructed using CarBlockiterator.fromBytes or CarBlockiterator.fromIterable.

An iteration can only be performce once per instantiation.

CarBlockIterator also implements the RootsReader interface and provides the getRoots() method.

Load this class with either import { CarBlockIterator } from '@ipld/car/iterator' (const { CarBlockIterator } = require('@ipld/car/iterator')). Or import { CarBlockIterator } from '@ipld/car' (const { CarBlockIterator } = require('@ipld/car')).

async CarBlockIterator#getRoots()

  • Returns: Promise<CID[]>

Get the list of roots defined by the CAR referenced by this iterator. May be zero or more CIDs.

async CarBlockIterator.fromBytes(bytes)

  • bytes (Uint8Array)

  • Returns: Promise<CarBlockIterator>

Instantiate a CarBlockIterator from a Uint8Array blob. Rather than decoding the entire byte array prior to returning the iterator, as in CarReader.fromBytes, only the header is decoded and the remainder of the CAR is parsed as the Blocks as yielded.

async CarBlockIterator.fromIterable(asyncIterable)

  • asyncIterable (AsyncIterable<Uint8Array>)

  • Returns: Promise<CarBlockIterator>

Instantiate a CarBlockIterator from a AsyncIterable<Uint8Array>, such as a modern Node.js stream. Rather than decoding the entire byte array prior to returning the iterator, as in CarReader.fromIterable, only the header is decoded and the remainder of the CAR is parsed as the Blocks as yielded.

class CarCIDIterator

Properties:

  • version (number): The version number of the CAR referenced by this iterator (should be 1).

Provides an iterator over all of the CIDs in a CAR. Implements a CIDIterator interface, or AsyncIterable<CID>. Similar to CarBlockIterator but only yields the CIDs in the CAR.

As an implementer of AsyncIterable, this class can be used directly in a for await (const cid of iterator) {} loop. Where the iterator is constructed using CarCIDiterator.fromBytes or CarCIDiterator.fromIterable.

An iteration can only be performce once per instantiation.

CarCIDIterator also implements the RootsReader interface and provides the getRoots() method.

Load this class with either import { CarCIDIterator } from '@ipld/car/iterator' (const { CarCIDIterator } = require('@ipld/car/iterator')). Or import { CarCIDIterator } from '@ipld/car' (const { CarCIDIterator } = require('@ipld/car')).

async CarCIDIterator#getRoots()

  • Returns: Promise<CID[]>

Get the list of roots defined by the CAR referenced by this iterator. May be zero or more CIDs.

async CarCIDIterator.fromBytes(bytes)

  • bytes (Uint8Array)

  • Returns: Promise<CarCIDIterator>

Instantiate a CarCIDIterator from a Uint8Array blob. Rather than decoding the entire byte array prior to returning the iterator, as in CarReader.fromBytes, only the header is decoded and the remainder of the CAR is parsed as the CIDs as yielded.

async CarCIDIterator.fromIterable(asyncIterable)

  • asyncIterable (AsyncIterable<Uint8Array>)

  • Returns: Promise<CarCIDIterator>

Instantiate a CarCIDIterator from a AsyncIterable<Uint8Array>, such as a modern Node.js stream. Rather than decoding the entire byte array prior to returning the iterator, as in CarReader.fromIterable, only the header is decoded and the remainder of the CAR is parsed as the CIDs as yielded.

class CarIndexer

Properties:

  • version (number): The version number of the CAR referenced by this reader (should be 1).

Provides an iterator over all of the Blocks in a CAR, returning their CIDs and byte-location information. Implements an AsyncIterable<BlockIndex>. Where a BlockIndex is a { cid:CID, length:number, offset:number, blockLength:number, blockOffset:number }.

As an implementer of AsyncIterable, this class can be used directly in a for await (const blockIndex of iterator) {} loop. Where the iterator is constructed using CarIndexer.fromBytes or CarIndexer.fromIterable.

An iteration can only be performce once per instantiation.

CarIndexer also implements the RootsReader interface and provides the getRoots() method.

Load this class with either import { CarIndexer } from '@ipld/car/indexer' (const { CarIndexer } = require('@ipld/car/indexer')). Or import { CarIndexer } from '@ipld/car' (const { CarIndexer } = require('@ipld/car')). The former will likely result in smaller bundle sizes where this is important.

async CarIndexer#getRoots()

  • Returns: Promise<CID[]>

Get the list of roots defined by the CAR referenced by this indexer. May be zero or more CIDs.

async CarIndexer.fromBytes(bytes)

  • bytes (Uint8Array)

  • Returns: Promise<CarIndexer>

Instantiate a CarIndexer from a Uint8Array blob. Only the header is decoded initially, the remainder is processed and emitted via the iterator as it is consumed.

async CarIndexer.fromIterable(asyncIterable)

  • asyncIterable (AsyncIterable<Uint8Array>)

  • Returns: Promise<CarIndexer>

Instantiate a CarIndexer from a AsyncIterable<Uint8Array>, such as a modern Node.js stream. is decoded initially, the remainder is processed and emitted via the iterator as it is consumed.

class CarWriter

Provides a writer interface for the creation of CAR files.

Creation of a CarWriter involves the instatiation of an input / output pair in the form of a WriterChannel, which is a { writer:CarWriter, out:AsyncIterable<Uint8Array> } pair. These two components form what can be thought of as a stream-like interface. The writer component (an instantiated CarWriter), has methods to put() new blocks and close() the writing operation (finalising the CAR archive). The out component is an AsyncIterable that yields the bytes of the archive. This can be redirected to a file or other sink. In Node.js, you can use the Readable.from() API to convert this to a standard Node.js stream, or it can be directly fed to a stream.pipeline().

The channel will provide a form of backpressure. The Promise from a write() won't resolve until the resulting data is drained from the out iterable.

It is also possible to ignore the Promise from write() calls and allow the generated data to queue in memory. This should be avoided for large CAR archives of course due to the memory costs and potential for memory overflow.

Load this class with either import { CarWriter } from '@ipld/car/writer' (const { CarWriter } = require('@ipld/car/writer')). Or import { CarWriter } from '@ipld/car' (const { CarWriter } = require('@ipld/car')). The former will likely result in smaller bundle sizes where this is important.

async CarWriter#put(block)

  • block (Block): A { cid:CID, bytes:Uint8Array } pair.

  • Returns: Promise<void>: The returned promise will only resolve once the bytes this block generates are written to the out iterable.

Write a Block (a { cid:CID, bytes:Uint8Array } pair) to the archive.

async CarWriter#close()

  • Returns: Promise<void>

Finalise the CAR archive and signal that the out iterable should end once any remaining bytes are written.

async CarWriter.create(roots)

  • roots (CID[]|CID|void)

  • Returns: WriterChannel: The channel takes the form of { writer:CarWriter, out:AsyncIterable<Uint8Array> }.

Create a new CAR writer "channel" which consists of a { writer:CarWriter, out:AsyncIterable<Uint8Array> } pair.

async CarWriter.createAppender()

  • Returns: WriterChannel: The channel takes the form of { writer:CarWriter, out:AsyncIterable<Uint8Array> }.

Create a new CAR appender "channel" which consists of a { writer:CarWriter, out:AsyncIterable<Uint8Array> } pair. This appender does not consider roots and does not produce a CAR header. It is designed to append blocks to an existing CAR archive. It is expected that out will be concatenated onto the end of an existing archive that already has a properly formatted header.

async CarWriter.updateRootsInBytes(bytes, roots)

  • bytes (Uint8Array)

  • roots (CID[]): A new list of roots to replace the existing list in the CAR header. The new header must take up the same number of bytes as the existing header, so the roots should collectively be the same byte length as the existing roots.

  • Returns: Promise<Uint8Array>

Update the list of roots in the header of an existing CAR as represented in a Uint8Array.

This operation is an overwrite, the total length of the CAR will not be modified. A rejection will occur if the new header will not be the same length as the existing header, in which case the CAR will not be modified. It is the responsibility of the user to ensure that the roots being replaced encode as the same length as the new roots.

The byte array passed in an argument will be modified and also returned upon successful modification.

async CarWriter.updateRootsInFile(fd, roots)

  • fd (fs.promises.FileHandle|number): A file descriptor from the Node.js fs module. Either an integer, from fs.open() or a FileHandle from fs.promises.open().

  • roots (CID[]): A new list of roots to replace the existing list in the CAR header. The new header must take up the same number of bytes as the existing header, so the roots should collectively be the same byte length as the existing roots.

  • Returns: Promise<void>

Update the list of roots in the header of an existing CAR file. The first argument must be a file descriptor for CAR file that is open in read and write mode (not append), e.g. fs.open or fs.promises.open with 'r+' mode.

This operation is an overwrite, the total length of the CAR will not be modified. A rejection will occur if the new header will not be the same length as the existing header, in which case the CAR will not be modified. It is the responsibility of the user to ensure that the roots being replaced encode as the same length as the new roots.

This function is only available in Node.js and not a browser environment.

class CarBufferWriter

A simple CAR writer that writes to a pre-allocated buffer.

CarBufferWriter#addRoot(root, options)

  • root (CID)

  • options

  • Returns: CarBufferWriter

Add a root to this writer, to be used to create a header when the CAR is finalized with close()

CarBufferWriter#write(block)

  • block (Block): A { cid:CID, bytes:Uint8Array } pair.

  • Returns: CarBufferWriter

Write a Block (a { cid:CID, bytes:Uint8Array } pair) to the archive. Throws if there is not enough capacity.

CarBufferWriter#close([options])

  • options (object, optional)

    • options.resize (boolean, optional)
  • Returns: Uint8Array

Finalize the CAR and return it as a Uint8Array.

CarBufferWriter.blockLength(Block)

  • block (Block)

  • Returns: number

Calculates number of bytes required for storing given block in CAR. Useful in estimating size of an ArrayBuffer for the CarBufferWriter.

CarBufferWriter.calculateHeaderLength(rootLengths)

  • rootLengths (number[])

  • Returns: number

Calculates header size given the array of byteLength for roots.

CarBufferWriter.headerLength({ roots })

  • options (object)

    • options.roots (CID[])
  • Returns: number

Calculates header size given the array of roots.

CarBufferWriter.estimateHeaderLength(rootCount[, rootByteLength])

  • rootCount (number)

  • rootByteLength (number, optional)

  • Returns: number

Estimates header size given a count of the roots and the expected byte length of the root CIDs. The default length works for a standard CIDv1 with a single-byte multihash code, such as SHA2-256 (i.e. the most common CIDv1).

CarBufferWriter.createWriter(buffer[, options])

  • buffer (ArrayBuffer)

  • options (object, optional)

    • options.roots (CID[], optional)
    • options.byteOffset (number, optional)
    • options.byteLength (number, optional)
    • options.headerSize (number, optional)
  • Returns: CarBufferWriter

Creates synchronous CAR writer that can be used to encode blocks into a given buffer. Optionally you could pass byteOffset and byteLength to specify a range inside buffer to write into. If car file is going to have roots you need to either pass them under options.roots (from which header size will be calculated) or provide options.headerSize to allocate required space in the buffer. You may also provide known roots and headerSize to allocate space for the roots that may not be known ahead of time.

Note: Incorrect headerSize may lead to copying bytes inside a buffer which will have a negative impact on performance.

async decoder.readHeader(reader)

  • reader (BytesReader)

  • strictVersion (number, optional)

  • Returns: Promise<(CarHeader|CarV2Header)>

Reads header data from a BytesReader. The header may either be in the form of a CarHeader or CarV2Header depending on the CAR being read.

async decoder.readBlockHead(reader)

  • reader (BytesReader)

  • Returns: Promise<BlockHeader>

Reads the leading data of an individual block from CAR data from a BytesReader. Returns a BlockHeader object which contains { cid, length, blockLength } which can be used to either index the block or read the block binary data.

decoder.createDecoder(reader)

  • reader (BytesReader)

  • Returns: CarDecoder

Creates a CarDecoder from a BytesReader. The CarDecoder is as async interface that will consume the bytes from the BytesReader to yield a header() and either blocks() or blocksIndex() data.

decoder.bytesReader(bytes)

  • bytes (Uint8Array)

  • Returns: BytesReader

Creates a BytesReader from a Uint8Array.

decoder.asyncIterableReader(asyncIterable)

  • asyncIterable (AsyncIterable<Uint8Array>)

  • Returns: BytesReader

Creates a BytesReader from an AsyncIterable<Uint8Array>, which allows for consumption of CAR data from a streaming source.

decoder.limitReader(reader, byteLimit)

  • reader (BytesReader)

  • byteLimit (number)

  • Returns: BytesReader

Wraps a BytesReader in a limiting BytesReader which limits maximum read to byteLimit bytes. It does not update pos of the original BytesReader.

class CarBufferReader

Properties:

  • version (number): The version number of the CAR referenced by this reader (should be 1 or 2).

Provides blockstore-like access to a CAR.

Implements the RootsBufferReader interface: getRoots(). And the BlockBufferReader interface: get(), has(), blocks() and cids().

Load this class with either import { CarBufferReader } from '@ipld/car/buffer-reader' (const { CarBufferReader } = require('@ipld/car/buffer-reader')). Or import { CarBufferReader } from '@ipld/car' (const { CarBufferReader } = require('@ipld/car')). The former will likely result in smaller bundle sizes where this is important.

CarBufferReader#getRoots()

  • Returns: CID[]

Get the list of roots defined by the CAR referenced by this reader. May be zero or more CIDs.

CarBufferReader#has(key)

  • key (CID)

  • Returns: boolean

Check whether a given CID exists within the CAR referenced by this reader.

CarBufferReader#get(key)

  • key (CID)

  • Returns: Block|undefined

Fetch a Block (a { cid:CID, bytes:Uint8Array } pair) from the CAR referenced by this reader matching the provided CID. In the case where the provided CID doesn't exist within the CAR, undefined will be returned.

CarBufferReader#blocks()

  • Returns: Block[]

Returns a Block[] of the Blocks ({ cid:CID, bytes:Uint8Array } pairs) contained within the CAR referenced by this reader.

CarBufferReader#cids()

  • Returns: CID[]

Returns a CID[] of the CIDs contained within the CAR referenced by this reader.

CarBufferReader.fromBytes(bytes)

  • bytes (Uint8Array)

  • Returns: CarBufferReader

Instantiate a CarBufferReader from a Uint8Array blob. This performs a decode fully in memory and maintains the decoded state in memory for full access to the data via the CarReader API.

CarBufferReader.readRaw(fd, blockIndex)

  • fd (number): A file descriptor from the Node.js fs module. An integer, from fs.open().

  • blockIndex (BlockIndex): An index pointing to the location of the Block required. This BlockIndex should take the form: {cid:CID, blockLength:number, blockOffset:number}.

  • Returns: Block: A { cid:CID, bytes:Uint8Array } pair.

Reads a block directly from a file descriptor for an open CAR file. This function is only available in Node.js and not a browser environment.

This function can be used in connection with CarIndexer which emits the BlockIndex objects that are required by this function.

The user is responsible for opening and closing the file used in this call.

License

Licensed under either of

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

js-car's People

Contributors

achingbrain avatar alanshaw avatar dependabot[bot] avatar gobengo avatar gozala avatar hugomrdias avatar mikeal avatar olizilla avatar rvagg avatar semantic-release-bot avatar shogunpanda avatar web3-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

js-car's Issues

Add support for WebStreams

Background

I'm exploring verified (trustless) retrieval of CAR files (See olizilla/ipfs-get#15) in browsers.

I discovered that there's no way to instantiate a CarReader from a ReadableStream returned from the Fetch API in the browser.

This is because Readable Web Streams are not iterable by default.

Adding support for Web Streams

For us to be able to promote verified retrieval in browsers, it would be nice to have full CAR support in browsers. For this, we need a way to instantiate a CAR Reader in the browser from a Readable Web Streams.

My current work around is to use @achingbrain's browser-readablestream-to-it
library

Very strange issue with `writer.close` (silently closes entire node process)

Using @ipld/car version 5.1.1

This simple code snippet is self explanatory, and happens as the console.logs suggest... though you would expect that it doesn't:

import { CarWriter } from "@ipld/car";

const main = async () => {
    console.log('wtf');
    const { writer, out } = await CarWriter.create();
    console.log('lol');
    await writer.close();
    console.log('this never prints'); // this really never prints
}

main()
    .then(() => {
        console.log('done'); // this also never prints
    })
    .catch(e => {
        console.error('ERROR:', e); // no errors either
    })

REPL to prove I'm not lying:
https://replit.com/@GabrielSpeed/ToughRadiantMarketing#index.js
(Also happens with BunJS https://replit.com/@GabrielSpeed/CreepyGigaGnudebugger#index.ts)

ETA: this happens even if the CAR is created with roots, and also even if the CAR is created with roots and the block containing the root is actually in the CAR.

Basically this happens no matter what state the CarWriter is in. Not sure why or how, but I will tell you the bug is going to be incredibly hard to solve, because it prints no error message or anything at any level.

Seems to have something to do with methods on the _encoder of the CarWriter. Something about the way the _mutex and _encoder play together (and that IteratorChannel_Writer class) is not working out well, and seems to be a problem similar to this: https://stackoverflow.com/questions/64897131/async-for-loop-causes-the-function-to-silently-terminate

Premature exit of reader when consuming an iterable stream with empty blocks

I encountered this issue when trying out web3.storage - I hit it when trying to store some data that had some zero length files in it.

web3-storage/web3.storage#302

The data:

https://bafybeiatccvommze36x5cy44m22qz4al63riz4p6bnxw4pns6g3lbw3aky.ipfs.dweb.link/

I ended up writing a test program so I could test all the permutations. Here it is (along with results):

https://github.com/jimpick/js-car-zero-length-blocks

It appears that when consuming a streaming iterable, it sometimes will abort reading prematurely if the latest block has a zero length payload.

Can't resolve import when using @ipld/car from jest

I'm getting an error when trying to import @ipld/car from a jest test:

    Cannot find module 'multiformats/cid' from 'node_modules/@ipld/car/cjs/lib/decoder.js'

    Require stack:
      node_modules/@ipld/car/cjs/lib/decoder.js
      node_modules/@ipld/car/cjs/lib/reader-browser.js
      node_modules/@ipld/car/cjs/lib/reader.js
      node_modules/@ipld/car/cjs/car.js
      test.js

      at Resolver.resolveModule (node_modules/jest-resolve/build/index.js:306:11)

I've created a reproduction repository: https://github.com/matheus23/ipld-car-jest-bug

The command to reproduce the error is: yarn && yarn start


Context: For fission we're trying to use .car files to set up js-ipfs for with test fixtures.

reading from a file object

Hi,
I have a CAR file object in javascript and want to read it using js-car. But I keep getting unexpected end of the file error. Here is my code I am trying

let arrayBuffer = await files[0].arrayBuffer();
let bytes=new Uint8Array(carFile); 
const reader = await CarReader.fromBytes(bytes) //throws error here
const indexer = await CarIndexer.fromBytes(bytes) //throws error here

I also tired this

let str = await files[0].stream() 
const reader = await CarReader.fromIterable(files[0].stream()) //throws error here

and none of them work. However with the same file this code works

const inStream = fs.createReadStream('test.car')
const reader = await CarReader.fromIterable(inStream)

I checked and I know that CarReader.fromBytes needs a Unit8Arrey and I am sure files[0] is not null. Does anyone knows what I am missing here?

dag-json support for CarReader.fromBytes

I'm trying to get a CarReader for a encoded dag-json, but when trying this:

      import { encode } from '@ipld/dag-json'

...


      let encoded = encode(objToStore);
      let carReader = await CarReader.fromBytes(encoded);

I'm getting this error:
IPFS Error: CBOR decode error: too many terminals, data makes no sense

Is it possible to create a CAR or CarReader from a dag-json?

CarWriter.fromIterable(roots, blockIterator)

@alanshaw had an API suggestion over @ web3-storage/ipfs-car#74

const out = await CarWriter.fromIterable([root], blockstore.blocks())

Some things to be resolved:

  1. Should it be named fromBlockIterable() to make absolutely clear that this is a {cid,bytes} iterator it wants, because every other fromIterable() in the API currently is a Uint8Array iterator? This API would be able to take as input a CarBlockIterator, so we have prior-art for that name. (fs.createWriteStream('out.car', CarWriter.fromIterable([], await CarBlockIterator.fromIterable(fs.createReadStream('in.car')) [transferring roots is possible but it would take more than one line!]).
  2. Does it need an await and if so what is it waiting on? Looking at the code, I think CarWriter.create() now is sync and doesn't need the await even though I see I've used it on the README! I don't think there's any good reason why a CarWriter.fromIterable() couldn't return out straight away, so maybe the await is entirely unnecessary here. The AsyncIterable protocol gives us everything we need to set up async constructions and should also allow proper error propagation regardless of where it happens in this chain.

`CarBufferWriter` should be exported by index-browser

│ [!] Error: 'CarBufferWriter' is not exported by node_modules/.pnpm/@[email protected]/node_modules/@ipld/car/src/index-browser.js, imported by node_modules/.pnpm/@[email protected]/node_modules/@ucanto/transp
│ https://rollupjs.org/guide/en/#error-name-is-not-exported-by-module
│ node_modules/.pnpm/@[email protected]/node_modules/@ucanto/transport/src/car/codec.js (2:26)
│ 1: import * as API from '@ucanto/interface'
│ 2: import { CarBufferReader, CarBufferWriter } from '@ipld/car'

Proposal: Sync Writer API

For dotStorage [email protected] we use CARs more or less as network packets. Specifically we want to allocate n bytes for a packet and keep writing blocks until desired size is reached. Then send that CAR out and start writing new blocks into a new CAR.

Unfortunately current writer API is not ideal for such a use case and imposes unecessary asynchrony. Here is the code I found myself writing.

const encode = async ({ blocks, roots = [CID.parse("bafkqaaa")] }) => {
    const { writer, out } = CAR.CarWriter.create(roots);
    for (const block of blocks) {
      writer.put(block);
    }
    writer.close();

    const chunks = [];
    let byteLength = 0;
    for await (const chunk of out) {
      chunks.push(chunk);
      byteLength += chunk.byteLength;
    }

    const buffer = new Uint8Array(byteLength);
    let byteOffset = 0;
    for (const chunk of chunks) {
      buffer.set(chunk, byteOffset);
      byteOffset += chunk.byteLength;
    }
    return buffer;
  };

It would be nice if we had another writer API to support such a use case, one that would not impose async reads from the output. Maybe something like:

declare function createWriter(buffer:ArrayBuffer, options?:WriterOptions): SyncBlockWriter

interface WriterOptions {
   roots?: CID[] // defaults to []
   byteOffset?: number // defaults to 0
   byteLength?: number // defaults to buffer.byteLength
   rootsCapacity?: number // defaults to total byteLength used by passed roots
}

interface SyncBlockWriter {
  /**
   * Throws an error if total root count is greater than root capacity specified at creation.
   */
  addRoots(roots:CID[]): void
  /**
   * Throws an error if buffer does not have enough capacity.
   */
  write(block: Block):void
  /**
   * Returns `Uint8Array` view into provided `ArrayBuffer` containing CAR bytes. Returned `Uint8Array` is a view
   * into a provided `ArrayBuffer` that was written.
   */
  close(): Uint8Array
}

CarReader.fromIterable for large files

The documentation shows how to create a carReader when files are small as per below:

import { CarReader } from '@ipld/car'

const inStream = fs.createReadStream('example.car');
// read and parse the entire stream in one go, this will cache the contents of
// the car in memory so is not suitable for large files.
const reader = await CarReader.fromIterable(inStream);

Source

Question: If I have a file at C:\output.car, and it's say 10GB and larger than my available RAM, how can I create a reader for this large .car file?

I think it should be one of these methods, but there's no sample code on how to use it...

import { CarBlockIterator } from '@ipld/car/iterator';
// or
import { CarCIDIterator } from '@ipld/car/iterator';

Source

My purpose is to upload many images from an NFT collection... I feel like adding an example of how to read a large .car file in the documentation would be helpful...

What is the proper way to get CAR content without the `carHeader`

Im trying to fetch and parse the content from an IPFS gateway as follows:

      const contentVerified = await this.getAndVerifyContentFromGateway(hash);
      console.log("header: ", await contentVerified.carReader.getRoots());
      const carBlockIterator = new CarBlockIterator(
        contentVerified.carReader.version,
        await contentVerified.carReader.getRoots(),
        contentVerified.carReader.blocks()
      );
      for await (const block of carBlockIterator) chunks.push(block.bytes);
      
      const buffer = Buffer.concat(chunks);
      return buffer.toString("utf8");

The result is the data well parsed with the carHeader still in the first position. Printing the buffer, the carHeader looks like

<Buffer 0a 90 17 08 02 12 88 17....

Instead of roughly removing the first block, is there a proper way to omit the header, or instead another way to parse the content without the header

No "exports" main defined

Getting the following error when executing the .js outputted from npx tsc:
Error [ERR_PACKAGE_PATH_NOT_EXPORTED]: No "exports" main defined in /[path]/node_modules/@ipld/car/package.json at new NodeError (node:internal/errors:400:5) at exportsNotFound (node:internal/modules/esm/resolve:361:10) at packageExportsResolve (node:internal/modules/esm/resolve:641:13) at resolveExports (node:internal/modules/cjs/loader:538:36) at Module._findPath (node:internal/modules/cjs/loader:607:31) at Module._resolveFilename (node:internal/modules/cjs/loader:1025:27) at Module._load (node:internal/modules/cjs/loader:885:27) at Module.require (node:internal/modules/cjs/loader:1105:19) at require (node:internal/modules/cjs/helpers:103:18) at Object.<anonymous> (/[path]/out/handleMessage.js:14:15) { code: 'ERR_PACKAGE_PATH_NOT_EXPORTED' }

I'm importing like this:
import { CarReader } from '@ipld/car'

My tsconfig is as follows:
{ "compilerOptions": { "module": "commonjs", "esModuleInterop": true, "target": "es6", "outDir": "./out", "sourceMap": true, "strict": true, "noImplicitAny": true, "moduleResolution": "node", "baseUrl": ".", "paths": { "*": ["node_modules/*", "src/types/*"] }, "skipLibCheck": true, "forceConsistentCasingInFileNames": true }, "include": ["src/**/*.ts"], "exclude": ["node_modules", "**/*.spec.ts"] }

I'm sure I'm making a stupid mistake but am lost. Any help would be greatly appreciated.

Propsoal: New API to create a writer with unknown root

In many cases we want to frame block sets into CAR files and once CAR reaches certain size, write a root block and end the frame. This use case is not supported by the current interface as you need to know the root ahead of time, which we do not.

It is still possible to create a car writer with fake root, buffer output into memory and then use updateRootsInBytes but that is really awkward interface.

I would like to propose an alternative CarWriter interface that would better support outlined use case.

export class CarWriter2 extends CarWriter {
  /**
    * Create a car writer with given root capacity. No blocks will be emitted into `out` until
    * number of roots matching the `count` are added.
    */
  static createWithRootCapacity(byteLength:number):{ writer:CarWriter2, out:AsyncIterable<Uint8Array> }
  /**
   * Throws an error if total root count is greater than root capacity specified at creation.
   */
  addRoots(roots:CID[]): void
  
  /**
   * Promise fails if root capacity has not been met (not enough roots had been added)
   */
  close(): Promise<void>
}

In practice I expect we could just amend current implementation as opposed to having a separate class as in the sketch above.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.