GithubHelp home page GithubHelp logo

rubensworks / rdf-parse.js Goto Github PK

View Code? Open in Web Editor NEW
53.0 3.0 7.0 851 KB

Parses RDF from any serialization

License: MIT License

TypeScript 95.19% JavaScript 4.81%
rdf linked-data rdfjs parser streaming hacktoberfest

rdf-parse.js's Introduction

RDF Parse

Build status Coverage Status npm version

This library parses RDF streams based on content type (or file name) and outputs RDF/JS-compliant quads as a stream.

This is useful in situations where you have RDF in some serialization, and you just need the parsed triples/quads, without having to concern yourself with picking the correct parser.

The following RDF serializations are supported:

Name Content type Extensions
TriG application/trig .trig
N-Quads application/n-quads .nq, .nquads
Turtle text/turtle .ttl, .turtle
N-Triples application/n-triples .nt, .ntriples
Notation3 text/n3 .n3
JSON-LD application/ld+json, application/json .json, .jsonld
RDF/XML application/rdf+xml .rdf, .rdfxml, .owl
RDFa and script RDF data tags HTML/XHTML text/html, application/xhtml+xml .html, .htm, .xhtml, .xht
Microdata text/html, application/xhtml+xml .html, .htm, .xhtml, .xht
RDFa in SVG/XML image/svg+xml,application/xml .xml, .svg, .svgz
SHACL Compact Syntax text/shaclc .shaclc, .shc
Extended SHACL Compact Syntax text/shaclc-ext .shaclce, .shce

Internally, this library makes use of RDF parsers from the Comunica framework, which enable streaming processing of RDF.

Internally, the following fully spec-compliant parsers are used:

Installation

$ npm install rdf-parse

or

$ yarn add rdf-parse

This package also works out-of-the-box in browsers via tools such as webpack and browserify.

Require

import rdfParser from "rdf-parse";

or

const rdfParser = require("rdf-parse").default;

Usage

Parsing by content type

The rdfParser.parse method takes in a text stream containing RDF in any serialization, and an options object, and outputs an RDFJS stream that emits RDF quads.

const textStream = require('streamify-string')(`
<http://ex.org/s> <http://ex.org/p> <http://ex.org/o1>, <http://ex.org/o2>.
`);

rdfParser.parse(textStream, { contentType: 'text/turtle', baseIRI: 'http://example.org' })
    .on('data', (quad) => console.log(quad))
    .on('error', (error) => console.error(error))
    .on('end', () => console.log('All done!'));

Parsing by file name

Sometimes, the content type of an RDF document may be unknown, for those cases, this library allows you to provide the path/URL of the RDF document, using which the extension will be determined.

For example, Turtle documents can be detected using the .ttl extension.

const textStream = require('streamify-string')(`
<http://ex.org/s> <http://ex.org/p> <http://ex.org/o1>, <http://ex.org/o2>.
`);

rdfParser.parse(textStream, { path: 'http://example.org/myfile.ttl', baseIRI: 'http://example.org' })
    .on('data', (quad) => console.log(quad))
    .on('error', (error) => console.error(error))
    .on('end', () => console.log('All done!'));

Getting all known content types

With rdfParser.getContentTypes(), you can retrieve a list of all content types for which a parser is available. Note that this method returns a promise that can be await-ed.

rdfParser.getContentTypesPrioritized() returns an object instead, with content types as keys, and numerical priorities as values.

// An array of content types
console.log(await rdfParser.getContentTypes());

// An object of prioritized content types
console.log(await rdfParser.getContentTypesPrioritized());

Obtaining prefixes

Using the 'prefix' event, you can obtain the prefixes that were available when parsing from documents in formats such as Turtle and TriG.

rdfParser.parse(textStream, { contentType: 'text/turtle' })
    .on('prefix', (prefix, iri) => console.log(prefix + ':' + iri))

Obtaining contexts

Using the 'context' event, you can obtain all contexts (@context) when parsing JSON-LD documents.

Multiple contexts can be found, and the context values that are emitted correspond exactly to the context value as included in the JSON-LD document.

rdfParser.parse(textStream, { contentType: 'application/ld+json' })
    .on('context', (context) => console.log(context))

License

This software is written by Ruben Taelman.

This code is released under the MIT license.

rdf-parse.js's People

Contributors

rubensworks avatar renovate[bot] avatar jeswr avatar renovate-bot avatar greenkeeper[bot] avatar rubenverborgh avatar falx avatar

Stargazers

Edgar Littleman avatar Felix Hungenberg avatar Jems avatar CH Sun avatar Doğa Armangil avatar Andrew Berezovskyi avatar Antonio Johansen Cubedo avatar  avatar  avatar 王虚白 avatar Manik Official avatar  avatar  avatar Francisco Coya Abajo avatar Oisin Grehan avatar  avatar Aaron Gray avatar  avatar Bryden Wayne avatar Daniel Dugovic avatar Tiago Lubiana avatar Phill Ayers avatar mat avatar Alex Jennings avatar Sarven Capadisli avatar  avatar Dieter Luypaert avatar betgar avatar  avatar Thanos Panagiotidis avatar  avatar దామోదర avatar Hyeseong Kim avatar Yazid Jibrel avatar Heartlander avatar Yun Hao avatar Ian Maurer avatar Mark Baggett avatar Jarka Pachlová avatar Emmanuel Oga avatar Bryan avatar lin onetwo avatar Shane Holloway avatar Edward Ryazantsev avatar Ashveen Bucktowar avatar Jon Repp avatar Pieter Colpaert avatar Simon Babay avatar BigBlueHat avatar  avatar Mark Hughes (happybeing) avatar Chris Hart avatar Alfredo Serafini avatar

Watchers

 avatar James Cloos avatar  avatar

rdf-parse.js's Issues

An in-range update of @types/n3 is breaking the build 🚨

The devDependency @types/n3 was updated from 1.1.3 to 1.1.4.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

@types/n3 is a devDependency of this project. It might not break your production code or affect downstream projects, but probably breaks your build or test tools, which may prevent deploying or publishing.

Status Details
  • continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

An in-range update of @types/jest is breaking the build 🚨


☝️ Important announcement: Greenkeeper will be saying goodbye 👋 and passing the torch to Snyk on June 3rd, 2020! Find out how to migrate to Snyk and more at greenkeeper.io


The devDependency @types/jest was updated from 25.1.3 to 25.1.4.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

@types/jest is a devDependency of this project. It might not break your production code or affect downstream projects, but probably breaks your build or test tools, which may prevent deploying or publishing.

Status Details
  • continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

  • Update Comunica monorepo packages to v3 (major) (@comunica/actor-http-fetch, @comunica/actor-http-proxy, @comunica/actor-rdf-parse-html, @comunica/actor-rdf-parse-html-microdata, @comunica/actor-rdf-parse-html-rdfa, @comunica/actor-rdf-parse-html-script, @comunica/actor-rdf-parse-jsonld, @comunica/actor-rdf-parse-n3, @comunica/actor-rdf-parse-rdfxml, @comunica/actor-rdf-parse-shaclc, @comunica/actor-rdf-parse-xml-rdfa, @comunica/bus-http, @comunica/bus-init, @comunica/bus-rdf-parse, @comunica/bus-rdf-parse-html, @comunica/config-query-sparql, @comunica/core, @comunica/mediator-combine-pipeline, @comunica/mediator-combine-union, @comunica/mediator-number, @comunica/mediator-race, @comunica/runner)
  • Update dependency arrayify-stream to v2
  • Update dependency componentsjs-generator to v4
  • Update dependency typescript to v5
  • Click on this checkbox to rebase all open PRs at once

Detected dependencies

npm
package.json
  • @comunica/actor-http-fetch ^2.0.1
  • @comunica/actor-http-proxy ^2.0.1
  • @comunica/actor-rdf-parse-html ^2.0.1
  • @comunica/actor-rdf-parse-html-microdata ^2.0.1
  • @comunica/actor-rdf-parse-html-rdfa ^2.0.1
  • @comunica/actor-rdf-parse-html-script ^2.0.1
  • @comunica/actor-rdf-parse-jsonld ^2.0.1
  • @comunica/actor-rdf-parse-n3 ^2.0.1
  • @comunica/actor-rdf-parse-rdfxml ^2.0.1
  • @comunica/actor-rdf-parse-shaclc ^2.6.2
  • @comunica/actor-rdf-parse-xml-rdfa ^2.0.1
  • @comunica/bus-http ^2.0.1
  • @comunica/bus-init ^2.0.1
  • @comunica/bus-rdf-parse ^2.0.1
  • @comunica/bus-rdf-parse-html ^2.0.1
  • @comunica/config-query-sparql ^2.0.1
  • @comunica/core ^2.0.1
  • @comunica/mediator-combine-pipeline ^2.0.1
  • @comunica/mediator-combine-union ^2.0.1
  • @comunica/mediator-number ^2.0.1
  • @comunica/mediator-race ^2.0.1
  • @rdfjs/types *
  • readable-stream ^4.3.0
  • stream-to-string ^1.2.0
  • @comunica/runner ^2.0.3
  • @types/jest ^29.0.0
  • @types/n3 ^1.10.4
  • arrayify-stream ^1.0.0
  • componentsjs-generator ^3.0.1
  • jest ^29.0.0
  • jest-rdf ^1.7.0
  • manual-git-changelog ^1.0.1
  • pre-commit ^1.2.2
  • rdf-data-factory ^1.1.0
  • rdf-dereference ^2.1.0
  • rdf-quad ^1.5.0
  • rdf-test-suite ^1.18.0
  • streamify-string ^1.0.1
  • ts-jest ^29.0.0
  • ts-node ^10.9.1
  • tslint ^6.0.0
  • tslint-eslint-rules ^5.3.1
  • typescript ^4.0.0

  • Check this box to trigger a request for Renovate to run again on this repository

Action Required: Fix Renovate Configuration

There is an error with this repository's Renovate configuration that needs to be fixed. As a precaution, Renovate will stop PRs until it is resolved.

Error type: undefined. Note: this is a nested preset so please contact the preset author if you are unable to fix it yourself.

Listed RDF serialization support does not match `CONTENT_MAPPINGS`

In particular

private static readonly CONTENT_MAPPINGS: { [id: string]: string } = {
ttl : "text/turtle",
turtle : "text/turtle",
nt : "application/n-triples",
ntriples : "application/n-triples",
nq : "application/n-quads",
nquads : "application/n-quads",
rdf : "application/rdf+xml",
rdfxml : "application/rdf+xml",
owl : "application/rdf+xml",
n3 : "text/n3",
shc : "text/shaclc",
shaclc : "text/shaclc",
shce : "text/shaclc-ext",
shaclce : "text/shaclc-ext",
trig : "application/trig",
jsonld : "application/ld+json",
json : "application/ld+json",
};
does not include any of the RDFa or microdata content types.

parse method is undefined

I did npm install rdf-parse and ended up with

"rdf-parse": "^2.1.1",

in my package.json. Then doing:

import rdfParser from "rdf-parse"
....
rdfParser.parse(someStream, { contentType: 'text/turtle'})

I end up with TypeError: rdfParser.parse is not a function

An in-range update of @comunica/actor-rdf-parse-jsonld is breaking the build 🚨

The dependency @comunica/actor-rdf-parse-jsonld was updated from 1.9.0 to 1.9.2.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

@comunica/actor-rdf-parse-jsonld is a direct dependency of this project, and it is very likely causing it to break. If other packages depend on yours, this update is probably also breaking those in turn.

Status Details
  • continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

An in-range update of @comunica/actor-rdf-parse-xml-rdfa is breaking the build 🚨

The dependency @comunica/actor-rdf-parse-xml-rdfa was updated from 1.9.0 to 1.9.2.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

@comunica/actor-rdf-parse-xml-rdfa is a direct dependency of this project, and it is very likely causing it to break. If other packages depend on yours, this update is probably also breaking those in turn.

Status Details
  • continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

Module causes Error when using Vite

Hi Ruben!

First of all, thank you very much for your work! I really appreciate your contributions to the RDF eco-system!

I'm using Vite as my build tool and I'm having an issue with running a build with the rdf-parse module. Whenever I'm loading the served application in a browser, I'm getting the following error:

Uncaught TypeError: Object.defineProperty called on non-object
    at Function.defineProperty (<anonymous>)
    at One (index-dd57c101.js:128:20209)
    at Z4 (index-dd57c101.js:128:21277)
    at tc (index-dd57c101.js:128:23249)
    at index-dd57c101.js:128:15488

pointing to this particular expression

Object.defineProperty(wd,"__esModule",{value:!0})

My vite.config.ts looks like this

import {defineConfig} from 'vite'
import vue from '@vitejs/plugin-vue'
import * as path from "path";
import { NodeModulesPolyfillPlugin } from '@esbuild-plugins/node-modules-polyfill'

// https://vitejs.dev/config/
export default defineConfig({
    build: {
        target: "es2020",  // https://github.com/sveltejs/kit/issues/859
        commonjsOptions: {
            include: ['tailwind-config.cjs', 'node_modules/**'],
        },
    },
    optimizeDeps: {
        esbuildOptions: {
            target: "es2020",
            // Enable esbuild polyfill plugins https://stackoverflow.com/q/73954820
            plugins: [
                NodeModulesPolyfillPlugin()
            ]
        },
        include: ['tailwind-config'],
    },
    plugins: [vue()],
    assetsInclude: ['**/*.md'],
    resolve: {
        alias: {
             ...
        },
    },
})

I think that I'm encountering the same issue as described here.

Do you have an idea what this issue could be here?

An in-range update of @comunica/actor-rdf-parse-html-rdfa is breaking the build 🚨

The dependency @comunica/actor-rdf-parse-html-rdfa was updated from 1.9.0 to 1.9.2.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

@comunica/actor-rdf-parse-html-rdfa is a direct dependency of this project, and it is very likely causing it to break. If other packages depend on yours, this update is probably also breaking those in turn.

Status Details
  • continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

An in-range update of @comunica/actor-rdf-parse-n3 is breaking the build 🚨

The dependency @comunica/actor-rdf-parse-n3 was updated from 1.9.2 to 1.9.3.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

@comunica/actor-rdf-parse-n3 is a direct dependency of this project, and it is very likely causing it to break. If other packages depend on yours, this update is probably also breaking those in turn.

Status Details
  • continuous-integration/travis-ci/push: The Travis CI build failed (Details).

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

An in-range update of @comunica/actor-rdf-parse-html is breaking the build 🚨

The dependency @comunica/actor-rdf-parse-html was updated from 1.9.0 to 1.9.2.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

@comunica/actor-rdf-parse-html is a direct dependency of this project, and it is very likely causing it to break. If other packages depend on yours, this update is probably also breaking those in turn.

Status Details
  • continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

browserify or webpack: action.input.on is not a function

I hoped just doing browserify ./node_modules/rdf-parse/index.js -o rdfParser.js -s rdfParse would work, but I get following error when executing code below in the browser:

Uncaught (in promise) TypeError: action.input.on is not a function
    at ActorRdfParseHtml.js:101
    import rdfParser from "rdf-parse";
    var accept = 'application/trig;q=1.0,application/n-quads,application/ld+json;q=.9,application/rdf+xml;q=.8,text/turtle,application/n-triples';

    const myInit = {
      method: 'GET',
      headers: { 'accept': accept },
      mode: 'cors',
      cache: 'default'
    };

    try {
      const response = await fetch(url, myInit);

      if (response.status !== 200) {
        throw new Error(await response.text());
      }

      const quads = [];
      rdfParser.parse(response.body, { contentType: response.headers.get('content-type'), baseIRI: response.url })
        .on('data', (quad) => quads.push(quad))
        .on('error', (error) => reject(error))
        .on('end', async () => {
          resolve(quads);
        });
    } catch (e) {
      console.error(e);
      reject(e);
    }

Remove default export

As shown in #46, default exports are not working well in ESM, so we should remove them in the next major version.

Having trouble passing JSON-LD context using options

Apologies for not having bottomed out tracing, but I got turned around in the action dispatching code and the translation from TS to JS. I'm trying to parse JSON-LD with the context passed in through options rather than specified inline in the data stream. All of the documentation seems to suggest that this should be doable via a context property on the options object - however, it doesn't seem to be picking that member up in the parser because my returned quad stream has 0 members.

Here's the test that I'm trying to run to verify...

context.only('when loading a JSON-LD graph from an in-memory string', () => {
    const jsonLdInput = '{"@id": "policy/f9ee1b01-5eb4-4fa7-91ac-6877e00e32e6","@type": "Policy","expenseTypeLimit": {"expenseType": { "@id": "expense-type/fe382b00-003f-4bd5-a490-79b052b89446" },"employee": { "@id": "user/d20a1b6b-6e9f-4bc9-b313-dd58f0a5d106" },"blockingExpenseTypeLimit": {"@type": "xsd:double","@value": "100.10"},"warningExpenseTypeLimit": {"@type": "xsd:double","@value": "50.00"}}}';
    const context = {
        '@vocab': "https://schema.acme.com/",
        '@base': 'https://graph.acme.com/',
        'xsd': 'http://www.w3.org/2001/XMLSchema#'
    };

    let data;

    beforeEach(() => {
        data = Readable.from(jsonLdInput);
    });

    it('should have the correct number of nodes', async () => {
        const ds = await load({ contentType: contentTypes.jsonld, context }, data);
        ds.size.should.eql(6);
    });
});

and here's the code that it's exercising that uses rdf-parse

export const load = R.curry(async (options, data) => {
  return new Promise((resolve, reject) => {
    const ds = rdf.dataset();

    rdfParser.default.parse(data, options)
    .on('data', q => {
      ds.add(quadMod(options)(q));
    })
    .on('error', e => {
      // console.error(e);
      reject(new Error('Unable to parse data as an RDF dataset'));
    })
    .on('end', () => {
      debugger
      resolve(ds);
    });
  });
});

Again, apologies if this is just stupid user error, but I've been stuck on it long enough now that I wanted to reach out and ask for help.

OWL support?

Thanks for making this library! Any plans to support OWL?

An in-range update of @comunica/actor-rdf-parse-jsonld is breaking the build 🚨

The dependency @comunica/actor-rdf-parse-jsonld was updated from 1.6.6 to 1.7.1.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

@comunica/actor-rdf-parse-jsonld is a direct dependency of this project, and it is very likely causing it to break. If other packages depend on yours, this update is probably also breaking those in turn.

Status Details
  • continuous-integration/travis-ci/push: The Travis CI build failed (Details).

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

Prefix events

A feature request for developers:

When working with data in JavaScript, and in order to pretty-print it later on, it’s useful to know what prefixes were used in the original file. Could we emit prefix events for when a parser encounters a new prefix? Of course only to be implemented for these serializations that support prefixes.

Potential stream piping problem

The problem described here may occur in this lib as well: rdfjs/rdfxml-streaming-parser.js#35

Appears to be occuring sometimes with streamify-string.

Reproduce as follows:

const rdfParser = require("rdf-parse").default;
const Fetcher = require("./lib/NodeHttpFetcher.js"); // the one from the ldfetch project
//const textStream = require('streamify-string');
var textStream = require('string-to-stream')

var fetcher = new Fetcher("application/ld+json");

fetcher.get('http://n076-12.wall1.ilabt.iminds.be/geonames-suffix-tree/node0.jsonld').then((response) => {
  console.log(response);
  rdfParser.parse(textStream(response.body, {encoding: 'utf-8', objectMode: true}),  { contentType: "application/ld+json", baseIRI: "http://ex.org/" })
    .on('data', (quad) => console.log(quad))
    .on('error', (error) => {
      console.error(error)
    })
    .on('end', () => {
      console.log('end');
    });

});

Needless circular import between index and engine-default

The index module re-exports both the named exports from lib/RdfParser and the default export from engine-default. This makes sense.

The engine-default module imports RdfParser from the index. This does not make sense; it could import that class directly from lib/RdfParser instead. This also breaks rollup builds.

I would submit a pull request with a fix, but the engine-default module is automagically generated from an RDF DSL that I don't sufficiently understand (it does look cool, though!). Could you help me out, @rubensworks?

For the time being, I work around this by importing the default from rdf-parse/engine-default instead of from rdf-parse.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.