ruby-rdf / rdf Goto Github PK
View Code? Open in Web Editor NEWRDF.rb is a pure-Ruby library for working with Resource Description Framework (RDF) data.
Home Page: http://rubygems.org/gems/rdf
License: The Unlicense
RDF.rb is a pure-Ruby library for working with Resource Description Framework (RDF) data.
Home Page: http://rubygems.org/gems/rdf
License: The Unlicense
Hello,
Since I updated to rdf-0.3.2 when I run:
require 'rdf'
src = %{
http://rdf.rubyforge.org/RDF/Writer.html#insert_graph http://www.w3.org/1999/02/22-rdf-syntax-ns#label "Writer#insert_graph test" .
}
reader = RDF::Reader.for(:ntriples).new(src)
graph = RDF::Graph.new << reader
RDF::Writer.open("insert_graph.nt") do |writer|
writer.insert_graph graph
end
I raises:
insert_graph.rb:11: protected method `insert_graph' called for #<RDF::NTriples::Writer:0x1020d2548> (NoMethodError)
from /Library/Ruby/Gems/1.8/gems/rdf-0.3.2/lib/rdf/writer.rb:186:in `call'
from /Library/Ruby/Gems/1.8/gems/rdf-0.3.2/lib/rdf/writer.rb:186:in `initialize'
from /Library/Ruby/Gems/1.8/gems/rdf-0.3.2/lib/rdf/writer.rb:155:in `new'
from /Library/Ruby/Gems/1.8/gems/rdf-0.3.2/lib/rdf/writer.rb:155:in `open'
from /Library/Ruby/Gems/1.8/gems/rdf-0.3.2/lib/rdf/writer.rb:154:in `open'
from insert_graph.rb:10
If I use the method #write_graph
instead, it works as expected but, the source code (lib/rdf/writer.rb:284) says:
# @deprecated replace by `RDF::Writable#insert_graph`
Am I missing something?
Thanks!
I just tracked down an issue on spira in which could have been found if we had a repository that performed validation before writing things down; a predicate was being saved as a string. It would be useful for testing if we had a version of RDF::Repository that performed input validation.
So I am thinking something like this:
RDF::Validating::Repository.new
or
RDF::Repository.new(:validate => true)
Whereupon:
RDF::Repository << RDF::Statement.new(RDF::DC.title, "a string", "another string")
#=> RDF::TypeError: Statement predicate must respond to #to_uri
If I implemented either of these, is that something you'd want to have available in core?
I just did a local install of the latest checked in source (0.3.0.pre). A simple vocabulary expansion results in a "can't modify frozen object" error.
[rdf] irb
ruby-1.9.2-p0 > require 'rdf'
=> true
ruby-1.9.2-p0 > RDF::FOAF.to_uri
RuntimeError: can't modify frozen object
from /Users/gregg/.rvm/gems/ruby-1.9.2-p0/gems/rdf-0.2.3/lib/rdf/util/cache.rb:58:in `define_finalizer'
from /Users/gregg/.rvm/gems/ruby-1.9.2-p0/gems/rdf-0.2.3/lib/rdf/util/cache.rb:58:in `define_finalizer!'
from /Users/gregg/.rvm/gems/ruby-1.9.2-p0/gems/rdf-0.2.3/lib/rdf/util/cache.rb:93:in `[]='
from /Users/gregg/.rvm/gems/ruby-1.9.2-p0/gems/rdf-0.2.3/lib/rdf/model/uri.rb:57:in `intern'
from /Users/gregg/.rvm/gems/ruby-1.9.2-p0/gems/rdf-0.2.3/lib/rdf/vocab.rb:93:in `to_uri'
from (irb):2
from /Users/gregg/.rvm/rubies/ruby-1.9.2-p0/bin/irb:17:in `<main>'
Since interned RDF::URI
instances are global to a Ruby process, being shared across different threads and varying use cases, they should be immutable in more than just principle.
The way to ensure this is for RDF::URI.intern
to call #freeze
whenever it constructs a new URI instance, which will then cause Ruby to throw a RuntimeError: can't modify frozen object
exception if somebody inadvertently tries to modify a returned URI object.
URI for XHTML vocabulary should be "http://www.w3.org/1999/xhtml" not "http://www.w3.org/1999/xhtml#".
This does lead to unfortunate URIs based on CURIEs or QNames, but that's required.
The namespace name http://www.w3.org/1999/xhtml is intended for use in various specifications such as:
Recommendations:
XHTML™ 1.0: The Extensible HyperText Markup Language
XHTML Modularization
XHTML 1.1
XHTML Basic
XHTML Print
XHTML+RDFa
RDF::Mutable does not open URIs via load:
RDF::Repository.load('http://datagraph.org/jhacker/foaf.nt')
Errno::ENOENT: No such file or directory - http://datagraph.org/jhacker/foaf.nt
from /opt/local/lib/ruby/gems/1.8/gems/rdf-0.1.1/lib/rdf/reader.rb:107:in `initialize'
...
Addressable::URI 2.2.0 adds some important fixes to URI format checking. If another gem includes Addressable 2.2.0, RDF will fail when loading with the following:
RubyGem version error: addressable(2.2.0 not ~> 2.1.2) (Gem::LoadError)
The following works in 1.8 but not 1.9 (forgive the invalid ntriples as input):
require 'rdf'
s = RDF::NTriples.unserialize '<http://openlibrary.org/b/OL3M> <http://RDVocab.info/Elements/titleProper> "Jhūlā." '
RDF::NTriples.serialize(s)
1.8:
ben:rdf ben$ irb
>> require 'rdf'
=> true
>> s = RDF::NTriples.unserialize '<http://openlibrary.org/b/OL3M> <http://RDVocab.info/Elements/titleProper> "Jhūlā." '
=> #<RDF::Statement:0x90bbb8(<http://openlibrary.org/b/OL3M> <http://RDVocab.info/Elements/titleProper> "Jhūlā." .)>
>> RDF::NTriples.serialize(s)
=> "<http://openlibrary.org/b/OL3M> <http://RDVocab.info/Elements/titleProper> "Jh\305\253l\304\201." .\n"
1.9:
ben:rdf ben$ irb1.9
irb(main):001:0> require 'rdf'
=> true
irb(main):002:0> s = RDF::NTriples.unserialize '<http://openlibrary.org/b/OL3M> <http://RDVocab.info/Elements/titleProper> "Jhūlā." '
=> #<RDF::Statement:0x93f260(<http://openlibrary.org/b/OL3M> <http://RDVocab.info/Elements/titleProper> "Jhūlā." .)>
irb(main):003:0> RDF::NTriples.serialize(s)
=> "<http://openlibrary.org/b/OL3M> <http://RDVocab.info/Elements/titleProper> \"Jhūlā.\" .\n"
Ruby's Time
class can represent either a datetime or just a time by itself. Currently, however, RDF.rb treats Time
instances as if they always straightforwardly mapped to the XSD.time
datatype. This is clearly wrong, as the following demonstrates:
>> RDF::Literal.new(Time.parse("2010-12-31T12:34:56Z"))
=> #<RDF::Literal::Time:0x80f9f378("12:34:56Z"^^<http://www.w3.org/2001/XMLSchema#time>)>
We need additional logic in RDF::Literal.new
to ensure we correctly map Time
instances to the XSD.dateTime
datatype when the object in question contains a date component as well.
The current implementation of Reader/Writer#prefix takes an optional uri to associate with the prefix. In fact, this may not be a URI at all. The only requirement is that when the prefix value as attached to a suffix, that that be a URI. Consider these rules from RDF/XML, used for creating prefix mappings required for defining predicate relatinonships:
An XML namespace-qualified name (QName) has restrictions on the legal characters such that not all property URIs can be expressed
as these names. It is recommended that implementors of RDF serializers, in order to break a URI into a namespace name and a local
name, split it after the last XML non-NCName character, ensuring that the first character of the name is a Letter or '_'. If the
URI ends in a non-NCName character then throw a "this graph cannot be serialized in RDF/XML" exception or error.
One of the RDFa tests verifies that, without prefix mappings, that dc:title will be treated as a URI, not a CURIE. It is, in fact, a valid URI. Following the process outlined above, you come up with a prefix of mapping of "dc:", which, when applied to the suffix "title", re-generates the original URI "dc:title".
The change need to #prefix would be to just not cast the uri parameter as an RDF::URI, but just intern it as a string:
def prefix(name, uri = nil)
name = name.to_s.empty? ? nil : (name.respond_to?(:to_sym) ? name.to_sym : name.to_s.to_sym)
uri.nil? ? prefixes[name] : prefixes[name] = (uri.respond_to?(:to_sym) ? uri.to_sym : uri.to_s.to_sym)
end
After install rdf gem i got error:
NameError: uninitialized contstant RDF
Example shown here:
$ irb -rrdf
>> Gem.loaded_specs['rdf'].version
=> #<Gem::Version "0.3.0">
>> RDF.type.qname.join(':')
=> "http://www.w3.org/1999/02/22-rdf-syntax-ns#__prefix__:type"
>>
RDF::Reader has a resource leak bug. It must close an input file after finishing a given block. If a block is not given when instantiating, the close method should be called explicitly. I found this issue when I tried to load more than 200 turtle files into a repository.
My patch is here:
http://github.com/fumi/rdf/commit/53957144f354eca9dab39e3a9ddbe620d0dbef86
The RDF vocabulary is defined and usable but not actually documented.
Consider the following:
RDF::Literal.new("10", :datatype => "http://www.w3.org/2001/XMLSchema#integer").datatype.inspect
Note that this is a string, and not a URI. This is because Literal.new does a case comparison by first typecasting the datatype to a URI, but not using that type-casted value in the instantiation of a subclass.
open-uri has a :proxy option - we currently can't use rdf.rb for a client as their internal network uses a proxy to get out (yes, they're consuming their own data...).
The current implementation of RDF::Literal
has some default handling for dates, floats, and so forth, but it's somewhat inflexible and not extensible. The system ought to provide a way for different XSD types to do different things with different Ruby classes, so that one could, for example, get an XSD.float
as a Rational
, or an XSD.XMLLiteral
as a parsed Nokogiri object.
The attached code runs the same test three times, each time it uses a larger source file. The test consists of: create a new graph, load the source document into the graph, identify a list of concepts resources, query for the rdfs:label of each concept resource. The time taken for the last step grows out-of-proportion with the size of the input document.
Here's the output I get on my machine:
ian@rowan-15 $ ruby rdf_misc_tests.rb Loaded suite rdf_misc_tests Started Initializing with account-code.ttl ... parsing complete in 1.1s producing 4711 triples ... got code list root, now indexing ... got 587 concepts to index in 0.1s ... collected names in 17.3s. 4241.37 triples/sec parsing, 5579.79 resources/sec query, collected 34.00 names/sec .Initializing with programme-object-group-code.ttl ... parsing complete in 3.1s producing 15895 triples ... got code list root, now indexing ... got 1985 concepts to index in 0.4s ... collected names in 207.1s. 5086.36 triples/sec parsing, 5476.99 resources/sec query, collected 9.59 names/sec .Initializing with programme-object-code.ttl ... parsing complete in 16.7s producing 38855 triples ... got code list root, now indexing ... got 4855 concepts to index in 0.9s ... collected names in 1286.2s. 2333.01 triples/sec parsing, 5188.34 resources/sec query, collected 3.77 names/sec . Finished in 1533.469101951 seconds. 3 tests, 0 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed
Note before running the test that the last step takes over 20 minutes. For reference, I'm using Ruby 1.9.1 on a four-core 64 bit linux machine with 8Gb of memory. Ruby version says:
ian@rowan-15 $ ruby -v ruby 1.9.1p378 (2010-01-10 revision 26273) [x86_64-linux] ~/workspace/coins/ruby/bugrep
I'm using the following version of RDF.rb:
ian@rowan-15 $ gem list --local | grep rdf rdf (0.2.1) rdf-raptor (0.4.0) rdf_context (0.5.6)
Ah. Just realised that I can't attach a file to this issue report (unless I'm missing something on github). Code is here: http://iandickinson.me.uk/download/rdf-ruby-perftest.tar
Consider allowing a vocabulary to be assigned to a URI, such as might happen from uri = RDF::FOAF.name
, which could have a side-effect of setting uri.vocab
to RDF::FOAF
. This would remove the O(N!) lookup of the URI's vocabulary. Also, a URI#vocab
method would be useful in determining the assigned vocabulary of a given URI.
$ irb -rrdf
>> p = RDF::URI('http://www.w3.org/ns/rdfa#')
=> #<RDF::URI:0x810d2150(http://www.w3.org/ns/rdfa#)>
>> p.join('term')
=> #<RDF::URI:0x810d0670(http://www.w3.org/ns/rdfa/term)>
I would expect that the result would be:
http://www.w3.org/ns/rdfa#term
Consider using the String#rdf_escape
and String#rdf_unescape
monkey patches. They properly deal with going from UTF-8 to escaped ASCII and back, somewhat based on JSON utf8_to_json.
# coding: utf-8
require 'iconv'
class String
#private
# "Borrowed" from JSON utf8_to_json
RDF_MAP = {
"\x0" => '\u0000',
"\x1" => '\u0001',
"\x2" => '\u0002',
"\x3" => '\u0003',
"\x4" => '\u0004',
"\x5" => '\u0005',
"\x6" => '\u0006',
"\x7" => '\u0007',
"\b" => '\b',
"\t" => '\t',
"\n" => '\n',
"\xb" => '\u000B',
"\f" => '\f',
"\r" => '\r',
"\xe" => '\u000E',
"\xf" => '\u000F',
"\x10" => '\u0010',
"\x11" => '\u0011',
"\x12" => '\u0012',
"\x13" => '\u0013',
"\x14" => '\u0014',
"\x15" => '\u0015',
"\x16" => '\u0016',
"\x17" => '\u0017',
"\x18" => '\u0018',
"\x19" => '\u0019',
"\x1a" => '\u001A',
"\x1b" => '\u001B',
"\x1c" => '\u001C',
"\x1d" => '\u001D',
"\x1e" => '\u001E',
"\x1f" => '\u001F',
'"' => '\"',
'\\' => '\\\\',
'/' => '/',
} # :nodoc:
if defined?(::Encoding)
# Funky way to define constant, but if parsed in 1.8 it generates an 'invalid regular expression' error otherwise
eval %(ESCAPE_RE = %r([\u{80}-\u{10ffff}]))
else
ESCAPE_RE = %r(
[\xc2-\xdf][\x80-\xbf] |
[\xe0-\xef][\x80-\xbf]{2} |
[\xf0-\xf4][\x80-\xbf]{3}
)nx
end
# Convert a UTF8 encoded Ruby string _string_ to an escaped string, encoded with
# UTF16 big endian characters as \U????, and return it.
#
# \\:: Backslash
# \':: Single quote
# \":: Double quot
# \n:: ASCII Linefeed
# \r:: ASCII Carriage Return
# \t:: ASCCII Horizontal Tab
# \uhhhh:: character in BMP with Unicode value U+hhhh
# \U00hhhhhh:: character in plane 1-16 with Unicode value U+hhhhhh
def rdf_escape
string = self + '' # XXX workaround: avoid buffer sharing
string.gsub!(/["\\\/\x0-\x1f]/) { RDF_MAP[$&] }
if defined?(::Encoding)
string.force_encoding(Encoding::UTF_8)
string.gsub!(ESCAPE_RE) { |c|
s = c.dump.sub(/\"\\u\{(.+)\}\"/, '\1').upcase
(s.length <= 4 ? "\\u0000"[0,6-s.length] : "\\U00000000"[0,10-s.length]) + s
}
string.force_encoding(Encoding::ASCII_8BIT)
else
string.gsub!(ESCAPE_RE) { |c|
s = Iconv.new('utf-16be', 'utf-8').iconv(c).unpack('H*').first.upcase
"\\u" + s
}
end
string
end
# Unescape characters in strings.
RDF_UNESCAPE_MAP = Hash.new { |h, k| h[k] = k.chr }
RDF_UNESCAPE_MAP.update({
?" => '"',
?\\ => '\\',
?/ => '/',
?b => "\b",
?f => "\f",
?n => "\n",
?r => "\r",
?t => "\t",
?u => nil,
})
if defined?(::Encoding)
UNESCAPE_RE = %r(
(?:\\[\\bfnrt"/]) # Escaped control characters, " and /
|(?:\\U00\h{6}) # 6 byte escaped Unicode
|(?:\\u\h{4}) # 4 byte escaped Unicode
)x
else
UNESCAPE_RE = %r((?:\\[\\bfnrt"/]|(?:\\u(?:[A-Fa-f\d]{4}))+|\\[\x20-\xff]))n
end
# Reverse operation of escape
# From JSON parser
def rdf_unescape
return '' if self.empty?
string = self.gsub(UNESCAPE_RE) do |c|
case c[1,1]
when 'U'
raise RdfException, "Long Unicode escapes no supported in Ruby 1.8" unless defined?(::Encoding)
eval(c.sub(/\\U00(\h+)/, '"\u{\1}"'))
when 'u'
bytes = [c[2, 2].to_i(16), c[4, 2].to_i(16)]
Iconv.new('utf-8', 'utf-16').iconv(bytes.pack("C*"))
else
RDF_UNESCAPE_MAP[c[1]]
end
end
string.force_encoding(Encoding::UTF_8) if defined?(::Encoding)
string
rescue Iconv::Failure => e
raise RdfException, "Caught #{e.class}: #{e}"
end
end
Please can you pass the URI being loaded as :base_uri to readers, so that it is possible to write:
graph = RDF::Graph.load('http://rdfa.digitalbazaar.com/test-suite/test-cases/xhtml1/0001.xhtml')
Instead of:
graph = RDF::Graph.load('http://rdfa.digitalbazaar.com/test-suite/test-cases/xhtml1/0001.xhtml', :base_uri => 'http://rdfa.digitalbazaar.com/test-suite/test-cases/xhtml1/0001.xhtml')
This is the current behavior for non-canonical literals in HEAD:
irb(main):024:0* x = RDF::Literal.new("001", :datatype => RDF::XSD.integer)
=> #<RDF::Literal::Integer:0xb97094("001"^^<http://www.w3.org/2001/XMLSchema#integer>)>
irb(main):025:0> y = x.canonicalize
=> #<RDF::Literal::Integer:0xb96356("1"^^<http://www.w3.org/2001/XMLSchema#integer>)>
irb(main):026:0> y == x
=> true
irb(main):027:0> y.eql? x
=> true
Is this intended? I realized while doing the canonicalize option for rdf-isomorphic that this is the behavior, but this would mean it's not needed.
After creating a query through SPARQL::Client's #select, #prefix, and #where methods I get different results from calling #each_solution vs. #solutions.each/#solutions.size on the query.
Specifically #each_solution is always empty.
c.f. https://gist.github.com/766954 for a runnable example.
Create a new ad-hoc vocabulary such as the following:
foo = RDF::Vocabulary.new("http://foo.com#")
Running Vocabulary.each(&:to_s) should return the newly created vocabulary. This is necessary if you want to be able to use it for URI#qname, for example. Note that if you name the anonymous class, such as
RDF::FOO = Class.new(Vocabulary.new("http://foo.com#"))
It will be enumerated. Perhaps either have a #name= method, or some other way to assign the ad-hoc vocabulary a name. Borrowing from ActiveSupport#constantize:
"RDF::FOO".constantize = Class.new(Vocabulary.new("http://foo.com#"))
Just as literals must be escaped to be represented as valid RDF strings, URIs must also be escaped.
Consider making the following change:
def format_uri(uri, options = {})
"<%s>" % escaped(uri_for(uri))
end
Here are specs I've used:
describe "utf-8 escaped" do
{
%(http://a/D%C3%BCrst) => %(<http://a/D%C3%BCrst>),
%(http://a/D\u00FCrst) => %(<http://a/D\\u00FCrst>),
%(http://b/Dürst) => %(<http://b/D\\u00FCrst>),
%(http://a/\u{15678}another) => %(<http://a/\\U00015678another>),
}.each_pair do |uri, dump|
it "should dump #{uri} as #{dump}" do
RDF::URI.new(uri).to_ntriples.should == dump
end
end
end
Instead of contaminating our Ruby code with camelCased monstrosities such as:
FOAF.firstName #=> RDF::URI("http://xmlns.com/foaf/0.1/firstName")
RDFS.seeAlso #=> RDF::URI("http://www.w3.org/2000/01/rdf-schema#seeAlso")
OWL.sameAs #=> RDF::URI("http://www.w3.org/2002/07/owl#sameAs")
XSD.dateTime #=> RDF::URI("http://www.w3.org/2001/XMLSchema#dateTime")
...we ought to be able to stick with Ruby conventions and say:
FOAF.first_name #=> RDF::URI("http://xmlns.com/foaf/0.1/firstName")
RDFS.see_also #=> RDF::URI("http://www.w3.org/2000/01/rdf-schema#seeAlso")
OWL.same_as #=> RDF::URI("http://www.w3.org/2002/07/owl#sameAs")
XSD.date_time #=> RDF::URI("http://www.w3.org/2001/XMLSchema#dateTime")
There's no reason we can't transparently support both naming conventions.
The recent round of RDF::Literal updates left XSD.string in a strange place. Strings are an implicit default type. Thus, currently, RDF::Literal handles language directly, which shouldn't be the case, as it's only defined on strings.
I'd like to factor out Strings into their own RDF::Literal::String class, and further, to return for the Ruby version of the literal not an instance of String but of a subclass thereof, which contains language data. This will make round-tripping easier and let me cleanly solve Spira issue 15 at http://github.com/datagraph/spira/issues/#issue/15.
If I do this, will you merge it, or is there a reason that Strings are the way they are?
I'm having a problem accessing the RDF::RDF vocabulary. The following program fails:
require 'rdf' puts "#{RDF::RDF.first}"
with:
ian@rowan-15 $ ruby rdf-ns-2.rb rdf-ns-2.rb:4:in `': uninitialized constant RDF::RDF (NameError)
I think this is because the autoload isn't being triggered for RDF::RDF. If I manually force a load of the RDF vocabulary:
require 'rdf' require 'rdf/vocab/rdf' puts "#{RDF::RDF.first}"
then other things break:
ian@rowan-15 $ ruby rdf-ns-2.rb /var/lib/gems/1.9.1/gems/rdf-0.2.1/lib/rdf/vocab.rb:83: warning: toplevel constant URI referenced by RDF::RDF::URI /var/lib/gems/1.9.1/gems/rdf-0.2.1/lib/rdf/vocab.rb:83:in `[]': undefined method `intern' for URI:Module (NoMethodError) from /var/lib/gems/1.9.1/gems/rdf-0.2.1/lib/rdf/vocab.rb:74:in `block in property' from rdf-ns-2.rb:4:in `'
I'm pretty sure I'm doing something wrong, but for the time being I've resorted to defining my own RDF Namespace object, so avoid having to touch RDF::RDF.
Take this RDF_Mutable spec:
it "should not insert a statement twice" do
@repository.insert(@statements.first)
@repository.insert(@statements.first)
@repository.count.should == 1
end
That is fine and good. But if I alter the the second insert to by adding (or changing) the context of the Statement object, I would expect @repository.count.should == 2. Yes, it is the same s-p-o, but in two different contexts. But with the RDF::Repository base implementation, the answer is still 1. Drilling down, that is because the == operator for the Statement objects throws away the context.
There are a variety of fixes for this, and some of them are certainly wrong, so I combed through RDF.rb to pick out behaviors of note around context handling and offer them up here with my thought on what a correct fix would be.
First off, Statement objects behaves explicitly as a triple with these methods:
And they behaves as quad with these methods closely related to those above:
I gather from the rdf-spec that the equality methods are intentional as they are, though I think I disagree with their current behavior. I think a Statement should always be treated as a quad, and refine the meaning of the context bit. I see two conflated API uses of the context: I have a context, or a I don't have a context versus I don't care about the context. The current behavior of the == method is a problem because it injects the I-don't-care semantics into places where the I-do-or-I-don't-have-a-context needs to be faithfully preserved, such as adding the same s-p-o into two different contexts of a RDF::Repository. The I-don't-care cases shows up mostly in query sorts APIs, such as Enumerable.has_statement? and should be intentionally handled there.
My proposal would be to move all the Statement methods listed above under the triple-like behavior to be quad like, and introduce the default context value of a boolean false for statements with no defined context, and leave the explicit value of nil for the I-don't-care case to be consistent with use of nil as a wildcard for s-p-o in various other query-oriented parts of the API. I'm fairly certain that will break some existing downstream things, so I'm putting this out for feedback and counter proposals.
So, on to some specific observations...
RDF::Mutable
Mutable.insert --- Rejects statements for which Statement.valid? is false. Valid admits statements without a context, which conspires to create problem with Mutable.delete.
Mutable.delete --- Context is currently treated as a wildcard if not supplied. The problem: A statement without a context is valid to insert, but you cannot isolate it to delete it without also taking the same triple out of other contexts. If statements with no context have a distinct value for the context, say the boolean false, they could be distinguished from an explicit "don't care" value of nil.
Mutable.update --- Implies a delete, so must behave consistently. Current behavior tosses the context on the delete, which is certainly a bug.
RDF::Enumerable
Enumerable.has_statement? --- The base class implementation is Enumeration.include? so the meaning is dictated by the == method of Statement, which currently discards the context. It behaves the same as Enumerable.has_triple?, which is not what I'd expect if I supply a Statement with an explicit context. Like Mutable.delete, this method should be able to verify both the existence of a statement with specific context, and a triple with no context (context == false), and with an explicit nil context, behave like a wildcard.
Enumerable.triples, Enumerable.each_triple -- If we cast away the context, the same triple may appear more than once. Is that a problem?
RDF::Graph --- The has_statement?, insert_statement and delete_statement implementations all depend on the Statement.== method, which throws away the context. This happens to work for Graph because the context is coerced to the same value all statements going in, so they would match if == was a quad match.
RDF::Repository --- Like Graph, the base class implementation depends on the Statement.== method and makes the latent bugs in Graph actual bugs.
Repository.has_statement? --- See Enumerable.has_statement.
Repository.insert_statement --- The duplicate check discards the context, so only one context can contain a given triple, which is a bug (and, incidentally, what lead me into investigating all this).
Feedback welcome.
What do you think about adding the following method to RDF::Enumerable? Makes it super easy to serialise something...
def dump(args)
RDF::Writer.for(*args).dump(self)
end
Should be okay, but need to start running the specs against Rubinius 1.0 in addition to MRI 1.8/1.9 and JRuby.
XMLLiterals need to be treated differently than other literals. In particular, it is necessary for XML and RDF readers to add namespace definitions to XMLLiterals. Also, equivalence tests look for two semantically equivalent XMLLiterals that are textually different to be equivalent; this is best handled by canonicalizing XMLLiterals.
Requirements are defined more specifically for RDFa [1], but should apply to all readers. Many tests look for equivalence of XMLLiterals that are defined somewhat differently, so the real thing to do is to perform an exclusive canonicalization [2]. See also in RDF Concepts [3].
In rdf-rdfxml this is handled incompletely by transferring namespaces and performing a partial re-write of the XML. See Literal.xmlliteral in rdf-rdfxml. A more complete solution would involve using the c14n module from libXML2, not usable directly through standard ruby bindings (is implemented at [4]).
RdfConcept deals with this by performing a partial transformation with namespace transfer and minimal rewriting and putting the burden in the literal comparison (which could be done in ref-isomorphic) by turning each XML Literal into a hash using ActiveSupport::XmlMini.parse and doing hash comparison.
[1] http://www.w3.org/TR/rdfa-core/#s_xml_literals
[2] http://www.w3.org/TR/2002/REC-xml-exc-c14n-20020718/
[3] http://www.w3.org/TR/rdf-concepts/#section-XMLLiteral
[4] http://rubygems.org/gems/coupa-libxml-ruby
The RDF::Query
and RDF::Query::Solution
classes are meant for implementing BGP queries. Let's try and finish the implementation thereof in time for RDF.rb 0.3.0.
Hello,
Please can you make it easier to enumerate the available serialisers. It is currently quite difficult to get the name, extensions and mime-type for each of the serialisers.
<link rel="alternate" type="application/rdf+xml" href="http://dbpedia.org/data/Oxford.rdf" title="Structured Descriptor Document (RDF/XML format)" />
<link rel="alternate" type="text/rdf+n3" href="http://dbpedia.org/data/Oxford.n3" title="Structured Descriptor Document (N3/Turtle format)" />
<link rel="alternate" type="application/json+rdf" href="http://dbpedia.org/data/Oxford.jrdf" title="Structured Descriptor Document (RDF/JSON format)" />
<link rel="alternate" type="application/json" href="http://dbpedia.org/data/Oxford.json" title="Structured Descriptor Document (RDF/JSON format)" />
It would be great to be able to do this:
>> f = RDF::Format.for(:ntriples)
=> RDF::NTriples::Format
>> f.name
=> "N-Triples"
>> f.content_types.first
=> "text/plain"
>> f.file_extensions.first
=> "nt"
nick.
I have a gist displaying the issue:
https://gist.github.com/675777
Basically, a local copy of a file fetched remotely works, but fetching the file directly, via RDF::Repository.load('http://...'), fails with encoding issues. Perhaps a bug in open_uri?
Support for the other RDF collection types can wait until someone actually needs them, but RDF::List
is pretty crucial. Dealing with rdf:List structures in the form of blank nodes is just painful.
We laid the groundwork for collection support earlier in ensuring that we always first check that an object responds to #each_statement
before we check for #each
, which becomes important with containers that return non-statements from #each
. Let's build from there.
RDF places limitations on the lexical value of typed literals [1]. Values must belong the lexical space of the relevant datatype. XML Schema defines the value space of various primitive datatype [2].
RDF::Literal should implement a #valid? method to verify the validity of typed literals.
Specs for various different datatypes are implemented in RdfContext, the relevant mapping information is included here.
xsd:decimal:
"1" => %("1.0"^^<http://www.w3.org/2001/XMLSchema#decimal>),
"-1" => %("-1.0"^^<http://www.w3.org/2001/XMLSchema#decimal>),
"1." => %("1.0"^^<http://www.w3.org/2001/XMLSchema#decimal>),
"1.0" => %("1.0"^^<http://www.w3.org/2001/XMLSchema#decimal>),
"1.00" => %("1.0"^^<http://www.w3.org/2001/XMLSchema#decimal>),
"+001.00" => %("1.0"^^<http://www.w3.org/2001/XMLSchema#decimal>),
"123.456" => %("123.456"^^<http://www.w3.org/2001/XMLSchema#decimal>),
"2.345" => %("2.345"^^<http://www.w3.org/2001/XMLSchema#decimal>),
"1.000000000" => %("1.0"^^<http://www.w3.org/2001/XMLSchema#decimal>),
"2.3" => %("2.3"^^<http://www.w3.org/2001/XMLSchema#decimal>),
"2.234000005" => %("2.234000005"^^<http://www.w3.org/2001/XMLSchema#decimal>),
"2.2340000000000005" => %("2.2340000000000005"^^<http://www.w3.org/2001/XMLSchema#decimal>),
"2.23400000000000005" => %("2.234"^^<http://www.w3.org/2001/XMLSchema#decimal>),
"2.23400000000000000000005" => %("2.234"^^<http://www.w3.org/2001/XMLSchema#decimal>),
"1.2345678901234567890123457890" => %("1.2345678901234567"^^<http://www.w3.org/2001/XMLSchema#decimal>),
xsd:boolean
"true" => %("true"^^<http://www.w3.org/2001/XMLSchema#boolean>),
"false" => %("false"^^<http://www.w3.org/2001/XMLSchema#boolean>),
"tRuE" => %("true"^^<http://www.w3.org/2001/XMLSchema#boolean>),
"FaLsE" => %("false"^^<http://www.w3.org/2001/XMLSchema#boolean>),
"1" => %("true"^^<http://www.w3.org/2001/XMLSchema#boolean>),
"0" => %("false"^^<http://www.w3.org/2001/XMLSchema#boolean>),
xsd:integer
"01" => %("1"^^<http://www.w3.org/2001/XMLSchema#integer>),
"1" => %("1"^^<http://www.w3.org/2001/XMLSchema#integer>),
"-1" => %("-1"^^<http://www.w3.org/2001/XMLSchema#integer>),
"+1" => %("1"^^<http://www.w3.org/2001/XMLSchema#integer>),
xsd:double
"1" => %("1.0E0"^^<http://www.w3.org/2001/XMLSchema#double>),
"-1" => %("-1.0E0"^^<http://www.w3.org/2001/XMLSchema#double>),
"+01.000" => %("1.0E0"^^<http://www.w3.org/2001/XMLSchema#double>),
"1." => %("1.0E0"^^<http://www.w3.org/2001/XMLSchema#double>),
"1.0" => %("1.0E0"^^<http://www.w3.org/2001/XMLSchema#double>),
"123.456" => %("1.23456E2"^^<http://www.w3.org/2001/XMLSchema#double>),
"1.0e+1" => %("1.0E1"^^<http://www.w3.org/2001/XMLSchema#double>),
"1.0e-10" => %("1.0E-10"^^<http://www.w3.org/2001/XMLSchema#double>),
"123.456e4" => %("1.23456E6"^^<http://www.w3.org/2001/XMLSchema#double>),
xsd:date, xsd:dateTime and xsd:Time are implemented as follows:
contents.is_a?(Time) ? contents.strftime("%H:%M:%S%Z").sub(/\+00:00|UTC/, "Z") : contents.to_s
contents.is_a?(DateTime) ? contents.strftime("%Y-%m-%dT%H:%M:%S%Z").sub(/\+00:00|UTC/, "Z") : contents.to_s
contents.is_a?(Date) ? contents.strftime("%Y-%m-%d%Z").sub(/\+00:00|UTC/, "Z") : contents.to_s
RdfContext also implements a Duration class that transforms integer milliseconds and floating point seconds into XSD format: [+1]PYYYYMMDDTHHMMSS.MMM
[1] http://www.w3.org/TR/rdf-concepts/#section-Literal-Value
[2] http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#built-in-primitive-datatypesg
The N-Triples spec says that a node is identifed as '_:' name, where name is [A-Za-z][A-Za-z0-9]*. However, on ruby 1.8.7 from a recent Ubuntu distro, Node.new creates identifiers with a dash in them, which the N-Triples serializer incorrectly passes on to an output file, e.g.:
_:g-605660708 <http://www.w3.org/2000/01/rdf-schema#label> "Movie Tickets" .
This is kinda nasty, since rapper will reject them, thus breaking any serialization to other formats, too.
This is RDF.rb 0.2.0.1.
http://lists.w3.org/Archives/Public/public-rdf-ruby/2010May/0024.html
Implement content negotiation in RDF.rb clients. Ideally with q= values for each of the supported parsers.
I would like to be able to do this:
repo = RDF::Repository.new
repo.load('http://www.bbc.co.uk/programmes/b00jnwlc#programme')
repo.each { |s| s.inspect! }
When i run
require 'rdf'
graph = RDF::Graph.new
s = RDF::URI.new("http://gemcutter.org/gems/rdf")
p = RDF::DC.creator
o = RDF::URI.new("http://ar.to/#self")
graph << RDF::Statement.new(s, p, o)
graph.each do |elem|
puts elem.inspect
end
RDF::Writer.for(:ntriples).new("spec/data/output.nt") do |writer|
graph.each_statement do |statement|
writer << statement
end
end
i got this
c:/ruby/lib/ruby/gems/1.8/gems/rdf-0.0.9/lib/rdf/writer.rb:248:in `puts': private method `puts' called for "spec/data/output.nt":String (NoMethodError)
Please can you add the WGS84 Geo Positioning vocabulary:
http://www.w3.org/2003/01/geo/wgs84_pos#
I believe it is a good candidate for inclusion as a standard vocabulary because it is on the top 10 produced by DERI:
http://prefix.cc/popular
URI joining and normalization is not well documented, but can be inferred from various W3C tests. Best described in RFC3986 section 5.2 [1]. Much of this is handled by Addressable::URI#join
The following specs were created when developing RdfContext to ensure proper normalization of joined URIs:
describe "normalization" do
{
%w(http://foo ) => "http://foo/",
%w(http://foo a) => "http://foo/a",
%w(http://foo /a) => "http://foo/a",
%w(http://foo #a) => "http://foo/#a",
%w(http://foo/ ) => "http://foo/",
%w(http://foo/ a) => "http://foo/a",
%w(http://foo/ /a) => "http://foo/a",
%w(http://foo/ #a) => "http://foo/#a",
%w(http://foo# ) => "http://foo/", # Special case for Addressable
%w(http://foo# a) => "http://foo/a",
%w(http://foo# /a) => "http://foo/a",
%w(http://foo# #a) => "http://foo/#a",
%w(http://foo/bar ) => "http://foo/bar",
%w(http://foo/bar a) => "http://foo/a",
%w(http://foo/bar /a) => "http://foo/a",
%w(http://foo/bar #a) => "http://foo/bar#a",
%w(http://foo/bar/ ) => "http://foo/bar/",
%w(http://foo/bar/ a) => "http://foo/bar/a",
%w(http://foo/bar/ /a) => "http://foo/a",
%w(http://foo/bar/ #a) => "http://foo/bar/#a",
%w(http://foo/bar# ) => "http://foo/bar",
%w(http://foo/bar# a) => "http://foo/a",
%w(http://foo/bar# /a) => "http://foo/a",
%w(http://foo/bar# #a) => "http://foo/bar#a",
%w(http://foo/bar# #D%C3%BCrst) => "http://foo/bar#D%C3%BCrst",
%w(http://foo/bar# #Dürst) => "http://foo/bar#D%C3%BCrst",
}.each_pair do |input, result|
it "should create <#{result}> from <#{input[0]}> and '#{input[1]}'" do
RDF::URI.new(input[0]).join(input[1].to_s).normalize.to_s.should == result
end
end
Note that rules for URIs are different than rules for namespace declarations. A URI can/should be canonicalized (e.g. http://foo.com => http://foo.com/) but a namespace should not (e.g., @Prefix foo: http://foo.com#. foo:a foo:b foo:c. => http://foo.com#a http://foo.com#b http://foo.com#c).
[1] http://tools.ietf.org/html/rfc3986#page-30
W3C rdfcore xmlbase tests: http://www.w3.org/2000/10/rdf-tests/rdfcore/xmlbase/
Hi,
I am trying to run a basic sparql query on a sesame based rdf store.
I can connect to an RDF store on sesame and print out all of the results, but that's about it. Having some challenges with the documentation for doing more advanced stuff (and pretty new to ruby, but not coding)
Here is my simple query:
#SELECT ?title
#WHERE
#{
# <http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> ?title .
#}
So far I have the following:
puts "Trying a different method for test"
urlTest = "http://localhost:8080/openrdf-sesame/repositories/test"
repositoryTest = RDF::Sesame::Repository.new(urlTest)
repositoryTest.each {|x| puts x} #(&block)
puts "run a query:"
queryTest = RDF::Query.new( urlTest )
puts "New query instantiated"
query.select(:title)
puts "Title selected from query"
query.each {|x| puts x} #(&block)
puts "Query results printed out"
Thanks in advance,
Bryan
gkellog noticed that RDF::Literal does not support #anonymous? or #unlabeled?, which are currently defined only on RDF::URI and RDF::Node.
I implemented #anonymous on RDF::Literal and RDF::Graph and sent a pull request. Not sure you'll agree with the semantics for Graph but I think it's what we want.
A serious bug slipped through to the 0.1.0 release's N-Triples writer implementation:
NameError: undefined local variable or method `node' for #<RDF::NTriples::Writer:0x1023c6628>
rdf-0.1.0/lib/rdf/ntriples/writer.rb:36:in `format_node'
rdf-0.1.0/lib/rdf/writer.rb:226:in `format_value'
rdf-0.1.0/lib/rdf/ntriples/writer.rb:26:in `write_triple'
rdf-0.1.0/lib/rdf/ntriples/writer.rb:26:in `map'
rdf-0.1.0/lib/rdf/ntriples/writer.rb:26:in `write_triple'
rdf-0.1.0/lib/rdf/writer.rb:199:in `write_statement'
rdf-0.1.0/lib/rdf/writer.rb:163:in `<<'
This affects the serialization of any statements that contain blank nodes. Fix coming up ASAP.
Prompted by a recent contribution to fix Ruby 1.9 enumerator compatibility (to be included in RDF.rb 0.1.8), I'm investigating what it will take to ensure that our use of enumerators is safe and compatible with all Ruby baseline versions that we wish to support (that is, 1.8.2+ and 1.9.x).
Literal#language is currently transformed into a constant. http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#dfn-plain-literal indicates that a plain literal may have a language tag as defined in RFC-3066, normalized to lower case. This includes tags with a primary-subtag and a subtag, such as "en-us". Changing options[:language].to_sym, dis-allows the this, because :en-us is not a Ruby symbol.
Also, note that normalization should force the language value to lower-case.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.