Linked Data has proved to be an effective way to look up and navigate information at web scale. It is also emerging as a compelling architectural basis for web applications [[RWLD]]. The principles for designing Linked Data web applications are standardized by the W3C Linked Data Platform Specification 1.0 [[!LDP]].
RDF is at the core of Linked Data. The design principles for Linked Data [[LDDI]] include the following rule:
When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)
This emergence of RDF as an important application development technology for Linked Data and other domains has
brought to light a serious gap in the associated technology stack, namely the absence of a way to describe the
integrity constraints imposed by an application on the RDF documents it processes. Here the term
integrity constraint
denotes any characteristic of data required by or enforced by an application. Common
examples of integrity constraints include the mandatory presence of properties, the cardinality of relations,
and restrictions on the allowed values of properties.
Well-established programming technologies such as relational databases, object-oriented programming languages, and XML utilize closed-world assumptions to provide schemas or type definitions that encode integrity constraints. Applications utilizing RDF and open-world assumptions may need a similar mechanism to constrain the open-world for a particular context, and frequently start to use RDFS or OWL for this purpose. However, this is a misapplication of those technologies.
RDFS and/or OWL are used to define OSLC vocabularies which specify the terms (classes and properties) and their semantic meaning. Vocabularies don't define constraints on what can be asserted, rather they define the meaning of what was asserted, often through the use of reasoners that introduce inferred assertions. However, there are also situations where a vocabulary needs to be constrained in order to be used in a specific context for a specific purpose. RDFS and OWL are often not useful for this purpose as unless the vocabularies are carefully designed, reasoners can introduce unintended assertions that are not consistent with the specified purpose.
This document describes the Resource Shape specification, a high-level RDF vocabulary for describing the
shape of RDF resources. Here the term RDF resource
is used to denote an LDPR resource that has
an RDF document among its set of available representations. The term shape
is used because it is often
useful to visualize an RDF document as a topological object, namely as a graph consisting of nodes
interconnected by arcs. Throughout this document the terms RDF resource
, RDF document
, and
RDF graph
are used more or less interchangeably albeit somewhat imprecisely.
The shape of an RDF graph includes both a description of its expected contents (properties, types) within some operational context (e.g. GET, POST) and the integrity constraints that must be satisfied by the contents in that context.
Resource shapes specify a standard way of describing resources and constraints on resources that may be used in defining among other things, OSLC domain resources. There are other means that OSLC domains may use in addition to or instead of resource shapes such as W3C [[SHACL]] or [[JSONSchema]]. OSLC servers should describe their resources using resource shapes if they wish to integrate with OSLC Core 3.0 clients or servers that are expecting resource shapes. OSLC domain specifications use shapes to specify the constraints on their domain vocabulary elements that must be supported by servers.
This specification begins with a brief discussion of the use cases and requirements that the OSLC Resource Shape Specification addresses. It then describes all aspects of the current specification in detail. Finally, it concludes with some recommendations for extensions based on implementation experience.
Terminology uses and extends the terminology and capabilities of OSLC Core Overview, W3C Linked Data Platform [[!LDP]], W3C's Architecture of the World Wide Web [[WEBARCH]], Hyper-text Transfer Protocol [[!HTTP11]].
This specification introduces the following terminology:
oslc:describes
property
matches the type of the associated resource, or the shape has no oslc:describes
property in
which case it applies to all resources associated with that shape. For all shapes that "apply to" or
describe an associated resource, that resource should satisfy the constraints defined by those shapes. See
Associating and Applying Shapes for the definition of how
shapes are associated with resources, what associated shapes apply to a resource, and what it means if
multiple shapes apply to a single resource.
Sample resource representations are provided in
text/turtle
format [[TURTLE]].
The following common URI prefixes are used throughout this specification:
This section briefly describes the main use cases that motivate an RDF graph shape language. For a more complete discussion of this topic, refer to the W3C RDF Validation Workshop [[RDFVAL]].
Any application that provides programmatic services should also provide application programming interface (API) documentation to programmers so that they can consume those services. Applications that process RDF, including Linked Data web applications, are no different than other applications in this respect.
Although API documentation is normally directed towards human programmers, there are also important use cases where other programs need to understand the API. For example, consider a generic form-building tool that can generate a user interface for creating or updating resources. Such a tool needs to understand the expected contents of the resource so it can generate a user interface. It also needs to understand the applicable integrity constraints so it can validate user input before sending it to the service provider or server. Furthermore, the server may apply different constraints for creation versus update operations.
Users must understand the contents of an RDF dataset in order to write SPARQL queries. Understanding the integrity constraints can help users write better SPARQL queries. For example, if a property is known to always be present, then there is no need to wrap it in an OPTIONAL clause.
Query building tools can take advantage of shape information. For example, consider a query builder that allows the user to define a query that filters results by selecting allowed values from a list. In principle, the query builder could dynamically query the dataset to determine the list of values present. However, such a query could be slow if the dataset was large. In addition, only those values present at the time of the query would appear in the list. In contrast, if the query builder had access to shape information, then it could avoid the potentially expensive query and present the user with the complete list of allowed values, whether or not they were actually present in the dataset at that time.
Resource indexers may be looking for certain types of structured data. For example, a web crawler might be indexing product descriptions and pricing for a marketplace application. The web crawler could provide a shape to describe the data it is looking for. Web application developers would then be able to provide that information in the web pages of their commerce site and thereby have their sites included in the index.
This section describes some high-level requirements for an RDF resource shape language. For simplicity, members of a shape language are referred to as shape resources. It is to be understood that the validity and meaning of a shape resource depends on any other shape resources it is linked to.
This section describes the Resource Shape specification (aka the Shapes specification) which is part of OSLC Core 3.0. The Shapes specification defines:
A resource shape is a resource that describes the contents of, and constraints on, the RDF representation of other resources. These constraints are intended to be applicable to OSLC core and domain vocabularies in order to express the domains sufficiently to support interoperability between clients and servers that use and support them. Resource shapes specify the minimum an implementation needs to do be considered compliant. Specifically, shape constraints say how OSLC clients and servers must behave if the resources satisfy the applicable shape constraints, but they do not say what clients and servers may do if the applicable constraints are not satisfied.
A shape resource itself has an RDF representation which uses the terms defined by the oslc: vocabulary. The
term shape resource
or simply shape
is sometimes used as shorthand for the more verbose phrase
the RDF representation of a shape resource
where this can lead to no confusion. The following sections describe all of the RDF vocabulary terms used in
shape resources.
The Shapes specification is based on a simple conceptual model of resources that works well in practice, but is somewhat biased towards the view that the RDF representation of a resource looks like a set of property-value pairs on that resource. The Shapes specification works well when the resource being described appears as a subject node in its RDF graph and all other nodes are connected to the resource node by a path consisting of one or more properties. Each property-value pair is represented by a triple in which the subject is the resource, the predicate is the property, and the object is the value. The value may be either a literal or a resource. When the object is a resource, that resource may itself be described by another shape. Thus the Shapes specification is powerful enough to describe complex graphs. Although the Shapes specification works well in practice, it cannot describe arbitrary RDF graphs.
The following diagram summarizes the main concepts and relations used in this specification:
In this diagram the boxes represent resource types and the arrows represent the relations between them that are defined by the Shapes specification. The two boxes on the left represent external types of resources that use shapes. The other three boxes represent the resource types that are defined by the Shapes specification.
The box labeled Resource Shape
represents a shape. A shape is a resource of rdf:type
oslc:ResourceShape. A shape describes a set of resources. A shape is
basically a set of one or more defined properties that any resource described by that shape is expected to
contain.
The box labeled Property
represents a defined property. A defined property is a resource of rdf:type
oslc:Property. The arrow labeled property
represents the aggregation
relation between a shape and its defined properties. The predicate of this relation is
oslc:property.
Each oslc:Property resource has a set of properties that describe the defined
property and the constraints on its use within any resource that the given shape applies to. These include a
description of the values of the defined property and the number of its occurrences or cardinality within the
resource. The value of a defined property may be a literal, a resource, or either. If the value of a defined
property is a resource, then defined property may refer to another
oslc:ResourceShape resource that describes the value resource. This
relation is depicted by the arrow labeled valueShape
. The predicate of this relation is
oslc:valueShape.
The value of a defined property may be constrained to take one of an allowed set of values. In some cases, the
allowed set of values may be large and be used in many shapes. In this case it is useful to put the allowed
values in a separate resource so they can be easily reused. The box labeled Allowed Values
represents a
resource of rdf:type oslc:AllowedValues. The arrow labeled
allowedValues
represents the relation between a defined property and its set of allowed values. The
predicate of this relation is oslc:allowedValues.
A [[REST]] service may describe aspects of its interface contract using shapes. For example, a REST service
may provide a URI where new resources can be created via HTTP POST. This service could describe the expected
contents of POST request bodies using shapes. Similarly, a REST service may provide a URI that represents a
container of resources and could describe those resources using shapes. The box labeled
any service
represents any REST service description. The arrow labeled resourceShape
represents
the predicate oslc:resourceShape
which is a property of the service description resource.
Similarly, any resource can describe its own contents by linking to a shape resource. The box labeled
any resource
represents any resource. The arrow labeled instanceShape
represents the predicate
oslc:instanceShape which is a property of the resource.
In general, the relation between shapes and resources is many-to-many. Given a resource R there MAY be zero or more shapes S associated with it. This specification defines three ways to associate shapes with a resource, namely using oslc:instanceShape, oslc:resourceShape, and oslc:valueShape. Other specifications MAY define additional mechanisms.
oslc:instanceShape
ResourceShape - directly associates a constraining shape with a
resource.
oslc:resourceShape
ResourceShape - associates a constraining shape with the entity
request or response resource of a service (e.g., a creation or query service).
oslc:valueShape
ResourceShape - associates a constraining shape with the resource that
is the object value of a property of a resource.
Not all shapes associated with a resource are necessarily applicable to it. Let S be associated with R. S is said to apply to R in the following two cases:
If no shapes are associated with a resource then there are no implied constraints on that resource.
If one or more shapes are associated with a resource then at least one of those SHOULD be applicable to that resource. If no associated shape applies to a resource then this SHOULD be interpreted as an error condition.
If exactly one shape applies to a resource then that resource SHOULD satisfy all the constraints defined by that shape.
If more than one shape applies to a resource then that resource SHOULD satisfy all the constraints defined by all the shapes (AND semantics). However, the specification for a service description MAY define alternate semantics. For example, a service MAY require that the resource satisfy the constraints defined by at least one of the shapes (OR semantics).
If a resource satisfies its applicable shapes, client and server implementations MUST behave as described in the defining OSLC specifications. If a resource does not satisfy its applicable shapes, implementations SHOULD attempt to complete the operation with the given data if possible. Otherwise implementations MAY reject the operation.
This section presents a simple running example to illustrate the Shapes specification. For more examples, refer to [[LDOW2013]], and [[ShapesRDFVAL]].
Consider a simple bug tracking service in which each bug has just two properties: a summary and status. The
summary is required, but the status is optional. The summary is given by the property
dcterms:title, and the status by oslc_cm:status. The oslc_cm:status is
constrained to take one of the values Submitted
, InProgress
, or Done
. The RDF type of a
bug is oslc_cm:ChangeRequest.
The following listing shows the RDF representation of the bug
http://example.com/bugs/1
which satisfies the constraints defined by the bug tracking service:
@prefix dcterms: <http://purl.org/dc/terms/> . @prefix oslc: <http://open-services.net/ns/core#> . @prefix oslc_cm: <http://open-services.net/ns/cm#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . <http://example.com/bugs/1> a oslc_cm:ChangeRequest ; dcterms:title "Null pointer exception in web ui"^^rdf:XMLLiteral ; oslc_cm:status "Submitted" ; oslc:instanceShape <http://example.com/shape/oslc-change-request> .
The following listing shows the RDF representation of the bug
http://example.com/bugs/2
which violates the constraints since its oslc_cm:status property has two values:
@prefix dcterms: <http://purl.org/dc/terms/> . @prefix oslc: <http://open-services.net/ns/core#> . @prefix oslc_cm: <http://open-services.net/ns/cm#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . <http://example.com/bugs/2> a oslc_cm:ChangeRequest ; dcterms:title "Wrong arguments"^^rdf:XMLLiteral ; oslc_cm:status "Submitted", "InProgress" ; oslc:instanceShape <http://example.com/shape/oslc-change-request> .
We can represent the constraints defined by the bug tracking service using the shape resource
http://example.com/shape/oslc-change-request
which has the following RDF representation:
@prefix dcterms: <http://purl.org/dc/terms/> . @prefix oslc: <http://open-services.net/ns/core#> . @prefix oslc_cm: <http://open-services.net/ns/cm#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @base <http://example.com/shape/> . <oslc-change-request> a oslc:ResourceShape ; dcterms:title "Creation shape of OSLC Change Request"^^rdf:XMLLiteral ; oslc:describes oslc_cm:ChangeRequest ; oslc:property <oslc-change-request#dcterms-title> , <oslc-change-request#oslc_cm-status> . <oslc-change-request#dcterms-title> a oslc:Property ; oslc:propertyDefinition dcterms:title ; oslc:name "title" ; oslc:occurs oslc:Exactly-one . <oslc-change-request#oslc_cm-status> a oslc:Property ; oslc:propertyDefinition oslc_cm:status ; oslc:name "status" ; oslc:occurs oslc:Zero-or-one ; oslc:allowedValues <status-allowed-values> .
The RDF representation contains one oslc:ResourceShape resource and two oslc:Property resources. The oslc:ResourceShape resource contains a short description of the shape using dcterms:title, gives the RDF type of the resource being described using oslc:describes, and lists the two properties contained in the resource being described using oslc:property.
The contained properties, dcterms:title and oslc_cm:status are described in oslc:Property resources. Each oslc:Property resource specifies the URI of the RDF property it is defining using oslc:propertyDefinition, and the occurrence of that property using oslc:occurs. The occurrence of each property is specified using the terms oslc:Exactly-one and oslc:Zero-or-one which indicate that the properties are both single-valued. dcterms:title is required and oslc_cm:status is optional.
In this example, only the occurrence and allowed values constraints are used. The Shapes specification provides for properties to be constrained by value type, range, and several other aspects.
The definition of oslc_cm:status refers to
http://example.com/shape/status-allowed-values
using oslc:allowedValues. That resource has the rdf:type
oslc:AllowedValues. It specifies the allowed values of the oslc_cm:status
property. It has the following RDF representation:
@prefix oslc: <http://open-services.net/ns/core#> . <http://example.com/shape/status-allowed-values> a oslc:AllowedValues ; oslc:allowedValue "Done" , "InProgress" , "Submitted" .
Property tables are used below to describe the resources defined by the Shapes specification, namely oslc:ResourceShape, oslc:Property, and oslc:AllowedValues. A property table is a tabular depiction of a subset of the information that can be specified using shapes. Each table describes one type of resource. Each row describes one property of the resource. Each column describes some aspect of the properties. The columns have the following meanings:
true
if the property is read-only. If omitted, or set to false, then the property is writable.
The oslc:instanceShape
property is used to link any described resource with a shape resource that
describes its contents. A resource MAY be associated with zero or more shapes.
The oslc:resourceShape
property is used to link an application service description with a shape
resource that describes some aspect of the service's API contract. A service description MAY be linked with
zero or more shapes.
For example, in OSLC a resource that accepts POST requests to and LDPC in order to create new resources is referred to as a creation factory. The service description for a creation factory may link to one or more shape resources that describe the bodies of POST requests.
dcterms:title
is used to provide a summary of
oslc:ResourceShape and oslc:Property resources.
Its value SHOULD be a literal of type rdf:XMLLiteral that is valid content for
an XHTML
<span>
element. If the value contains no XML markup then it MAY be represented as a plain text literal or
xsd:string.
dcterms:description
is used to provide a description of
oslc:Property resources. Its value SHOULD be a
literal of type rdf:XMLLiteral that is valid content for an XHTML
<div>
element. If the value contains no XML markup then it MAY be represented as a plain text literal or
xsd:string.
oslc:describes
is used to list the types of the described resources associated with this shape.
Suppose that shape S is associated with described resource R, e.g. via an
oslc:resourceShape or oslc:instanceShape link. If
shape S describes type T and described resource R has type T then S describes the contents of and
constraints on R.
For example, a creation factory may be able to create many different types of resources. The constraints on a given type of resource is specified by the associated shape resources that contain an oslc:describes link to that type.
oslc:property
is used to list the defined properties that are expected to be contained in
described resources associated with this shape. The object of this property
MUST be an oslc:Property resource whose
representation is contained in the shape document.
If a described resource contains a property described by some oslc:Property resource, then a REST service is expected to process that property in some useful way as defined by the service's API contract. If there is no matching oslc:Property resource then the behavior of the service is undefined.
oslc:allowedValue
is used to specify an allowed value of the defined property. The object of
this property SHOULD be compatible with the type specified by the
oslc:valueType property if present. An
oslc:Property resource MAY contain one or more oslc:allowedValue properties
and an optional oslc:allowedValues property which links to an
oslc:AllowedValues resource. The complete set of allowed values is the
union of all the values specified directly in the oslc:Property resource and
the linked
oslc:AllowedValues
resource.
oslc:allowedValues
specifies a link to an
oslc:AllowedValues resource which defines a set of allowed values for the
defined property. See oslc:allowedValue for a description of how the complete
set of allowed values is defined.
oslc:defaultValue
specifies the default value for the defined property. The object of this
property SHOULD be compatible with the type specified by the
oslc:valueType
property if present.
A default value is normally used when creating resources. A service SHOULD use the default value to provide a value for a property if none is provided in the creation request.
A default value MAY be used by clients of a described resource if the defined property is not present in the representation of the described resource. This mechanism is useful if a service introduces a new defined property but does not update all pre-existing described resources.
If present and true, oslc:hidden
is used to indicate that the defined property is not normally
presented to users. A client of the described resource SHOULD NOT display
hidden defined properties to normal users. It MAY display hidden defined properties to administrative users.
If the oslc:isMemberProperty
is present and true then the defined property is a container
membership property similar to rdfs:member. The described resource is the container and the object resources
are its members.
For example, OSLC Query Capabilities are query services that behave like containers for other resources. A defined property for which oslc:isMemberProperty is true links the container to its member resources.
The recent Linked Data Platform specification [LDP] elaborates the concept of resource container. It is therefore desirable to evolve the Shapes specification to align with the LDP concept of container membership.
oslc:name
is used to specify the local name of the defined property. This is normally the
portion of the defined property URI (see oslc:propertyDefinition) that
follows the last hash (#) or slash (/).
For datatype properties whose type is xsd:string
, oslc:maxSize
specifies the
maximum number of characters in the defined property value. The absence of oslc:maxSize indicates that
either there is no maximum size or that the maximum size is specified some other way.
oslc:occurs
is used to specify the number of times that the defined property may occur. The
value of this property MUST
be one of the following individuals:
For strings and language-tagged strings, single-valued means there is at most one value for any given language tag, and at most one untagged value.
oslc:propertyDefinition
is used to specify the URI of the the defined property.
oslc:range
MUST NOT be used with datatype properties. It MAY be
used with object properties. For object properties, oslc:range
is used to specify the allowed
rdf:type
s of the object resource. The target resource SHOULD be any of the specified
oslc:range
types, but no inferencing is intended if the actual target resource is or is not one
of these types. This is very different semantics than rdfs:range
which does have inferencing
implications.
The values of this property MAY be any rdf:type URI or the following individual:
If present and true, oslc:readOnly
is used to specify that the value of defined property is
managed by the service, i.e. that it is read-only. It cannot be directly modified by clients of the service.
Services SHOULD ignore attempts to modify read-only properties, but MAY fail such requests. If a service
ignores an attempt to modify a read-only property then it SHOULD NOT do so silently. A service MAY use the
HTTP Warning header or some other means to indicate that the attempt to modify a read-only property has
been ignored.
In this context, modification means a change in any object of the triples associated with the defined property. For example, a GET request followed by a PUT request would not modify the triples. A service MUST NOT interpret a PUT request that does not modify the triples associated with the defined property as a violation of the oslc:readOnly constraint.
Examples of read-only properties include creation and modification timestamps, the identity of who created or modified the described resource, and properties computed from the values of other properties.
When modifying a resource, it is natural for a client to first retrieve its current representation using a GET request. This request will return read-only properties along with read-write properties. If a service fails PUT requests that contain read-only properties then clients will have to remove all read-only properties before submitting PUT requests. The behavior of ignoring read-only properties in PUT requests is therefore more convenient for clients. Similarly, when copying a resource, a client would GET it first. It is therefore more convenient for clients if the service ignores read-only properties also in POST requests.
For object properties, oslc:representation
is used to specify how the object resource is
represented. The value of oslc:representation
MUST be one of the
following individuals:
For object properties, oslc:valueShape
is used to specify a link to resource shape that
describes the object resource.
For datatype properties, oslc:valueType
specifies the literal value type. OSLC datatype
properties are a subset of the base or primitive types defined by [[!rdf11-concepts]] and
[[!xmlschema11-1]]. A datatype MUST be one of the following individuals:
For object properties, oslc:valueType
specifies how the object resource is identified. It
MUST be one of the following individuals:
local resourceis used because the scope of identifier is local to the representation.
The following individuals have participated in the creation of this specification and are gratefully acknowledged:
Participants:
John Arwe (IBM)
Nick Crossley (IBM)
Miguel Esteban Gutiérrez (Universidad Politécnica de Madrid)
Arnaud Le Hors (IBM)
Martin Nally (IBM)
Eric Prud'hommeaux (W3C)
Arthur Ryman (IBM)
Steve Speicher (IBM)
Tack Tong (IBM)
In addition, the OSLC Open Project would like to call out special recognition of the significant contribution by the originators of the Resource Shape concepts. The OSLC Resource Shape specification was initially developed by the OSLC Reporting Workgroup under the leadership of Tack Tong (IBM), with major contributions from Arthur Ryman (IBM) and Martin Nally (IBM). Members of OSLC workgroups found shapes to be applicable to other aspects of OSLC and so the specification was subsequently integrated into the OSLC Core specification by Dave Johnson (IBM).
Anamitra Bhattacharyya (IBM) provided valuable feedback based on implementation experience and input on datatype facets. Steve Speicher (IBM) and John Arwe (IBM) contributed proposals for extending shapes which has informed the [[SHACL]] work.