Extending SHACL
While browsing the web, I found this interesting web page titled DASH Constraint Components. It defines some new constraint components for SHACL. As my research at the moment revolves a lot around the expressiveness of SHACL, this really interests me.
The first thing I tried to do was to express these extensions with
existing core SHACL. However, I quickly noticed a small extension is
required to make it all work out: sh:equals
needs to support full path
expressions. I've seen this being mentioned in the SHACL Discord
server. Let's go over the extensions proposed by dash
and try to
express them:
dash:rootClass <root>
— simply check whether the focus node is a subclass of<root>
. This can be done in current SHACL using the property shape:[ a sh:PropertyShape ; sh:path [ sh:zeroOrMorePath rdfs:subClassOf ] ; sh:hasValue <root> ]
dash:stem
anddash:singleLine
are additional checks on strings, which can be accomplished using the pattern constraint component of SHACL.dash:coExistsWith <prop>
— when applied to a property shape withsh:path <path>
, it means that if there is a node reachable by<path>
, there must also be a node reachable by<prop>
. Otherwise, when there is no node reachable by<path>
, there may also be no node reachable by<prop>
. This can be done in SHACL using the node shape:[ a sh:NodeShape ; sh:and ( [ sh:or ( [ sh:path <path> ; sh:maxCount 0 ] [ sh:path <prop> ; sh:minCount 1 ] ) ] [ sh:or ( [ sh:path <path> ; sh:minCount 1 ] [ sh:path <prop> ; sh:maxCount 0 ] ) ] ) ]
dash:subSetOf <prop>
— when applied to a property shape withsh:path <path>
, it means that all nodes reachable with<path>
must also be reachable with<prop>
. This can be done in SHACL:[ a sh:PropertyShape ; sh:path [ sh:alternativePath (<path> <prop>) ] ; sh:equals <prop> ]
dash:nonRecursive <true | false>
— when applied to a property shape withsh:path <path>
, it means that there may not be a path<path>
to the focusnode itself. This is an interesting shape that is expressible in SHACL withsh:equals
that support full paths:[ a sh:NodeShape sh:not [ a sh:PropertyShape ; sh:path <path> ; sh:equals [ sh:zeroOrOnePath <path> ] ] ]
dash:symmetric <true | false>
— when applied to a property shape withsh:path <path>
, it means that every node reachable with<path>
must also be reachable with the inverse of that<path>
. This can be done in SHACL withsh:equals
that support full paths:[ a sh:PropertyShape ; sh:path <path> ; sh:equals [ sh:alternativePath (<path> [ sh:inversePath <path> ] ) ] ]
dash:closedByTypes <true | false>
— this shape depends on the assumption that rdf types are also SHACL shapes. Then, we say that(<p1> <p2> <p3> ...)
are the properties mentioned by the shapes that are also the type or a superclass of the type of the focus node. We can then write this in SHACL as:[ a sh:NodeShape ; sh:closed true ; sh:property [ sh:path <p1> ] ; sh:property [ sh:path <p2> ] ; sh:property [ sh:path <p3> ] ; ... ]
dash:hasValueIn (<c1> <c2> <c3> ...)
— when applied to a property shape withsh:path <path>
, it means that there must exist a node reachable with<path>
such that this node is one of(<c1> <c2> <c3> ...)
. This can be expressed in SHACL:[ a sh:NodeShape ; sh:or ( [ sh:path <path> ; sh:qualifiedValueShape [ sh:in (<c1>) ] ; sh:qualifiedMinCount 1 ] ; [ sh:path <path> ; sh:qualifiedValueShape [ sh:in (<c2>) ] ; sh:qualifiedMinCount 1 ] ; [ sh:path <path> ; sh:qualifiedValueShape [ sh:in (<c3>) ] ; sh:qualifiedMinCount 1 ] ; ... ) ]
dash:hasValueWithClass <c>
— when applied to a property shape withsh:path <path>
, it means that there must exist a node reachable by<path>
to a node with ardf:type
edge to the<c>
node. This can be expressed in SHACL:[ a sh:PropertyShape ; sh:path ( <path> rdf:type ) ; sh:qualifiedValueShape [ sh:in (<c>) ] ; sh:qualifiedMinCount 1 ]
dash:uniqueValueForClass <c>
— when applied to a property shape withsh:path <path>
, it means the focus node is the only node withrdf:type <c>
that can reach the value node (reached by<path>
). In other words, every value node of the focus node must have exactly one inverse<path>
to a node withrdf:type <c>
.[ a sh:PropertyShape ; sh:path <path> ; sh:node [ a sh:PropertyShape ; sh:path ( [ sh:inversePath <path> ] rdf:type ) ; sh:qualifiedValueShape [ sh:in (<c>) ] ; sh:qualifiedMinCount 1 ; sh:qualifiedMaxCount 1 ] ]
dash:uriStart <s>
— this constraint requires a string test on IRIs, which is unsupported in current SHACL.
So, to recap, the dash
constraints that are expressible in the current
SHACL are: dash:rootClass
, dash:stem
, dash:singleLine
,
dash:coExistsWith
, dash:subSetOf
, dash:closedByType
, dash:hasValueIn
,
dash:hasValueWithClass
and dash:uniqueValueForClass
. The constraints
that are not expressible in the current SHACL but are expressible when
we allow for equality between paths are dash:nonRecursive
and
dash:symmetric
. The only constraint that is not expressible is
dash:uriStart
.
It is obvious that some of these translations are not practical as
they require a lot of duplication. However, features are added to a
language because there is some need for it. It could be an
abbreviation of a common pattern (and adding explicit keywords can
help optimizing these shapes), or it could be something that seems
not to be expressible but nevertheless desired by practitioners. I'm
interested in solving the question whether these newly proposed
features are already expressible. If they are, then it is useful to
know, for example, to people who write SHACL validators as they can
make use of existing code to implement new features. If the proposed
features are not already expressible, it is also interesting to know
that its addition really adds expressive power. In our own work for
example, we found out that sh:equals
and sh:disjoint
really add
expressive power to SHACL.
In my view, this hints at the extension of SHACL with equality (and
disjointness) constraints that support full paths. This is a natural
extension and will give more expressive power as is desired by dash
.
Another idea is to have some kind of macro system for SHACL. This way,
we can add constraints like the ones expressible in the current SHACL
and proposed by dash
. SHACL can be complex to write and users may want
to write macros for constraints they find interesting.
Hopefully, this was an interesting read. As I'm only human, there may
be some mistakes in my interpretation of the dash
constraints. Any
feedback is welcome.