Expressiveness of SHACL Features
Note: This post is an adaptation of Section 1.2 of my PhD thesis, and serves as a general intro to our paper on the subject. The purpose of publishing it as a standalone blogpost is to further disseminate my writings. It is part of a short series of posts representing the Introduction of my thesis:
- Section 1.1: SHACL in a nutshell
- Section 1.2: Expressiveness (this post)
- Section 1.3: Recursion
- Section 1.4: Provenance
Given the definition of SHACL as a constraint language, users are
often interested in what you can actually do with it; what shapes can
you actually write? When a language is defined, many criteria are
postulated on what the language should be able to describe. Often,
these are then directly included in the language. A simple example is
the class constraint component in SHACL. This allows for writing
shapes stating that the focus node must be of a certain RDF class (or
subclass thereof). It is easy to observe this feature is technically
redundant: if we would remove this constraint component from SHACL,
the expressive power would remain the same, because we can already
express it using other constraint components. For example, given an
IRI :c
every shape of the form:
[ a sh:NodeShape ; sh:class :c ]
can be written as the shape:
[ a sh:PropertyShape ; sh:path ( [ sh:zeroOrMorePath rdfs:subClassOf ] rdf:type ) ; sh:hasValue :c ]
Clearly, constraints about classes are useful, so it is included in the language. However, our example shows that the complex path expressions are more fundamental to SHACL. We have these, so we can already do class constraints.
This discussion of the class constraint component is quite clear cut; it is easy to see that it is expressible by other features. Now, a natural question arises: does this also happen for other constraint components? Can we, for example, express the equals constraint component using other SHACL features? In this case it is not clear whether this is possible, but how can you be certain that it is not expressible? These are the expressiveness questions we investigate in our paper (and to actually answer the question, no, you cannot express equality with the other features).
To add to the relevance of this kind of investigation, the SHACL
community is interested in extending SHACL with other constraint
components called the DASH Constraint Components. It turns out that
most of these constraint components are not easily (or at all)
expressible with SHACL as it is, but if we extend SHACL with a more
powerful version of the equality constraint component, the DASH
extensions become expressible! As a concrete example, DASH proposes a
constraint dash:nonRecursive
on property shapes that states
that there must not be a path (from sh:path
) from the focus
node, to the focus node. In other words, there may not be a self-loop
(following the path) in the data graph for the focus node. This can be
expressed with the shape:
:noLoop a sh:NodeShape sh:not [ a sh:PropertyShape ; sh:path <path> ; sh:equals [ sh:zeroOrOnePath <path> ] ] .
However, this shape is technically not a correct SHACL shape because
the sh:equals
keyword may only have IRIs as their value, not
blank nodes (and thus not paths). Nevertheless, the meaning is clear.
The point here is not to say that DASH is superfluous, rather, it is simply to give insight in what is fundamentally happening when you add features to SHACL. It turns out that adding full-path support to the equality feature lets us also express some other DASH constraints, indicating that adding full equality to SHACL is useful. Furthermore, it may give insights to developers implementing SHACL validators: not every added feature requires implementing new algorithms to support that feature.