Expressiveness of SHACL Features

Note: This post is an adaptation of Section 1.2 of my PhD thesis, and serves as a general intro to our paper on the subject. The purpose of publishing it as a standalone blogpost is to further disseminate my writings. It is part of a short series of posts representing the Introduction of my thesis:

Given the definition of SHACL as a constraint language, users are often interested in what you can actually do with it; what shapes can you actually write? When a language is defined, many criteria are postulated on what the language should be able to describe. Often, these are then directly included in the language. A simple example is the class constraint component in SHACL. This allows for writing shapes stating that the focus node must be of a certain RDF class (or subclass thereof). It is easy to observe this feature is technically redundant: if we would remove this constraint component from SHACL, the expressive power would remain the same, because we can already express it using other constraint components. For example, given an IRI :c every shape of the form:

[ a sh:NodeShape ;
  sh:class :c
]

can be written as the shape:

[ a sh:PropertyShape ;
  sh:path ( [ sh:zeroOrMorePath rdfs:subClassOf ] rdf:type ) ;
  sh:hasValue :c
]

Clearly, constraints about classes are useful, so it is included in the language. However, our example shows that the complex path expressions are more fundamental to SHACL. We have these, so we can already do class constraints.

This discussion of the class constraint component is quite clear cut; it is easy to see that it is expressible by other features. Now, a natural question arises: does this also happen for other constraint components? Can we, for example, express the equals constraint component using other SHACL features? In this case it is not clear whether this is possible, but how can you be certain that it is not expressible? These are the expressiveness questions we investigate in our paper (and to actually answer the question, no, you cannot express equality with the other features).

To add to the relevance of this kind of investigation, the SHACL community is interested in extending SHACL with other constraint components called the DASH Constraint Components. It turns out that most of these constraint components are not easily (or at all) expressible with SHACL as it is, but if we extend SHACL with a more powerful version of the equality constraint component, the DASH extensions become expressible! As a concrete example, DASH proposes a constraint dash:nonRecursive on property shapes that states that there must not be a path (from sh:path) from the focus node, to the focus node. In other words, there may not be a self-loop (following the path) in the data graph for the focus node. This can be expressed with the shape:

:noLoop a sh:NodeShape
  sh:not [ a sh:PropertyShape ;
           sh:path <path> ;
           sh:equals [ sh:zeroOrOnePath <path> ]
         ] .

However, this shape is technically not a correct SHACL shape because the sh:equals keyword may only have IRIs as their value, not blank nodes (and thus not paths). Nevertheless, the meaning is clear.

The point here is not to say that DASH is superfluous, rather, it is simply to give insight in what is fundamentally happening when you add features to SHACL. It turns out that adding full-path support to the equality feature lets us also express some other DASH constraints, indicating that adding full equality to SHACL is useful. Furthermore, it may give insights to developers implementing SHACL validators: not every added feature requires implementing new algorithms to support that feature.