-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
triple terms in subject position - issues with RDF/XML? #138
Comments
IIRC the last discussion we had about this issue was in a WG meeting about a month ago, and syntax wise the result was that
The latter problem was considered crucial enough to resolve the question if triple terms should be allowed in subject position to no, although there was no consensus if it would be desirable to allow them in subject position (at the time my position towards that question was no as well). W.r.t. the latter question I changed my opinion to yes, triple terms should be allowed in subject position, and I never considered the question if JSON-LD can properly represent them in subject position as crucially important. To me N-Triples and its derivatives (Turtle, N-Quads, Trig, SPARQL query syntax) are the essential syntaxes, whereas RDF/XML, JSON-LD and RDFa aren't. |
SemanticsTF: https://www.w3.org/2024/11/22-rdf-star-minutes.html#724b WG : https://www.w3.org/2024/11/14-rdf-star-minutes.html
I suggested using a
None has been mentioned. See links. |
In fact, the current thinking is that JSON-LD doesn't have a problem with triple terms in the subject position. An example of this might be the following: {
"@triple": {
"@id": "s",
"p": "o"
},
"rdf:type": "rdf:TripleTerm"
} This would be equivalent to the (hypothetical) |
Yes, the JSON-LD structure does support that. Just as JSON-LD literal objects (using The risks of conflating resources with fixed meanings with various more granular concretizations thereof are very similar between literals and triple terms. We should only extend syntaxes with expressive power that must be allowed. We must also consider the risks of such conflations becoming common because of non-expert use. This is of course why we added reifiers, so hopefully this risk is properly managed now. I am stil very wary about adding more expressive power in concrete syntaxes. |
There are several threads to be pulled together -
In semantics, yes, I think that was the state at WG 2024-11-14 but that not about syntax. As I read it, it is not a matter of technical barriers. A WG discussion should be on the wider issue first (maybe rename this issue to remove the tie to RDF/XML?). There are a number of intertwined points to consider
1. SyntaxOn pure syntax grounds, there is no evidence that something for RDF/XML can't be done; the format can support compound structures. But fitting it with striped RDF/XML syntax needs considering. To have a design, we should have use cases. 2. Use CasesIIRC, 2 use cases have been presented:
3. ConcernsSeveral people have concerns about conceptually allowing triple terms as subjects. Other people argue for unrestricted form as a principle. 4. RiskA continuation maintenance and new features mode allows "class 4" new features but does not allow changing the existing spec to remove or incompatibly alter a feature. In other words, the syntax design approach must be right first time if it is the RDF 1.2 published document of the current charter. 5. ResourcesThere needs to a commitment to produce PRs, put time into review and come to consensus, and produce tests. Given time constraints, this needs to be balanced against other work. Considering the current rate of progress of RDF documents, the WG limiting factors for doing the completing work is not just WG discussion, although discussion time itself is a constraint. |
I wrote down these examples before reading your answer - I think we separately arrived at a similar notion :-) Regarding the syntax issue, an early attempt for RDF/XML:
The starting
Verbosely (same way as above):
Or, using a shorthand (by analogy with nodeID for reified statements)
|
It's not the same 😄 <<( ex:john ex:married ex:paul )>> ex:place ex:CanterburyCathedral . is not the same as _:r1 rdf:reifies <<( ex:john ex:married ex:paul )>> .
_:r1 ex:place ex:CanterburyCathedral . The latter is the expansion of occurrence syntax: << ex:john ex:married ex:paul >> ex:place ex:CanterburyCathedral . The subject-triple-term form (case 1 in this comment) is a strange thing to say. The triple term is the abstract fact, not it's use in a graph to give the RDF Model. Facts don't have a "place" - the event being modelled (reified) has a place. This is the problem with the RDF-star CG report and why several people are arguing against triple-terms as subject. |
Reusing Doable but looks like more significant impact to existing implementations which can be avoided. A specific tag would be simpler e.g. Maybe more important to put time into occurrence syntax for RDF/XML. |
I was referring to the XML syntax structure (-:
I've heard this argument in different forms and using different terms (abstract fact / type, propositions, conceptualizations), but I haven't heard why it wouldn't be possible, or why it would even be strange. RDF is in the business of describing, well, resources, which a triple term is. Please read before answering - in many cases, there is a good reason why it is not advisable from a modelling perspective, as the seminal example shows (and other cases). The same problem exists in databases with the choice between surrogate and composite primary keys in databases.
|
Good point.
Yes, that could be an alternative. I was trying to retain the standard practice (as I saw it) of having a subject represented by an
Also an option.
Fair enough - the goal of this issue is to check the possibility of triple terms in subject position. |
Probably a discussion for the RDF Primer work going on. w3c/rdf-primer#16, section 3.5. |
To repeat the argument I made in last Friday's Semantics TF call: Most use cases for RDF-star are concerned with the occurrence of a statement, i.e. with some specific instance (the reification represented by the reifier) of the statement (the type represented by the abstract triple term), e.g. recording its provenance or adding qualifying detail. In that respect I still think that it is the right decision to let the syntactic shortcuts in Turtle-star refer to such instances, identified by reifiers, and not to all statements of that type. However, it recently occurred (no pun intended) to me that this design went a step too far in that it also aims to resolve a more general problem, namely that of multi-part statings. We had examples like the following to illustrate how multi-part annotations on triple terms would immediately lead into a messy mix-up of annotations, e.g.:
Letting Bob's and Carol's provenance annotations refer to different instances, represented by different reifiers, reliably avoids that problem. So far, so good! However, triple terms are terms in the general sense of RDF - just like IRIs and blank nodes, and those other RDF terms in general suffer from exactly the same problem, as James recently pointed out to me in private conversation, e.g.:
I think that everybody will agree that mimicking our approach to triple term reifications and suggesting to replace all the reference to In conclusion it seems clear to me that if we forbid triple terms in subject position we are merely trying to cure the symptoms of a much more general problem. That can't work because our approach to triple term reification doesn't scale to the whole of RDF, and therefore we should drop the attempt. The syntactic sugar that Turtle-star provides should be enough to save statement annotation from the "seminal mistake" and facilitate predictable and dependable modelling of annotations in practice. However, we should refrain from trying to design an overly tight straight jacket Use cases: the ones brought forward - Risk: whatever we decide, we most probably won't be able to change it later:
All things considered, it seems prudent to stay on the permissive side. |
I agree.
I draw the opposite conclusion because while it may well be that implementations are affected either way, but not having triple terms as subjects, then adding the possibility, does not invalidate the earlier data whereas removing a feature does. |
Not data, but code, and that will generate push back against changing the definition of what is allowed in the data. So it it will become impossible. |
Well, I presume the The simple propositions, when stated, are simply true. Not contextually (conditionally/temporally/situationally). That is the directness, and limitation, of RDF statements. But the simple :Berlin :likedBy :John
{| a :Occasion ; :in :Spring ; :of 2015 |}
{| a :Occasion ; :in :Autumn ; :of 2017 |} ;
:likedBy :Mary
{| a :GeneralSentiment ; :in :Autumn |} . No need to remodel using e.g. prov:specializationOf and break all simple "find all who liked Berlin" queries with intermediary bnodes ("fluent stand-ins", if you will). Are there scenarios where one would need to make conflations between propositions and detailed circumstances? Or at least where a reifier would significantly complicate the use case, rather than aid in avoiding conflations emerging due to an oversight? |
My question (which, I have to say, I have been trying to get a straight answer to) - (please read until the end) Is there a fundamental/semantic/philosophical/... problem with describing triple terms directly? E.g., in my local KG, I would like to say
Is this semantically/philosophically/... speaking bonkers/strange/..., or not? Note - I'm not talking about what is the most robust/sensible modeling solution. There seem to be mixed signals about this. |
@william-vw Your example uses shorthand reifier/occurrence syntax and has a triple term in the object position. Is that intended? |
@afs thanks! it was not the intention... |
Of course.
I say yes, and refer to the notion(s) of truth-bearers for details (as all things philosophical, indefinitely debated). I still accept to introduce a "simple meaning as truth" version of propsitions as resources which we can directly refer to in RDF. Because we've always had those in RDF: "An RDF triple is said to be asserted in an RDF graph if it is an element of the RDF graph" (emphasis mine). This may in the process even make more sense of << :Bob :interestedIn :TheMonaLisa >> a :Statement ;
:establishedAt 2004 ; :establishedBy :Alice .
<< :Bob :interestedIn :TheMonaLisa >> a :Statement ;
:establishedAt 2007 ; :establishedBy :Bob . But I see many problems (many of them more subtle at first, hiding the very problems that using reifiers can instead solve) if concrete syntaxes were to support triple terms as subject of expressions (as we've already seen in RDF-star examples over the years). Just as I see little to no practical value for using literals as subjects in data aspiring to be interoperable in an open world, and lots of potential conflation and misunderstanding. Conceptually getting to grips with shared identities is tricky enough as it is. But this is not to say that I don't see that, at least under "laboratory conditions", you may want to reason in all directions: _:reifiedBy owl:inverseOf rdf:reifies .
:propositionEstablishedAt owl:propertyChainAxiom (_:reifiedBy :establishedAt) .
:propositionEstablishedBy owl:propertyChainAxiom (_:reifiedBy :establishedBy) .
<<( :bob :interestedIn :monaLisa )>> a rdf:Proposition ;
:propositionEstablishedAt 2004, 2007 ;
:propositionEstablishedBy :Alice, :Bob .
2004 :dateOfPropositionEstablished <<( :bob :interestedIn :monaLisa )>> . This is where Symmetric and Generalized RDF has its place. And I expect any RDF library supporting OWL implementations on top to allow that form internally. Given the lack of real use cases for it, old examples showing the problems, and following the rule of least power, I am thus much more concerned about supporting triple terms as subjects, in concrete syntaxes, at this point. I'm far less worried about RDF implementations lacking support for what is needed for reasoning to happen; I think most already do. As does SPARQL. |
@niklasl thanks for weighing in. Trying to understand your reply -
Are you referring to the fact that triple terms represent statements that may or may not have been asserted? Else I am unsure what you are saying. |
No, IMO it is not bonkers/strange. For more discussion, especially w.r.t. robustness, see my comment above. |
It's common place, and result of the limitations of a strictly binary formalism, so nothing to be astonished about.
IMO your argument falls prey to the classic "If you have a hammer, every problem looks like a nail". Would you suggest that all the examples above would better be modeled with the shiny new triple term mechanism? That will only make me come up with more complex examples ;-) My point is: the problem that the focus on occurrences and its supporting syntactic sugar solves is actually a deeper problem in RDF. That doesn't, IMO, make the approach to statement annotation via occurrences wrong. But it does speak against making the approach a general rule. @william-vw 's example above is neither wrong no bonkers. It is problematic, yes, but not more problematic than the examples about :John's activities above. So it's a general problem of RDF, not of triple terms specifically. So we shouldn't try to address it in piece meal fashion only for triple terms, and against the way RDF in general works. |
I am looking to what triple terms should denote: formal, atomic propositions. The "meaning itself", as a logical atom/unit/element of a model (an interpretation, |
That example does appear very odd. Can you point me to a dataset with such data? (Does the tired John who (exists?) on Sunday and buys bread like swimming at Woolworth during breaks while reading when overworked?)
Without knowing the application, I'd assume that model is too simple for any interoperable use. If you still need (some) of those simple predicates ( Note that I do not think reifiers should be front and center of any reasonable modelling; they work for marginalia about the occasional detail in an otherwise preferably flat model of direct predicates. (It's a hammer I'd use for banging in nails that stick out from the walls; and only if walls are what the model needs.)
But why, instead of addressing those, add a new dimension of potential error, without motivating use cases? I don't see reifiers as any more against how RDF works than to use date literals as objects but not to denote I'd rather avoid opening up such a new category of conflation and, after the fact, try to invent new kinds of "punning" heuristics, if even possible (which may lead to attempts to solve it by nesting more triples in triples; more likely to just aggravate the problem). |
The example illustrates how easy it is to mix up multi-part statements, not just multi-part annotations. Not every issue is best represented by a concrete use case, some problems are easier to understand with a more abstract example.
I'm aware that that is your perspective, and I guess you are aware that mine is quite different. I'm not so much interested in the "marginalia" aspect (and I'm also not working a s a librarian) but in qualification of statement as a way to make complex relations more accessible, easier to understand. IMO this approach is the core of LPG's success and we should take it very serious. But I'll now try to re-start this discussion, without getting sucked into a spiral of comments on comments on comments... ;-) Over the holidays I thought more about this issue. As you know, just like you I was long strongly opposed to triple terms in subject position, because most use cases deal with occurrences, i.e. specific instances of a statement, and triple terms are types and therefore need the introduction of an indirection in modelling that creates an identifier for that occurrence. That is easy to oversee or forget, and that may create problems down the road and at least require re-modelling, hence your reservations (and mine, earlier). There are however valid use cases for speaking about a statement itself, as a type, beyond RDF meta modelling, e.g.:
I was long thinking that such use cases can be covered by strictly understanding them as instances, e.g. the instance of asserting the source or credibility of a statement. However, that’s not doing them justice: they are entities of interest in their own right, just as well as an IRI referring to such an entity (e.g. the IRI of a webpage explaining Einstein's theory, etc). The occurrence would then be the act of stating them, but that's not the statement itself, and therefore only a hackish resolution to the problem of annotating the statement itself.
That at its heart is exactly the same problem as with multipart annotations on triple terms. To illustrate we could, to mimic the occurrences approach to triple terms, introduce an occurrence syntax for terms, like
Beware, I’m not suggesting this! I just want to illustrate the principle of what we did with the occurrence syntax for triple terms, and how it deviates from how the problem is handled in RDF in general. In general we get by with simply :John and then all sorts of branching out via blank nodes etc. That is just how RDF is designed. We shouldn’t aim to treat triple terms completely differently. Certainly the majority of use cases is occurrence oriented, and that majority is well covered by the easy to use and succinct annotation syntax, avoiding the pitfalls described above. IMO that should be enough of a safeguard and should nudge modelling in the right direction. In fact I think it has to be enough because to do more, by preventing valid modelling of propositions as entities of their own right, is blocking equally valid use cases. They may be rarer, but they are not as rare and outlier-ish as I thought. Therefore I became convinced that it is overanxious and overreaching to force them into a corner where they can only be realized as hacks. |
I believe the current consensus of the Semantic TF is that triple terms should be allowed in subject position. (E.g., see the current proposal for the RDF 1.2 semantics)
It was raised before that one of the barriers for this is the RDF/XML syntax.
This issue is meant to gather comments on these syntax issues, and allow for discussion in a working group meeting.
PS: I am unsure how triple terms would currently be represented as objects in RDF/XML. I believe the current RDF 1.2 document only refers to reification which is the same as the RDF 1.1 section. I have some ideas for representing triple terms but I would first like to know if there is existing work.
The text was updated successfully, but these errors were encountered: