The specification is intended for discussion within the RDF Stream Processing Community Group. Its content does not yet represent the consensus of the Community Group.
This specification is incomplete.
The metamodel for the abstract syntax of RDF streams is defined in the namespace http://www.w3.org/ns/rsp#
and we use the prefix rsp
for this namespace. Additional prefixes used in this document are the following:
prefix | namespace IRI | definition |
rsp | http://www.w3.org/ns/rsp# | The RSP rnamespace |
rdf | http://www.w3.org/1999/02/22-rdf-syntax-ns# | The RDF namespace [[!RDF-SCHEMA]] |
rdfs | http://www.w3.org/2000/01/rdf-schema# | The RDFS namespace [[!RDF-SCHEMA]] |
xsd | http://www.w3.org/2000/10/XMLSchema# | XML Schema Namespace [[!XMLSCHEMA11-2]] |
owl | http://www.w3.org/2002/07/owl# | The OWL namespace [[!OWL2-OVERVIEW]] |
owl-time | http://www.w3.org/2006/time# | The OWL-TIME namespace [[OWL-TIME]] |
prov | http://www.w3.org/ns/prov# |
The PROV namespace [[!PROV-DM]] |
dul | http://www.ontologydesignpatterns.org/ont/dul/DUL.owl# |
The DOLCE+DnS Ultralet namespace (http://lov.okfn.org/dataset/lov/vocabs/dul) [[DUL]] |
(others) | (various) | All other namespace prefixes are used in examples only. In particular, IRIs starting with "http://example.com" represent some application-dependent IRI [[!IRI]] |
Let T^{all} denote the class of all temporal entities. This specification is neutral regarding the formal specification of temporal entities.
We provide an identifier, rsp:TemporalEntity
, for the rdfs:Class
of all
temporal entities.
The following material needs attention from an individual who has reviewed the literature regarding temporal ontologies, including but not limited to http://www.omg.org/spec/DTV/1.0/ http://www.ihmc.us/users/phayes/docs/timeCatalog.pdf http://protege.cim3.net/cgi-bin/wiki.pl?SWRLTemporalOntology http://dl.acm.org/citation.cfm?id=1741341 http://stl.mie.utoronto.ca/publications/psl-chapter.pdf There is more detail in https://github.com/streamreasoning/RSP-QL/issues/25
An instant is an instantaneous point of time on underlying axis and is isomorphic to numbers. Henceforth a time interval is the interval between two instants. A semi-infinite time interval is an interval where one instant is set to infinite. Furthermore if both instants are set to infinite, we call the interval bi-infinite. As a consequence a temporal entity is an entity associated with an instant or a time interval[Dyreson94,Allen83].
Add a out of scope section
Add Stream Profiles and Hierarchies, e.g., combined_time, timepoints, periods, approximate_point, interval_meeting, owltime_instants and owltime_interval
An RSP timestamp predicate P
is an rdf:Property
with range rsp:TemporalEntity
that is
associated, through the rdf:Property
rsp:hasTimePlenum
, with a unique partially-ordered set (poset) of
temporal entities T_{P} ⊆ T^{all}
, called the time plenum of the predicate.
The time plenum is associated, through the rdf:Property
rsp:hasPartialOrder
,
with a unique partial order ≤_{P}
that is in the class
rsp:ChronologicalPartialOrder
.
The usual mathematical requirements of a partial order MUST be satisfied by an instance of
rsp:ChronologicalPartialOrder
. In particular, for all instances X
, Y
, Z
of the time plenum, the following properties hold:
X ≤_{P} X
X ≤_{P} Y
and Y ≤_{P} X
implies X =
Y
X ≤_{P} Y
and Y ≤_{P} Z
implies X ≤_{P}
Z
. Further, an instance of rsp:ChronologicalPartialOrder
MUST respect the natural order of time.
In
particular, if every time instant within the closure of temporal entity X
is earlier than every time instant within the closure of temporal entity Y
,
then X ≤_{P} Y
(where closure of a time instant t
is defined as
the degenerate interval [t, t]
, and closure of an interval is defined in
the usual way)
The above definition regarding natural order of time depends on the concept of time instant as the primitive of temporal entities that is universal amoung temporal ontologies and has a particular chronological order associated with them. However, not all temporal ontologies have time instants as their primitives, e.g., OMG. Further, branching temporal ontologies have a different set of time primitives than linear ones. So this needs to be fixed.
A timestamped graph is defined as an RDF Dataset under the RDF Dataset semantics that "each named graph defines its own context" (see ) and where a particular triple in the default graph has been designated as the timestamp triple, with the following constraints:
<n, G>
in the RDF Dataset (where
G
is an RDF graph, and n
is an IRI or blank node).<n, p, t>
, where n
is
defined in the previous item, and p
is a timestamp predicate that captures
the relationship between the temporal entity t
, called the timestamp, and
the graph G
.There may be multiple triples in the default graph, including multiple triples using timestamp predicates, but exactly one triple must be designated as the timestamp triple. The objects of any triples that use a timestamp predicate but are not the timestamp triple of the timestamped graph are not called timestamps.
Due to the assertion of the timestamp triple, the context referred to in "each named graph defines its own context" has a temporal aspect to it. Other aspects of this context may be asserted by additional triples in the default graph of the timestamped graph. Such triples are not required to have timestamp predicates, and thus may be about non-temporal aspects of the context, e.g., the authority or the sensor device. That is, we require the context to have a temporal aspect to it, but the context is not limited to temporal aspects. Thus, it would be misleading to call it a "temporal context".
The part of the definition above that points to RDF Dataset semantics really belongs in the semantics section, not in this definition, which should be purely syntax. We need additional informative text that gives the motivation for these definitions.
This definition does not permit the timestamp to be omitted, which is one of the data structures that is considered to be in-scope by the requirements document.
A sequence of RDF graphs (or named graphs, or RDF datasets) MAY be physically received by an RSP engine, which MAY then create an RDF stream from it by adding timestamps, e.g. indicating the time of arrival. The original sequence is not itself an RDF stream.
This definition allows the timestamp to be an IRI or blanknode. Additional information about the timestamp may be provided in the default graph (e.g., through properties in the OWL Time Ontology), but this is not required by the definition of timestamped graph.
A literal-timestamped graph is a timestamped graph whose timestamp
t
is an rdfs:Literal
.
Merge and union of RDF streams with non-literal-timestamped graphs may not be defined. See .
The timestamp predicate p
may be drawn from a community
agreed vocabulary (Issue
10). The timestamp predicate may also be user-defined.
The format in the examples of this document follows TriG, although does not imply any specific serialization or formatting; it simply
shows the data structured according to the RDF stream model. When the default graph of a
timestamped graph contains only one triple, this must be the timestamp triple, so there is
no need of an additional format to designate it. In examples of timestamped graphs having
more than one triple in the default graph, the first triple of the default graph to occur
in the serialization is assumed to be the timestamp triple. Prefixes (e.g.
prov:
, dul:
) are used for readability; their expansions are defined in
Table 1.
A non-normative subsection should be created to hold all the information about the formatting of examples, and there the expansion of all prefixes that are used in examples can be defined.
According to the semantics defined in , the
assertion of the named graph pair means that the graph denoted by :g
entails
the two triples :axel :isIn :RedRoom
and :darko :isIn :RedRoom
,
under whatever entailment regime is being considered. It does not assert those triples
directly, nor does it assert that these triples are actually in that graph. Further, it
does not rule out additional entailments of :g
. These details are best
explained in the semantics section itself, although it would probably be helpful to have
some informative explanation near the beginning to avoid confusion.
Given any two timestamped graphs, TSG1
and TSG2
, we say that
TSG2
covers
TSG1
(denoted TSG1 ≲ TSG2
) if and only if TSG1
and TSG2
have the same timestamp predicate P
and the timestamps,
t1
and t2
resp., satisfy t1 ≤_{P} t2
,
where ≤_{P}
is the temporal partial order associated with the time plenum
of the timestamp predicate P
.
The relation ≲
between timestamped graphs is a preorder (a
reflexive, transitive binary relation). It is not a partial order because it doesn't have
the antisymmetry property (a≲ b
and b≲ a
implies a
= b
.)
An RDF stream
S
consists of a sequence of timestamped graphs, called its elements, such that
elements sharing the same
timestamp predicate are ordered by the partial order associated with this timestamp predicate.
I.e., if a stream S
contains elements S(i)
and S(j)
with i < j
and S(i)
covers S(j)
, then the
timestamps of S(i)
and S(j)
are equal.
If a timestamp predicate occurs in the timestamp triple of an element of an RDF stream, we say it is a timestamp predicate of the stream.
Time-boundedness properties on RDF streams behave better if it is required that the set of temporal entities for each timestamp predicate is pairwise bounded. I.e., for each pair of temporal entities in the set, there is a temporal entity in the set that is an upper bound of both, as well as a temporal entity in the set that is a lower bound of both. This property is not satisfied by branching temporal structures, but could be a requirement of some profile.
The comparability between any pair of elements of an RDF stream must (or should?) be completely determined from the default graphs of those elements, or elements that precede at least one of them. Otherwise the ordering could be revealed by a subsequent element, inducing retroactively an ordering requirement on a previous pair of stream elements.
On the following we may refer to RDF stream simply as stream.
There can be multiple graphs with the same timestamp in the stream.
It has been pointed out that this statement might be problematic, as graphs could no longer be used for punctuation purposes. Comparatively, we have not found a constraint on this in similar models e.g., CQL: there could be zero, one, or multiple elements with the same timestamp in a stream.
Two RDF timestamped graphs TSG1 and TSG2 are isomorphic if and only if there is a bijection M between the nodes, triples, graphs and named graphs in TSG1 and those in TSG2 such that:
The definition of timestamped graphs allows blank nodes to be used as graph names, as well as within triples.
:_1 {...} {:_1, prov:generatedAtTime, t1}
:_2 {...} {:_2, prov:generatedAtTime, t1}
Two RDF streams are S-isomorphic if and only if they have the same set of elements.
:g1 {...} {:g1, prov:generatedAtTime, t1} :g2 {...} {:g2, prov:generatedAtTime, t2} :g3 {...} {:g3, prov:generatedAtTime, t2} :g4 {...} {:g4, prov:generatedAtTime, t3}
:g1 {...} {:g1, prov:generatedAtTime, t1} :g3 {...} {:g3, prov:generatedAtTime, t2} :g2 {...} {:g2, prov:generatedAtTime, t2} :g4 {...} {:g4, prov:generatedAtTime, t3}
Two RDF streams S1 and S2 are B-isomorphic if and only if there is a bijection M between the nodes, triples, graphs, named graphs, and timestamped graphs that occur in S1 and those that occur in S2 such that:
An RDF stream is viewed as being on a single "RDF surface"(see [[BLOGIC]]), so that blank nodes may be shared between any graphs in the stream. For this reason, B-isomorphism is defined in terms of a single mapping M for the entire RDF stream rather than, say, separate mappings for each timestamped graph.
Two RDF streams are isomorphic if there exists an RDF stream that is both B-isomorphic to one stream and S-isomorphic to the other stream.
RDF streams that are S-isomorphic are isomorphic.
RDF streams that are B-isomorphic are isomorphic.
Isomorphic RDF streams MUST have the same semantics. The semantics of RDF streams is affected by the result of applying window functions () as well as by entailment. Therefore, isomorphic RDF streams SHALL be indistinguishable, up to isomorphism, with respect to entailment (in any entailment regime), as well as with respect to the application of window functions.
A substream
S'
of an RDF stream S
is an RDF stream that is isomorphic to a
subsequence of S
.
There are several specializations of the substream relation which are useful in practice. These include the filter relation, where stream elements are selected based on satisfaction of some criterion, and the window relation, where a temporally contiguous portion of the stream is selected.
A window
S'
of an RDF stream S
is a substream of S
such that
if S'(i)
and S'(j)
are two elements of S'
and
S'(i) ≲ S'(j)
(i.e., S'(j)
covers S'(i)
), and
further if S'(i) ≲ S(k) ≲ S'(j)
for some element S(k)
of
S
, then S(k)
is an element of S'
.
Informally, a window is a temporally-contiguous selection from the original stream, without gaps.
In order to combine two RDF streams into one, without loss or gratuitous introduction of entailments, we define two relations, RDF stream merge and RDF stream union. These relations are inspired by the concepts of similar name for RDF graphs defined in [[RDF11-Concepts]].
RDF stream union is a trinary relation on RDF streams. Let S1
,
S2
, and S3
be RDF streams. We say S3
is an RDF
stream union of S1
and S2
if and only if the set of
elements of S3
is equal to the union of the sets of elements of S1
with S2
.
Union is the appropriate relation to consider when combining streams that are allowed to share blank nodes. Informally, they would be on the same RDF surface.
RDF stream merge is a trinary relation on RDF streams. Let S1
,
S2
, and S3
be RDF streams. We say S3
is an RDF
stream merge of S1
and S2
if and only if there exist
streams S4
, S5
, and S6
, with S4
isomorphic to S1
, S5
isomorphic to S2
,
S6
isomorphic to S3
, such that S4
and
S5
have no blank nodes in common, and S6
is the union of
S4
with S5
.
Merge is the appropriate relation to consider when joining streams that are considered to not share blank nodes. Informally, they would be on different RDF surfaces.
Given any three RDF streams, the determination of their satisfaction of the union or merge relation is a purely syntactic computation. That is, it is not necessary to consider entailments of the RDF streams or to be concerned with the values of literals.
The continuous operation of union or merge of two semi-infinite streams may not be computable, even when the operation is defined. This is because it may not be possible to know when an element of one stream might be received that needs to occur before the latest elements of the other stream. Computability of these continuous operations may be possible when additional information is available about the streams to be combined, e.g., the set of timestamp predicates used by each stream and the maximum latency of transmission. Within some RDF stream profiles, the union and merge operations may be continuously computable due to the profile constraints.
This section may be expanded with more subclasses if a need is identified.
A time-bounded-above RDF stream is an RDF Stream where for every timestamp predicate of the stream, there is a temporal entity in its range that bounds from above (is greater than or equal to) all timestamps of that predicate in the stream.
A time-bounded-below RDF stream is an RDF Stream where for every timestamp predicate of the stream, there is a temporal entity in its range that bounds from below (is less than or equal to) all timestamps of that predicate in the stream.
A time-bounded RDF stream is an RDF Stream that is time-bounded above and time-bounded below.
Consider the class of timestamp predicates whose associated poset of
temporal entities is pairwise bounded, i.e., has the property that every pair of
temporal entities is bounded, i.e., there exists a temporal entity in the poset that is
greater or equal to both of them and another temporal entity that is less or equal to both
of them. This is a fairly natural property to expect for partial orders on time plenums with "linear topology" (e.g.
[[OWL-TIME]]). However it does not hold in the case of branching time plenums (see http://www.ihmc.us/users/phayes/docs/timeCatalog.pdf), because there are incomparable
branches (called paths), which may be considered, e.g., as different worlds (in the sense of
Kripke frames), scenarios (planning usecase), or concurrent processes (process modelling).
It would be a useful property if the class of time-bounded RDF streams were closed under
stream merger and union, and this property would hold, under the current definition, in the
case of pairwise-bounded time plenums, because we could find bounds for the temporal entities
of each timestamp predicate by taking an upper bound of the pair of upper bounds from the
two streams, and similarly for the lower bounds. However, the closure property does not hold
in the case of a branching temporal topology under the above definition of time-bounded. To
achieve the closure property, we would have to allow a number of "upper bounds" for each
timestamp predicate, rather than requiring just one. For example,
Given two
sets, A and B, of entities in a partial order, we say that A bounds B from above iff there
is no element of B that is greater than any element of A, and for every element of B there
is some element of A that is greater than or equal to it.
A
time-bounded-above RDF stream is an RDF Stream where for every timestamp
predicate of the stream, there is a set of temporal entities in its range that
bounds from above all timestamps of that predicate in the stream.
The class of time-bounded RDF streams using timestamp predicates with pairwise-bounded time plenums is closed under stream merger and union.
A temporal-count-bounded RDF stream is an RDF stream where for each timestamp predicate of the stream, there are a finite number of temporal entities in the stream that are timestamps for that predicate.
The qualifier "temporal" has been added to this term to emphasize that it is temporal entities that are being counted, not timestamped graphs.
Given an RDF stream whose set of timestamp predicates with pairwise-bounded temporal entities, then if the stream is temporal-count-bounded it is also time-bounded, since the upper bound for each timestamp predicate can be taken as an upper bound of the (finite) set of its timestamps, and similarly for the lower bound. If the definition of "time-bounded" was changed as in the earlier Editor's Note, then every temporal-count-bounded RDF stream would be time-bounded.
Every temporal-count-bounded RDF stream using only timestamp predicates with pairwise-bounded time plenums is time-bounded.
The class of temporal-count-bounded RDF streams is closed under stream merger and union, since the sets of temporal entities for each timestamp predicate in the resulting stream are the union of the corresponding (finite) sets in the original streams, and so are finite.
A finite RDF stream is an RDF stream of finite length, i.e., with a finite number of elements in it.
Clearly, every finite RDF stream is temporal-count-bounded.
Every finite RDF stream using only pairwise-bounded time plenums is time-bounded. However, this is not the case for branching time plenums.
Clearly, the class of finite RDF streams is closed under stream merger and union.
A window function is a partial function from RDF streams to their windows that
preserves isomorphism. That is, if w
is a window function, with isomorphic
streams S1
and S2
in its domain, then w(S1)
is
isomorphic to w(S2)
.
A general window function is a window function that is a total function on RDF streams.
The term "window operator" is reserved for later use to return a sequence of windows.
The most common types of window functions in practice are time-based and count-based.
Because the time plenum for each timestamp predicate is partially ordered, we may define a temporal interval of a timestamp predicate to be an interval, in the usual sense for partial orders, within its time plenum.
Recall that intervals need not be bounded and need not be closed, and are specified in terms of two, one or zero inequality conditions based on the partial order or its induced strict order.
A general time-based window function
w
is a general window function specified by a finite set
w_{P}
of timestamp predicates together with temporal intervals
w_{J}(P)
of each timestamp predicate P
in
w_{P}
, such that for every stream S
, an element
S(i)
of S
is an element of w(S)
if and only if the
timestamp predicate P
of S(i)
is in P
and the
timestamp t
of S(i)
is contained in
w_{J}(P)
.
A time-based window function is the restriction of a general time-based window function to a subset of RDF streams.
The substream resulting from the application of a time-based window function is time-bounded.
A general temporal-count-based window function
w
is a general window function specified by a finite set
w_{P}
of timestamp predicates together with semi-infinite temporal
intervals w_{J}(P)
of each timestamp predicate P
in
w_{P}
with endpoints w_{T}(P)
, and positive
integers w_{N}(P)
, for each timestamp predicate P
in
w_{P}
such that for every stream S
, an element
S(i)
of S
is an element of w(S)
if and only if the
timestamp predicate P
of S(i)
is in P
and
t
is the timestamp of S(i)
,T(S, P, w)
is the set of temporal entities that are timestamps for
elements of S
with timestamp predicate P
, belong to
w_{J}(P)
, and belong to the closed interval between
t
and w_{T}(P)
, andT(S, P, w)
is less than or equal to
w_{N}(P)
A temporal-count-based window function is the restriction of a general temporal-count-based window function to a subset of RDF streams.
The substream resulting from the application of a temporal-count-based window function is temporal-count-bounded.
Due to the potential for stream elements with incomparable or duplicate timestamps, the number of elements in the substream having a particular timestamp predicate is not guaranteed to be equal to the depth specified for the predicate by the temporal-count-based window.
Temporal-count-based window functions with future-facing orientation on timestamp predicates whose time plenum is not totally-ordered are not computable, because in general it is not possible to know, at any point in the reception of the stream, whether there are further elements of the stream that would be selected by the temporal-count-based window function.
Applications that require a substream with an exact number N
of elements for
a specified timestamp predicate might apply a temporal-count-based window function with
N
for the count of temporal entities, and then randomly discard extra
elements, according to some criterion, e.g., extreme elements (maximal or minimal,
depending on the orientation of the counting). However, this extra step causes the process
to be nondeterministic, and hence does not correspond to a function. If elements are
discarded in a nonrandom fashion, e.g., based on their order in the stream sequence, then
this would be a function, but would not preserve isomorphism, and so would not be a window
function. The issue of obtaining a window function that returns an exact number of
elements (for a particular predicate) is handled in the next section, where we define the
concept of window relation, and use it to define an element-count-based window relation.
When restricted to certain kinds of RDF streams, element-count-based window relations are
functional so element-count-based window functions can be defined on such subsets. This
and similar considerations, of importance to implementations, motivate the RDF Stream profiles as subsets of RDF streams.
A window relation is a binary relation on RDF streams (a relation having an
extension which is a set of pairs of RDF streams) such that the second element in the pair
is a window of the first element and preserves isomorphism. That is, if <S1,
S2>
is a member of the extension of a window relation W
, and
S3
and S4
are isomorphic to S1
and
S2
, resp., then <S3, S4>
is also a member of the
extension of W
.
A length-based window relation
W
is a window relation specified by a set of streams
W_{S}
, a finite set W_{P}
of timestamp
predicates together with the following parameters for each timestamp predicate
p
in W_{P}
:
W_{J}(P)
of timestamp
predicate P
(the interval of P
);W_{T}(P)
which is the finite endpoint of
W_{J}(P)
(the temporal reference of P
);W_{N}(P)
(the length for
P
)S
be an RDF stream in W_{S}
with window
S'
. Define T(S, P, W)
to be the set of temporal entities that
are timestamps for elements of S
with timestamp predicate P
and
belong to W_{J}(P)
, and T(S', P)
to be the set of
temporal entities that are timestamps for elements of S'
with timestamp
predicate P
. The pair <S, S'>
is a member of the extension
of W
if and only if the cardinality of T(S', P)
is equal to the
minimum of W_{N}(P)
and the cardinality of the set T(S, P,
W)
and for each element S'(i)
of S'
P
of S'(i)
is in
W_{P}
, andt
of S'(i)
is in
W_{J}(P)
.A length-based window function
W
on domain W_{S}
is a length-based window relation
where any member <S, S'>
of its extension are such that S
is in W_{S}
and W
defines a total function on
W_{S}
.
The names "element-count-bounded window relation" and "element-count-bounded window function" have been proposed, as a replacement of "length-bounded ...". This nomenclature depends on the adoption of "element" as the term for the individual timestamped graphs in a stream.
this example could be integrated to the main text body
Beyond time instants: intervals & more
Usign the previously described model, intervals can be specified for a graph in the following way: Given p1 and p2 representing start and end time predicates, then
(g,p1,t1)
and(g,p2,t2)
denote that g is defined in an interval [t1,t2]. As an example::g_1, :startsAt, "2015-06-18T12:00:00"^^xsd:dateTime :g_1, :endsAt, "2015-06-18T13:00:00"^^xsd:dateTime
Or even:
:g_2 :validBetween [:startsAt "2015-06-18T12:00:00"^^xsd:dateTime; :endsAt "2015-06-18T13:00:00"^^xsd:dateTime]
The semantics of a timestamped graph is defined in terms of its semantics as an RDF Dataset. In particular, the designation of a particular triple in the default graph as the timestamp triple has no effect on its semantics.
The semantics of timestamped graphs, and consequently of RDF streams, is based on the semantics formalized in [[RDF11-Datasets]] in the case that each named graph defines its own context.
The following terms are used in the sense of [[RDF11-MT]]: entailment regime, E-interpretation, blank node, universe, named graph pair, RDF graph, E-entails, default graph.
An RSP interpretation
I
with respect to an entailment regime E
is an E-interpretation
extended to named graphs, timestamped graphs, RDF datasets, and RDF streams as follows:
A
from blank nodes to the universe IR
and a
named graph pair ng
= <n, G>
, [I+A](ng)
is
true if and only if [I+A](n)
is an RDF graph that E-entails
G
;TSG
= < ng, DG>
, where
DG
is the default graph and ng
is the named graph pair of
TSG
,and a mapping A
from blank nodes to the universe
IR
, [I+A](TSG)
is true if and only if [I+A](DG)
is true and [I+A](ng)
is true;I(TSG)
true if and only if there exisits a mapping A
from
blank nodes to the universe IR
such that [I+A](TSG) is true.D
= < DG, NG>
, where DG
is the default graph and NG
is the set of named graph pairs of
D
, I(D)
is true if there exists a mapping A
from blank nodes to the universe IR
such that [I+A](DG)
is
true and [I+A](ng)
is true for every ng
in
NG
;I(D)
is false otherwise.S
, I(S) is true if and only if if there exists a
mapping A
from blank nodes to the universe IR
such that for
every element S(i)
of S
, i ≥ 1
,
[I+A](S(i))
is true.I
E-satisfies a graph, named graph
pair, timestamped graph, dataset, or stream X
(or satisfies A
w.r.t the E-entailment regime) when I(X)
is true.
Following standard terminology, we say that a graph, named graph, timestamped graph,
dataset, or stream X
RSP-E-entails a graph, named graph, timestamped graph,
dataset, or stream Y
if and only if for every RSP interpretation I
with respect to E-entailment, I
E-satisfies Y
whenever
I
E-satisfies X
.
The "RSP" prefix in "RSP-E-entails" may be dropped when neither antecedent nor consequent is an RDF dataset or named graph, as in that case there is no possibility of alternate dataset semantics.
It should be straightforward to prove that isomorphic RDF streams simply entail each other, and hence are logically equivalent under simple entailment. This should also hold for other standard entailment regimes, and perhaps should be required for all RSP entailment regimes.
The notion of RSP-E-entailment is defined here so that it may be used to
support the definition of query semantics under various entailment regimes in the [[RSP-QL
Queries]] document, including simple entailment. A choice was made regarding the semantics
of named graph pairs (i.e., [I+A](ng)
is true if and only if
[I+A](n)
is an RDF graph that E-entails G
) that affects the
semantics of timestamped graphs, datasets, and streams. It remains to be seen if this choice
supports query semantics in a manner that meets the needs of the RSP-QL community. In
particular, it imposes constraints on the denotation of the name of a named graph pair which
may be inconsistent with the domain of a predicate that might be considered for use as a
timestamp predicate (e.g., SSN predicate ssn:observationSamplingTime
).
It is possible to restrict the abstract syntax for a class of RDF streams. This is often done in order to facilitate the efficient implementation of certain operations and queries or to promote more efficient representation. Each such restriction constitutes an RDF stream profile.
The profiles defined in this document fall into two categories: time-series profiles and linked-list profiles. Within each category there is a least-restrictive profile and a number of subprofiles which apply additional restrictions to it.
This section describes four profiles which are defined through restrictions on the default graphs of the stream's timestamped graph elements or on the relation between timestamp values:
The RDF Stream Time-Series profile is designed to support high-volume, low-latency window operations and queries that depend on full knowledge of timestamps.
This motivational paragraph needs some work - the above is a place holder.
An RDF time series is an RDF stream that satisfies the following properties:
If some property (e.g., in the RSP-QL namespace) is defined as a
subproperty of prov:generatedAtTime
, then this can be substituted in Example 4, which will then serve as an example of a
time series. Alternatively, the definition could be changed so that the only requirement
is that the timestamps that occur in the stream belong to xsd:dateTime.
The set of RDF time series is closed under RDF stream merge and union on time series that use the same timestamp predicate.
An "RDF distinct time series" is an RDF time series such that no two elements in the time series have the same timestamp.
An RDF distinct time series has a unique sequential order - it is S-isomorphic only to itself.
The merge or union of RDF distinct time series is always defined but is not necessarily an RDF distinct time series, even if the operation is restricted to time series that use the same timestamp predicate, due to the possibility of duplication of timestamp. However, the merge or union of RDF distinct time series that use the same timestamp predicate is always an RDF time series.
The restriction of element-count-based window relations to distinct time series results in functional relations. This enables the definition of element-count-based window functions for distinct time series. Further, the element-count-based window functions are always computable. However, the time-based window functions are not necessarily computable unless there is a finite precision to the timestamps, as can be seen from the Zeno's paradox example @@@(this example hasn't been written up yet, but is mentioned elsewhere).
There is going to be a lot more to talk about for RDF time series relative to queries, but in this document we don't have those definitions.
The restriction of timestamps in the regular time series profile allows the timestamp to be represented concisely as an integer, provided the spacing is provided elsewhere, e.g., in metadata.
@@@motivate this profile