annotate patchablegraph.py @ 27:2f74ed860ea2

release 1.1.0
author drewp@bigasterisk.com
date Sat, 23 Apr 2022 23:58:50 -0700
parents e11d407c46f8
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
1 """
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
2 Design:
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
3
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
4 1. Services each have (named) graphs, which they patch as things
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
5 change. PatchableGraph is an object for holding this graph.
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
6 2. You can http GET that graph, or ...
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
7 3. You can http GET/SSE that graph and hear about modifications to it
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
8 4. The client that got the graph holds and maintains a copy. The
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
9 client may merge together multiple graphs.
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
10 5. Client queries its graph with low-level APIs or client-side sparql.
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
11 6. When the graph changes, the client knows and can update itself at
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
12 low or high granularity.
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
13
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
14
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
15 See also:
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
16 * http://iswc2007.semanticweb.org/papers/533.pdf RDFSync: efficient remote synchronization of RDF
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
17 models
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
18 * https://www.w3.org/2009/12/rdf-ws/papers/ws07 Supporting Change Propagation in RDF
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
19 * https://www.w3.org/DesignIssues/lncs04/Diff.pdf Delta: an ontology for the distribution of
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
20 differences between RDF graphs
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
21
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
22 """
25
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
23 import asyncio
4
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
24 import itertools
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
25 import json
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
26 import logging
25
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
27 import weakref
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
28 from typing import Callable, List, NewType, Optional, cast, Set
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
29
3
703adc4f78b1 scales -> promethewus
drewp@bigasterisk.com
parents: 0
diff changeset
30 from prometheus_client import Counter, Gauge, Summary
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
31 from rdfdb.grapheditapi import GraphEditApi
4
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
32 from rdfdb.patch import Patch
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
33 from rdfdb.rdflibpatch import inGraph, patchQuads
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
34 from rdflib import ConjunctiveGraph
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
35 from rdflib.parser import StringInputSource
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
36 from rdflib.plugins.serializers.jsonld import from_rdf
17
388a5e15d249 fix local import
drewp@bigasterisk.com
parents: 9
diff changeset
37
25
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
38 JsonSerializedPatch = NewType('JsonSerializedPatch', str)
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
39 JsonLdSerializedGraph = NewType('JsonLdSerializedGraph', str)
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
40
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
41 log = logging.getLogger('patchablegraph')
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
42
20
8ec07d997cd5 declare labelnames on metrics
drewp@bigasterisk.com
parents: 17
diff changeset
43 SERIALIZE_CALLS = Summary('serialize_calls', 'PatchableGraph.serialize calls', labelnames=['graph'])
8ec07d997cd5 declare labelnames on metrics
drewp@bigasterisk.com
parents: 17
diff changeset
44 PATCH_CALLS = Summary('patch_calls', 'PatchableGraph.patch calls', labelnames=['graph'])
8ec07d997cd5 declare labelnames on metrics
drewp@bigasterisk.com
parents: 17
diff changeset
45 STATEMENT_COUNT = Gauge('statement_count', 'current PatchableGraph graph size', labelnames=['graph'])
8ec07d997cd5 declare labelnames on metrics
drewp@bigasterisk.com
parents: 17
diff changeset
46 OBSERVERS_CURRENT = Gauge('observers_current', 'current observer count', labelnames=['graph'])
8ec07d997cd5 declare labelnames on metrics
drewp@bigasterisk.com
parents: 17
diff changeset
47 OBSERVERS_ADDED = Counter('observers_added', 'observers added', labelnames=['graph'])
3
703adc4f78b1 scales -> promethewus
drewp@bigasterisk.com
parents: 0
diff changeset
48
703adc4f78b1 scales -> promethewus
drewp@bigasterisk.com
parents: 0
diff changeset
49
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
50 # forked from /my/proj/light9/light9/rdfdb/rdflibpatch.py
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
51 def _graphFromQuads2(q):
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
52 g = ConjunctiveGraph()
25
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
53 # g.addN(q) # no effect on nquad output
4
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
54 for s, p, o, c in q:
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
55 g.get_context(c).add((s, p, o)) # kind of works with broken rdflib nquad serializer code
25
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
56 # g.store.add((s,p,o), c) # no effect on nquad output
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
57 return g
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
58
4
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
59
25
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
60 def jsonFromPatch(p: Patch) -> JsonSerializedPatch:
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
61 return cast(JsonSerializedPatch, json.dumps({'patch': {
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
62 'adds': from_rdf(_graphFromQuads2(p.addQuads)),
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
63 'deletes': from_rdf(_graphFromQuads2(p.delQuads)),
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
64 }}))
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
65
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
66
4
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
67 patchAsJson = jsonFromPatch # deprecated name
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
68
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
69
25
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
70 def patchFromJson(j: JsonSerializedPatch) -> Patch:
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
71 body = json.loads(j)['patch']
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
72 a = ConjunctiveGraph()
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
73 a.parse(StringInputSource(json.dumps(body['adds']).encode('utf8')), format='json-ld')
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
74 d = ConjunctiveGraph()
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
75 d.parse(StringInputSource(json.dumps(body['deletes']).encode('utf8')), format='json-ld')
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
76 return Patch(addGraph=a, delGraph=d)
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
77
4
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
78
25
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
79 def graphAsJson(g: ConjunctiveGraph) -> JsonLdSerializedGraph:
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
80 # This is not the same as g.serialize(format='json-ld')! That
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
81 # version omits literal datatypes.
25
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
82 return cast(JsonLdSerializedGraph, json.dumps(from_rdf(g)))
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
83
4
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
84
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
85 _graphsInProcess = itertools.count()
4
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
86
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
87
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
88 class PatchableGraph(GraphEditApi):
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
89 """
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
90 Master graph that you modify with self.patch, and we get the
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
91 updates to all current listeners.
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
92 """
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
93
4
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
94 def __init__(self, label: Optional[str] = None):
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
95 self._graph = ConjunctiveGraph()
25
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
96 self._subscriptions: weakref.WeakSet[asyncio.Queue] = weakref.WeakSet()
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
97
4
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
98 if label is None:
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
99 label = f'patchableGraph{next(_graphsInProcess)}'
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
100 self.label = label
25
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
101 log.info('making %r', label)
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
102
4
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
103 def serialize(self, *arg, **kw) -> bytes:
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
104 with SERIALIZE_CALLS.labels(graph=self.label).time():
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
105 return cast(bytes, self._graph.serialize(*arg, **kw))
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
106
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
107 def patch(self, p: Patch):
3
703adc4f78b1 scales -> promethewus
drewp@bigasterisk.com
parents: 0
diff changeset
108 with PATCH_CALLS.labels(graph=self.label).time():
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
109 # assuming no stmt is both in p.addQuads and p.delQuads.
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
110 dels = set([q for q in p.delQuads if inGraph(q, self._graph)])
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
111 adds = set([q for q in p.addQuads if not inGraph(q, self._graph)])
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
112 minimizedP = Patch(addQuads=adds, delQuads=dels)
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
113 if minimizedP.isNoop():
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
114 return
4
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
115 patchQuads(self._graph, deleteQuads=dels, addQuads=adds, perfect=False) # true?
25
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
116 if self._subscriptions:
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
117 log.info('PatchableGraph: patched; telling %s observers', len(self._subscriptions))
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
118 j = patchAsJson(p)
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
119 for q in self._subscriptions:
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
120 q.put_nowait(('patch', j))
3
703adc4f78b1 scales -> promethewus
drewp@bigasterisk.com
parents: 0
diff changeset
121 STATEMENT_COUNT.labels(graph=self.label).set(len(self._graph))
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
122
25
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
123 def asJsonLd(self) -> JsonLdSerializedGraph:
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
124 return graphAsJson(self._graph)
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
125
25
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
126 def subscribeToPatches(self) -> asyncio.Queue:
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
127 q = asyncio.Queue()
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
128 qref = weakref.ref(q, self._onUnsubscribe)
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
129 self._initialSubscribeEvents(qref)
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
130 return q
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
131
25
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
132 def _initialSubscribeEvents(self, qref):
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
133 q = qref()
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
134 log.info('new sub queue %s', q)
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
135 self._subscriptions.add(q) # when caller forgets about queue, we will too
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
136 OBSERVERS_CURRENT.labels(graph=self.label).set(len(self._subscriptions))
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
137 OBSERVERS_ADDED.labels(graph=self.label).inc()
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
138 q.put_nowait(('graph', self.asJsonLd())) # this should be chunked, or just done as reset + patches
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
139
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
140 def _onUnsubscribe(self, qref):
e11d407c46f8 rewrite for asyncio and starlette
drewp@bigasterisk.com
parents: 23
diff changeset
141 OBSERVERS_CURRENT.labels(graph=self.label).set(len(self._subscriptions)) # minus one?
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
142
4
dc4f852d0d70 reformat and add some types
drewp@bigasterisk.com
parents: 3
diff changeset
143 def setToGraph(self, newGraph: ConjunctiveGraph):
0
c3f0a692c4cb move repo from homeauto/lib/
drewp@bigasterisk.com
parents:
diff changeset
144 self.patch(Patch.fromDiff(self._graph, newGraph))