annotate service/mqtt_to_rdf/inference/stmt_chunk.py @ 1727:23e6154e6c11

file moves
author drewp@bigasterisk.com
date Tue, 20 Jun 2023 23:26:24 -0700
parents service/mqtt_to_rdf/stmt_chunk.py@88f6e9bf69d1
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
1 import itertools
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
2 import logging
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
3 from dataclasses import dataclass
1673
80f4e741ca4f redo RHS bnode processing
drewp@bigasterisk.com
parents: 1669
diff changeset
4 from typing import Iterable, Iterator, List, Optional, Set, Tuple, Type, Union, cast
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
5
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
6 from rdflib.graph import Graph
1727
23e6154e6c11 file moves
drewp@bigasterisk.com
parents: 1697
diff changeset
7 from rdflib import RDF
1673
80f4e741ca4f redo RHS bnode processing
drewp@bigasterisk.com
parents: 1669
diff changeset
8 from rdflib.term import Literal, Node, URIRef, Variable
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
9
1727
23e6154e6c11 file moves
drewp@bigasterisk.com
parents: 1697
diff changeset
10 from inference.candidate_binding import CandidateBinding
23e6154e6c11 file moves
drewp@bigasterisk.com
parents: 1697
diff changeset
11 from inference.inference_types import Inconsistent, RuleUnboundBnode, WorkingSetBnode
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
12
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
13 log = logging.getLogger('infer')
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
14
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
15 INDENT = ' '
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
16
1660
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
17 ChunkPrimaryTriple = Tuple[Optional[Node], Node, Optional[Node]]
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
18
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
19
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
20 @dataclass
1664
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
21 class AlignedRuleChunk:
1668
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
22 """a possible association between a rule chunk and a workingSet chunk. You can test
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
23 whether the association would still be possible under various additional bindings."""
1664
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
24 ruleChunk: 'Chunk'
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
25 workingSetChunk: 'Chunk'
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
26
1668
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
27 def __post_init__(self):
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
28 if not self.matches():
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
29 raise Inconsistent()
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
30
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
31 def newBindingIfMatched(self, prevBindings: CandidateBinding) -> CandidateBinding:
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
32 """supposing this rule did match the statement, what new bindings would
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
33 that produce?
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
34
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
35 raises Inconsistent if the existing bindings mean that our aligned
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
36 chunks can no longer match.
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
37 """
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
38 outBinding = CandidateBinding({})
1664
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
39 for rt, ct in zip(self.ruleChunk._allTerms(), self.workingSetChunk._allTerms()):
1673
80f4e741ca4f redo RHS bnode processing
drewp@bigasterisk.com
parents: 1669
diff changeset
40 if isinstance(rt, (Variable, RuleUnboundBnode)):
1668
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
41 if prevBindings.contains(rt) and prevBindings.applyTerm(rt) != ct:
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
42 msg = f'{rt=} {ct=} {prevBindings=}' if log.isEnabledFor(logging.DEBUG) else ''
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
43 raise Inconsistent(msg)
1664
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
44 if outBinding.contains(rt) and outBinding.applyTerm(rt) != ct:
1668
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
45 # maybe this can happen, for stmts like ?x :a ?x .
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
46 raise Inconsistent("outBinding inconsistent with itself")
1664
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
47 outBinding.addNewBindings(CandidateBinding({rt: ct}))
1668
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
48 else:
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
49 if rt != ct:
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
50 # getting here means prevBindings was set to something our
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
51 # rule statement disagrees with.
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
52 raise Inconsistent(f'{rt=} != {ct=}')
1664
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
53 return outBinding
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
54
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
55 def matches(self) -> bool:
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
56 """could this rule, with its BindableTerm wildcards, match workingSetChunk?"""
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
57 for selfTerm, otherTerm in zip(self.ruleChunk._allTerms(), self.workingSetChunk._allTerms()):
1673
80f4e741ca4f redo RHS bnode processing
drewp@bigasterisk.com
parents: 1669
diff changeset
58 if not isinstance(selfTerm, (Variable, RuleUnboundBnode)) and selfTerm != otherTerm:
1664
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
59 return False
1694
73abfd4cf5d0 new html log and other refactoring as i work on the advanceTheStack problems
drewp@bigasterisk.com
parents: 1677
diff changeset
60
1664
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
61 return True
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
62
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
63
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
64 @dataclass
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
65 class Chunk: # rename this
1661
00a5624d1d14 cleanups and optimizations
drewp@bigasterisk.com
parents: 1660
diff changeset
66 """A statement, maybe with variables in it, except *the subject or object
00a5624d1d14 cleanups and optimizations
drewp@bigasterisk.com
parents: 1660
diff changeset
67 can be rdf lists*. This is done to optimize list comparisons (a lot) at the
00a5624d1d14 cleanups and optimizations
drewp@bigasterisk.com
parents: 1660
diff changeset
68 very minor expense of not handling certain exotic cases, such as a branching
00a5624d1d14 cleanups and optimizations
drewp@bigasterisk.com
parents: 1660
diff changeset
69 list.
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
70
1661
00a5624d1d14 cleanups and optimizations
drewp@bigasterisk.com
parents: 1660
diff changeset
71 Example: (?x ?y) math:sum ?z . <-- this becomes one Chunk.
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
72
1661
00a5624d1d14 cleanups and optimizations
drewp@bigasterisk.com
parents: 1660
diff changeset
73 A function call in a rule is always contained in exactly one chunk.
00a5624d1d14 cleanups and optimizations
drewp@bigasterisk.com
parents: 1660
diff changeset
74
00a5624d1d14 cleanups and optimizations
drewp@bigasterisk.com
parents: 1660
diff changeset
75 https://www.w3.org/TeamSubmission/n3/#:~:text=Implementations%20may%20treat%20list%20as%20a%20data%20type
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
76 """
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
77 # all immutable
1660
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
78 primary: ChunkPrimaryTriple
1653
e7d594c065d4 minor refactoring
drewp@bigasterisk.com
parents: 1652
diff changeset
79 subjList: Optional[List[Node]] = None
e7d594c065d4 minor refactoring
drewp@bigasterisk.com
parents: 1652
diff changeset
80 objList: Optional[List[Node]] = None
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
81
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
82 def __post_init__(self):
1659
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
83 if not (((self.primary[0] is not None) ^ (self.subjList is not None)) and
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
84 ((self.primary[2] is not None) ^ (self.objList is not None))):
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
85 raise TypeError("invalid chunk init")
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
86 self.predicate = self.primary[1]
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
87 self.sortKey = (self.primary, tuple(self.subjList or []), tuple(self.objList or []))
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
88
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
89 def __hash__(self):
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
90 return hash(self.sortKey)
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
91
1664
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
92 def __lt__(self, other):
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
93 return self.sortKey < other.sortKey
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
94
1660
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
95 def _allTerms(self) -> Iterator[Node]:
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
96 """the terms in `primary` plus the lists. Output order is undefined but stable between same-sized Chunks"""
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
97 yield self.primary[1]
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
98 if self.primary[0] is not None:
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
99 yield self.primary[0]
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
100 else:
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
101 yield from cast(List[Node], self.subjList)
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
102 if self.primary[2] is not None:
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
103 yield self.primary[2]
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
104 else:
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
105 yield from cast(List[Node], self.objList)
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
106
1664
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
107 def ruleMatchesFrom(self, workingSet: 'ChunkedGraph') -> Iterator[AlignedRuleChunk]:
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
108 """Chunks from workingSet where self, which may have BindableTerm wildcards, could match that workingSet Chunk."""
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
109 # if log.isEnabledFor(logging.DEBUG):
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
110 # log.debug(f'{INDENT*6} computing {self}.ruleMatchesFrom({workingSet}')
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
111 allChunksIter = workingSet.allChunks()
1697
88f6e9bf69d1 stats and non-debug mode speedups
drewp@bigasterisk.com
parents: 1694
diff changeset
112 if log.isEnabledFor(logging.DEBUG): # makes failures a bit more stable, but shows up in profiling
1664
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
113 allChunksIter = sorted(allChunksIter)
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
114 for chunk in allChunksIter:
1668
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
115 try:
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
116 aligned = AlignedRuleChunk(self, chunk)
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
117 except Inconsistent:
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
118 continue
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
119 yield aligned
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
120
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
121 def __repr__(self):
1677
aa35ae7a1acc add new bug test (no fix yet)
drewp@bigasterisk.com
parents: 1673
diff changeset
122 pre = ('+'.join(repr(elem) for elem in self.subjList) + '+' if self.subjList else '')
aa35ae7a1acc add new bug test (no fix yet)
drewp@bigasterisk.com
parents: 1673
diff changeset
123 post = ('+' + '+'.join(repr(elem) for elem in self.objList) if self.objList else '')
1659
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
124 return pre + repr(self.primary) + post
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
125
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
126 def isFunctionCall(self, functionsFor) -> bool:
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
127 return bool(list(functionsFor(cast(URIRef, self.predicate))))
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
128
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
129 def isStatic(self) -> bool:
1660
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
130 return all(_termIsStatic(s) for s in self._allTerms())
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
131
1669
9d00adef0b22 rm used parameter
drewp@bigasterisk.com
parents: 1668
diff changeset
132 def apply(self, cb: CandidateBinding) -> 'Chunk':
1660
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
133 """Chunk like this one but with cb substitutions applied. If the flag is
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
134 True, we raise BindingUnknown instead of leaving a term unbound"""
1669
9d00adef0b22 rm used parameter
drewp@bigasterisk.com
parents: 1668
diff changeset
135 fn = lambda t: cb.applyTerm(t, failUnbound=False)
1660
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
136 return Chunk(
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
137 (
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
138 fn(self.primary[0]) if self.primary[0] is not None else None, #
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
139 fn(self.primary[1]), #
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
140 fn(self.primary[2]) if self.primary[2] is not None else None),
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
141 subjList=[fn(t) for t in self.subjList] if self.subjList else None,
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
142 objList=[fn(t) for t in self.objList] if self.objList else None,
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
143 )
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
144
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
145
1660
31f7dab6a60b function evaluation uses Chunk lists now and runs fast. Only a few edge cases still broken
drewp@bigasterisk.com
parents: 1659
diff changeset
146 def _termIsStatic(term: Optional[Node]) -> bool:
1659
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
147 return isinstance(term, (URIRef, Literal)) or term is None
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
148
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
149
1694
73abfd4cf5d0 new html log and other refactoring as i work on the advanceTheStack problems
drewp@bigasterisk.com
parents: 1677
diff changeset
150 def applyChunky(cb: CandidateBinding, g: Iterable[AlignedRuleChunk]) -> Iterator[AlignedRuleChunk]:
1664
1a7c1261302c logic fix- some bindings were being returned 2+; some 0 times
drewp@bigasterisk.com
parents: 1661
diff changeset
151 for aligned in g:
1669
9d00adef0b22 rm used parameter
drewp@bigasterisk.com
parents: 1668
diff changeset
152 bound = aligned.ruleChunk.apply(cb)
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
153 try:
1668
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
154 yield AlignedRuleChunk(bound, aligned.workingSetChunk)
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
155 except Inconsistent:
89e53cb8a01c fix some harder tests. Mostly, _advanceTheStack needed to spin the odometer rings starting from the other side, to get all the right combos
drewp@bigasterisk.com
parents: 1664
diff changeset
156 pass
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
157
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
158
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
159 class ChunkedGraph:
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
160 """a Graph converts 1-to-1 with a ChunkedGraph, where the Chunks have
1652
dddfa09ea0b9 debug logging and comments
drewp@bigasterisk.com
parents: 1651
diff changeset
161 combined some statements together. (The only exception is that bnodes for
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
162 rdf lists are lost)"""
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
163
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
164 def __init__(
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
165 self,
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
166 graph: Graph,
1673
80f4e741ca4f redo RHS bnode processing
drewp@bigasterisk.com
parents: 1669
diff changeset
167 bnodeType: Union[Type[RuleUnboundBnode], Type[WorkingSetBnode]],
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
168 functionsFor # get rid of this- i'm just working around a circular import
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
169 ):
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
170 self.chunksUsedByFuncs: Set[Chunk] = set()
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
171 self.staticChunks: Set[Chunk] = set()
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
172 self.patternChunks: Set[Chunk] = set()
1659
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
173
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
174 firstNodes = {}
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
175 restNodes = {}
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
176 graphStmts = set()
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
177 for s, p, o in graph:
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
178 if p == RDF['first']:
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
179 firstNodes[s] = o
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
180 elif p == RDF['rest']:
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
181 restNodes[s] = o
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
182 else:
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
183 graphStmts.add((s, p, o))
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
184
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
185 def gatherList(start):
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
186 lst = []
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
187 cur = start
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
188 while cur != RDF['nil']:
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
189 lst.append(firstNodes[cur])
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
190 cur = restNodes[cur]
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
191 return lst
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
192
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
193 for s, p, o in graphStmts:
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
194 subjList = objList = None
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
195 if s in firstNodes:
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
196 subjList = gatherList(s)
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
197 s = None
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
198 if o in firstNodes:
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
199 objList = gatherList(o)
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
200 o = None
1673
80f4e741ca4f redo RHS bnode processing
drewp@bigasterisk.com
parents: 1669
diff changeset
201 from rdflib import BNode
1694
73abfd4cf5d0 new html log and other refactoring as i work on the advanceTheStack problems
drewp@bigasterisk.com
parents: 1677
diff changeset
202 if isinstance(s, BNode):
73abfd4cf5d0 new html log and other refactoring as i work on the advanceTheStack problems
drewp@bigasterisk.com
parents: 1677
diff changeset
203 s = bnodeType(s)
73abfd4cf5d0 new html log and other refactoring as i work on the advanceTheStack problems
drewp@bigasterisk.com
parents: 1677
diff changeset
204 if isinstance(p, BNode):
73abfd4cf5d0 new html log and other refactoring as i work on the advanceTheStack problems
drewp@bigasterisk.com
parents: 1677
diff changeset
205 p = bnodeType(p)
73abfd4cf5d0 new html log and other refactoring as i work on the advanceTheStack problems
drewp@bigasterisk.com
parents: 1677
diff changeset
206 if isinstance(o, BNode):
73abfd4cf5d0 new html log and other refactoring as i work on the advanceTheStack problems
drewp@bigasterisk.com
parents: 1677
diff changeset
207 o = bnodeType(o)
1673
80f4e741ca4f redo RHS bnode processing
drewp@bigasterisk.com
parents: 1669
diff changeset
208
1659
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
209 c = Chunk((s, p, o), subjList=subjList, objList=objList)
15e84c71beee parse lists from graph into the Chunks
drewp@bigasterisk.com
parents: 1654
diff changeset
210
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
211 if c.isFunctionCall(functionsFor):
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
212 self.chunksUsedByFuncs.add(c)
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
213 elif c.isStatic():
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
214 self.staticChunks.add(c)
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
215 else:
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
216 self.patternChunks.add(c)
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
217
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
218 def allPredicatesExceptFunctions(self) -> Set[Node]:
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
219 return set(ch.predicate for ch in itertools.chain(self.staticChunks, self.patternChunks))
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
220
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
221 def noPredicatesAppear(self, preds: Iterable[Node]) -> bool:
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
222 return self.allPredicatesExceptFunctions().isdisjoint(preds)
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
223
1654
d47832373b34 __nonzero__ is called __bool__ in py3! thanks for nothing, linters
drewp@bigasterisk.com
parents: 1653
diff changeset
224 def __bool__(self):
1651
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
225 return bool(self.chunksUsedByFuncs) or bool(self.staticChunks) or bool(self.patternChunks)
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
226
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
227 def __repr__(self):
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
228 return f'ChunkedGraph({self.__dict__})'
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
229
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
230 def allChunks(self) -> Iterable[Chunk]:
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
231 yield from itertools.chain(self.staticChunks, self.patternChunks, self.chunksUsedByFuncs)
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
232
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
233 def __contains__(self, ch: Chunk) -> bool:
20474ad4968e WIP - functions are broken as i move most layers to work in Chunks not Triples
drewp@bigasterisk.com
parents:
diff changeset
234 return ch in self.allChunks()