annotate service/reasoning/inference.py @ 1092:54de5144900d

switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms) Ignore-this: a655f4c56db51b09b3f14d7f09e354cb darcs-hash:4ffd7012f404392375434243104eba065ffb8086
author drewp <drewp@bigasterisk.com>
date Mon, 09 May 2016 00:32:08 -0700
parents cb7fa2f30df9
children e03696277b32
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
825
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
1 """
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
2 see ./reasoning for usage
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
3 """
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
4
1092
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
5 import sys, os, contextlib
919
6ee2a90fc816 bugs in async http client. move trig helpers to rdflibtrig, which can work with rdflib 4
drewp <drewp@bigasterisk.com>
parents: 850
diff changeset
6 try:
6ee2a90fc816 bugs in async http client. move trig helpers to rdflibtrig, which can work with rdflib 4
drewp <drewp@bigasterisk.com>
parents: 850
diff changeset
7 from rdflib.Graph import Graph
6ee2a90fc816 bugs in async http client. move trig helpers to rdflibtrig, which can work with rdflib 4
drewp <drewp@bigasterisk.com>
parents: 850
diff changeset
8 except ImportError:
6ee2a90fc816 bugs in async http client. move trig helpers to rdflibtrig, which can work with rdflib 4
drewp <drewp@bigasterisk.com>
parents: 850
diff changeset
9 from rdflib import Graph
1089
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
10
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
11 from rdflib.parser import StringInputSource
825
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
12
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
13 sys.path.append("/my/proj/room/fuxi/build/lib.linux-x86_64-2.6")
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
14 from FuXi.Rete.Util import generateTokenSet
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
15 from FuXi.Rete import ReteNetwork
1089
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
16 from FuXi.Rete.RuleStore import N3RuleStore
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
17
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
18 from rdflib import plugin, Namespace
825
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
19 from rdflib.store import Store
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
20
1092
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
21 from greplin import scales
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
22 STATS = scales.collection('/web',
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
23 scales.PmfStat('readRules'))
1089
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
24
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
25 from escapeoutputstatements import escapeOutputStatements
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
26 ROOM = Namespace("http://projects.bigasterisk.com/room/")
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
27
1092
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
28 def _loadAndEscape(ruleStore, n3, outputPatterns):
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
29 ruleGraph = Graph(ruleStore)
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
30
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
31 # Can't escapeOutputStatements in the ruleStore since it
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
32 # doesn't support removals. Can't copy plainGraph into
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
33 # ruleGraph since something went wrong with traversing the
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
34 # triples inside quoted graphs, and I lose all the bodies
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
35 # of my rules. This serialize/parse version is very slow (400ms),
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
36 # but it only runs when the file changes.
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
37 plainGraph = Graph()
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
38 plainGraph.parse(StringInputSource(n3), format='n3') # for inference
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
39 escapeOutputStatements(plainGraph, outputPatterns=outputPatterns)
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
40 expandedN3 = plainGraph.serialize(format='n3')
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
41
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
42 ruleGraph.parse(StringInputSource(expandedN3), format='n3')
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
43
1089
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
44 _rulesCache = (None, None, None, None)
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
45 def readRules(rulesPath, outputPatterns):
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
46 """
1092
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
47 returns (rulesN3, ruleStore)
1089
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
48
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
49 This includes escaping certain statements in the output
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
50 (implied) subgraaphs so they're not confused with input
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
51 statements.
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
52 """
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
53 global _rulesCache
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
54
1092
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
55 with STATS.readRules.time():
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
56 mtime = os.path.getmtime(rulesPath)
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
57 key = (rulesPath, mtime)
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
58 if _rulesCache[:2] == key:
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
59 _, _, rulesN3, ruleStore = _rulesCache
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
60 else:
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
61 rulesN3 = open(rulesPath).read() # for web display
1089
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
62
1092
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
63 ruleStore = N3RuleStore()
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
64 _loadAndEscape(ruleStore, rulesN3, outputPatterns)
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
65 log.debug('%s rules' % len(ruleStore.rules))
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
66
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
67 _rulesCache = key + (rulesN3, ruleStore)
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
68 return rulesN3, ruleStore
1089
cb7fa2f30df9 rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp <drewp@bigasterisk.com>
parents: 1086
diff changeset
69
825
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
70 def infer(graph, rules):
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
71 """
1092
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
72 returns new graph of inferred statements. Plain rete api seems to
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
73 alter rules.formulae and rules.rules, but this function does not
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
74 alter the incoming rules object, so you can cache it.
825
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
75 """
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
76 # based on fuxi/tools/rdfpipe.py
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
77 target = Graph()
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
78 tokenSet = generateTokenSet(graph)
1092
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
79 with _dontChangeRulesStore(rules):
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
80 network = ReteNetwork(rules, inferredTarget=target)
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
81 network.feedFactsToAdd(tokenSet)
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
82
825
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
83 return target
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
84
1092
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
85 @contextlib.contextmanager
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
86 def _dontChangeRulesStore(rules):
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
87 if not hasattr(rules, '_stashOriginalRules'):
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
88 rules._stashOriginalRules = rules.rules[:]
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
89 yield
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
90 for k in rules.formulae.keys():
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
91 if not k.startswith('_:Formula'):
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
92 del rules.formulae[k]
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
93 rules.rules = rules._stashOriginalRules[:]
54de5144900d switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp <drewp@bigasterisk.com>
parents: 1089
diff changeset
94
825
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
95 import time, logging
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
96 log = logging.getLogger()
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
97 def logTime(func):
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
98 def inner(*args, **kw):
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
99 t1 = time.time()
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
100 try:
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
101 ret = func(*args, **kw)
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
102 finally:
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
103 log.info("Call to %s took %.1f ms" % (
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
104 func.__name__, 1000 * (time.time() - t1)))
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
105 return ret
fc753b24f69a move reasoning from /my/proj/room, new integration with magma
drewp <drewp@bigasterisk.com>
parents:
diff changeset
106 return inner