annotate service/reasoning/inference.py @ 287:3b61c0dfaaef

switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms) Ignore-this: a655f4c56db51b09b3f14d7f09e354cb
author drewp@bigasterisk.com
date Mon, 09 May 2016 00:32:08 -0700
parents 95f72a22965d
children e03696277b32
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
20
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
1 """
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
2 see ./reasoning for usage
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
3 """
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
4
287
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
5 import sys, os, contextlib
114
4cd065b97fa1 bugs in async http client. move trig helpers to rdflibtrig, which can work with rdflib 4
drewp@bigasterisk.com
parents: 45
diff changeset
6 try:
4cd065b97fa1 bugs in async http client. move trig helpers to rdflibtrig, which can work with rdflib 4
drewp@bigasterisk.com
parents: 45
diff changeset
7 from rdflib.Graph import Graph
4cd065b97fa1 bugs in async http client. move trig helpers to rdflibtrig, which can work with rdflib 4
drewp@bigasterisk.com
parents: 45
diff changeset
8 except ImportError:
4cd065b97fa1 bugs in async http client. move trig helpers to rdflibtrig, which can work with rdflib 4
drewp@bigasterisk.com
parents: 45
diff changeset
9 from rdflib import Graph
284
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
10
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
11 from rdflib.parser import StringInputSource
20
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
12
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
13 sys.path.append("/my/proj/room/fuxi/build/lib.linux-x86_64-2.6")
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
14 from FuXi.Rete.Util import generateTokenSet
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
15 from FuXi.Rete import ReteNetwork
284
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
16 from FuXi.Rete.RuleStore import N3RuleStore
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
17
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
18 from rdflib import plugin, Namespace
20
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
19 from rdflib.store import Store
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
20
287
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
21 from greplin import scales
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
22 STATS = scales.collection('/web',
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
23 scales.PmfStat('readRules'))
284
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
24
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
25 from escapeoutputstatements import escapeOutputStatements
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
26 ROOM = Namespace("http://projects.bigasterisk.com/room/")
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
27
287
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
28 def _loadAndEscape(ruleStore, n3, outputPatterns):
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
29 ruleGraph = Graph(ruleStore)
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
30
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
31 # Can't escapeOutputStatements in the ruleStore since it
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
32 # doesn't support removals. Can't copy plainGraph into
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
33 # ruleGraph since something went wrong with traversing the
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
34 # triples inside quoted graphs, and I lose all the bodies
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
35 # of my rules. This serialize/parse version is very slow (400ms),
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
36 # but it only runs when the file changes.
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
37 plainGraph = Graph()
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
38 plainGraph.parse(StringInputSource(n3), format='n3') # for inference
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
39 escapeOutputStatements(plainGraph, outputPatterns=outputPatterns)
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
40 expandedN3 = plainGraph.serialize(format='n3')
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
41
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
42 ruleGraph.parse(StringInputSource(expandedN3), format='n3')
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
43
284
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
44 _rulesCache = (None, None, None, None)
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
45 def readRules(rulesPath, outputPatterns):
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
46 """
287
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
47 returns (rulesN3, ruleStore)
284
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
48
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
49 This includes escaping certain statements in the output
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
50 (implied) subgraaphs so they're not confused with input
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
51 statements.
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
52 """
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
53 global _rulesCache
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
54
287
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
55 with STATS.readRules.time():
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
56 mtime = os.path.getmtime(rulesPath)
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
57 key = (rulesPath, mtime)
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
58 if _rulesCache[:2] == key:
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
59 _, _, rulesN3, ruleStore = _rulesCache
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
60 else:
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
61 rulesN3 = open(rulesPath).read() # for web display
284
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
62
287
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
63 ruleStore = N3RuleStore()
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
64 _loadAndEscape(ruleStore, rulesN3, outputPatterns)
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
65 log.debug('%s rules' % len(ruleStore.rules))
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
66
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
67 _rulesCache = key + (rulesN3, ruleStore)
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
68 return rulesN3, ruleStore
284
95f72a22965d rules become simple-looking again; fix the ambiguity in memory after loading them.
drewp@bigasterisk.com
parents: 281
diff changeset
69
20
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
70 def infer(graph, rules):
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
71 """
287
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
72 returns new graph of inferred statements. Plain rete api seems to
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
73 alter rules.formulae and rules.rules, but this function does not
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
74 alter the incoming rules object, so you can cache it.
20
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
75 """
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
76 # based on fuxi/tools/rdfpipe.py
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
77 target = Graph()
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
78 tokenSet = generateTokenSet(graph)
287
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
79 with _dontChangeRulesStore(rules):
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
80 network = ReteNetwork(rules, inferredTarget=target)
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
81 network.feedFactsToAdd(tokenSet)
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
82
20
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
83 return target
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
84
287
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
85 @contextlib.contextmanager
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
86 def _dontChangeRulesStore(rules):
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
87 if not hasattr(rules, '_stashOriginalRules'):
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
88 rules._stashOriginalRules = rules.rules[:]
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
89 yield
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
90 for k in rules.formulae.keys():
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
91 if not k.startswith('_:Formula'):
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
92 del rules.formulae[k]
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
93 rules.rules = rules._stashOriginalRules[:]
3b61c0dfaaef switch from evtiming to greplin.scales. Optimize rules reader to reuse previous data (400ms -> 0.6ms)
drewp@bigasterisk.com
parents: 284
diff changeset
94
20
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
95 import time, logging
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
96 log = logging.getLogger()
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
97 def logTime(func):
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
98 def inner(*args, **kw):
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
99 t1 = time.time()
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
100 try:
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
101 ret = func(*args, **kw)
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
102 finally:
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
103 log.info("Call to %s took %.1f ms" % (
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
104 func.__name__, 1000 * (time.time() - t1)))
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
105 return ret
3f0dd03112b5 move reasoning from /my/proj/room, new integration with magma
drewp@bigasterisk.com
parents:
diff changeset
106 return inner