Optimizing python-constraint with forward checking

in #programming6 years ago (edited)

For Soffit I implemented a few Constraint objects with the python-constraint library. These objects represent a particular restriction on the solution, and they can prune the search space either ahead of time or while the search is running.

One of them just checks variables to see if they're in a set of allowed tuples, and I make heavy use of this for edge-based constraints. Originally the constraint checking was very simple:

    def __call__(self, variables, domains, assignments, forwardcheck=False):
        current = tuple( assignments.get( v, Unassigned ) for v in variables )
        if Unassigned in current:
            return True
        else:
            return current in self.allowedSet

That is, if all the variables are assigned, then check for a matching tuple. Otherwise say the constraint is still satisfied. There was a lot more code in the preprocessing stage to eliminate individual values that were impossible.

I noticed that certain rules took a long time to evaluate (even if there were no matches.) A little digging with cProfile showed that both my check above as well as the built-in AllDifferentConstraint were handling a lot of different queries, accounting for the bulk of the time. So it would be advantageous to give the constraint solver more guidance.

In general, the problem is quite hard: given a partially-filled tuple, how do you find all compatible tuples from a set? Fortunately, the way I was using this class, tuples of size two (graph edges) dominated, and I was able to verify that by instrumenting my code. And with just two values, you can build two indexes at a reasonable cost. The first says what "B" values exist for a given value of "A", and the second index the opposite.

    def insertTrees( self, a, b ):
        if a in self.forward:
            self.forward[a].add( b )
        else:
            self.forward[a] = set( [b] )

        if b in self.backward:
            self.backward[b].add( a )
        else:
            self.backward[b] = set( [a] )

Given that index, we can look at which values in the domain of the variable are permissible, and temporarily suppress those that aren't. The hideValue function temporarily removes a value; when the solver backtracks the value will be restored.

    def pairCheck( self, current, domain, whichMap ):
        domainValues = set( domain )
        restrictedValues = whichMap[ current ]
        #print( "domainValues", domainValues )
        #print( "restrictedValues", restrictedValues )
        for val in domainValues.difference( restrictedValues ):
            #print( "Hiding value", val )
            domain.hideValue( val )
            if not domain:
                return False
        return True

So, in the current version, the code that checks the constraints special-cases when there are just two variables. When the solver passes in forwardcheck=true, that means the Constraint is allowed to modify the domain of variables in the way shown above. We can also signal infeasibility for values that don't appear in the index at all, but this is redundant if the preprocessing step got to run. (In some cases, it doesn't because the check is conditional.)

    def __call__(self, variables, domains, assignments, forwardcheck=False):
        current = tuple( assignments.get( v, Unassigned ) for v in variables )
        if Unassigned not in current:
            return current in self.allowedSet
        
        if len( variables ) != 2:
            return True

        if current[0] is not Unassigned:
            if current[0] not in self.forward:
                return False                
            if forwardcheck:
                return self.pairCheck( current[0],
                                       domains[variables[1]],
                                       self.forward )
                    

        if current[1] is not Unassigned:
            if current[1] not in self.backward:
                return False
            
            if forwardcheck:
                return self.pairCheck( current[1],
                                       domains[variables[0]],
                                       self.backward )

        return True

This optimization is pretty effective; the larger graph below took about 95 seconds, which is as long as my previous run took for about 1/4 as many nodes.

1d-cellular-rule30-large.png

The current profile shows that hideValue is actually where a lot of the time is spent, but some of the graph functions appear as well, so there maybe some opportunity to do less copying or renaming and get a further increase in speed.

Two list comprehensions appear on the list, this one for nodes and the corresponding one for tags:

            matchingTag = [ (i,) for i in self.graph.nodes
                            if self.graph.nodes[i].get( 'tag', None ) == tag ]

It would be pretty easy to keep a tag->node index in the graph object so that this search could be done more efficiently.

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  8949895    3.698    0.000    4.232    0.000 /home/mark/.local/share/virtualenvs/soffit-71IOh0Y6/lib/python3.5/site-packages/constraint/__init__.py:798(hideValue)
     7553    2.777    0.000    6.728    0.001 /home/mark/.local/share/virtualenvs/soffit-71IOh0Y6/lib/python3.5/site-packages/networkx/classes/digraph.py:628(add_edges_from)
   143429    2.212    0.000    6.345    0.000 /home/mark/soffit/soffit/constraint.py:63(pairCheck)
     7553    1.357    0.000    1.989    0.000 /home/mark/.local/share/virtualenvs/soffit-71IOh0Y6/lib/python3.5/site-packages/networkx/classes/digraph.py:415(add_nodes_from)
  2836653    1.352    0.000    1.741    0.000 /home/mark/.local/share/virtualenvs/soffit-71IOh0Y6/lib/python3.5/site-packages/networkx/classes/graph.py:646(nodes)
     9611    1.331    0.000    2.822    0.000 /home/mark/soffit/soffit/graph.py:118(<listcomp>)
  1446731    1.297    0.000    1.505    0.000 /home/mark/.local/share/virtualenvs/soffit-71IOh0Y6/lib/python3.5/site-packages/networkx/classes/reportviews.py:666(<genexpr>)
     3129    1.117    0.000    3.519    0.001 /home/mark/soffit/soffit/graph.py:139(<listcomp>)
  1446103    1.042    0.000    3.049    0.000 /home/mark/.local/share/virtualenvs/soffit-71IOh0Y6/lib/python3.5/site-packages/networkx/relabel.py:161(<genexpr>)
 10226094    0.958    0.000    0.963    0.000 {method 'get' of 'dict' objects}
   451272    0.821    0.000    0.821    0.000 /home/mark/soffit/soffit/constraint.py:26(insertTrees)
  3532989    0.783    0.000    0.783    0.000 {method 'copy' of 'dict' objects}
     9616    0.698    0.000    1.805    0.000 /home/mark/soffit/soffit/constraint.py:37(__init__)
 10279728    0.613    0.000    0.613    0.000 {method 'append' of 'list' objects}
  3418673    0.604    0.000    1.714    0.000 {method 'update' of 'dict' objects}
   103074    0.587    0.000    0.850    0.000 /home/mark/.local/share/virtualenvs/soffit-71IOh0Y6/lib/python3.5/site-packages/constraint/__init__.py:979(__call__)
Sort:  

Hello! Your post has been resteemed and upvoted by @ilovecoding because we love coding! Keep up good work! Consider upvoting this comment to support the @ilovecoding and increase your future rewards! ^_^ Steem On!

Reply !stop to disable the comment. Thanks!

Coin Marketplace

STEEM 0.19
TRX 0.15
JST 0.029
BTC 63914.63
ETH 2664.93
USDT 1.00
SBD 2.77