Class Walker<N>

java.lang.Object
com.google.mu.util.graph.Walker<N>
Direct Known Subclasses:
BinaryTreeWalker, GraphWalker

public abstract class Walker<N> extends Object
Implements generic graph and tree traversal algorithms (pre-order, post-order and breadth-first) as lazily evaluated streams, allowing infinite-size graphs.

The following example code explores all islands to find the treasure island:


 Optional<Island> treasureIsland =
     Walker.inGraph(Island::nearbyIslands)
         .preOrderFrom(homeIsland)
         .filter(Island::hasTreasure)
         .findFirst();
 

None of these streams are safe to run in parallel. Although, multiple threads could traverse the same graph collaboratively by sharing a concurrent node tracker. See inGraph(findSuccessors, nodeTracker) for details.

For binary trees, use inBinaryTree(Tree::left, Tree::right).

Since:
4.0
  • Method Details

    • inBinaryTree

      public static <N> BinaryTreeWalker<N> inBinaryTree(UnaryOperator<N> getLeft, UnaryOperator<N> getRight)
      Returns a BinaryTreeWalker for walking in the binary tree topology as observed by getLeft and getRight functions. Both functions return null to indicate that there is no left or right child.

      It's guaranteed that for any given node, getLeft and getRight are called lazily, only when the left or the right child is traversed. They are called at most once for each node.

      Since:
      4.2
    • inTree

      public static <N> Walker<N> inTree(Function<? super N,? extends Stream<? extends N>> findChildren)
      Returns a Walker to walk the tree topology (no cycles) as observed by the findChildren function, which finds children of any given tree node.

      inTree() is more efficient than inGraph() because it doesn't need to remember nodes that are already visited. On the other hand, the returned Walker can walk in cycles if the findChildren function unexpectedly represents a cyclic graph. If you need to guard against cycles just in case, you can use inGraph() with a custom node tracker to check for the critical precondition:

      
       Set<N> visited = new HashSet<>();
       Walker<N> walker = Walker.inGraph(successorFunction, n -> {
         checkArgument(visited.add(n), "Node with multiple parents: %s", n);
         return true;
       });
       

      The returned object is idempotent, stateless and immutable as long as findChildren is idempotent, stateless and immutable.

      Parameters:
      findChildren - Function to get the child nodes for a given node. No children if empty stream or null is returned,
    • inGraph

      public static <N> GraphWalker<N> inGraph(Function<? super N,? extends Stream<? extends N>> findSuccessors)
      Returns a Walker to walk the graph topology (possibly with cycles) as observed by the findSuccessors function, which finds successors of any given graph node.

      Because the traversal needs to remember which node(s) have been traversed, memory usage is linear to the number of traversed nodes.

      Parameters:
      findSuccessors - Function to get the successor nodes for a given node. No successor if empty stream or null is returned,
    • inGraph

      public static <N> GraphWalker<N> inGraph(Function<? super N,? extends Stream<? extends N>> findSuccessors, Predicate<? super N> tracker)
      Similar to inGraph(Function), returns a Walker that can be used to traverse a graph of nodes. tracker is used to track every node being traversed. When Walker is about to traverse a node, tracker.test(node) will be called and the node (together with its edges) will be skipped if false is returned.

      This is useful for custom node tracking. For example, using a ConcurrentHashMap, multiple threads can traverse a large graph concurrently and collaboratively:

      
       Walker<Room> concurrentWalker =
           Walker.inGraph(buildingMap, ConcurrentHashMap.newKeySet()::add);
      
       // thread 1:
       Stream<Room> shield = concurrentWalker.preOrderFrom(roof);
       shield.forEachOrdered(room -> System.out.println("Raided by SHIELD from roof: " + room);
      
       // thread 2:
       Stream<Room> avengers = concurrentWalker.breadthFirstFrom(mainEntrance);
       avengers.forEachOrdered(room -> System.out.println("Raided by Avengers: " + room);
       

      Or, nodes can be tracked by functional equivalence. What gives? Imagine in a pirate treasure hunt, we start from an island and scavenge from island to island. Considering the islands as nodes, we can use any traversal strategy. Say, we picked pre-order:

      
       Optional<Island> treasureIsland =
           Walker.inGraph(Island::nearbyIslands)
               .preOrderFrom(startIsland)
               .filter(Island::hasTreasure)
               .findFirst();
       
      That gives us the treasure island. But what if upon finding the treasure island, we also want to make our own treasure map? It requires not just finding the island, but also recording how we got there. To do this, we can start by defining a class that encodes the route:
      
       class Route {
         private final Island island;
         private final Route predecessor;
      
         // End of the route.
         Island end() {
           return island;
         }
      
         // Returns a new Route with this Route as the predecessor.
         Route extendTo(Island newIsland) {
           return new Route(newIsland, this);
         }
      
         Stream<Route> nearbyRoutes() {
           return island.nearbyIslands().map(this::extendTo);
         }
      
         List<Island> islands() {
           // follow the `predecessor` chain to return all islands along the route.
         }
       }
       
      And then we can modify the treasure hunt code to walk through a stream of Route objects in place of islands. A caveat is that Route doesn't define equals --- even if it did, it'd be recursive and not what we need anyway (we care about unique islands, not unique routes).

      Long story short, the trick is to use functional equivalence so that the Walker still knows which islands have already been searched:

      
       Set<Island> searched = new HashSet<>();
       Optional<Route> treasureIslandRoute =
           Walker.inGraph(Route::nearbyRoutes, route -> searched.add(route.end()))
               .preOrderFrom(new Route(startIsland))
               .filter(route -> route.end().hasTreasure())
               .findFirst();
       

      In the case of walking a very large graph with more nodes than can fit in memory, it's conceivable to use com.google.common.hash.BloomFilter#put BloomFilter to track visited nodes, as long as you are okay with probabilistically missing a fraction of the graph nodes due to Bloom filter's inherent false-positive rates. Because Bloom filters have zero false-negatives, it's guaranteed that the walker will never walk in cycles.

      Parameters:
      findSuccessors - Function to get the successor nodes for a given node. No successor if empty stream or null is returned,
      tracker - Tracks each node being visited during traversal. Returns false if the node and its edges should be skipped for traversal (for example because it has already been traversed). Despite being a Predicate, the tracker usually carries side-effects like storing the tracked nodes in a set (set::add, bloomFilter::put etc. will do).
    • preOrderFrom

      @SafeVarargs public final Stream<N> preOrderFrom(N... startNodes)
      Starts from startNodes and walks depth first in pre-order.

      The returned stream may be infinite if the graph or tree has infinite depth or infinite breadth, or both. The stream can still be short-circuited to consume a limited number of nodes during traversal.

    • preOrderFrom

      public abstract Stream<N> preOrderFrom(Iterable<? extends N> startNodes)
      Starts from startNodes and walks depth first in pre-order.

      The returned stream may be infinite if the graph or tree has infinite depth or infinite breadth, or both. The stream can still be short-circuited to consume a limited number of nodes during traversal.

    • postOrderFrom

      @SafeVarargs public final Stream<N> postOrderFrom(N... startNodes)
      Starts from startNodes and walks depth first in post-order (the reverse of a topological sort).

      The returned stream may be infinite if the graph or tree has infinite breadth. The stream can still be short-circuited to consume a limited number of nodes during traversal.

      The stream may result in infinite loop when traversing through a node with infinite depth.

    • postOrderFrom

      public abstract Stream<N> postOrderFrom(Iterable<? extends N> startNodes)
      Starts from startNodes and walks depth first in post-order (the reverse of a topological sort).

      The returned stream may be infinite if the graph or tree has infinite breadth. The stream can still be short-circuited to consume a limited number of nodes during traversal.

      The stream may result in infinite loop when traversing through a node with infinite depth.

    • breadthFirstFrom

      @SafeVarargs public final Stream<N> breadthFirstFrom(N... startNodes)
      Starts from startNodes and walks in breadth-first order.

      The returned stream may be infinite if the graph or tree has infinite depth or infinite breadth, or both. The stream can still be short-circuited to consume a limited number of nodes during traversal.

    • breadthFirstFrom

      public abstract Stream<N> breadthFirstFrom(Iterable<? extends N> startNodes)
      Starts from startNodes and walks in breadth-first order.

      The returned stream may be infinite if the graph or tree has infinite depth or infinite breadth, or both. The stream can still be short-circuited to consume a limited number of nodes during traversal.