Class BiCollectors

java.lang.Object
com.google.mu.util.stream.BiCollectors

public final class BiCollectors extends Object
Common utilities pertaining to BiCollector.

Don't forget that you can directly "method reference" a Collector-returning factory method as a BiCollector as long as it accepts two Function parameters corresponding to the "key" and the "value" parts respectively. For example: collect(ImmutableMap::toImmutableMap), collect(Collectors::toConcurrentMap).

Most of the factory methods in this class are deliberately named after their Collector counterparts. This is a feature. Static imports can be overloaded by method arity, so you already static import, for example, Collectors.toMap, simply adding static import com.google.mu.util.stream.BiCollectors.toMap will allow both the BiCollector and the Collector to be used in the same file without ambiguity or confusion.

Since:
3.0
  • Method Details

    • toMap

      public static <K, V> BiCollector<K,V,Map<K,V>> toMap()
      Returns a BiCollector that collects the key-value pairs into an immutable Map.

      Normally calling biStream.toMap() is more convenient but for example when you've got a BiStream<K, LinkedList<V>> and need to collect it into Map<K, List<V>>, you'll need to call collect(toMap()) instead of BiStream.toMap().

    • toMap

      public static <K, V, M extends Map<K, V>> BiCollector<K,V,M> toMap(Supplier<? extends M> mapSupplier)
      Returns a BiCollector that collects the key-value pairs into a mutable Map created by mapSupplier.

      Duplicate keys will cause IllegalArgumentException to be thrown, with the offending key reported in the error message. If instead of throwing exception, you need to merge the values mapped to the same key, consider to use biStream.collect(new CustomMap<>(), Map::put) for overwriting semantics; biStream.collect(new CustomMap<>(), Map::putIfAbsent) for no overwrites; or biStream.collect(new CustomMap<>(), (m, k, v) -> m.merge(k, v, ...) for other merge logic.

      Note that due to constructor overload ambiguity, toMap(CustomMapType::new) may not compile because many mutable Map types such as LinkedHashMap expose both 0-arg and 1-arg constructors. You may need to use a lambda instead of constructor reference to work around the compiler ambiguity, such as toMap(() -> new LinkedHashMap<>()).

      Null keys and values are discouraged but supported as long as the result Map supports them. Thus this method can be used as a workaround of the toMap(Supplier) JDK bug that fails to support null values.

      Since:
      5.9
    • toMap

      public static <K, V> BiCollector<K,V,Map<K,V>> toMap(BinaryOperator<V> valueMerger)
      Returns a BiCollector that collects the key-value pairs into an immutable Map using valueMerger to merge values of duplicate keys.
    • toMap

      public static <K, V1, V> BiCollector<K,V1,Map<K,V>> toMap(Collector<V1,?,V> valueCollector)
      Returns a BiCollector that collects the key-value pairs into an immutable Map using valueCollector to collect values of identical keys into a final value of type V.

      For example, the following calculates total population per state from city demographic data:

      
        Map<StateId, Integer> statePopulations = BiStream.from(cities, City::getState, c -> c)
           .collect(toMap(summingInt(City::getPopulation)));
       

      Entries are collected in encounter order.

    • counting

      public static <K, V> BiCollector<K,V,Long> counting()
      Returns a counting BiCollector that counts the number of input entries.
      Since:
      3.2
    • countingDistinct

      public static <K, V> BiCollector<K,V,Integer> countingDistinct()
      Returns a counting BiCollector that counts the number of distinct input entries according to Object.equals(java.lang.Object) for both keys and values.

      Unlike counting(), this collector should not be used on very large (for example, larger than Integer.MAX_VALUE) streams because it internally needs to keep track of all distinct entries in memory.

      Since:
      3.2
    • summingInt

      public static <K, V> BiCollector<K,V,Integer> summingInt(ToIntBiFunction<? super K,? super V> mapper)
      Returns a BiCollector that produces the sum of an integer-valued function applied to the input pair. If no input entries are present, the result is 0.
      Since:
      3.2
    • summingLong

      public static <K, V> BiCollector<K,V,Long> summingLong(ToLongBiFunction<? super K,? super V> mapper)
      Returns a BiCollector that produces the sum of a long-valued function applied to the input pair. If no input entries are present, the result is 0.
      Since:
      3.2
    • summingDouble

      public static <K, V> BiCollector<K,V,Double> summingDouble(ToDoubleBiFunction<? super K,? super V> mapper)
      Returns a BiCollector that produces the sum of a double-valued function applied to the input pair. If no input entries are present, the result is 0.
      Since:
      3.2
    • averagingInt

      public static <K, V> BiCollector<K,V,Double> averagingInt(ToIntBiFunction<? super K,? super V> mapper)
      Returns a BiCollector that produces the arithmetic mean of an integer-valued function applied to the input pair. If no input entries are present, the result is 0.
      Since:
      3.2
    • averagingLong

      public static <K, V> BiCollector<K,V,Double> averagingLong(ToLongBiFunction<? super K,? super V> mapper)
      Returns a BiCollector that produces the arithmetic mean of a long-valued function applied to the input pair. If no input entries are present, the result is 0.
      Since:
      3.2
    • averagingDouble

      public static <K, V> BiCollector<K,V,Double> averagingDouble(ToDoubleBiFunction<? super K,? super V> mapper)
      Returns a BiCollector that produces the arithmetic mean of a double-valued function applied to the input pair. If no input entries are present, the result is 0.
      Since:
      3.2
    • summarizingInt

      public static <K, V> BiCollector<K,V,IntSummaryStatistics> summarizingInt(ToIntBiFunction<? super K,? super V> mapper)
      Returns a BiCollector which applies an int-producing mapping function to each input pair, and returns summary statistics for the resulting values.
      Since:
      3.2
    • summarizingLong

      public static <K, V> BiCollector<K,V,LongSummaryStatistics> summarizingLong(ToLongBiFunction<? super K,? super V> mapper)
      Returns a BiCollector which applies an long-producing mapping function to each input pair, and returns summary statistics for the resulting values.
      Since:
      3.2
    • summarizingDouble

      public static <K, V> BiCollector<K,V,DoubleSummaryStatistics> summarizingDouble(ToDoubleBiFunction<? super K,? super V> mapper)
      Returns a BiCollector which applies an double-producing mapping function to each input pair, and returns summary statistics for the resulting values.
      Since:
      3.2
    • groupingBy

      public static <R, C, V> BiCollector<C,V,BiStream<R,BiStream<C,V>>> groupingBy(BiFunction<? super C,? super V,? extends R> classifier)
      Groups input pairs by classifier and collects entries belonging to the same group into a nested BiStream. For example, you can break a Map into an ImmutableTable with:
      
       Map<City, Long> cityPopulations = ...;
       ImmutableTable<State, City, Long> stateCityPoulations =
           BiStream.from(cityPopulations)
               .collect(groupingBy((city, population) -> city.getState()))
               .collect(GuavaCollectors.toImmutableTable());
       
      Since:
      6.1
    • groupingBy

      public static <K, V, G, R> BiCollector<K,V,BiStream<G,R>> groupingBy(BiFunction<? super K,? super V,? extends G> classifier, BiCollector<? super K,? super V,R> groupCollector)
      Groups input entries by classifier and collects entries belonging to the same group using groupCollector. For example, the following code splits a phone book by area code:
      
       Multimap<Address, PhoneNumber> phoneBook = ...;
       ImmutableMap<AreaCode, ImmutableSetMultimap<Address, PhoneNumber>> areaPhoneBooks =
           BiStream.from(phoneBook)
               .collect(
                   groupingBy(
                       (addr, phone) -> phone.areaCode(),
                       ImmutableSetMultimap::toImmutableSetMultimap))
               .collect(ImmutableMap::toImmutableMap);
       
      Since:
      3.2
    • groupingBy

      public static <K, V, G, R> BiCollector<K,V,BiStream<G,R>> groupingBy(Function<? super K,? extends G> classifier, Collector<? super V,?,R> groupCollector)
      Groups input entries by classifier and collects values belonging to the same group using groupCollector. For example, the following code collects unique area codes for each state:
      
       Multimap<Address, PhoneNumber> phoneBook = ...;
       ImmutableMap<State, ImmutableSet<AreaCode>> stateAreaCodes =
           BiStream.from(phoneBook)
               .mapValues(PhoneNumber::areaCode)
               .collect(groupingBy(Address::state, toImmutableSet()))
               .collect(ImmutableMap::toImmutableMap);
       
      Since:
      3.2
    • groupingBy

      public static <K, V, G> BiCollector<K,V,BiStream<G,V>> groupingBy(Function<? super K,? extends G> classifier, BinaryOperator<V> groupReducer)
      Groups input pairs by classifier and reduces values belonging to the same group using groupReducer. For example, the following code calculates total household income for each state:
      
       Map<Address, Household> households = ...;
       ImmutableMap<State, Money> stateHouseholdIncomes =
           BiStream.from(households)
               .mapValues(Household::income)
               .collect(groupingBy(Address::state, Money::add))
               .collect(ImmutableMap::toImmutableMap);
       
      Since:
      3.3
    • partitioningBy

      public static <K, V> BiCollector<K,V,Both<BiStream<K,V>,BiStream<K,V>>> partitioningBy(BiPredicate<? super K,? super V> predicate)
      Returns a BiCollector that partitions the incoming pairs into two groups: elements that match predicate, and those that don't. Both groups are stored in a BiStream.

      For example:

      
       timeSeries
           .collect(partitioningBy((time, event) -> event.isImportant()))
           .andThen((importantEvents, unimportantEvents) -> ...);
       
      Since:
      8.1
    • partitioningBy

      public static <K, V, R> BiCollector<K,V,Both<R,R>> partitioningBy(BiPredicate<? super K,? super V> predicate, BiCollector<? super K,? super V,? extends R> downstream)
      Returns a BiCollector that partitions the incoming pairs into two groups: elements that match predicate, and those that don't, and use downstream collector to collect the pairs.

      For example:

      
       timeSeries
           .collect(partitioningBy((time, event) -> event.isImportant(), toSortedImmutableMap()))
           .andThen((importantEvents, unimportantEvents) -> ...);
       
      Type Parameters:
      K - the input key type
      V - the input value type
      R - the result type of the downstream collector
      Since:
      8.1
    • partitioningBy

      public static <K, V, T, F> BiCollector<K,V,Both<T,F>> partitioningBy(BiPredicate<? super K,? super V> predicate, BiCollector<? super K,? super V,? extends T> ifTrue, BiCollector<? super K,? super V,? extends F> ifFalse)
      Returns a BiCollector that partitions the incoming pairs into two groups: elements that match predicate, and those that don't, and use ifTrue and ifFalse downstream collectors respectively to collect the pairs.

      For example:

      
       timeSeries
           .collect(
               partitioningBy((time, event) -> event.isImportant(), toImmutableMap(), counting()))
           .andThen((importantEvents, unimportantCount) -> ...);
       
      Type Parameters:
      K - the input key type
      V - the input value type
      T - the result type for the pairs that evaluate to true
      F - the result type for the pairs that evaluate to false
      Since:
      8.1
    • collectingAndThen

      public static <K, V, T, R> BiCollector<K,V,R> collectingAndThen(BiCollector<K,V,T> upstream, Function<? super T,? extends R> finisher)
      Returns a BiCollector that maps the result of upstream collector using finisher.
      Since:
      3.2
    • collectingAndThen

      public static <K, V, A, B, R> BiCollector<K,V,R> collectingAndThen(BiCollector<K,V,? extends Both<? extends A,? extends B>> collector, BiFunction<? super A,? super B,? extends R> finisher)
      Returns a BiCollector that maps the result of collector using the finisher BiFunction. Useful when combined with BiCollectors like partitioningBy(java.util.function.BiPredicate<? super K, ? super V>).

      For example:

      
       collectingAndThen(
           partitioningBy((time, request) -> isAllowed(time, request), toMap(), counting()),
           (allowedRequests, disallowed) -> ...)
       
      Since:
      8.1
    • collectingAndThen

      public static <K, V, R> BiCollector<K,V,R> collectingAndThen(Function<? super BiStream<K,V>,? extends R> finisher)
      Returns a BiCollector that first collects the input pairs into a BiStream and then applies finisher on the intermediary BiStream.

      This method makes it easier to create BiCollector using a lambda. For example, you may want to apply some stream operations for every group of pairs when using the groupingBy method:

      
           BiStream.from(phoneBook)
               .collect(
                   groupingBy(
                       (addr, phone) -> phone.areaCode(),
                       collectingAndThen(group -> group.flatMapKeys(...).mapIfPresent(...)...))
               .collect(ImmutableMap::toImmutableMap);
       
      Since:
      5.4
    • mapping

      public static <K, V, T, R> BiCollector<K,V,R> mapping(BiFunction<? super K,? super V,? extends T> mapper, Collector<T,?,R> downstream)
      Returns a BiCollector that first maps the input pair using mapper and then collects the results using downstream collector.
      Since:
      3.2
    • mapping

      public static <K, V, K1, V1, R> BiCollector<K,V,R> mapping(BiFunction<? super K,? super V,? extends K1> keyMapper, BiFunction<? super K,? super V,? extends V1> valueMapper, BiCollector<K1,V1,R> downstream)
      Returns a BiCollector that first maps the input pair using keyMapper and valueMapper respectively, then collects the results using downstream collector.
      Since:
      3.6
    • mapping

      public static <K, V, K1, V1, R> BiCollector<K,V,R> mapping(Function<? super K,? extends K1> keyMapper, Function<? super V,? extends V1> valueMapper, BiCollector<K1,V1,R> downstream)
      Returns a BiCollector that first maps the input pair using keyMapper and valueMapper, then collects the results using the downstream collector.
      Since:
      8.2
    • mapping

      public static <K, V, K1, V1, R> BiCollector<K,V,R> mapping(BiFunction<? super K,? super V,? extends Both<? extends K1,? extends V1>> mapper, BiCollector<K1,V1,R> downstream)
      Returns a BiCollector that first maps the input pair into another pair using mapper. and then collects the results using downstream collector.
      Since:
      5.2
    • flatMapping

      public static <K, V, T, R> BiCollector<K,V,R> flatMapping(BiFunction<? super K,? super V,? extends Stream<? extends T>> flattener, Collector<T,?,R> downstream)
      Returns a BiCollector that first flattens the input pair using flattener and then collects the results using downstream collector.

      For example, you may use several levels of groupingBy() to aggregate metrics along a few dimensions, and then flatten them into a histogram. This could be done using BiStream#flatMapToObj, like:

      
       import static com.google.mu.util.stream.BiStream.groupingBy;
      
         List<HistogramBucket> histogram = events.stream()
             .collect(groupingBy(Event::cell, groupingBy(Event::hour, counting())))
             .flatMapToObj((cell, cellEvents) ->
                 cellEvents.mapToObj((hour, count) ->
                     HistogramBucket.newBuilder()
                         .addDimension(cell)
                         .addDimension(hour)
                         .setCount(count)
                         .build()))
             .collect(toList());
       
      It works. But if you need to do this kind of histogram creation along different dimensions repetitively, the flatMapToObj() + mapToObj() boilerplate becomes tiresome to read and write. Instead, you could use BiCollectors.flatMapping() to encapsulate and reuse the boilerplate:
      
       import static com.google.mu.util.stream.BiStream.groupingBy;
      
         List<HistogramBucket> byCellHourly = events.stream()
             .collect(groupingBy(Event::cell, groupingBy(Event::hour, counting())))
             .collect(toHistogram());
      
         List<HistogramBucket> byUserHourly = events.stream()
             .collect(groupingBy(Event::user, groupingBy(Event::hour, counting())))
             .collect(toHistogram());
      
         private static BiCollector<Object, BiStream<?, Long>, List<HistogramBucket>> toHistogram() {
           return BiCollectors.flatMapping(
               (d1, events) ->
                     events.mapToObj((d2, count) ->
                         HistogramBucket.newBuilder()
                             .addDimension(d1)
                             .addDimension(d2)
                             .setCount(count)
                             .build()),
               .collect(List());
         }
       
      Since:
      3.4
    • flatMapping

      public static <K, V, K1, V1, R> BiCollector<K,V,R> flatMapping(BiFunction<? super K,? super V,? extends BiStream<? extends K1,? extends V1>> flattener, BiCollector<K1,V1,R> downstream)
      Returns a BiCollector that first flattens the input pair using flattener and then collects the result pairs using downstream collector.
      Since:
      3.4
    • inverse

      public static <A, B, R> BiCollector<A,B,R> inverse(BiCollector<B,A,R> downstream)
      Returns a BiCollector that inverses the input pairs of (a, b) into (b, a) before passing it to downstream collector.
      Since:
      8.1
    • maxByKey

      public static <K, V> BiCollector<K,V,BiOptional<K,V>> maxByKey(Comparator<? super K> comparator)
      Returns a BiCollector that finds the pair with the maximum key according to comparator.

      Null keys and values are not supported.

      Since:
      6.6
    • minByKey

      public static <K, V> BiCollector<K,V,BiOptional<K,V>> minByKey(Comparator<? super K> comparator)
      Returns a BiCollector that finds the pair with the minimum key according to comparator.

      Null keys and values are not supported.

      Since:
      6.6
    • maxByValue

      public static <K, V> BiCollector<K,V,BiOptional<K,V>> maxByValue(Comparator<? super V> comparator)
      Returns a BiCollector that finds the pair with the maximum value according to comparator.

      Null keys and values are not supported.

      Since:
      6.6
    • minByValue

      public static <K, V> BiCollector<K,V,BiOptional<K,V>> minByValue(Comparator<? super V> comparator)
      Returns a BiCollector that finds the pair with the minimum value according to comparator.

      Null keys and values are not supported.

      Since:
      6.6
    • minBy

      public static <K, V> BiCollector<K,V,BiOptional<K,V>> minBy(Comparator<? super K> keyComparator, Comparator<? super V> valueComparator)
      Returns a BiCollector that finds the minimum pair according to keyComparator and then valueComparator for equal keys.

      Null keys and values are not supported.

      Since:
      6.6
    • maxBy

      public static <K, V> BiCollector<K,V,BiOptional<K,V>> maxBy(Comparator<? super K> keyComparator, Comparator<? super V> valueComparator)
      Returns a BiCollector that finds the maximum pair according to keyComparator and then valueComparator for equal keys.

      Null keys and values are not supported.

      Since:
      6.6