Class Substring.RepeatingPattern

java.lang.Object
com.google.mu.util.Substring.RepeatingPattern
Enclosing class:
Substring

public abstract static class Substring.RepeatingPattern extends Object
A substring pattern to be applied repeatedly on the input string, each time over the remaining substring after the previous match.
Since:
5.2
  • Method Details

    • match

      public abstract Stream<Substring.Match> match(String input, int fromIndex)
      Applies this pattern against string starting from fromIndex and returns a stream of each iteration.

      Iterations happen in strict character encounter order, from the beginning of the input string to the end, with no overlapping. When a match is found, the next iteration is guaranteed to be in the substring after the current match. For example, between(first('/'), first('/')).repeatedly().match("/foo/bar/baz/") will return ["foo", "bar", "baz"]. On the other hand, after(last('/')).repeatedly().match("/foo/bar") will only return "bar".

      Pattern matching is lazy and doesn't start until the returned stream is consumed.

      An empty stream is returned if this pattern has no matches in the input string.

      Throws:
      IndexOutOfBoundsException - if fromIndex is negative or greater than input.length()
      Since:
      8.2
    • match

      public final Stream<Substring.Match> match(String input)
      Applies this pattern against string and returns a stream of each iteration.

      Iterations happen in strict character encounter order, from the beginning of the input string to the end, with no overlapping. When a match is found, the next iteration is guaranteed to be in the substring after the current match. For example, between(first('/'), first('/')).repeatedly().match("/foo/bar/baz/") will return ["foo", "bar", "baz"]. On the other hand, after(last('/')).repeatedly().match("/foo/bar") will only return "bar".

      Pattern matching is lazy and doesn't start until the returned stream is consumed.

      An empty stream is returned if this pattern has no matches in the input string.

    • from

      public Stream<String> from(CharSequence input)
      Applies this pattern against string and returns a stream of each iteration.

      Iterations happen in strict character encounter order, from the beginning of the input string to the end, with no overlapping. When a match is found, the next iteration is guaranteed to be in the substring after the current match. For example, between(first('/'), first('/')).repeatedly().from("/foo/bar/baz/") will return ["foo", "bar", "baz"]. On the other hand, after(last('/')).repeatedly().from("/foo/bar") will only return "bar".

      Pattern matching is lazy and doesn't start until the returned stream is consumed.

      An empty stream is returned if this pattern has no matches in the input string.

    • removeAllFrom

      public String removeAllFrom(String string)
      Returns a new string with all matches of this pattern removed. Returns string as is if no match is found.
    • replaceAllFrom

      public String replaceAllFrom(String string, Function<? super Substring.Match,? extends CharSequence> replacementFunction)
      Returns a new string with all matches of this pattern replaced by applying replacementFunction for each match.

      replacementFunction must not return null. Returns string as-is if no match is found.

    • split

      public Stream<Substring.Match> split(String string)
      Returns a stream of Match objects delimited by every match of this pattern. If this pattern isn't found in string, the full string is matched.

      The returned Match objects are cheap "views" of the matched substring sequences. Because Match implements CharSequence, the returned Match objects can be directly passed to CharSequence-accepting APIs such as com.google.common.base.CharMatcher.trimFrom() and Substring.Pattern.splitThenTrim(java.lang.CharSequence) etc.

    • splitThenTrim

      public Stream<Substring.Match> splitThenTrim(String string)
      Returns a stream of Match objects delimited by every match of this pattern. with whitespaces trimmed.

      The returned Match objects are cheap "views" of the matched substring sequences. Because Match implements CharSequence, the returned Match objects can be directly passed to CharSequence-accepting APIs such as com.google.common.base.CharMatcher.trimFrom() and Substring.Pattern.split(java.lang.CharSequence) etc.

    • cut

      public Stream<Substring.Match> cut(String string)
      Returns a stream of Match objects from the input string as demarcated by this delimiter pattern. It's similar to split(java.lang.String) but includes both the substrings split by the delimiters and the delimiter substrings themselves, interpolated in the order they appear in the input string.

      For example,

      
       spanningInOrder("{", "}").repeatedly().cut("Dear {user}: please {act}.")
       
      will result in the stream of ["Dear ", "{user}", ": please ", "{act}", "."].

      The returned Match objects are cheap "views" of the matched substring sequences. Because Match implements CharSequence, the returned Match objects can be directly passed to CharSequence-accepting APIs such as Guava CharMatcher.trimFrom, Substring.Pattern.splitThenTrim(java.lang.CharSequence), etc.

      Since:
      7.1
    • splitKeyValuesAround

      public final BiStream<String,String> splitKeyValuesAround(Substring.Pattern keyValueSeparator, String input)
      Returns a BiStream of key value pairs from input.

      The key-value pairs are delimited by this repeating pattern. with the key and value separated by keyValueSeparator.

      Empty parts (including leading and trailing separator) are ignored. Although whitespaces are not trimmed. For example:

      
       first(',')
           .repeatedly()
           .splitKeyValuesAround(first('='), "k1=v1,,k2=v2,")
       
      will result in a BiStream equivalent to [(k1, v1), (k2, v2)], but "k1=v1, ,k2=v2" will fail to be split due to the whitespace after the first ','.

      Non-empty parts where keyValueSeparator is absent will result in IllegalArgumentException.

      For alternative splitting strategies, like, if you want to reject instead of ignoring empty parts. consider to use split(java.lang.String) and Substring.Pattern.split(java.lang.CharSequence) directly, such as:

      
       first(',')
           .repeatedly()
           .split("k1=v1,,k2=v2,")  // the redundant ',' will throw IAE
           .collect(
               GuavaCollectors.toImmutableMap(
                   m -> first('=').split(m).orElseThrow(...)));
       
      Or, if you want to ignore unparsable parts:
      
       first(',')
           .repeatedly()
           .split("k1=v1,k2>v2")  // Ignore the unknown "k2>v2"
           .map(first('=')::split)
           .collect(
               MoreCollectors.flatMapping(
                   BiOptional::stream,
                   toImmutableMap()));
       
      Since:
      5.9
    • splitThenTrimKeyValuesAround

      public final BiStream<String,String> splitThenTrimKeyValuesAround(Substring.Pattern keyValueSeparator, String input)
      Returns a BiStream of key value pairs from input.

      The key-value pairs are delimited by this repeating pattern. with the key and value separated by keyValueSeparator.

      All keys and values are trimmed, with empty parts (including leading and trailing separator) ignored. For example:

      
       first(',')
           .repeatedly()
           .splitThenTrimKeyValuesAround(first('='), "k1 = v1, , k2=v2,")
       
      will result in a BiStream equivalent to [(k1, v1), (k2, v2)].

      Non-empty parts where keyValueSeparator is absent will result in IllegalArgumentException.

      For alternative splitting strategies, like, if you want to reject instead of ignoring empty parts. consider to use split(java.lang.String) and Substring.Pattern.splitThenTrim(java.lang.CharSequence) directly, such as:

      
       first(',')
           .repeatedly()
           .split("k1 = v1, , k2=v2,")  // the redundant ',' will throw IAE
           .collect(
               GuavaCollectors.toImmutableMap(
                   m -> first('=').splitThenTrim(m).orElseThrow(...)));
       
      Or, if you want to ignore unparsable parts:
      
       first(',')
           .repeatedly()
           .split("k1 = v1, k2 > v2")  // Ignore the unknown "k2 > v2"
           .map(first('=')::splitThenTrim)
           .collect(
               MoreCollectors.flatMapping(
                   BiOptional::stream,
                   toImmutableMap()));
       
      Since:
      5.9
    • alternationFrom

      public final BiStream<String,String> alternationFrom(String input)
      Returns the alternation of this pattern from the input string, with the matched substring alternated with the trailing substring before the next match.

      For example: to find bulleted items (strings prefixed by 1:, 2:, 456: etc.), you can:

      
       Substring.Pattern bulletNumber = consecutive(CharPredicate.range('0', '9'))
           .separatedBy(CharPredicate.WORD.not(), CharPredicate.is(':'));
       Map<Integer, String> bulleted = bulletNumber.repeatedly()
           .alternationFrom("1: go home;2: feed 2 cats 3: sleep tight.")
           .mapKeys(n -> Integer.parseInt(n))
           .mapValues(withColon -> prefix(":").removeFrom(withColon.toString()).trim())
           .toMap();
           // => [{1, "go home;"}, {2, "feed 2 cats"}, {3, "sleep tight."}]
       
      Since:
      6.1