Class Substring.Pattern

java.lang.Object
com.google.mu.util.Substring.Pattern
Direct Known Subclasses:
Substring.Prefix, Substring.Suffix
Enclosing class:
Substring

public abstract static class Substring.Pattern extends Object
A pattern that can be matched against a string, finding a single substring from it.
  • Constructor Details

    • Pattern

      public Pattern()
  • Method Details

    • in

      public final Optional<Substring.Match> in(String string)
      Matches this pattern against string, returning a Match if successful, or empty() otherwise.

      This is useful if you need to call Substring.Match methods, like Substring.Match.remove() or Substring.Match.before(). If you just need the matched substring itself, prefer to use from(java.lang.CharSequence) instead.

    • in

      public final Optional<Substring.Match> in(String string, int fromIndex)
      Matches this pattern against string starting from the character at fromIndex, returning a Match if successful, or empty() otherwise.

      Note that it treats fromIndex as the beginning of the string, so patterns like Substring.prefix(java.lang.String), Substring.BEGINNING will attempt to match from this index.

      Throws:
      IndexOutOfBoundsException - if fromIndex is negative or greater than string.length()
      Since:
      7.0
    • from

      public final Optional<String> from(CharSequence string)
      Matches this pattern against string, returning the matched substring if successful, or empty() otherwise. pattern.from(str) is equivalent to pattern.in(str).map(Object::toString).

      This is useful if you only need the matched substring itself. Use in(java.lang.String) if you need to call Substring.Match methods, like Substring.Match.remove() or Substring.Match.before().

    • removeFrom

      public final String removeFrom(String string)
      Matches this pattern against string, and returns a copy with the matched substring removed if successful. Otherwise, returns string unchanged.
    • replaceFrom

      public final String replaceFrom(String string, CharSequence replacement)
      Returns a new string with the substring matched by this replaced by replacement. Returns string as-is if a substring is not found.
    • replaceFrom

      public final String replaceFrom(String string, Function<? super Substring.Match,? extends CharSequence> replacementFunction)
      Returns a new string with the substring matched by this replaced by the return value of replacementFunction.

      For example, you can replace a single template placeholder using:

      
       Substring.spanningInOrder("{", "}")
           .replaceFrom(s, placeholder -> replacements.get(placeholder.skip(1, 1).toString()));
       

      Returns string as-is if a substring is not found.

      Since:
      5.6
    • toEnd

      public final Substring.Pattern toEnd()
      Returns a Pattern that will match from the substring matched by this to the end of the input string. For example:
         String line = "return foo; // some comment...";
         String commentRemoved = first("//").toEnd().removeFrom(line).trim();
         assertThat(commentRemoved).isEqualTo("return foo;");
       

      To match from the beginning of the input string to the end of a pattern, use Substring.upToIncluding(com.google.mu.util.Substring.Pattern) instead.

    • or

      public final Substring.Pattern or(Substring.Pattern that)
      Returns a Pattern that falls back to using that if this fails to match.
    • limit

      public Substring.Pattern limit(int maxChars)
      Returns a Pattern that's equivalent to this pattern except it only matches at most maxChars.
      Since:
      6.1
    • skip

      public Substring.Pattern skip(int fromBeginning, int fromEnd)
      Returns a Pattern that's equivalent to this pattern except it will skip up to fromBeginnings characters from the beginning of the match and up to fromEnd characters from the end of the match.

      If the match includes fewer characters, an empty match is returned.

      Since:
      6.1
    • then

      public final Substring.Pattern then(Substring.Pattern following)
      Similar to regex lookahead, returns a pattern that matches the following pattern after it has matched this pattern. For example first('/').then(first('/')) finds the second '/' character.
      Since:
      5.7
    • peek

      public Substring.Pattern peek(Substring.Pattern following)
      Return a Pattern equivalent to this Pattern, except it will fail to match if the following pattern can't find a match in the substring after the current match.

      Useful in asserting that the current match is followed by the expected pattern. For example: SCHEME_NAME.peek(prefix(':')) returns the URI scheme name.

      Note that unlike regex lookahead, no backtracking is attempted. So first("foo").peek("bar") will match "bafoobar" but won't match "foofoobar" because the first "foo" isn't followed by "bar".

      If look-ahead is needed, you can use followedBy(java.lang.String) as in first("foo").followedBy("bar").

      If you are trying to define a boundary around or after your pattern similar to regex anchor '\b', consider using separatedBy(com.google.mu.util.CharPredicate) if the boundary can be detected by a character.

      Since:
      6.0
    • separatedBy

      public final Substring.Pattern separatedBy(CharPredicate separator)
      Returns an otherwise equivalent Pattern, except it only matches if it's next to the beginning of the string, the end of the string, or the separator character(s).

      Useful if you are trying to find a word with custom boundaries. To search for words composed of regex \w character class, consider using Substring.word() instead.

      For lookahead and lookbehind assertions, consider using immediatelyBetween(java.lang.String, java.lang.String) or followedBy(java.lang.String) instead.

      Since:
      6.2
    • separatedBy

      public Substring.Pattern separatedBy(CharPredicate separatorBefore, CharPredicate separatorAfter)
      Returns an otherwise equivalent Pattern, except it requires the beginning of the match must either be the beginning of the string, or be separated from the rest of the string by the separatorBefore character; and the end of the match must either be the end of the string, or be separated from the rest of the string by the separatorAfter character.

      Useful if you are trying to find a word with custom boundaries. To search for words composed of regex \w character class, consider using Substring.word() instead.

      For lookahead and lookbehind assertions, consider using Substring.between(java.lang.String, java.lang.String) or followedBy(java.lang.String) instead.

      Since:
      6.2
    • immediatelyBetween

      public final Substring.Pattern immediatelyBetween(String lookbehind, String lookahead)
      Returns an otherwise equivalent pattern except it requires the matched substring be immediately preceded by the lookbehind string and immediately followed by the after string.

      Similar to regex lookarounds, the returned pattern will backtrack until the lookaround is satisfied. That is, word().immediatelyBetween("(", ")") will find the "bar" substring inside the parenthesis from "foo (bar)".

      If you need lookahead only, use followedBy(java.lang.String) instead; for lookbehind only, pass an empty string as the lookahead string, as in: word().immediatelyBetween(":", "").

      Since:
      6.2
    • immediatelyBetween

      public final Substring.Pattern immediatelyBetween(String lookbehind, Substring.BoundStyle lookbehindBound, String lookahead, Substring.BoundStyle lookaheadBound)
      Similar to immediatelyBetween(String, String), but allows including the lookbehind and/or lookahead inclusive in the match.

      For example, to split around all "{placholder_name}", you can use:

      
       PLACEHOLDER_NAME_PATTERN.immediatelyBetween("{", INCLUSIVE, "}", INCLUSIVE)
           .split(input);
       
      Since:
      7.0
    • notImmediatelyBetween

      public final Substring.Pattern notImmediatelyBetween(String lookbehind, String lookahead)
      Returns an otherwise equivalent pattern except it requires the matched substring not be immediately preceded by the lookbehind string and immediately followed by the after string.

      Similar to regex negative lookarounds, the returned pattern will backtrack until the negative lookaround is satisfied. That is, word().notImmediatelyBetween("(", ")") will find the "bar" substring from "(foo) bar".

      If you need negative lookahead only, use notFollowedBy(java.lang.String) instead; for negative lookbehind only, pass an empty string as the lookahead string, as in: word().notImmediatelyBetween(":", "").

      If the pattern shouldn't be preceded or followed by particular character(s), consider using separatedBy(com.google.mu.util.CharPredicate). The following code finds "911" but only if it's at the beginning of a number:

      
       Substring.Pattern emergency =
           first("911").separatedBy(CharPredicate.range('0', '9').not(), CharPredicate.ANY);
       
      Since:
      6.2
    • followedBy

      public final Substring.Pattern followedBy(String lookahead)
      Returns an otherwise equivalent pattern except it requires the matched substring be immediately followed by the lookahead string.

      Similar to regex negative lookahead, the returned pattern will backtrack until the lookahead is satisfied. That is, word().followedBy(":") will find the "Joe" substring from "To Joe:".

      If you need lookbehind, or both lookahead and lookbehind, use immediatelyBetween(java.lang.String, java.lang.String) instead.

      Since:
      6.2
    • notFollowedBy

      public final Substring.Pattern notFollowedBy(String lookahead)
      Returns an otherwise equivalent pattern except it requires the matched substring not be immediately followed by the lookahead string.

      Similar to regex negative lookahead, the returned pattern will backtrack until the negative lookahead is satisfied. That is, word().notFollowedBy(" ") will find the "Joe" substring from "To Joe:".

      If you need negative lookbehind, or both negative lookahead and lookbehind, use notImmediatelyBetween(java.lang.String, java.lang.String) instead.

      If the pattern shouldn't be followed by particular character(s), consider using separatedBy(com.google.mu.util.CharPredicate). The following code finds the file extension name ".java" if it's not followed by another letter:

      
       CharPredicate letter = Character::isJavaIdentifierStart;
       Substring.Pattern javaExtension =
           first(".java").separatedBy(CharPredicate.ANY, letter.not());
       
      Since:
      6.2
    • precededBy

      public final Substring.Pattern precededBy(String lookbehind)
      Returns an otherwise equivalent pattern except it requires the matched substring be immediately preceded by the lookahead string.

      Similar to regex lookbehind, the returned pattern will backtrack until the lookbehind is satisfied. That is, word().precededBy(": ") will find the "Please" substring from "Amy: Please come in".

      Since:
      6.2
    • notPrecededBy

      public final Substring.Pattern notPrecededBy(String lookbehind)
      Returns an otherwise equivalent pattern except it requires the matched substring not be immediately preceded by the lookbehind string.

      Similar to regex negative lookbehind, the returned pattern will backtrack until the negative lookbehind is satisfied. For example, word().notPrecededBy("(") will find the "bar" substring from "(foo+bar)".

      Since:
      6.2
    • split

      public final BiOptional<String,String> split(CharSequence string)
      Splits string into two parts that are separated by this separator pattern. For example:
      
       Optional<KeyValue> keyValue = first('=').split("name=joe").join(KeyValue::new);
       

      If you need to trim the key-value pairs, use splitThenTrim(java.lang.CharSequence).

      To split a string into multiple substrings delimited by a delimiter, use repeatedly().

      Since:
      5.0
    • splitThenTrim

      public final BiOptional<String,String> splitThenTrim(CharSequence string)
      Splits string into two parts that are separated by this separator pattern, with leading and trailing whitespaces trimmed. For example:
      
       Optional<KeyValue> keyValue = first('=').splitThenTrim("name = joe ").join(KeyValue::new);
       

      If you are trying to parse a string to a key-value data structure (Map, Multimap etc.), you can use com.google.common.base.Splitter.MapSplitter though it's limited to Map and doesn't allow duplicate keys:

      
       String toSplit = " x -> y, z-> a ";
       Map<String, String> result = Splitter.on(',')
           .trimResults()
           .withKeyValueSeparator(Splitter.on("->"))
           .split(toSplit);
       
      Alternatively, you can use Substring to allow duplicate keys and to split into multimaps or other types:
      
       import static com.google.mu.util.stream.MoreCollectors.mapping;
      
       String toSplit = " x -> y, z-> a, x -> t ";
       ImmutableListMultimap<String, String> result = first(',')
           .repeatedly()
           .split(toSplit)
           .map(first("->")::splitThenTrim)
           .collect(
               mapping(
                   kv -> kv.orElseThrow(...),
                   ImmutableListMultimap::toImmutableListMultimap));
       

      To split a string into multiple substrings delimited by a delimiter, use repeatedly().

      Since:
      5.0
    • repeatedly

      public Substring.RepeatingPattern repeatedly()
      Returns a Substring.RepeatingPattern that applies this pattern repeatedly against the input string. That is, after each iteration, the pattern is applied again over the substring after the match, repeatedly until no match is found.
      Since:
      5.2
    • toString

      public String toString()
      Do not depend on the string representation of Substring, except for subtypes Substring.Prefix and Substring.Suffix that have an explicitly defined representation.
      Overrides:
      toString in class Object