Class Substring
extracted, removed, replaced, or used to divide the input string into parts.
For example, to strip off the "http://" prefix from a uri string if present:
static String stripHttp(String uri) {
return Substring.prefix("http://").removeFrom(uri);
}
To strip off either an "http://" or "https://" prefix if present:
static import com.google.util.Substring.prefix;
static String stripHttpOrHttps(String uri) {
return prefix("http://").or(prefix("https://")).removeFrom(uri);
}
To strip off a suffix starting with a dash (-) character:
static String stripDashSuffix(String str) {
return last('-').toEnd().removeFrom(str);
}
To replace a trailing "//" with "/":
static String fixTrailingSlash(String str) {
return Substring.suffix("//").replaceFrom(str, '/');
}
To extract the 'name' and 'value' from an input string in the format of "name:value":
Substring.first(':')
.split("name:joe")
.map(NameValue::new)
.orElseThrow(BadFormatException::new);
To parse key-value pairs:
import static com.google.mu.util.stream.GuavaCollectors.toImmutableListMultimap;
ImmutableListMultimap<String, String> tags =
all(',')
.splitThenTrimKeyValuesAround(first('='), "k1=v1, k2=v2") // => [(k1, v1), (k2, v2)]
.collect(toImmutableListMultimap());
To replace the placeholders in a text with values (although do consider using a proper templating
framework because it's a security vulnerability if your values come from untrusted sources like
the user inputs):
ImmutableMap<String, String> variables =
ImmutableMap.of("who", "Arya Stark", "where", "Braavos");
String rendered =
spanningInOrder("{", "}")
.repeatedly()
.replaceAllFrom(
"{who} went to {where}.",
placeholder -> variables.get(placeholder.skip(1, 1).toString()));
assertThat(rendered).isEqualTo("Arya Stark went to Braavos.");
From Apache StringUtils to Substring
| StringUtils Style | Substring Style |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- Since:
- 2.0
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enumThe style of the bounds of a match.static final classThe result of successfully matching aSubstring.Patternagainst a string, providing access to thematched substring, to the parts of the stringbeforeandafterit, and to copies with the matched substringremovedorreplaced.static classA pattern that can be matched against a string, finding a single substring from it.static final classAn immutable string prefixPatternwith extra utilities such asSubstring.Prefix.addToIfAbsent(String),Substring.Prefix.removeFrom(StringBuilder),Substring.Prefix.isIn(CharSequence)etc.static classA substring pattern to be applied repeatedly on the input string, each time over the remaining substring after the previous match.static final classAn immutable string suffixPatternwith extra utilities such asSubstring.Suffix.addToIfAbsent(String),Substring.Suffix.removeFrom(StringBuilder),Substring.Suffix.isIn(CharSequence)etc. -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Substring.PatternPatternthat matches the empty substring at the beginning of the input string.static final Substring.PatternPatternthat matches the empty substring at the end of the input string.static final Substring.PatternPatternthat never matches any substring. -
Method Summary
Modifier and TypeMethodDescriptionstatic Substring.Patternafter(Substring.Pattern delimiter) Returns aPatternthat covers the substring afterdelimiter.static Substring.RepeatingPatternall(char ch) Returns aSubstring.RepeatingPatternthat matches all occurrences ofchin the input string.static Substring.RepeatingPatternall(CharPredicate matcher) Returns aSubstring.RepeatingPatternthat matches all characters that matchmatcherin the input string.static Substring.RepeatingPatternReturns aSubstring.RepeatingPatternthat matches all occurrences ofsubstrin the input string.static Substring.Patternbefore(Substring.Pattern delimiter) Returns aPatternthat covers the substring beforedelimiter.static Substring.Patternbetween(char open, char close) Returns aPatternthat will match the substring between the firstopenand the firstcloseafter it.static Substring.Patternbetween(char open, Substring.BoundStyle openBound, char close, Substring.BoundStyle closeBound) Similar tobetween(char, char)but allows to use alternative bound styles to include or exclude the delimiters at both ends.static Substring.Patternbetween(Substring.Pattern open, Substring.BoundStyle openBound, Substring.Pattern close, Substring.BoundStyle closeBound) Similar tobetween(Pattern, Pattern)but allows to use alternative bound styles to include or exclude the delimiters at both ends.static Substring.Patternbetween(Substring.Pattern open, Substring.Pattern close) Returns aPatternthat will match the substring betweenopenandclose.static Substring.Patternbetween(String open, Substring.BoundStyle openBound, String close, Substring.BoundStyle closeBound) Similar tobetween(String, String)but allows to use alternative bound styles to include or exclude the delimiters at both ends.static Substring.PatternReturns aPatternthat will match the substring between the firstopenand the firstcloseafter it.static Substring.Patternconsecutive(char ch) Returns aPatternthat matches the first non-empty sequence of consecutivechcharacters.static Substring.Patternconsecutive(CharPredicate matcher) Returns aPatternthat matches the first non-empty sequence of consecutive characters identified bymatcher.static Substring.Patternfirst(char character) Returns aPatternthat matches the first occurrence ofcharacter.static Substring.Patternfirst(CharPredicate charMatcher) Returns aPatternthat matches the first character found bycharMatcher.static Substring.PatternReturns aPatternthat matches the first occurrence ofstr.static Substring.PatternReturns aPatternthat matches the first occurrence ofregexPattern.static Substring.PatternReturns aPatternthat matches the first occurrence ofregexPatternand then selects the capturing group identified bygroup.static Collector<Substring.Pattern, ?, Substring.Pattern> Returns aCollectorthat collects the input candidateSubstring.Patternand reults in a pattern that matches whichever that occurs first in the input string.static Substring.Patternlast(char character) Returns aPatternthat matches the last occurrence ofcharacter.static Substring.Patternlast(CharPredicate charMatcher) Returns aPatternthat matches the last character found bycharMatcher.static Substring.PatternReturns aPatternthat matches the last occurrence ofstr.static Substring.Patternleading(CharPredicate matcher) Returns aPatternthat matches from the beginning of the input string, a non-empty sequence of leading characters identified bymatcher.static Substring.Prefixprefix(char prefix) Returns aPrefixpattern that matches strings starting withprefix.static Substring.PrefixReturns aPrefixpattern that matches strings starting withprefix.static Substring.PatternspanningInOrder(String stop1, String stop2, String... moreStops) Returns aPatternthat matches the first occurrence ofstop1, followed by an occurrence ofstop2, followed sequentially by occurrences ofmoreStopsin order, including any characters between consecutive stops.static Substring.Suffixsuffix(char suffix) Returns aSuffixpattern that matches strings ending withsuffix.static Substring.SuffixReturns aSuffixpattern that matches strings ending withsuffix.static Substring.RepeatingPatterntopLevelGroups(Pattern regexPattern) Returns a repeating pattern representing all the top-level groups fromregexPattern.static Substring.Patterntrailing(CharPredicate matcher) Returns aPatternthat matches from the end of the input string, a non-empty sequence of trailing characters identified bymatcher.static Substring.PatternupToIncluding(Substring.Pattern pattern) Returns aPatternthat will match from the beginning of the original string up to the substring matched bypatterninclusively.static Substring.Patternword()Returns aPatternthat matches the first occurrence of a word composed of[a-zA-Z0-9_]characters.static Substring.PatternReturns aPatternthat matches the first occurrence ofwordthat isn't immediately preceded or followed by another "word" ([a-zA-Z0-9_]) character.
-
Field Details
-
NONE
Patternthat never matches any substring. -
BEGINNING
Patternthat matches the empty substring at the beginning of the input string. Typically used to represent an optional delimiter. For example, the following pattern matches the substring after optional "header_name=":static final Substring.Pattern VALUE = Substring.after(first('=').or(BEGINNING)); -
END
Patternthat matches the empty substring at the end of the input string. Typically used to represent an optional delimiter. For example, the following pattern matches the text between the first occurrence of the string "id=" and the end of that line, or the end of the string:static final Substring.Pattern ID = Substring.between(substring("id="), substring("\n").or(END));
-
-
Method Details
-
prefix
Returns aPrefixpattern that matches strings starting withprefix.Typically if you have a
Stringconstant representing a prefix, consider to declare aSubstring.Prefixconstant instead. The type is more explicit, and utilitiy methods likeSubstring.Pattern.removeFrom(java.lang.String),Substring.Pattern.from(java.lang.CharSequence)are easier to discover and use. -
prefix
Returns aPrefixpattern that matches strings starting withprefix.Typically if you have a
charconstant representing a prefix, consider to declare aSubstring.Prefixconstant instead. The type is more explicit, and utilitiy methods likeSubstring.Pattern.removeFrom(java.lang.String),Substring.Pattern.from(java.lang.CharSequence)are easier to discover and use. -
suffix
Returns aSuffixpattern that matches strings ending withsuffix.Typically if you have a
Stringconstant representing a suffix, consider to declare aSubstring.Suffixconstant instead. The type is more explicit, and utilitiy methods likeSubstring.Pattern.removeFrom(java.lang.String),Substring.Pattern.from(java.lang.CharSequence)are easier to discover and use. -
suffix
Returns aSuffixpattern that matches strings ending withsuffix.Typically if you have a
charconstant representing a suffix, consider to declare aSubstring.Suffixconstant instead. The type is more explicit, and utilitiy methods likeSubstring.Pattern.removeFrom(java.lang.String),Substring.Pattern.from(java.lang.CharSequence)are easier to discover and use. -
first
Returns aPatternthat matches the first occurrence ofstr. -
first
Returns aPatternthat matches the first occurrence ofcharacter. -
first
Returns aPatternthat matches the first character found bycharMatcher.- Since:
- 6.0
-
last
Returns aPatternthat matches the last character found bycharMatcher.- Since:
- 6.0
-
first
Returns aPatternthat matches the first occurrence ofregexPattern.Unlike
str.replaceFirst(regexPattern, replacement),first(regexPattern).replaceFrom(str, replacement)
treats thereplacementas a literal string, with no special handling of backslash (\) and dollar sign ($) characters. -
word
Returns aPatternthat matches the first occurrence of a word composed of[a-zA-Z0-9_]characters.- Since:
- 6.0
-
word
Returns aPatternthat matches the first occurrence ofwordthat isn't immediately preceded or followed by another "word" ([a-zA-Z0-9_]) character.For example, if you are looking for an English word "cat" in the string "catchie has a cat",
first("cat")won't work because it'll match the first three letters of "cathie". Instead, you should useword("cat")to skip over "cathie".If your word boundary isn't equivalent to the regex
\Wcharacter class, you can define your own word boundaryCharMatcherand then useSubstring.Pattern.separatedBy(com.google.mu.util.CharPredicate)instead. Say, if your word is lower-case alpha with dash ('-'), then:CharMatcher boundary = CharMatcher.inRange('a', 'z').or(CharMatcher.is('-')).negate(); Substring.Pattern petFriendly = first("pet-friendly").separatedBy(boundary);- Since:
- 6.0
-
leading
Returns aPatternthat matches from the beginning of the input string, a non-empty sequence of leading characters identified bymatcher.For example:
leading(javaLetter()).from("System.err")will result in"System".- Since:
- 6.0
-
trailing
Returns aPatternthat matches from the end of the input string, a non-empty sequence of trailing characters identified bymatcher.For example:
trailing(digit()).from("60612-3588")will result in"3588".- Since:
- 6.0
-
consecutive
Returns aPatternthat matches the first non-empty sequence of consecutivechcharacters.- Since:
- 8.7
-
consecutive
Returns aPatternthat matches the first non-empty sequence of consecutive characters identified bymatcher.For example:
consecutive(javaLetter()).from("(System.out)")will find"System", andconsecutive(javaLetter()).repeatedly().from("(System.out)")will produce["System", "out"].Equivalent to
matcher.collapseFrom(string, replacement), you can doconsecutive(matcher).repeatedly().replaceAllFrom(string, replacement). But you can also do things other than collapsing these consecutive groups, for example to inspect their values and replace conditionally:consecutive(matcher).repeatedly().replaceAllFrom(string, group -> ...), or other more sophisticated use cases like building index maps of these sub sequences.- Since:
- 6.0
-
all
Returns aSubstring.RepeatingPatternthat matches all occurrences ofsubstrin the input string. It's equivalent tofirst(substr).repeatedly().Note that overlapping occurrences are not matched. For example, if you have
"aaa",all("aa").from("aaa")will return only the first"aa".- Since:
- 8.6
-
all
Returns aSubstring.RepeatingPatternthat matches all occurrences ofchin the input string. It's equivalent tofirst(ch).repeatedly().- Since:
- 8.6
-
all
Returns aSubstring.RepeatingPatternthat matches all characters that matchmatcherin the input string. It's equivalent tofirst(matcher).repeatedly().- Since:
- 8.6
-
topLevelGroups
Returns a repeating pattern representing all the top-level groups fromregexPattern. IfregexPatternhas no capture group, the entire pattern is considered the only group.For example,
topLevelGroups(compile("(g+)(o+)")).from("ggooo")will return["gg", "ooo"].Nested capture groups are not taken into account. For example:
topLevelGroups(compile("((foo)+(bar)*)(zoo)")).from("foofoobarzoo")will return["foofoobar", "zoo"].Note that the top-level groups are statically determined by the
regexPattern. Particularly, quantifiers on a capture group do not increase or decrease the number of captured groups. That is, when matching"(foo)+"against"foofoofoo", there will only be one top-level group, with"foo"as the value.- Since:
- 5.3
-
first
Returns aPatternthat matches the first occurrence ofregexPatternand then selects the capturing group identified bygroup.For example, the following pattern finds the shard number (12) from a string like
12-of-99:import java.util.regex.Pattern; private static final Substring.Pattern SHARD_NUMBER = Substring.first(Pattern.compile("(\\d+)-of-\\d+"), 1);- Throws:
IndexOutOfBoundsException- ifgroupis negative or exceeds the number of capturing groups inregexPattern.
-
firstOccurrence
Returns aCollectorthat collects the input candidateSubstring.Patternand reults in a pattern that matches whichever that occurs first in the input string. For example you can use it to find the first occurrence of any reserved word in a set:Substring.Pattern reserved = Stream.of("if", "else", "for", "public") .map(Substring::word) .collect(firstOccurrence());- Since:
- 6.1
-
spanningInOrder
Returns aPatternthat matches the first occurrence ofstop1, followed by an occurrence ofstop2, followed sequentially by occurrences ofmoreStopsin order, including any characters between consecutive stops.Note that with more than two stops and if all the stops are literals, you may want to use
StringFormat.span()instead.For example, to find hyperlinks like
<a href="...">...</a>, you can useStringFormat.span("<a href=\"{link}\">{...}</a>"), which is equivalent tospanningInOrder("<a href=\"", "\">", "</a>")but more self-documenting with proper placeholder names. -
last
Returns aPatternthat matches the last occurrence ofstr. -
last
Returns aPatternthat matches the last occurrence ofcharacter. -
before
Returns aPatternthat covers the substring beforedelimiter. For example:String file = "/home/path/file.txt"; String path = Substring.before(last('/')).from(file).orElseThrow(...); assertThat(path).isEqualTo("/home/path"); -
after
Returns aPatternthat covers the substring afterdelimiter. For example:String file = "/home/path/file.txt"; String ext = Substring.after(last('.')).from(file).orElseThrow(...); assertThat(ext).isEqualTo("txt"); -
upToIncluding
Returns aPatternthat will match from the beginning of the original string up to the substring matched bypatterninclusively. For example:String uri = "http://google.com"; String schemeStripped = upToIncluding(first("://")).removeFrom(uri); assertThat(schemeStripped).isEqualTo("google.com");To match from the start of
patternto the end of the original string, useSubstring.Pattern.toEnd()instead. -
between
Returns aPatternthat will match the substring between the firstopenand the firstcloseafter it.If for example you need to find the substring between the first
"<-"and the last"->", usebetween(first("<-"), last("->"))instead.- Since:
- 6.0
-
between
public static Substring.Pattern between(String open, Substring.BoundStyle openBound, String close, Substring.BoundStyle closeBound) Similar tobetween(String, String)but allows to use alternative bound styles to include or exclude the delimiters at both ends.- Since:
- 7.2
-
between
Returns aPatternthat will match the substring between the firstopenand the firstcloseafter it.If for example you need to find the substring between the first and the last
'/', usebetween(first('/'), last('/'))instead.- Since:
- 6.0
-
between
public static Substring.Pattern between(char open, Substring.BoundStyle openBound, char close, Substring.BoundStyle closeBound) Similar tobetween(char, char)but allows to use alternative bound styles to include or exclude the delimiters at both ends.- Since:
- 7.2
-
between
Returns aPatternthat will match the substring betweenopenandclose. For example the following pattern finds the link text in markdown syntax:private static final Substring.Pattern DEPOT_PATH = Substring.between(first("//depot/"), last('/')); assertThat(DEPOT_PATH.from("//depot/google3/foo/bar/baz.txt")).hasValue("google3/foo/bar"); -
between
public static Substring.Pattern between(Substring.Pattern open, Substring.BoundStyle openBound, Substring.Pattern close, Substring.BoundStyle closeBound) Similar tobetween(Pattern, Pattern)but allows to use alternative bound styles to include or exclude the delimiters at both ends.- Since:
- 7.2
-