Class CaseBreaker

java.lang.Object
com.google.mu.util.CaseBreaker

public final class CaseBreaker extends Object
Utility to break input strings (normally identifier strings) in camelCase, UpperCamelCase, snake_case, UPPER_SNAKE_CASE and dash-case etc.

Unlike Guava CaseFormat, this class doesn't require you to know the input casing. You can take any string and then extract or convert into the target casing.

Warning: This class doesn't recognize supplementary code points.

Starting from v9.0, CaseBreaker is moved to the core mug artifact, and no longer requires Guava as a dependency. The toCase() method lives in CaseFormats, in the mug-guava artifact.

Since:
9.0
  • Constructor Details

    • CaseBreaker

      public CaseBreaker()
  • Method Details

    • withPunctuationChars

      public CaseBreaker withPunctuationChars(CharPredicate punctuation)
      Returns a new instance using punctuation to identify punctuation characters (ones that separate words but aren't themselves included in the result), for example if you want to support dash-case using the en dash (–) character.
    • withLowerCaseChars

      public CaseBreaker withLowerCaseChars(CharPredicate camelLower)
      Returns a new instance using camelLower to identify lower case characters (don't forget to include digits if they should also be treated as lower case).
    • breakCase

      public Stream<String> breakCase(CharSequence text)
      Returns a lazy stream of words split out from text, delimited by non-letter-digit ascii characters, and further split at lowerCamelCase and UpperCamelCase boundaries.

      Examples:

      
       breakCase("userId")            => ["user", "Id"]
       breakCase("field_name")        => ["field", "name"]
       breakCase("CONSTANT_NAME")     => ["CONSTANT", "NAME"]
       breakCase("dash-case")         => ["dash", "case"]
       breakCase("3 separate words")  => ["3", "separate", "words"]
       breakCase("TheURLs")           => ["The", "URLs"]
       breakCase("🅣ⓗⓔ🅤🅡🅛ⓢ")      => ["🅣ⓗⓔ", "🅤🅡🅛ⓢ""]
       breakCase("UpgradeIPv4ToIPv6") => ["Upgrade", "IPv4", "To", "IPv6"]
       

      By default, non-alphanumeric ascii characters are treated as case delimiter characters. And Java lower case characters and ascii digits are considered to be lower case when breaking up camel case.

      Besides used as case delimiters, non-letter-digit ascii characters are filtered out from the returned words.

      If the default setting doesn't work for you, it can be customized by using withPunctuationChars(com.google.mu.util.CharPredicate) and/or withLowerCaseChars(com.google.mu.util.CharPredicate).