Package student.testingsupport
Class StringNormalizer
- All Implemented Interfaces:
Serializable,Cloneable,Iterable<StringNormalizer.NormalizerRule>,Collection<StringNormalizer.NormalizerRule>,List<StringNormalizer.NormalizerRule>,RandomAccess
This class represents a programmable string "normalizing" engine that
can be used to convert strings into a canonical form, say, before
comparing strings for equality or something. Basically, a normalizer
is a list of zero or more rules, or transformations. The
normalize(String) method can be used to apply the entire
set of transformations to a given string.
For example, you can build a string normalizer that replaces all
sequences of one or more whitespace characters by a single space
character, trims any leading or trailing space, and converts a
string to lower case. This class provides a number of predefined
transformations in the StringNormalizer.StandardRule enumeration.
Some examples:
// An "identity" transformation that does nothing:
StringNormalizer norm1 = new StringNormalizer();
// norm1.normalize(...) returns its argument unchanged
// A "lower case" normalizer:
StringNormalizer norm2 = new StringNormalizer(
StringNormalizer.StandardRule.IGNORE_CAPITALIZATION);
// norm2.normalize(...) returns a lower case version of its argument
// self-explanatory:
StringNormalizer norm3 = new StringNormalizer(
StringNormalizer.StandardRule.IGNORE_CAPITALIZATION,
StringNormalizer.StandardRule.IGNORE_PUNCTUATION);
// A "standard" normalizer:
StringNormalizer norm4 = new StringNormalizer(true);
// norm4.normalize(...) returns its contents with all punctuation
// characters removed, all letters converted to lower case, all
// whitespace sequences replaced by single spaces, all MS-DOS or
// Mac line terminators replaced by "\n"'s, and all leading and
// trailing whitespace removed.
Note that string normalizers that contain multiple rules apply those
rules in order (i.e., in the order added, or the
List order of this class). This may produce
inconsistent results if you are not careful when you add your rules.
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classThis interface defines what it means to be a normalizer rule: an object having an appropriateStringNormalizer.NormalizerRule.normalize(String)method.static classA highly reusable concrete implementation ofStringNormalizer.NormalizerRulethat applies a series ofregular expressionsubstitutions.static enumThis enumeration defines the set of predefined transformation rules. -
Field Summary
Fields inherited from class java.util.AbstractList
modCount -
Constructor Summary
ConstructorsConstructorDescriptionCreates a new StringNormalizer object containing no rules (the "identity" normalizer).StringNormalizer(boolean useStandardRules) Creates a new StringNormalizer object, optionally containing the standard set of rules.StringNormalizer(Collection<? extends StringNormalizer.NormalizerRule> rules) Creates a new StringNormalizer object containing the given set of rules.Creates a new StringNormalizer object containing the given set of rules.Creates a new StringNormalizer object containing the given set of rules. -
Method Summary
Modifier and TypeMethodDescriptionbooleanAdd the specified rule.voidAdd the specified standard rule, as defined inStringNormalizer.StandardRule.voidAdd the standard set of rules.Normalize a string by applying a set of normalization rules (transformations).normalize(String content, ArrayList<StringNormalizer.NormalizerRule> outParticipatingRules) Normalize a string by applying a set of normalization rules (transformations).voidRemove the specified standard rule, as defined inStringNormalizer.StandardRule.Retrieve a standard rule by name.Methods inherited from class java.util.ArrayList
add, addAll, addAll, clear, clone, contains, ensureCapacity, equals, forEach, get, hashCode, indexOf, isEmpty, iterator, lastIndexOf, listIterator, listIterator, remove, remove, removeAll, removeIf, removeRange, replaceAll, retainAll, set, size, sort, spliterator, subList, toArray, toArray, trimToSizeMethods inherited from class java.util.AbstractCollection
containsAll, toStringMethods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, waitMethods inherited from interface java.util.Collection
parallelStream, stream, toArrayMethods inherited from interface java.util.List
containsAll
-
Constructor Details
-
StringNormalizer
public StringNormalizer()Creates a new StringNormalizer object containing no rules (the "identity" normalizer). -
StringNormalizer
public StringNormalizer(boolean useStandardRules) Creates a new StringNormalizer object, optionally containing the standard set of rules. The standard set is all those inStringNormalizer.StandardRuleexception the OPT_* rules.- Parameters:
useStandardRules- If true, the set of standard (non-OPT_*) rules will be used. If false, an "identity" normalizer will be produced instead.
-
StringNormalizer
Creates a new StringNormalizer object containing the given set of rules.- Parameters:
rules- a (variable-length) comma-separated sequence of rules to add
-
StringNormalizer
Creates a new StringNormalizer object containing the given set of rules.- Parameters:
rules- a (variable-length) comma-separated sequence of rules to add
-
StringNormalizer
Creates a new StringNormalizer object containing the given set of rules.- Parameters:
rules- a collection of rules to add (could be another StringNormalizer, or any other kind of collection)
-
-
Method Details
-
normalize
Normalize a string by applying a set of normalization rules (transformations).- Parameters:
content- The string to transform- Returns:
- The result after all rules have been applied
-
normalize
public String normalize(String content, ArrayList<StringNormalizer.NormalizerRule> outParticipatingRules) Normalize a string by applying a set of normalization rules (transformations). When using this version of normalize, all rules must implement equals and hashCode methods.- Parameters:
content- The string to transformoutParticipatingRules- returns those rules that had an effect on the output. If the list is not empty, new rules are added and no rule is deleted from the list.- Returns:
- The result after all rules have been applied
-
addStandardRules
public void addStandardRules()Add the standard set of rules. The standard set is all those inStringNormalizer.StandardRuleexception the OPT_* rules. -
add
Add the specified standard rule, as defined inStringNormalizer.StandardRule. Note that you can also use the inheritedList.add(Object)method to add custom NormalizerRule objects.- Parameters:
rule- The rule to add
-
add
Add the specified rule. For efficiency, only adds the rule if it is not already present in this normalizer.- Specified by:
addin interfaceCollection<StringNormalizer.NormalizerRule>- Specified by:
addin interfaceList<StringNormalizer.NormalizerRule>- Overrides:
addin classArrayList<StringNormalizer.NormalizerRule>- Parameters:
rule- The rule to add- Returns:
- True if the rule was added, or false if it is already present
-
remove
Remove the specified standard rule, as defined inStringNormalizer.StandardRule. Note that you can also use the inheritedList.remove(Object)method to remove other kinds of NormalizerRule objects.- Parameters:
rule- The rule to remove
-
standardRule
Retrieve a standard rule by name.- Parameters:
rule- the rule to retrieve- Returns:
- The corresponding
StringNormalizer.NormalizerRule
-