|
|||||||||
| Home >> All >> org >> apache >> oro >> text >> [ awk overview ] | PREV CLASS NEXT CLASS | ||||||||
SUMMARY: JAVADOC | SOURCE | DOWNLOAD | NESTED | FIELD | CONSTR | METHOD |
DETAIL: FIELD | CONSTR | METHOD | ||||||||
org.apache.oro.text.awk
Class AwkMatcher

java.lang.Objectorg.apache.oro.text.awk.AwkMatcher
- All Implemented Interfaces:
- org.apache.oro.text.regex.PatternMatcher
- public final class AwkMatcher
- extends java.lang.Object
- implements org.apache.oro.text.regex.PatternMatcher
- extends java.lang.Object
The AwkMatcher class is used to match regular expressions (conforming to the Awk regular expression syntax) generated by AwkCompiler. AwkMatcher only supports 8-bit ASCII. Any attempt to match Unicode values greater than 255 will result in undefined behavior. AwkMatcher finds true leftmost-longest matches, so you must take care with how you formulate your regular expression to avoid matching more than you really want.
It is important for you to remember that AwkMatcher does not save parenthesized sub-group information. Therefore the number of groups saved in a MatchResult produced by AwkMatcher will always be 1.
- Since:
- 1.0
- Version:
- @version@
| Field Summary | |
private AwkPattern |
__awkPattern
|
private int |
__beginOffset
A kluge variable to make PatternMatcherInput matches work when their begin offset is non-zero. |
private int |
__lastMatchedBufferOffset
|
private AwkMatchResult |
__lastMatchResult
|
private int[] |
__offsets
|
private AwkStreamInput |
__scratchBuffer
|
private AwkStreamInput |
__streamSearchBuffer
|
| Constructor Summary | |
AwkMatcher()
|
|
| Method Summary | |
private int |
__streamMatchPrefix()
|
(package private) void |
_search()
|
boolean |
contains(AwkStreamInput input,
org.apache.oro.text.regex.Pattern pattern)
Determines if the contents of an AwkStreamInput, starting from the current offset of the input contains a pattern. |
boolean |
contains(char[] input,
org.apache.oro.text.regex.Pattern pattern)
Determines if a string (represented as a char[]) contains a pattern. |
boolean |
contains(org.apache.oro.text.regex.PatternMatcherInput input,
org.apache.oro.text.regex.Pattern pattern)
Determines if the contents of a PatternMatcherInput, starting from the current offset of the input contains a pattern. |
boolean |
contains(java.lang.String input,
org.apache.oro.text.regex.Pattern pattern)
Determines if a string contains a pattern. |
org.apache.oro.text.regex.MatchResult |
getMatch()
Fetches the last match found by a call to a matches() or contains() method. |
boolean |
matches(char[] input,
org.apache.oro.text.regex.Pattern pattern)
Determines if a string (represented as a char[]) exactly matches a given pattern. |
boolean |
matches(org.apache.oro.text.regex.PatternMatcherInput input,
org.apache.oro.text.regex.Pattern pattern)
Determines if the contents of a PatternMatcherInput instance exactly matches a given pattern. |
boolean |
matches(java.lang.String input,
org.apache.oro.text.regex.Pattern pattern)
Determines if a string exactly matches a given pattern. |
boolean |
matchesPrefix(char[] input,
org.apache.oro.text.regex.Pattern pattern)
Determines if a prefix of a string (represented as a char[]) matches a given pattern. |
boolean |
matchesPrefix(char[] input,
org.apache.oro.text.regex.Pattern pattern,
int offset)
Determines if a prefix of a string (represented as a char[]) matches a given pattern, starting from a given offset into the string. |
boolean |
matchesPrefix(org.apache.oro.text.regex.PatternMatcherInput input,
org.apache.oro.text.regex.Pattern pattern)
Determines if a prefix of a PatternMatcherInput instance matches a given pattern. |
boolean |
matchesPrefix(java.lang.String input,
org.apache.oro.text.regex.Pattern pattern)
Determines if a prefix of a string matches a given pattern. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
__lastMatchedBufferOffset
private int __lastMatchedBufferOffset
__lastMatchResult
private AwkMatchResult __lastMatchResult
__scratchBuffer
private AwkStreamInput __scratchBuffer
__streamSearchBuffer
private AwkStreamInput __streamSearchBuffer
__awkPattern
private AwkPattern __awkPattern
__offsets
private int[] __offsets
__beginOffset
private int __beginOffset
- A kluge variable to make PatternMatcherInput matches work when
their begin offset is non-zero. This kluge is caused by the
misguided notion that AwkStreamInput could be overloaded to do
both stream and fixed buffer matches. The whole input representation
scheme has to be scrapped and redone. -- dfs 2001/07/10
| Constructor Detail |
AwkMatcher
public AwkMatcher()
| Method Detail |
matchesPrefix
public boolean matchesPrefix(char[] input,
org.apache.oro.text.regex.Pattern pattern,
int offset)
- Determines if a prefix of a string (represented as a char[])
matches a given pattern, starting from a given offset into the string.
If a prefix of the string matches the pattern, a MatchResult instance
representing the match is made accesible via
getMatch()55 .This method is useful for certain common token identification tasks that are made more difficult without this functionality.
- Specified by:
matchesPrefixin interfaceorg.apache.oro.text.regex.PatternMatcher
matchesPrefix
public boolean matchesPrefix(char[] input,
org.apache.oro.text.regex.Pattern pattern)
- Determines if a prefix of a string (represented as a char[])
matches a given pattern.
If a prefix of the string matches the pattern, a MatchResult instance
representing the match is made accesible via
getMatch()55 .This method is useful for certain common token identification tasks that are made more difficult without this functionality.
- Specified by:
matchesPrefixin interfaceorg.apache.oro.text.regex.PatternMatcher
matchesPrefix
public boolean matchesPrefix(java.lang.String input, org.apache.oro.text.regex.Pattern pattern)
- Determines if a prefix of a string matches a given pattern.
If a prefix of the string matches the pattern, a MatchResult instance
representing the match is made accesible via
getMatch()55 .This method is useful for certain common token identification tasks that are made more difficult without this functionality.
- Specified by:
matchesPrefixin interfaceorg.apache.oro.text.regex.PatternMatcher
matchesPrefix
public boolean matchesPrefix(org.apache.oro.text.regex.PatternMatcherInput input, org.apache.oro.text.regex.Pattern pattern)
- Determines if a prefix of a PatternMatcherInput instance
matches a given pattern. If there is a match, a MatchResult instance
representing the match is made accesible via
getMatch()55 . Unlike thecontains(PatternMatcherInput, Pattern)55 method, the current offset of the PatternMatcherInput argument is not updated. You should remember that the region starting from the begin offset of the PatternMatcherInput will be tested for a prefix match.This method is useful for certain common token identification tasks that are made more difficult without this functionality.
- Specified by:
matchesPrefixin interfaceorg.apache.oro.text.regex.PatternMatcher
matches
public boolean matches(char[] input,
org.apache.oro.text.regex.Pattern pattern)
- Determines if a string (represented as a char[]) exactly
matches a given pattern. If
there is an exact match, a MatchResult instance
representing the match is made accesible via
getMatch()55 . The pattern must be an AwkPattern instance, otherwise a ClassCastException will be thrown. You are not required to, and indeed should NOT try to (for performance reasons), catch a ClassCastException because it will never be thrown as long as you use an AwkPattern as the pattern parameter.- Specified by:
matchesin interfaceorg.apache.oro.text.regex.PatternMatcher
matches
public boolean matches(java.lang.String input, org.apache.oro.text.regex.Pattern pattern)
- Determines if a string exactly matches a given pattern. If
there is an exact match, a MatchResult instance
representing the match is made accesible via
getMatch()55 . The pattern must be a AwkPattern instance, otherwise a ClassCastException will be thrown. You are not required to, and indeed should NOT try to (for performance reasons), catch a ClassCastException because it will never be thrown as long as you use an AwkPattern as the pattern parameter.- Specified by:
matchesin interfaceorg.apache.oro.text.regex.PatternMatcher
matches
public boolean matches(org.apache.oro.text.regex.PatternMatcherInput input, org.apache.oro.text.regex.Pattern pattern)
- Determines if the contents of a PatternMatcherInput instance
exactly matches a given pattern. If
there is an exact match, a MatchResult instance
representing the match is made accesible via
getMatch()55 . Unlike thecontains(PatternMatcherInput, Pattern)55 method, the current offset of the PatternMatcherInput argument is not updated. You should remember that the region between the begin and end offsets of the PatternMatcherInput will be tested for an exact match.The pattern must be an AwkPattern instance, otherwise a ClassCastException will be thrown. You are not required to, and indeed should NOT try to (for performance reasons), catch a ClassCastException because it will never be thrown as long as you use an AwkPattern as the pattern parameter.
- Specified by:
matchesin interfaceorg.apache.oro.text.regex.PatternMatcher
contains
public boolean contains(char[] input,
org.apache.oro.text.regex.Pattern pattern)
- Determines if a string (represented as a char[]) contains a pattern.
If the pattern is
matched by some substring of the input, a MatchResult instance
representing the first such match is made acessible via
getMatch()55 . If you want to access subsequent matches you should either use a PatternMatcherInput object or use the offset information in the MatchResult to create a substring representing the remaining input. Using the MatchResult offset information is the recommended method of obtaining the parts of the string preceeding the match and following the match.The pattern must be an AwkPattern instance, otherwise a ClassCastException will be thrown. You are not required to, and indeed should NOT try to (for performance reasons), catch a ClassCastException because it will never be thrown as long as you use an AwkPattern as the pattern parameter.
- Specified by:
containsin interfaceorg.apache.oro.text.regex.PatternMatcher
contains
public boolean contains(java.lang.String input, org.apache.oro.text.regex.Pattern pattern)
- Determines if a string contains a pattern. If the pattern is
matched by some substring of the input, a MatchResult instance
representing the first such match is made acessible via
getMatch()55 . If you want to access subsequent matches you should either use a PatternMatcherInput object or use the offset information in the MatchResult to create a substring representing the remaining input. Using the MatchResult offset information is the recommended method of obtaining the parts of the string preceeding the match and following the match.The pattern must be an AwkPattern instance, otherwise a ClassCastException will be thrown. You are not required to, and indeed should NOT try to (for performance reasons), catch a ClassCastException because it will never be thrown as long as you use an AwkPattern as the pattern parameter.
- Specified by:
containsin interfaceorg.apache.oro.text.regex.PatternMatcher
contains
public boolean contains(org.apache.oro.text.regex.PatternMatcherInput input, org.apache.oro.text.regex.Pattern pattern)
- Determines if the contents of a PatternMatcherInput, starting from the
current offset of the input contains a pattern.
If a pattern match is found, a MatchResult
instance representing the first such match is made acessible via
getMatch()55 . The current offset of the PatternMatcherInput is set to the offset corresponding to the end of the match, so that a subsequent call to this method will continue searching where the last call left off. You should remember that the region between the begin and end offsets of the PatternMatcherInput are considered the input to be searched, and that the current offset of the PatternMatcherInput reflects where a search will start from. Matches extending beyond the end offset of the PatternMatcherInput will not be matched. In other words, a match must occur entirely between the begin and end offsets of the input. See PatternMatcherInput for more details.As a side effect, if a match is found, the PatternMatcherInput match offset information is updated. See the PatternMatcherInput setMatchOffsets(int, int) 55 method for more details.
The pattern must be an AwkPattern instance, otherwise a ClassCastException will be thrown. You are not required to, and indeed should NOT try to (for performance reasons), catch a ClassCastException because it will never be thrown as long as you use an AwkPattern as the pattern parameter.
This method is usually used in a loop as follows:
PatternMatcher matcher; PatternCompiler compiler; Pattern pattern; PatternMatcherInput input; MatchResult result; compiler = new AwkCompiler(); matcher = new AwkMatcher(); try { pattern = compiler.compile(somePatternString); } catch(MalformedPatternException e) { System.err.println("Bad pattern."); System.err.println(e.getMessage()); return; } input = new PatternMatcherInput(someStringInput); while(matcher.contains(input, pattern)) { result = matcher.getMatch(); // Perform whatever processing on the result you want. }- Specified by:
containsin interfaceorg.apache.oro.text.regex.PatternMatcher
contains
public boolean contains(AwkStreamInput input, org.apache.oro.text.regex.Pattern pattern) throws java.io.IOException
- Determines if the contents of an AwkStreamInput, starting from the
current offset of the input contains a pattern.
If a pattern match is found, a MatchResult
instance representing the first such match is made acessible via
getMatch()55 . The current offset of the input stream is advanced to the end offset corresponding to the end of the match. Consequently a subsequent call to this method will continue searching where the last call left off. See AwkStreamInput for more details.Note, patterns matching the null string do NOT match at end of input stream. This is different from the behavior you get from the other contains() methods.
The pattern must be an AwkPattern instance, otherwise a ClassCastException will be thrown. You are not required to, and indeed should NOT try to (for performance reasons), catch a ClassCastException because it will never be thrown as long as you use an AwkPattern as the pattern parameter.
This method is usually used in a loop as follows:
PatternMatcher matcher; PatternCompiler compiler; Pattern pattern; AwkStreamInput input; MatchResult result; compiler = new AwkCompiler(); matcher = new AwkMatcher(); try { pattern = compiler.compile(somePatternString); } catch(MalformedPatternException e) { System.err.println("Bad pattern."); System.err.println(e.getMessage()); return; } input = new AwkStreamInput( new BufferedInputStream(new FileInputStream(someFileName))); while(matcher.contains(input, pattern)) { result = matcher.getMatch(); // Perform whatever processing on the result you want. }
__streamMatchPrefix
private int __streamMatchPrefix()
throws java.io.IOException
_search
void _search()
throws java.io.IOException
getMatch
public org.apache.oro.text.regex.MatchResult getMatch()
- Fetches the last match found by a call to a matches() or contains()
method.
- Specified by:
getMatchin interfaceorg.apache.oro.text.regex.PatternMatcher
|
|||||||||
| Home >> All >> org >> apache >> oro >> text >> [ awk overview ] | PREV CLASS NEXT CLASS | ||||||||
SUMMARY: JAVADOC | SOURCE | DOWNLOAD | NESTED | FIELD | CONSTR | METHOD |
DETAIL: FIELD | CONSTR | METHOD | ||||||||
JAVADOC
org.apache.oro.text.awk.AwkMatcher