Reranker Framework (ReFr)
Reranking framework for structure prediction and discriminative language modeling
|
A simple class for tokenizing a stream of tokens for the formally specified language used to construct objects for the Reranker framework. More...
#include <stream-tokenizer.H>
Classes | |
struct | Token |
Information about a token read from the underlying stream. More... | |
Public Types | |
enum | TokenType { EOF_TYPE, RESERVED_CHAR, RESERVED_WORD, STRING, NUMBER, IDENTIFIER } |
The set of types of tokens read by this stream tokenizer. More... | |
Public Member Functions | |
StreamTokenizer (istream &is, const char *reserved_chars="(){},=;/") | |
Constructs a new instance around the specified byte stream. More... | |
StreamTokenizer (const string &s, const char *reserved_chars="(){},=;/") | |
Constructs a new instance around the specified string. More... | |
void | set_reserved_words (set< string > &reserved_words) |
Sets the set of “reserved words” used by this stream tokenizer. More... | |
virtual | ~StreamTokenizer () |
Destroys this instance. More... | |
string | str () |
Returns the entire sequence of characters read so far by this stream tokenizer as a newly constructed string object. More... | |
size_t | tellg () const |
Returns the number of bytes read from the underlying byte stream just after scanning the most recent token, or 0 if this stream is just about to return the first token. More... | |
size_t | line_number () const |
Returns the number of lines read from the underlying byte stream, where a line is any number of bytes followed by a newline character (i.e., this is ASCII-centric). More... | |
bool | HasNext () const |
Returns whether there is another token in the token stream. More... | |
bool | HasPrev () const |
string | PeekPrev () const |
size_t | PeekPrevTokenStart () const |
TokenType | PeekPrevTokenType () const |
string | Next () |
Returns the next token in the token stream. More... | |
void | Rewind () |
Rewinds this token stream to the beginning. More... | |
void | Rewind (size_t num_tokens) |
Rewinds this token stream by the specified number of tokens. More... | |
void | Putback () |
A synonym for Rewind(1). More... | |
size_t | PeekTokenStart () const |
Returns the next token’s start position, or the byte position of the underlying byte stream if there is no next token. More... | |
TokenType | PeekTokenType () const |
Returns the type of the next token, or EOF_TYPE if there is no next token. More... | |
size_t | PeekTokenLineNumber () const |
Returns the line number of the first byte of the next token, or the current line number of the underlying stream if there is no next token. More... | |
string | Peek () const |
Returns the next token that would be returned by the Next method. More... | |
Static Public Member Functions | |
static const char * | TypeName (TokenType token_type) |
Returns a string type name for the specified TokenType constant. More... | |
A simple class for tokenizing a stream of tokens for the formally specified language used to construct objects for the Reranker framework.
Definition at line 87 of file stream-tokenizer.H.
The set of types of tokens read by this stream tokenizer.
Enumerator | |
---|---|
EOF_TYPE | |
RESERVED_CHAR | |
RESERVED_WORD | |
STRING | |
NUMBER | |
IDENTIFIER |
Definition at line 92 of file stream-tokenizer.H.
|
inline |
Constructs a new instance around the specified byte stream.
is | the input byte stream for this stream tokenizer to use |
reserved_chars | the set of single characters serving as “reserved characters” |
Definition at line 134 of file stream-tokenizer.H.
|
inline |
Constructs a new instance around the specified string.
s | the string providing the stream of characters for this stream tokenizer to use |
reserved_chars | the set of single characters serving as “reserved characters” |
Definition at line 147 of file stream-tokenizer.H.
|
inlinevirtual |
Destroys this instance.
Definition at line 161 of file stream-tokenizer.H.
|
inline |
Returns whether there is another token in the token stream.
Definition at line 184 of file stream-tokenizer.H.
|
inline |
Definition at line 186 of file stream-tokenizer.H.
|
inline |
Returns the number of lines read from the underlying byte stream, where a line is any number of bytes followed by a newline character (i.e., this is ASCII-centric).
Definition at line 179 of file stream-tokenizer.H.
|
inline |
Returns the next token in the token stream.
Definition at line 201 of file stream-tokenizer.H.
|
inline |
Returns the next token that would be returned by the Next method.
The return value of this method is only valid when HasNext returns true
.
Definition at line 271 of file stream-tokenizer.H.
|
inline |
Definition at line 188 of file stream-tokenizer.H.
|
inline |
Definition at line 192 of file stream-tokenizer.H.
|
inline |
Definition at line 196 of file stream-tokenizer.H.
|
inline |
Returns the line number of the first byte of the next token, or the current line number of the underlying stream if there is no next token.
Definition at line 264 of file stream-tokenizer.H.
|
inline |
Returns the next token’s start position, or the byte position of the underlying byte stream if there is no next token.
Definition at line 251 of file stream-tokenizer.H.
|
inline |
Returns the type of the next token, or EOF_TYPE if there is no next token.
Definition at line 257 of file stream-tokenizer.H.
|
inline |
A synonym for Rewind(1).
Definition at line 245 of file stream-tokenizer.H.
|
inline |
Rewinds this token stream to the beginning.
If the underlying stream has no tokens, this is a no-op.
Definition at line 228 of file stream-tokenizer.H.
|
inline |
Rewinds this token stream by the specified number of tokens.
If the specified number of tokens is greater than the number of tokens read so far, invoking this method will be functionally equivalent to invoking the no-argument Rewind() method.
Definition at line 236 of file stream-tokenizer.H.
|
inline |
Sets the set of “reserved words” used by this stream tokenizer.
Should be invoked just after construction time.
Definition at line 156 of file stream-tokenizer.H.
|
inline |
Returns the entire sequence of characters read so far by this stream tokenizer as a newly constructed string object.
Definition at line 167 of file stream-tokenizer.H.
|
inline |
Returns the number of bytes read from the underlying byte stream just after scanning the most recent token, or 0 if this stream is just about to return the first token.
Definition at line 172 of file stream-tokenizer.H.
|
inlinestatic |
Returns a string type name for the specified TokenType constant.
Definition at line 102 of file stream-tokenizer.H.