Reranker Framework (ReFr)
Reranking framework for structure prediction and discriminative language modeling
|
A class to read one line at a time from backing file, and tokenize that line based on whitespace, and then extract n-gram features from that token stream. More...
#include <file-backed-ngram-feature-extractor.H>
Public Member Functions | |
FileBackedNgramFeatureExtractor () | |
Constructs an instance. More... | |
~FileBackedNgramFeatureExtractor () | |
Destroys this instance. More... | |
virtual void | RegisterInitializers (Initializers &initializers) |
Registers three variables that may be initialized when this object is constructed via Factory::CreateOrDie. More... | |
virtual void | Extract (Candidate &candidate, FeatureVector< int, double > &features) |
Does nothing. More... | |
virtual void | ExtractSymbolic (Candidate &candidate, FeatureVector< string, double > &symbolic_features) |
Extracts symbolic features based on the most recent line read from the file (or stream) that backs this feature extractor. More... | |
Public Member Functions inherited from reranker::BasicFileBackedFeatureExtractor | |
BasicFileBackedFeatureExtractor () | |
Constructs an instance. More... | |
virtual | ~BasicFileBackedFeatureExtractor () |
Destroys this instance. More... | |
Public Member Functions inherited from reranker::AbstractFileBackedFeatureExtractor | |
AbstractFileBackedFeatureExtractor () | |
Constructs an instance. More... | |
virtual | ~AbstractFileBackedFeatureExtractor () |
Destroys this instance. More... | |
virtual void | Init (const Environment *env, const string &arg) |
This method is guaranteed to be inokved by a Factory after invoking the RegisterInitializers method just after construction. More... | |
virtual void | Reset () |
Indicates to this instance that iteration over candidate sets on which features are being extracted has been reset. More... | |
virtual void | Extract (CandidateSet &candidate_set) |
Extracts features for all the candidates in the specified CandidateSet. More... | |
virtual long | line_number () const |
Returns the current number of lines read by this feature extractor from the underlying stream. More... | |
Public Member Functions inherited from reranker::FeatureExtractor | |
FeatureExtractor () | |
Constructs an empty feature vector. More... | |
virtual | ~FeatureExtractor () |
Destroys this vector. More... | |
Public Member Functions inherited from reranker::FactoryConstructible | |
virtual | ~FactoryConstructible () |
Protected Attributes | |
int | n_ |
string | prefix_ |
NgramExtractor | ngram_extractor_ |
Protected Attributes inherited from reranker::AbstractFileBackedFeatureExtractor | |
string | filename_ |
The name of the file backing this feature extractor. More... | |
istream * | is_ |
The stream created from the file backing this feature extractor. More... | |
string | line_ |
The last line read by this feature extractor. More... | |
long | line_number_ |
The number of lines read so far by this feature extractor. More... | |
Tokenizer | tokenizer_ |
A simple whitespace tokenizer for use by concrete subclasses. More... | |
Additional Inherited Members | |
Protected Member Functions inherited from reranker::BasicFileBackedFeatureExtractor | |
virtual void | ExtractFeatureValuePair (const string &s, string &feature, double &value) |
Extracts a symbolic feature and value from the specified string. More... | |
Protected Member Functions inherited from reranker::AbstractFileBackedFeatureExtractor | |
virtual void | ReadFromStream () |
Reads from the stream. More... | |
A class to read one line at a time from backing file, and tokenize that line based on whitespace, and then extract n-gram features from that token stream.
Definition at line 50 of file file-backed-ngram-feature-extractor.H.
|
inline |
Constructs an instance.
Definition at line 54 of file file-backed-ngram-feature-extractor.H.
|
inline |
Destroys this instance.
Definition at line 57 of file file-backed-ngram-feature-extractor.H.
|
inlinevirtual |
Does nothing.
Reimplemented from reranker::BasicFileBackedFeatureExtractor.
Definition at line 102 of file file-backed-ngram-feature-extractor.H.
|
virtual |
Extracts symbolic features based on the most recent line read from the file (or stream) that backs this feature extractor.
The line is tokenized based on whitespace, and each token is transformed into a feature-value pair via the ExtractFeatureValuePair method.
Reimplemented from reranker::BasicFileBackedFeatureExtractor.
Definition at line 43 of file file-backed-ngram-feature-extractor.C.
|
inlinevirtual |
Registers three variables that may be initialized when this object is constructed via Factory::CreateOrDie.
Variable name | Type | Required | Description | Default value |
---|---|---|---|---|
filename | string | Yes | The name of the file backing this feature extractor. | n/a |
n | int | Yes | The n-gram order for this feature extractor to extract. | n/a |
prefix | string | No | A prefix string to be prepended to each feature name produced. | n + "g_ng" where n is the string representation of the n-gram order |
Reimplemented from reranker::AbstractFileBackedFeatureExtractor.
Definition at line 94 of file file-backed-ngram-feature-extractor.H.
|
protected |
Definition at line 114 of file file-backed-ngram-feature-extractor.H.
|
protected |
Definition at line 116 of file file-backed-ngram-feature-extractor.H.
|
protected |
Definition at line 115 of file file-backed-ngram-feature-extractor.H.