Reranker Framework (ReFr)
Reranking framework for structure prediction and discriminative language modeling
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
Public Member Functions | Protected Attributes | List of all members
reranker::FileBackedNgramFeatureExtractor Class Reference

A class to read one line at a time from backing file, and tokenize that line based on whitespace, and then extract n-gram features from that token stream. More...

#include <file-backed-ngram-feature-extractor.H>

Inheritance diagram for reranker::FileBackedNgramFeatureExtractor:
reranker::BasicFileBackedFeatureExtractor reranker::AbstractFileBackedFeatureExtractor reranker::FeatureExtractor reranker::FactoryConstructible

Public Member Functions

 FileBackedNgramFeatureExtractor ()
 Constructs an instance. More...
 
 ~FileBackedNgramFeatureExtractor ()
 Destroys this instance. More...
 
virtual void RegisterInitializers (Initializers &initializers)
 Registers three variables that may be initialized when this object is constructed via Factory::CreateOrDie. More...
 
virtual void Extract (Candidate &candidate, FeatureVector< int, double > &features)
 Does nothing. More...
 
virtual void ExtractSymbolic (Candidate &candidate, FeatureVector< string, double > &symbolic_features)
 Extracts symbolic features based on the most recent line read from the file (or stream) that backs this feature extractor. More...
 
- Public Member Functions inherited from reranker::BasicFileBackedFeatureExtractor
 BasicFileBackedFeatureExtractor ()
 Constructs an instance. More...
 
virtual ~BasicFileBackedFeatureExtractor ()
 Destroys this instance. More...
 
- Public Member Functions inherited from reranker::AbstractFileBackedFeatureExtractor
 AbstractFileBackedFeatureExtractor ()
 Constructs an instance. More...
 
virtual ~AbstractFileBackedFeatureExtractor ()
 Destroys this instance. More...
 
virtual void Init (const Environment *env, const string &arg)
 This method is guaranteed to be inokved by a Factory after invoking the RegisterInitializers method just after construction. More...
 
virtual void Reset ()
 Indicates to this instance that iteration over candidate sets on which features are being extracted has been reset. More...
 
virtual void Extract (CandidateSet &candidate_set)
 Extracts features for all the candidates in the specified CandidateSet. More...
 
virtual long line_number () const
 Returns the current number of lines read by this feature extractor from the underlying stream. More...
 
- Public Member Functions inherited from reranker::FeatureExtractor
 FeatureExtractor ()
 Constructs an empty feature vector. More...
 
virtual ~FeatureExtractor ()
 Destroys this vector. More...
 
- Public Member Functions inherited from reranker::FactoryConstructible
virtual ~FactoryConstructible ()
 

Protected Attributes

int n_
 
string prefix_
 
NgramExtractor ngram_extractor_
 
- Protected Attributes inherited from reranker::AbstractFileBackedFeatureExtractor
string filename_
 The name of the file backing this feature extractor. More...
 
istream * is_
 The stream created from the file backing this feature extractor. More...
 
string line_
 The last line read by this feature extractor. More...
 
long line_number_
 The number of lines read so far by this feature extractor. More...
 
Tokenizer tokenizer_
 A simple whitespace tokenizer for use by concrete subclasses. More...
 

Additional Inherited Members

- Protected Member Functions inherited from reranker::BasicFileBackedFeatureExtractor
virtual void ExtractFeatureValuePair (const string &s, string &feature, double &value)
 Extracts a symbolic feature and value from the specified string. More...
 
- Protected Member Functions inherited from reranker::AbstractFileBackedFeatureExtractor
virtual void ReadFromStream ()
 Reads from the stream. More...
 

Detailed Description

A class to read one line at a time from backing file, and tokenize that line based on whitespace, and then extract n-gram features from that token stream.

Definition at line 50 of file file-backed-ngram-feature-extractor.H.

Constructor & Destructor Documentation

reranker::FileBackedNgramFeatureExtractor::FileBackedNgramFeatureExtractor ( )
inline

Constructs an instance.

Definition at line 54 of file file-backed-ngram-feature-extractor.H.

reranker::FileBackedNgramFeatureExtractor::~FileBackedNgramFeatureExtractor ( )
inline

Destroys this instance.

Definition at line 57 of file file-backed-ngram-feature-extractor.H.

Member Function Documentation

virtual void reranker::FileBackedNgramFeatureExtractor::Extract ( Candidate candidate,
FeatureVector< int, double > &  features 
)
inlinevirtual

Does nothing.

Reimplemented from reranker::BasicFileBackedFeatureExtractor.

Definition at line 102 of file file-backed-ngram-feature-extractor.H.

void reranker::FileBackedNgramFeatureExtractor::ExtractSymbolic ( Candidate candidate,
FeatureVector< string, double > &  symbolic_features 
)
virtual

Extracts symbolic features based on the most recent line read from the file (or stream) that backs this feature extractor.

The line is tokenized based on whitespace, and each token is transformed into a feature-value pair via the ExtractFeatureValuePair method.

Reimplemented from reranker::BasicFileBackedFeatureExtractor.

Definition at line 43 of file file-backed-ngram-feature-extractor.C.

virtual void reranker::FileBackedNgramFeatureExtractor::RegisterInitializers ( Initializers initializers)
inlinevirtual

Registers three variables that may be initialized when this object is constructed via Factory::CreateOrDie.

Variable name Type Required Description Default value
filename string Yes The name of the file backing this feature extractor. n/a
n int Yes The n-gram order for this feature extractor to extract. n/a
prefix string No A prefix string to be prepended to each feature name produced. n + "g_ng" where n is the string representation of the n-gram order

Reimplemented from reranker::AbstractFileBackedFeatureExtractor.

Definition at line 94 of file file-backed-ngram-feature-extractor.H.

Member Data Documentation

int reranker::FileBackedNgramFeatureExtractor::n_
protected

Definition at line 114 of file file-backed-ngram-feature-extractor.H.

NgramExtractor reranker::FileBackedNgramFeatureExtractor::ngram_extractor_
protected

Definition at line 116 of file file-backed-ngram-feature-extractor.H.

string reranker::FileBackedNgramFeatureExtractor::prefix_
protected

Definition at line 115 of file file-backed-ngram-feature-extractor.H.


The documentation for this class was generated from the following files: