Reranker Framework (ReFr)
Reranking framework for structure prediction and discriminative language modeling
|
hadoop-run | |
hadooputil | |
HadoopInterface | A simple class interface for running hadoop commands |
pyutil | |
CommandIO | |
reranker | Provides reranking models for discriminative modeling, with some special handling for discriminative language models |
AbstractFileBackedFeatureExtractor | This class makes it easy for concrete subclasses to extract features based on input from a file |
BasicFileBackedFeatureExtractorConstructor | |
BasicFileBackedFeatureExtractor | A class to read one line at a time from a backing file, and tokenize that line based on whitespace, and then interpret each token as a feature-value pair via a protected method ExtractFeatureValuePair (see the documentation for that method for details on how tokens are interpreted as feature-value pairs) |
CandidateSetIterator | An interface specifying iteration over CandidateSet instances, using Java-style semantics (sorry, die-hard C++ iterator fans) |
CollectionCandidateSetIterator | An implementation of the CandidateSetIterator interface that is backed by an arbitrary C++ collection of pointers to CandidateSet’s, where the collection’s iterators implement the ForwardIterator concept |
MultiFileCandidateSetIterator | An implementation of the CandidateSetIterator interface that iterates over CandidateSet instances that have been serialized to CandidateSetMessage protocol buffer messages in multiple files |
CandidateSetProtoReader | A class to fill in a CandidateSet based on a CandidateSetMessage, crucially constructing new Candidate instances from each CandidateMessage inside the CandidateSetMessage |
CandidateSetProtoWriter | A class to construct a CandidateSetMessage from a CandidateSet instance |
CandidateSetReader | A class for reading streams of training or test instances, where each training or test instance is a reranker::CandidateSet object |
CandidateSetWriter | A class for writing streams of training or test instances, where each training or test instance is a reranker::CandidateSet object |
CandidateSet | A class to hold a set of candidates, either for training or test |
Scorer | An inner interface for a model to score a CandidateSet |
Candidate | A class to represent a candidate in a set of candidates that constitutes a training instance for a reranker |
Comparator | An inner interface specifying comparison between two Candidate instances |
DotProduct | This class defines a dot product kernel function for two vectors |
EnvironmentImpl | Provides a set of named variables and their types, as well as the values for those variables |
VarMapBase | A base class for a mapping from variables of a specific type to their values |
Environment | An interface for an environment in which variables of various types are mapped to their values |
ValueString | A template class that helps print out values with ostream& operator support and vectors of those values |
ValueString< string > | A specialization of the ValueString class to support printing of string values |
ValueString< bool > | A partial specialization of the ValueString class to support printing of boolean values |
ValueString< shared_ptr< T > > | A partial specialization of the ValueString class to support printing of shared_ptr's to objects, where we simply print the typeid name followed by a colon character followed by the pointer address |
ValueString< vector< T > > | A partial specialization of the ValueString class to support printing of vectors of values |
VarMap | A container to hold the mapping between named variables of a specific type and their values |
ExampleFeatureExtractor | This class illustrates how to write a FeatureExtractor implementation |
ExampleFeatureExtractorConstructor | |
ExecutiveFeatureExtractorImplConstructor | |
ExecutiveFeatureExtractor | This class is like a regular FeatureExtractor, but has been promoted to the executive level and thus wears fancypants |
ExecutiveFeatureExtractorImpl | |
TypeName | We use the templated class TypeName to be able to take an actual C++ type and get the type name string used by the Interpreter and Environment classes |
TypeName< bool > | A partial specialization so that an object of type bool converts to "bool" |
TypeName< int > | A partial specialization so that an object of type int converts to "int" |
TypeName< double > | A partial specialization so that an object of type double converts to "double" |
TypeName< string > | A partial specialization so that an object of type string converts to "string" |
TypeName< shared_ptr< T > > | A partial specialization so that an object of type shared_ptr<T> , where T is some Factory-constructible type, converts to the string produced by TypeName<T> |
TypeName< vector< T > > | A partial specialization so that an object of type vector<T> gets converted to the type name of T followed by the string "[]" , equivalent to the result of executing the following expression: |
MemberInitializer | An interface for initializers of members of a Factory-constructible object |
TypedMemberInitializer | |
Initializers | A container for all the member initializers for a particular Factory-constructible instance |
FactoryBase | An interface for all Factory instances, specifying a few pure virtual methods |
FactoryContainer | A class to hold all Factory instances that have been created |
Constructor | An interface with a single virtual method that constructs a concrete instance of the abstract type T |
FactoryConstructible | An interface to make it easier to implement Factory-constructible types by implementing both required methods to do nothing |
Factory | Factory for dynamically created instance of the specified type |
FeatureExtractor | An abstract base class/interface for all feature extractors |
FeatureVectorReader | A class to de-serialize FeatureVector instances from FeatureVecMessage instances |
FeatureVectorReader< FeatureVector< string, V > > | Partial specialization of the FeatureVectorReader class for feature vectors whose unique identifiers for features are string objects |
FeatureVectorWriter | A class to serialize FeatureVector instances to FeatureVecMessage instances |
FeatureVectorWriter< FeatureVector< string, V > > | Partial specialization of the FeatureVectorWriter class for feature vectors whose unique identifiers for features are string objects |
DelKey | |
DelKey< int > | |
DelKey< double > | |
DelKey< string > | |
UidGetter | A simple class that provides a layer of abstraction when retrieving objects to represent unique identifiers for features |
UidGetter< string > | A specialization for when feature uid’s are string objects, where StringCanonicalizer::Get is used to provide a canonical string instance |
FeatureVector | A class to represent a feature vector, where features are represented by unique identifiers, and feature values are represented by the template type |
FileBackedLossSetterConstructor | |
FileBackedLossSetter | A “feature extractor” that reads lines from a backing file, setting each candidate’s loss via its Candidate::set_loss method |
FileBackedNgramFeatureExtractorConstructor | |
FileBackedNgramFeatureExtractor | A class to read one line at a time from backing file, and tokenize that line based on whitespace, and then extract n-gram features from that token stream |
Interpreter | Provides an interpreter for assigning primitives and Factory-constructible objects to named variables, as well as vectors thereof |
KernelFunction | An interface specifying a kernel function for two FeatureVector instances |
DirectLossScoreComparatorConstructor | |
MiraStyleModelConstructor | |
DirectLossScoreComparator | A class to do “direct loss minimization” by considering the score of a candidate to be its raw score plus its loss insofar as candidate ordering is concerned |
MiraStyleModel | A subclass of PerceptronModel that differs only in the way that the ComputeStepSize method is implemented |
Reducer | Abstract base-class for a streaming reducer |
FeatureReducer | A reducer class which processes FeatureMessage proto buffers |
ModelInfoReducer | A reducer class which processes ModelMessage protocol messages and merges them into a single message |
SymbolReducer | A reducer class which processes SymbolMessage messages and returns a set of unique them into a single message |
ModelProtoReader | A class to de-serialize a Model instance from a ModelMessage instance |
ModelProtoWriter | A class to construct a ModelMessage from a Model instance |
EndOfEpochModelWriter | An end-of-epoch hook for writing out the best model so far to file after each epoch (if the best model changes from the last time it was written out) |
ModelReader | Knows how to create Model instances that have been serialized to a file |
DefaultScoreComparatorConstructor | |
DefaultGoldComparatorConstructor | |
DefaultCandidateSetScorerConstructor | |
RandomPairCandidateSetScorerConstructor | |
DefaultScoreComparator | The default comparator for comparing two Candidate instances based on their respective scores (i.e., the values returned by invoking their Candidate::score methods) |
DefaultGoldComparator | The default comparator for comparing two Candidate instances for being the “gold” candidate |
DefaultCandidateSetScorer | The default candidate set scorer scores each candidate using the Model::ScoreCandidate method and then sets the index of the best-scoring candidate based on the results of having applied the Model::score_comparator and sets the index of the gold candidate based on the results of having applied the Model::gold_comparator |
RandomPairCandidateSetScorer | This candidate set scorer picks two candidates at random from the set, scores them and then identifies which has the higher score and the lowest loss, effectively meaning that training proceeds as if those were the only two candidates |
Model | Model is an interface for reranking models |
Hook | An interface for specifying a hook to be run by a Model instance |
UpdatePredicate | An inner interface for a predicate that tests whether a Model needs to be updated based on the current training example |
Updater | An inner interface specifying an update function for a model |
NgramFeatureExtractorConstructor | |
NgramExtractor | Extracts n-gram features from an arbitrary vector of string tokens |
NgramFeatureExtractor | Extracts n-gram features for candidate hypotheses on the fly |
PerceptronModelProtoReaderConstructor | |
PerceptronModelProtoReader | A class to construct a PerceptronModel from a ModelMessage instance |
PerceptronModelProtoWriterConstructor | |
PerceptronModelProtoWriter | A class to construct a ModelMessage from a PerceptronModel instance |
PerceptronModelConstructor | |
PerceptronModelDefaultUpdatePredicateConstructor | |
PerceptronModelDefaultUpdaterConstructor | |
PerceptronModel | This class implements a perceptron model reranker |
DefaultUpdatePredicate | The default update predicate for perceptron and perceptron-style models, which indicates to do a model update whenever the top-scoring candidate hypothesis under the current model differs from the oracle or “gold” candidate hypothesis |
DefaultUpdater | The default update function for perceptron models |
RankFeatureExtractorConstructor | |
RankFeatureExtractor | Encodes the baseline rank as a boolean feature |
StreamInitializer | An interface that allows for a primitive, Factory-constructible object or vector thereof to be initialized based on the next token or tokens from a token stream |
Initializer | A class to initialize a Factory-constructible object |
Initializer< int > | A specialization to allow Factory-constructible objects to initialize int data members |
Initializer< double > | A specialization to initialize double data members |
Initializer< bool > | A specialization to initialize bool data members |
Initializer< string > | A specialization to initialize string data members |
Initializer< vector< T > > | A partial specialization to allow initialization of a vector of any primitive type or any Factory-constructible type |
StreamTokenizer | A simple class for tokenizing a stream of tokens for the formally specified language used to construct objects for the Reranker framework |
Token | Information about a token read from the underlying stream |
StringCanonicalizer | A class that stores a canonical version of string objects in a static data structure |
Symbols | An interface specifying a converter from symbols (strings) to int indices |
StaticSymbolTable | A converter from symbols (strings) to int indices |
LocalSymbolTable | A symbol table that stores the mapping from symbols to int ’s and vice versa in local (non-static) data structures |
Tokenizer | A very simple tokenizer class |
Time | A simple class to hold the three notions of time during training: the current epoch, the current time index within the current epoch, and the absolute time index |
TrainingVectorSet | A class to hold the several feature vectors needed during training (especially for the perceptron family of algorithms), as well as for performing the updates to those feature vectors (again, with the perceptron family of algorithms in mind) |