I was wondering if there is some documentation available for the package Moose-Algos-InformationRetrieval or if someone ever used it?
I don’t understand how the MalLogLikelihoodRatio is working (it’s a kind of similarity metric between two sets of terms).
What I have right know is a corpus of documents with their contents and I’d like to compare a request to my corpus.
data := FileSystem workingDirectory allFiles select:[:e| e extension = 'ph'] thenCollect:#contents.
corpus := MalCorpus new.
'Initiating corpus...' displayProgressFrom: 1 to: data size during: [ :bar|
data do:[:e|
corpus addDocument: e with: (MalTerms fromString: e contents).
bar increment.
].
corpus removeStopwords .
].
stringRequest := 'toto tata'.
request := MalTerms fromString: stringRequest.
likelihood := MalLogLikelihoodRatio new.
likelihood setTerms1: request.
requestAnswer := Dictionary keys: data values: (data collect:[:e|
likelihood setTerms2: (corpus atDocument: e); computeAll
]).
I don’t get what are the values that I have in the variable requestAnswer.