class documentation

class StandardStemmer(PorterStemmer, BaseStemmer):

View In Hierarchy

All those porter stemmer implementations look hideous; make at least the stem method nicer.
Method stem No summary

Inherited from PorterStemmer:

Method __init__ No summary
Method cons cons(i) is TRUE <=> b[i] is a consonant.
Method cvc No summary
Method doublec doublec(j) is TRUE <=> j,(j-1) contain a double consonant.
Method ends ends(s) is TRUE <=> k0,...k ends with the string s.
Method m m() measures the number of consonant sequences between k0 and j. if c is a consonant sequence and v a vowel sequence, and <..> indicates arbitrary presence,
Method r r(s) is used further down.
Method setto setto(s) sets (j+1),...k to the characters in the string s, readjusting k.
Method step1ab step1ab() gets rid of plurals and -ed or -ing. e.g.
Method step1c step1c() turns terminal y to i when there is another vowel in the stem.
Method step2 step2() maps double suffices to single ones. so -ization ( = -ize plus -ation) maps to -ize etc. note that the string before the suffix must give m() > 0.
Method step3 step3() dels with -ic-, -full, -ness etc. similar strategy to step2.
Method step4 step4() takes off -ant, -ence etc., in context <c>vcvc<v>.
Method step5 step5() removes a final -e if m() > 1, and changes -ll to -l if m() > 1.
Method vowelinstem vowelinstem() is TRUE <=> k0,...j contains a vowel
Instance Variable b Undocumented
Instance Variable j Undocumented
Instance Variable k Undocumented
Instance Variable k0 Undocumented
def stem(self, word):
In stem(p,i,j), p is a char pointer, and the string to be stemmed is from p[i] to p[j] inclusive. Typically i is zero and j is the offset to the last character of a string, (p[j+1] == ''). The stemmer adjusts the characters p[i] ... p[j] and returns the new end-point of the string, k. Stemming never increases word length, so i <= k <= j. To turn the stemmer into a module, declare 'stem' as extern, and delete the remainder of this file.
Parameters
word:strUndocumented
Returns
strUndocumented