Term Frequency - Inverse Document Frequency
Weighing scheme for Term-Document matrices
- Term Frequency (TF) β tf(t,d) =  log (1 + count(t,d))
- take base 10
 - count(t,d) β frequency of term t in document d
 - the addition of 1 is Laplace Smoothing
 
 - Inverse Document Frequency (IDF) β idf(t) = log(N/df(t))
- N β total number of documents , |D|
 - df(t) β number of documents in which t occurs
 
 - tf-idf(t,d) = tf (t,d) x idf(t)