NLP Lunch Tutorial: Smoothing

NLP Lunch Tutorial: SmoothingBill MacCartney21 April 2005 Preface Everything is from this great paper by Stanley F. Chen and JoshuaGoodman (1998), An Empirical Study of Smoothing Techniquesfor Language Modeling , which I read yesterday. Everything is presented in the context ofn-gram language models,but Smoothing is needed in many problem contexts, and most ofthe Smoothing methods we ll look at generalize without Plan Motivation the problem an example All the Smoothing methods formula after formula intuitions for each So which one is the best? (answer: modified Kneser-Ney) Excel demo for absolute discounting and Good-Turing?2 Probabilistic modeling You have some kind of probabilistic model, which is a distributionp(e) over an event spaceE. You want to estimate the parameters of your model distributionpfrom data. In principle, you might to like to use maximum likelihood (ML)estimates, so that your model ispML(x) =c(x) ec(e) : data sparsity But, you have insufficient data: there are many eventsxsuch thatc(x) = 0, so that the ML estimate ispML(x) = 0.

Apr 21, 2005 · times in the training data to the n-grams that occur r times. • In particular, reallocate the probability mass of n-grams that were seen once to the n-grams that were never seen. • For each count r, we compute an adjusted count r∗: r∗ = (r + 1) nr+1 nr where nr is the number of n-grams seen exactly r times. • Then we have: pGT(x : c(x ...

Fullscreen Download

Tags:

Smoothing

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Spam in document Broken preview Other abuse

Transcription of NLP Lunch Tutorial: Smoothing

Related search queries

Civil Engineering, Formula for the orthogonal projection, OPTIMIZATION

PDF4PRO ^⚡AMP

Modern search engine that looking for books and documents around the web

NLP Lunch Tutorial: Smoothing

Tags:

Information

Transcription of NLP Lunch Tutorial: Smoothing

Related search queries

NLP Lunch Tutorial: Smoothing

Tags:

Information

Documents from same domain

Related documents

Related search queries