Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 1

N-Grams:

An n-gram is a subsequence of n items from a given sequence. The items in question can be phonemes, syllables, letters, words or base pairs according to the application. An n-gram of size 1 is referred to as a "unigram"; size 2 is a "bigram" (or, less commonly, a "digram"); size 3 is a "trigram"; and size 4 or more is simply called an "n-gram". Some language models built from n-grams are "(n 1)-order Markov models". An n-gram model is a type of probabilistic model for predicting the next item in such a sequence. n-gram models are used in various areas of statistical natural language processing and genetic sequence analysis. Here are examples of word level 3-grams and 4-grams (and counts of the number of times they appeared) from the Google n-gram corpus.[1] ceramics collectables collectibles (55) ceramics collectables fine (130) ceramics collected by (52) ceramics collectible pottery (50) ceramics collectibles cooking (45) 4-grams serve as the incoming (92) serve as the incubator (99) serve as the independent (794) serve as the index (223) serve as the indication (72) serve as the indicator (120)

You might also like