Nlp.stanford.edu

Hi there! I'm a computational sociolinguist at UC Davis Linguistics, where I direct the Linguistic Mechanisms Lab. I recently moved from Northwestern ...

Introduction to Information Retrieval - Stanford University

Introduction to Information Retrieval - Stanford University

... PDF files. We will not deal further with these issues in this book, and will assume henceforth that our documents are a list of characters ...Read more569 pages

Tên miền: nlp.stanford.edu Đọc thêm

Deep Learning for Natural Language Processing

Deep Learning for Natural Language Processing

Deep Learning for. Natural Language Processing. Christopher Manning. Stanford University. Page 2. Christopher Manning. 1980s Natural Language Processing. VP →{ ...Read more36 pages

Tên miền: nlp.stanford.edu Đọc thêm

Evaluation in information retrieval

Evaluation in information retrieval

In this chapter we begin with a discussion of measuring the effectiveness of IR systems (Section 8.1) and the test collections that are most often used for this ...Read more25 pages

Tên miền: nlp.stanford.edu Đọc thêm

Tokenization

Tokenization

Tokenization is the task of chopping it up into pieces, called tokens, perhaps at the same time throwing away certain characters, such as punctuation.Read more

Tên miền: nlp.stanford.edu Đọc thêm

12 Language models for information

12 Language models for information

In this chapter, we first introduce the concept of language models (Sec- tion 12.1) and then describe the basic and most commonly used language modeling ...Read more17 pages

Tên miền: nlp.stanford.edu Đọc thêm

An example information retrieval problem

An example information retrieval problem

The list is then called a postings list (or ), and all the postings lists taken together are referred to as the postings . The dictionary in Figure 1.3 has been ...Read more

Tên miền: nlp.stanford.edu Đọc thêm

The term vocabulary and postings lists

The term vocabulary and postings lists

Collect the documents to be indexed. · Tokenize the text. · Do linguistic preprocessing of tokens. · Index the documents that each term occurs in.Read more

Tên miền: nlp.stanford.edu Đọc thêm

Boolean retrieval

Boolean retrieval

Information retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text) that satisfies an information need from within large ...

Tên miền: nlp.stanford.edu Đọc thêm