All About Programming: Lucene 4 Essentials for Text Search and Indexing

Lucene Indexes Fields

Conceptually, Lucene provides indexing and search over documents, but implementation-wise, all indexing and search is carried out over fields. A document is a collection of fields. Each field has three parts: name, type, and value. At search time, the supplied field name restricts the search to particular fields.

For example, a MEDLINE citation can be represented as a series of fields: one field for the name of the article, another field for name of the journal in which it was published, another field for the authors of the article, a pub-date field for the date of publication, a field for the text of the article’s abstract, and another field for the list of topic keywords drawn from Medical Subject Headings (MeSH). Each of these fields is given a different name, and at search time, the client could specify that it was searching for authors or titles or both, potentially restricting to a date range and set of journals by constructing search terms for the appropriate fields and values.

Read full article from Lucene 4 Essentials for Text Search and Indexing | LingPipe Blog

Lucene 4 Essentials for Text Search and Indexing | LingPipe Blog

Lucene Indexes Fields

No comments:

Post a Comment

Labels

Popular Posts