Searching made easy with Apache Lucene 4.3 | Java Code Geeks



  • Indexing – Indexing involves adding a document to the Lucene index by help of a class called “IndexWriter“.
  • Searching – Searching involves retrieval of a document from Lucene index by help of a class called “IndexSearcher
  • Document – A Lucene Document is a single unit of search and index. For example item in a shopping cart. Lucene index can contain millions of documents.
  • Fields – Fields are properties of any document. In other words fields are the facets of the document which is an object. For example category of an item in shopping cart. Each document can have multiple fields.
  • Queries – Lucene has its own query language. This allows us to search for document based on mulitple fields. We can assign weight to a field and also use boolean expressions like and and or to the query. For example – Return all items in cart which belong to category garden or home and has color red and has price less than Rs.1000.
  • Analyzers – When a field text is to be indexed then they need to be converted into its most basic form. First they are tokenized and then they are converted to lowercase, sigularized, depunctuated. These tasks are performed by Analyzers. Analyzers are complicted and we require a deep study on how to use them. Most often the built in analyzers don’t suffice for our requirement, in that case we can create a new one. For this tutorial we will be using StandardAnalyzer as they contain most of the basic features we require.

  • Read full article from Searching made easy with Apache Lucene 4.3 | Java Code Geeks


    No comments:

    Post a Comment

    Labels

    Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

    Popular Posts