All About Programming: lucene4.x的分组实现

lucene4.x的分组实现

public static List<HashMap<String, String>> testGroup(String indexPath,String groupField,String sumField){

        List<HashMap<String, String>> map=new ArrayList<HashMap<String,String>>();

        Directory d1=null; 

         IndexReader read1=null;

        try{

         d1=FSDirectory.open(new File(indexPath));//磁盘索引

          read1=DirectoryReader.open(d1);//打开流

       IndexSearcher sear=new IndexSearcher(new MultiReader(read1));//MultiReader此类可以多份索引的读入

       //但是得保证各个索引的字段结构一致

        GroupingSearch  gSearch=new GroupingSearch(groupField);//分组查询按照place分组

        Query q=new WildcardQuery(new Term(groupField,"*"));//查询所有数据

          TopGroups t=gSearch.search(sear, q, 0, Integer.MAX_VALUE);//设置返回数据

          GroupDocs[] g=t.groups;//获取分组总数

          System.out.println("总数据数"+t.totalHitCount);

          System.out.println("去重复后的数量:"+g.length);

         for(int i=0;i<g.length;i++){

               ScoreDoc []sd=g[i].scoreDocs;

               String str  =sear.doc(sd[0].doc).get(groupField);

               int total=sumcount(str,groupField,sumField,sear);

           //System.out.println("place:"+str+"===>"+"个数:"+g[i].totalHits+);

           System.out.println("place:"+str+"===>"+"个数:"+g[i].totalHits);

               HashMap<String, String> m=new HashMap<String, String>();

               m.put("word", str);

               m.put("wx_count", total+"");

               m.put("wx_total", "10000");

               map.add(m);

         }

         read1.close();//关闭资源

           d1.close(); 

        }catch(Exception e){

            e.printStackTrace();

        } 

        return map;

    }
至此，已经可以简单的实现分组去重统计的功能了，如果业务比较复杂，例如像报表查询，以及一些特定的统计求和功能，这个就可能需要自己写了 

Please read full article from lucene4.x的分组实现
lucene4.x的分组实现

No comments:

Post a Comment

Labels

Popular Posts