Month: December 2012

lucene几种Query对象示例

package player.kent.chen.temp.lucene.miscquery; import java.io.IOException; public class MyLuceneMiscQueryDemo { public static void main(String[] args) throws Exception { //创建index writer对象 Directory indexDir = new RAMDirectory(); IndexWriter indexWriter = new IndexWriter(indexDir, new StandardAnalyzer(Version.LUCENE_30), IndexWriter.MaxFieldLength.UNLIMITED); String text1 = "adam"; Document doc1 = new Document(); doc1.add(new Field("content", text1, Field.Store.YES, Field.Index.ANALYZED)); indexWriter.addDocument(doc1); String text2 = "brings"; Document doc2 = new Document(); …

lucene几种Query对象示例 Read More »

lucene near-real-time search代码示例

package player.kent.chen.temp.lucene.nrts; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.TermQuery; import org.apache.lucene.search.TopDocs; import org.apache.lucene.store.Directory; import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.util.Version; public class MyNearRealTimeSearch { public static void main(String[] args) throws Exception { //创建index writer对象 Directory indexDir = new RAMDirectory(); IndexWriter indexWriter = new IndexWriter(indexDir, new StandardAnalyzer(Version.LUCENE_30), IndexWriter.MaxFieldLength.UNLIMITED); //为第一个文档建索引,但不commit() String …

lucene near-real-time search代码示例 Read More »

[lucene] QueryParser中的default field是什么意思?

直接上例子 假设已有Index: 对文本文件进行索引,有两个Field, 分别是 文件名(fileName)和文件内容(content) 使用content作为default field: QueryParser qp = new QueryParser(Version.LUCENE_30, "content", new StandardAnalyzer( Version.LUCENE_30)); Query query = qp.parse("人"); //会搜出内容中含有“人”字样的文档 Query query = qp.parse("fileName:人"); //会搜出标题中含有“人”字样的文档 可以看出: 1. 使用content作为default field构建的Parser,仍然可以对其他Field进行搜索 2. 如果在搜索的term里不指定field, 则parser会默认使用content作为目标Field

nginx 后接jett/tomcat

不需要改jetty/tomcat的配置,只需要修改nginx.conf 引用 http{                server{                 listen 80;                  server_name www.xxx.com www2.xxx.com;                 location / {                     proxy_pass              http://localhost:8080;                     proxy_set_header        X-Real-IP $remote_addr;                     proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;                     proxy_set_header        Host $http_host;                 }         } } 另外,经过以上设置后,在servlet里拿request.getServerName()和request.getServerPort()会跟浏览器里输入的一致。

Ubuntu下启动、停止nginx

nginx在ubuntu下会被安装成service, 所以相应的起止办法是: $sudo service nginx start $sudo service nginx stop 另外,配置文件在 $sudo vi /etc/nginx/nginx.conf

想象一下Lunece索引的逻辑结构

  想象:   假设一个文本有以下几部分组成:                   title:   "Hadoop: The Definitive Guide"         content:   "Hadoop got its start in Nutch" unbreakable:    "united kingdom"   (先不要理会unbreakable的意义)         ignored:    "Hadoop Nonsense" (注释同上)       如果按下列语句来建索引,索引大概会是什么样?     Document doc = …

想象一下Lunece索引的逻辑结构 Read More »

lucene indexer/searcher简单代码示例

仅供拷贝 <!–pom.xml–> <dependency> <groupId>org.apache.lucene</groupId> <artifactId>lucene-core</artifactId> <version>3.0.0</version> </dependency> package player.kent.chen.temp.lucene; import java.io.File; import java.io.FileReader; import java.io.IOException; import org.apache.commons.io.FileUtils; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.store.Directory; import org.apache.lucene.store.FSDirectory; import org.apache.lucene.util.Version; public class MyLuceneIndexer { public static void main(String[] args) throws Exception { String rootDir = "/home/kent/diskD/home-kent-dev/workspace/kent-temp/data/lucene"; File contentDir = new File(rootDir, "content"); File indexDir …

lucene indexer/searcher简单代码示例 Read More »