Lucene3.0 Study Notes (2)

2011-03-01  来源:本站原创  分类:Internet  人气:78 

For Lucene3.0 today some of the new features, do a simple practice. Main index to achieve the establishment of two methods: 1. On the one txt document indexing and search. 2 for a certain folder for all the txt files to index and search.
There are two points that, in this share this:
1 to add a new section of the index of the field, if used in a new Field ("***", Reader reader) is not stored. So doc.get ("***");, you can not remove the contents, then you need to write a method to achieve the reader into a string.
(2) in a certain folder for all the txt document indexing, you need a file for each build a document object, and then add the field to the document in the domain, respectively. Otherwise, the search time will be wrong (for unknown reasons, expert guidance or two who still hope), and when using Luck tool to view, whether to each text building Docment, two results the same content, but the order will be different .

[Color = green] [/ color]


package test3;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.Field.Index;
import org.apache.lucene.document.Field.Store;
import org.apache.lucene.index.CorruptIndexException;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriter.MaxFieldLength;
import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.store.LockObtainFailedException;
import org.apache.lucene.util.Version;
import org.junit.Test;

public class IndexTxt {

private IndexWriter write = null;
private IndexSearcher search = null;

private String dataPath = "E: \ \ testlucene \ \ test \ \ test.txt"; / / text for a particular document indexing and search
private String dataPath1 = "E: \ \ testlucene \ \ test"; / / for all the documents under a particular text indexing and search
private String indexPath = "E: \ \ testlucene \ \ fileIndex";

private Directory indexDir = null;
private Analyzer analyzer = new StandardAnalyzer (Version.LUCENE_30);
public IndexTxt () throws IOException {
File file = new File (indexPath);
indexDir = FSDirectory.open (file); / / create the index directory
}
@ Test
public void createIndex () throws CorruptIndexException, LockObtainFailedException, IOException {/ / index
/ *
* First file to be indexed into the document object
* /
Document doc = new Document ();
File dataFile = new File (dataPath);
/ / Get the file input stream

/ / Add all field
doc.add (new Field ("name", dataFile.getName (), Store.YES, Index.ANALYZED));

/ / Doc.add (new Field ("content", reader ));// not stored, it can not be used doc.get ("content") to get the content, so to adapt next line
doc.add (new Field ("content", filecontent (dataFile), Store.YES, Index.ANALYZED));
/ / Where the index
write = new IndexWriter (indexDir, analyzer, true, MaxFieldLength.LIMITED);
write.addDocument (doc);
write.close ();
}
/ *
* For a particular folder to index and search all the documents
* /
@ Test
public void createIndex1 () throws IOException {/ /
File folder = new File (dataPath1);
write = new IndexWriter (indexDir, analyzer, true, MaxFieldLength.LIMITED);
if (folder.isDirectory ()) {
String [] files = folder.list ();// returns a string array of strings specified by this abstract pathname represents a directory of files and directories.

for (int i = 0; i <files.length; i + +) {
File file = new File (folder, files [i ]);// based parent abstract pathname and the child pathname string to create a new File instance.
Document doc = new Document ();
doc.add (new Field ("name", file.getName (), Store.YES, Index.ANALYZED));
doc.add (new Field ("content", filecontent (file), Store.YES, Index.ANALYZED));
write.addDocument (doc);
}
/ / Where the index
write.close ();
} Else {
System.out.println ("----- folder.isDirectory (): false. ");
}
}
private String filecontent (File file) throws IOException {
FileInputStream fis = new FileInputStream (file);
StringBuffer content = new StringBuffer ();
BufferedReader reader = new BufferedReader (new InputStreamReader (fis));
for (String line = null; (line = reader.readLine ())!= null;) {
content.append (line). append ("\ n");
}
return content.toString ();
}
@ Test
public void createSearch () throws CorruptIndexException, IOException, ParseException {
/ / Index for a particular search directory
search = new IndexSearcher (indexDir);
/ / Key -> Query object
String key = "game";
QueryParser parse = new QueryParser (Version.LUCENE_30, "content", analyzer);
Query query = parse.parse (key); / / search keywords into a Query object

TopDocs hits = search.search (query, 100); / / encapsulates the return of all the records meet the conditions
int total = hits.totalHits; / / return the document containing the number of keywords, bear in mind the number of the document
if (total == 0)
System.out.println ("no such a file");
else {
for (int i = 0; i <hits.scoreDocs.length; i ++){// hits.scoreDocs return The top hits for the query.
ScoreDoc scoreDoc = hits.scoreDocs [i]; / / return a record of a qualified
Document doc = search.doc (scoreDoc.doc); / / Returns the stored fields of document i.
System.out.println (doc.get ("name"));
System.out.println (doc.get ("content"));
System.out.println ("----------------");
}
}
}
}

相关文章
  • Lucene3.0 Study Notes (2) 2011-03-01

    For Lucene3.0 today some of the new features, do a simple practice. Main index to achieve the establishment of two methods: 1. On the one txt document indexing and search. 2 for a certain folder for all the txt files to index and search. There are tw

  • lucene3.0 Study Notes (2) index 2010-06-30

    1. IndexWriter learning IndexWriter writer = new IndexWriter(FSDirectory.open("E:\\test\\index"), new StandardAnalyzer(Version.LUCENE_CURRENT), true, IndexWriter.MaxFieldLength.LIMITED); IndexWriter constructor of the class of four parameters: (

  • lucene 3.0 Study Notes (1) - index 2009-09-25

    Are learning lucene, download the new version is 3.0, where the study notes in order and put in as a backup blog. Use lucene as a search engine, the main two things to do are: 1, indexing; 2, using the index query. That lucene first to search the con

  • lucene 3.0 Study Notes (1) - index (rpm) 2010-11-16

    Are learning lucene, download the new version is 3.0, where the study notes in order, on the blog, as a backup. Use lucene as a search engine, the main thing to do 2: 1, indexing; 2, the use of references. Lucene first to search for that content into

  • Struts2.0 Study Notes (1) name space and. Xml file configuration 2010-02-07

    First of all to say sorry, thank the concerns that I blog friends. Near future because of the change do the work, because continuous learning, continuous progress in order to make their own to find a better job. Therefore, during this period did not

  • Spring Security 2.0 study notes 2010-04-10

    spring 2.5 has released, Acegi 2.0 came out, Found it also in the number of new features, but many are in English, so I look around, remember these things, that the notes are, Ha ha. Ado, of course, from the web.xml configuration file start ah. Look

  • lucene 3.0 Study Notes (1) - index (r) 2010-11-16

    Are learning lucene, download the new version is 3.0, where in order to learn the notes on the blog as a backup. Use lucene as a search engine, the main thing to do 2: 1, indexing; 2, use the index query. Lucene first to search for that content into

  • spring3.0 Study Notes 3 --- SpEL expression 2 2008-10-11

    Cases of a : Use the symbol //evaluates to true boolean trueValue=parser.parseExpression("2==2").getValue(Boolean.class); //evaluates to false boolean falseValue=parser.parseExpression("2<-5.0").getValue(Boolean.class); //evaluates

  • lucene 3.0 Study Notes (2) - use the index query 2009-01-28

    On the one we have already built the index, following the use of an index to do the proper business of the. This is a basic search function in the implementation of the code example: Directory dir = FSDirectory.open(new File("index"))); IndexSea

  • spring3.0 Study Notes 2 --- SpEL expression 1 2009-07-06

    Relatively speaking, Java is a static language . But today we're talking about is a dynamic language "---SpEL. Dynamic languages and the most significant static language differences, for example ," 'Hello'.toUperCase()" This is just a norma

  • maven2.0 study notes (reproduced) 2010-06-18

    from: http://www.360doc.com/content/06/1017/14/5874_232631.shtml Maven was originally intended to build in the Jakarta Turbine project processing manipulation to simplify. Between several projects using Ant build files to the difference is small, all

  • Mina2.0 study notes 1 (to learn Mina2.0 in the Eclipse source) 2010-07-24

    1, extract the Maven 2.2.1 download Download: http://maven.apache.org/download.html Extract: D: \ ProgramFiles \ apache-maven-2.2.1 Configure environment variables: M2_HOME = D: \ ProgramFiles \ apache-maven-2.2.1 Path to add% M2_HOME% \ bin 2, insta

  • Junit 4.0 study notes 2010-10-16

    unit 4x compared to junit3.8 extensive use of annotation. Main program Java code public class Compalbe { public int add (int a, int b) { return a + b; } public int div (int a, int b) throws Exception { return a / b; } } Test program will test the met

  • lucene 3.0 Study Notes (2) - use the index query (rpm) 2010-11-16

    We have built on one of the index, the following do get right to the use of the index. This is a basic search function in the implementation of the code sample: Java code Directory dir = FSDirectory.open (new File ("index "))); IndexSearcher sea

  • lucene 3.0 Study Notes (2) - use the index query (turn) 2010-11-16

    The one we have built the index, following the use of the index to do a proper business. This is a basic search function to implement the sample code: Java code Directory dir = FSDirectory.open (new File ("index "))); IndexSearcher searcher = ne

  • Struts2.0 Study Notes (1) 2010-12-02

    One. Struts2 project developed the basic steps 1. Struts into the core support package 2. FilterDispatcher configured in the web.xml 3. Development Dao 4. Development of action 5. Prepare front display page 6. Struts.xml profile created II. Struts2 p

  • EJB personal study notes 2010-05-30

    Ejb3.0 Study Notes ( Run in Jobss Ejb) What is EJB? Full name is Enterprice JavaBeans is a standard for distributed business applications server component model . Prepared using Enterprice JavaBeans architecture is extensible applications , Transacti

  • js Study Notes (3) 2009-03-16

    Dojo Study Notes (2. DjConfig explain) dojo are djConfig a built-in Global Settings object, its role is through its control of behavior dojo First of all, we need to quote the former dojo.js statement djConfig object in order to load when dojo.js be

  • hibernate-depth study notes 2009-07-28

    hibernate-depth study notes Keywords: hibernate o / r maping Hb just in the fire is abuse, seen, but not very ormaping understand that we are now looking to re-hb, before many areas are not Now suddenly have a lot about all the basic . Logo generator

  • XSL Study Notes 1: XSL Overview 2010-03-29

    XSL Study Notes 1: XSL Overview In essence, XSL is a manifestation of XML technology, which is the main output is HTML page. XML document does not contain any formatting, to complete the XML conversion of documents to other formats should first consi