[Quote] Oracle full-text research (all 2)

2011-06-23  来源:本站原创  分类:Database  人气:81 

3.2 Filter properties

Filter is responsible for a variety of file formats the data into plain text format, the other components of the index in the pipeline can only handle plain text data, does not recognize such as microsoft word or excel file format, filter with charset_filter,

inso_filter, null_filter, user_filter, procedure_filter types. (Can be a text document format into a database format.)

3.2.1 CHARSET_FILTER

Non-database character from the document into the database character (text: Use the CHARSET_FILTER to convert

documents from a non-database character set to the character set used by the database)

Example:

create table hdocs (id number primary key, fmt varchar2 (10), cset varchar2 (20),

text varchar2 (80)

);

begin

cxt_ddl.create.preference ('cs_filter', 'CHARSET_FILTER');

ctx_ddl.set_attribute ('cs_filter', 'charset', 'UTF8');

end

insert into hdocs values ​​(1, 'text', 'WE8ISO8859P1', '/ docs / iso.txt');

insert into hdocs values ​​(2, 'text', 'UTF8', '/ docs/utf8.txt');

commit;

create index hdocsx on hdocs (text) indextype is ctxsys.context

parameters ('datastore ctxsys.file_datastore

filter cs_filter

format column fmt

charset column cset ');

3.2.2 NULL_FILTER

Default properties without any filtering

oracle is not recommended for html, xml and plain text using auto_filter parameters, oracle recommend that you use

null_filter and section group type

- Create null filter

create index myindex on docs (htmlfile) indextype is ctxsys.context

parameters ('filter ctxsys.null_filter section group ctxsys.html_section_group');

Filter the default field type will be indexed and the type of datastore for storage in the database

varchar2, char, and clob data in the field, oracle automatically selected null_filtel, if the property is set to datastore

file_datastore, oracle will choose auto_filter as the default value.

3.2.3 AUTO_FILTER

Universal filter for most of the documents, including PDF and Ms word, the filter will automatically identify the plain-text, HTML, XHTML,

SGML and XML documents

Create table my_filter (id number, docs varchar2 (1000));

Insert into my_filter values ​​(1, 'Expert Oracle Database Architecture.pdf');

Insert into my_filter values ​​(2, '1. Txt ');

Insert into my_filter values ​​(3, '2. Doc ');

commit;

/

- Create file datastore

Begin

ctx_ddl.create_preference ('test_filter', 'file_datastore');

ctx_ddl.set_attribute ('test_filter', 'path', '/ opt / tmp');

End;

- Error message table

select * from CTX_USER_INDEX_ERRORS

- The establishment of auto filter

Create index idx_m_filter on my_filter (docs) indextype is ctxsys.context

parameters ('datastore test_filter filter ctxsys.auto_filter');

select * from my_filter where contains (docs, 'oracle')> 0

AUTO_FILTER can automatically identify most of the format, we can show through the column to specify the document type, there is text, binary, ignore, set to binary documents using auto_filter, set to text documents using null_filter, set to ignore the document is not indexed.

create table hdocs (id number primary key, fmt varchar2 (10), text varchar2 (80));

insert into hdocs values ​​(1, 'binary', '/ docs / myword.doc');

insert in hdocs values ​​(2, 'text', '/ docs / index.html');

insert in hdocs values ​​(2, 'ignore', '/ docs/1.txt');

commit;

create index hdocsx on hdocs (text) indextype is ctxsys.context

parameters ('datastore ctxsys.file_datastore filter ctxsys.auto_filter format column

fmt ');

3.2.4 MAIL_FILTER

By mail_filter the RFC-822, RFC-2045 text information into the index

Restrictions:

Documents must be us-ascii

Length can not exceed 1024bytes

document must be syntactically valid with regard to RFC-822

3.2.5 USER_FILTER

Use the USER_FILTER type to specify an external filter for filtering documents in a column

3.2.6 PROCEDURE_FILTER

Use the PROCEDURE_FILTER type to filter your documents with a stored procedure. The stored procedure is called

each time a document needs to be filtered.

3.2.7 Reference Script

- Create null filter

create index myindex on docs (htmlfile) indextype is ctxsys.context

parameters ('filter ctxsys.null_filter section group ctxsys.html_section_group');

- The establishment of auto filter

Create index idx_m_filter on my_filter (docs) indextype is ctxsys.context

parameters ('datastore test_filter filter ctxsys.auto_filter');

Filter error log table: CTX_USER_INDEX_ERRORS

相关文章
  • Oracle Full Text Search 2010-05-07

    Oracle Full Text Search 1. Database configuration: Database Configuration ORACLE_TEXT components; Set the lexical analyzer (chinese_vgram_lexer, the lexical analyzer for the special Chinese language parser, support for all Chinese character set, acco

  • [Quote] Oracle full-text research (all 2) 2011-06-23

    3.2 Filter properties Filter is responsible for a variety of file formats the data into plain text format, the other components of the index in the pipeline can only handle plain text data, does not recognize such as microsoft word or excel file form

  • [Quote] Oracle full-text research (all 3) 2011-06-23

    3.3 Lexer property Oracle full-text search of lexer attributes used to handle different languages, use the most basic English basic_lexer, Chinese, you can use chinese_vgram_lexer or chinese_lexer. 3.3.1 Basic_lexer basic_lexer property supports Engl

  • [Quote] Oracle full-text research (all 4) 2011-06-23

    3.4 Section Group Properties Section group to support the internal structure of the document containing the query (such as html, xml documents, etc.), you can specify the document A part of the query, you can limit the scope of the query in the title

  • [Quote] Oracle full-text research (all 5) 2011-06-23

    3.5 Storage Properties Oracle full-text search will usually generate a series of auxiliary tables to generate rule dr $ + + $ index + table name identification purposes, Because these tables are automatically generated oracle, usually no way to speci

  • [Quote] Oracle full-text research (all 6) 2011-06-23

    3.6 Wordlist property Oracle full-text search of the wordlist used to set properties of fuzzy query and the same root query, wordlist property also supports Sub-queries and prefix queries, oracle of the wordlist properties only basic_wordlist a (text

  • [Quote] Oracle full-text research (all 7) 2011-06-23

    3.7 Stoplist Properties Stoplist allow shielding of some commonly used words, such as is, a, this, of little use to index these words, the system Default language will be used and the database that corresponds to disable the lexicon (original: Stopli

  • [Quote] Oracle full-text research (all 8) 2011-06-23

    3.9 Highlighting Highlight Not to say that highlighted the content, but returns all hit the position of the word in the document and hit the length of the word itself. So that the user get the document also highlighted the need to get the content len

  • [Quote] Oracle full-text research (all 9) 2011-06-23

    3.10 Commonly used scripts 3.10.1 Delete the preference: begin ctx_ddl.drop_preference ('my_lexer'); end; 3.10.2. Index rebuild: ALTER INDEX newsindex REBUILD PARAMETERS ('replace lexer my_lexer'); 3.10.3 Synchronization index begin ctx_ddl.sync_inde

  • [Quote] Oracle full-text research (all 10) 2011-06-23

    4, the operation instance 4.1 single and multi-column support for the Chinese search Create table mytable1 (id number primary key, doc1 varchar2 (400), doc2 clob, doc3 clob); Insert into mytable1 values (1, 'today's weather is very good, I want to go

  • How to make better use of Oracle Full Text Search 2010-05-11

    Do not use Oracle text functions, but there are many ways to search text in the Oracle database. You can use a standard LIKE operator and the INSTR function to achieve. SELECT *FROM mytext WHERE INSTR (thetext, 'Oracle') > 0; SELECT * FROM mytext WHE

  • oracle full text search, copy, and laughed 2011-01-09

    Recent projects to do ... and to deal with Oracle to use Oracle's own "Search" Syntax is simple (if completed index): select * from table where CONTAINS (column name, 'to retrieve the keyword' ")> 0 On OK. But check out the results found

  • [Quote] oracle table partition Detailed 2011-08-02

    [Quote] Detailed oracle table partition [Connection] http://tianzt.blog.51cto.com/459544/171759

  • Oracle database and research the types of locks 2011-04-15

    Database is a multi-user shared resources. When multiple users simultaneous access to data in the database will have multiple transactions simultaneously access the same data. If not controlled for concurrent operation may be read and stored on incor

  • [Quote] oracle table space partition table as well as a summary index 2011-08-02

    [Quote] [connection] http://blog.csdn.net/zgl_dm/article/details/2440306 Table space: Oracle's solution UNDOTBS01.DBF file is too large 1. Prohibit automatic undo tablespace growth alter database datafile 'full_path/undotbs01.dbf' autoextend off; 2 .

  • oracle import text data 2010-04-22

    CREATE TABLE ALL_SALES ( YEAR NUMBER(38), MONTH NUMBER(38), PRD_TYPE_ID NUMBER(38), EMP_ID NUMBER(38), AMOUNT NUMBER(8,2) ) 2003 1 1 21 10034.84 2003 2 1 21 15144.65 2003 3 1 21 20137.83 2003 4 1 21 25057.45 2003 5 1 21 17214.56 2003 6 1 21 15564.64

  • Oracle text search _CatSearch 2010-07-28

    Oracle text index of the query template feature can be used in the CONTEXT index CATSEARCH syntax, or use the index in CTXCAT CONTAINS syntax. Oracle Database Text word can not be full-text search --- If the project needs to do in the Oracle database

  • Oracle 10g Install Ultra Search 2010-05-07

    windows server 2003 32 Bit 9.2.0.6 upgrade to Windows Server 2003 64-bit 10.2.0.4, the view state of each component in the database are shown below, you can see Oracle Ultra Search is NO SCRIPT state. Quote Oracle Database 10.2 Upgrade Status Utility

  • linux 5 installed Oracle 9i error log 2010-10-18

    Operating system version Quote [Oracle @ zhoul oracle] $ uname-a Linux zhoul 2.6.18-164.el5 # 1 SMP Tue Aug 18 15:51:54 EDT 2009 i686 i686 i386 GNU / Linux Oracle 9.2.0.4 upgrade to 9.2.0.6 in the Times the following error: Quote - Linking Oracle rm-

  • linux 5 installation of Oracle 9i logging error 2010-10-18

    Operating system version Quote [Oracle @ zhoul oracle] $ uname-a Linux zhoul 2.6.18-164.el5 # 1 SMP Tue Aug 18 15:51:54 EDT 2009 i686 i686 i386 GNU / Linux In Oracle 9.2.0.4 upgrade to 9.2.0.6 Times the following error: Quote - Linking Oracle rm-f /