SQL Server uses an index structure and its

2011-06-01  来源:本站原创  分类:Database  人气:85 

First, the layman to understand index structure

In fact, you can be understood as a special index directory. Microsoft SQL SERVER provides two indexes: clustered index (clustered index, also known as the clustering index, clustered index) and non-clustered index (nonclustered index, also called non-clustering index, non-clustered index). Below, we look at an example to illustrate the clustered index and nonclustered index difference:
In fact, the body of our Chinese dictionary itself is a clustered index. For example, we want to investigate the "security", which will naturally turn first few pages of the dictionary, because "safety" of the alphabet is "an", and sorted according to pinyin dictionary of Chinese characters in English letters "a" at the beginning and with "z" at the end, then "Security" word in the dictionary of the natural drainage of the front. If you turn over all with "a" still can not find the beginning part of the word, then it shows you do not have this word in the dictionary; Similarly, if the investigations, "Zhang" word, that you will turn your dictionary last part, because "Chang" the alphabet is "zhang". In other words, the body of the dictionary itself is a directory, you do not need to go check the other directory to find what you're looking for content. We call this the body content is itself a directory arranged according to certain rules known as "clustered index."
If you know a word, you can quickly be found from the word automatically. However, you may encounter words you do not know, do not know its pronunciation, this time, you can not find the method in accordance with just the word you want to check, and need to go under the "radicals" found what you're looking words, after the word and then turn directly to the page you are looking for a page to find the words. But you combine "radical Catalog" and "check the word table" and the word order found in the body is not really the sort method, such as your check, "Zhang", which we can see that after the check in the check word radical table, "Zhang" is a 672 page, check the word table, "Zhang" above is the "Chi" word, but the page number is 63, "Zhang" Here is the "crossbow" word, the page is 390. Obviously, these words were not really in the "Zhang" word on the bottom, and now you see a row of "Chi, Zhang, crossbow," the words they actually sort of non-clustered index is the dictionary text The words in the non-clustered index maps. In this way we can to find the word you need, but it requires two processes, first find the directory results, and the ability to turn to the page you need. We call this directory is purely directory, text is simply text to sort called "non-clustered index."
Through the above examples, we can understand what is "clustered index" and "non-clustered index." Further extended look, we can easily understand: Each table can have only one clustered index, because the directory can be sorted according to a method.

Second, when using the clustered index or non-clustered index

The following table summarizes when to use clustered index or non-clustered index (very important):

Action Description Using the clustered index Use of non-clustered index
Column sort is often grouped Should Should
Returns a range of data Should Not
One or very few different values Not Not
Small number of different values Should Not
Large number of different values Not Should
Frequently updated columns Not Should
Foreign key columns Should Should
Primary key column Should Should
Frequently modify the index column Not Should

In fact, we can in front of the clustered index and nonclustered index definition example to understand the table. Such as: Return the data within a certain range. For example one of your table has a time out, just you to aggregate index based on the column, when you query the January 1, 2004 to October 1, 2004 Zhijian all data, this rate will is very fast, because your body is this dictionary to sort by date, the clustering index to retrieve only need to find all of the data in the beginning and end of the data can be; rather than non-clustered index, you must first be found found in the directory corresponding to each data page, and then found according to the specific content page.

Third, with the actual, the index used to talk about the error

The purpose is to apply the theory. Although we have listed the clustered index should be used when non-clustered index, but in practice the above rules is very easy to be overlooked or not a comprehensive analysis of the actual situation. Here we will practice the practical problems encountered in the index used to talk about the errors, so that we grasp the indexing method.

1, the primary key is the clustered index I think this idea is utterly wrong, is the clustered index of a waste. Although the default SQL SERVER is the primary key clustered index on the.
Typically, we will each have a table ID column, to distinguish each data and the ID column is automatically increased in steps of typically 1. Our office automation instance of this column Gid is the case. At this point, if this column as the primary key, SQL SERVER this column will default to a clustered index. Advantage in doing so, is that you can keep your data in the database physically sorted according to ID, but I think it has little significance.
Obviously, the advantage of the clustered index is obvious, and each table can have only one clustered index rule, which makes the clustered index has become more precious.
We talked about earlier from the clustered index definition we can see, the biggest advantage of using the clustered index is based on the query requirements can quickly narrow down to avoid full table scans. In practice, because the ID number is automatically generated, we do not know the ID number for each record, it is difficult in practice to query using the ID number. This ID number to make the primary key as a clustered index that becomes a waste of resources. Second, so that each ID number of different fields as a clustered index does not conform to the "big number of different values of the index case should not be aggregated to establish" rules; course, this is only recorded for the user to modify the content frequently, especially in the index will be negative effects when items, but did not affect the query speed.
In the office automation systems, both systems show the need for home users sign documents, meetings or query the user for file data query under any circumstances can not be separated field is "date" there is the user's own "user name . "
Typically, office automation displays each user's home page will not sign documents or meetings. Although we can only restrict the where statement has not been signed for the current user, but if your system has been established for a long time, and the large amount of data, then, each time each time a user opens the home page for a full table scan , this meaning is not, the vast majority of users a month before the documents have been visited, and this can only create more database overhead only. In fact, we can allow the user to open the system home page, just check the user database, nearly three months did not read the file, "Date" field to limit the table scan, speed up the search. If your office automation system has been established for 2 years, then in theory your home page display speed will be 8 times the original speed, even faster.
Here the reason that "theoretically" the words, because if you build a clustered index or blind on the primary key in the ID, your query speed is not so high, even if you "date" field create an index on (non-aggregate index). Here we look at 10 million cases of data for the speed performance (within 3 months of data for 250 000):

(1) only in the clustered index on primary key and does not divide time:

Select gid,fariqi,neibuyonghu,title from tgongwen

Time used: 128 470 ms (ie: 128 seconds)

(2) primary key clustered index on the fariq on the establishment of non-clustered index:

select gid,fariqi,neibuyonghu,title from Tgongwen
where fariqi> dateadd(day,-90,getdate())

With time: 53763 milliseconds (54 seconds)

(3) the aggregate index based on the date column (fariqi) on:

select gid,fariqi,neibuyonghu,title from Tgongwen
where fariqi> dateadd(day,-90,getdate())

With time: 2423 milliseconds (2 seconds)

Although each statement is extracted from the 250,000 data, a variety of situations, but the difference is huge, especially the clustered index based on the date column differences. In fact, if your database is really a capacity of 10 million words, based on the primary key ID column as the first and second case above, the performance of the website is out, simply can not be displayed. This is my ID column as the clustered index to abandon one of the most important factor. Speed is reached above: the front of each select statement:

declare @d datetime
set @d=getdate()

And added in the select statement:

select [ Statement execution time ( Ms )]=datediff(ms,@d,getdate())

2, as long as the index can significantly speed up the search in fact, we can find the above example, 2 and 3 of the same statement, and the indexing fields are the same; the only difference is that the former fariqi field the establishment of a non-aggregate index, which in this field is established on the aggregate index, the query speed is a big difference. So, in any field is not simply indexing can speed up the search.
The statement from the construction of the table, we can see that the table has 10 million data fields in fariqi have 5003 different records. In the fields on the aggregation of this index is appropriate. In reality, every day we made a few files, these files on the same date issued a document, which fully comply with the requirements of the establishment of a clustered index: "neither the vast majority are the same, but the same can not be only a very few" rules . From this, we establish an "appropriate" aggregate index for us to improve the query speed is very important.

3, all need to speed up the search fields are added to the clustered index to speed up the search has been mentioned above: the query can not be separated during the data field is "date" there is the user's own "user name . " Since these two fields are so important, we could have combined to create a composite index (compound index).
Many people think that as long as the clustered index is added to any field, you can speed up the search, it was also confused: If the composite clustered index fields separately query, the query speed will slow down it? With this issue, we look at the following query speed (the result set are 250 000 data): (date column fariqi composite clustered index in the first row of the starting line, the user name neibuyonghu came in after the column):

(1)select gid,fariqi,neibuyonghu,title from Tgongwen where fariqi>''2004-5-5''

Query speed: 2513 ms

(2)select gid,fariqi,neibuyonghu,title from Tgongwen
            where fariqi>''2004-5-5'' and neibuyonghu='' The Office of ''

Query speed: 2516 ms

(3)select gid,fariqi,neibuyonghu,title from Tgongwen where neibuyonghu='' The Office of ''

Query speed: 60280 ms

From the above test, we can see that if only the beginning of the clustered index columns used as a query, and while all the composite clustered index columns in the query speed is almost the same, even more than to spend all of the composite index of the column but also slightly fast (the number in the query result set as in the case); and if only the starting compound non-clustered index columns as a query, then the index is of no effect. Of course, the statement is the same as 1, 2, query speed is the same because the number of entries in the query, if the composite index of all columns using the term, and the results less, then this will form a "covering index", which can achieve optimal performance. Also, remember: regardless of whether you regularly use other aggregate index columns, but it must be the leading column of the most frequently used columns.

Fourth, there is no other book on the lessons learned to use the index

1, with the aggregate index of aggregation than not the primary key index is faster following example statements: (all data extraction 250 000)

select gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi=''2004-9-16''

Use of time: 3326 ms

select gid,fariqi,neibuyonghu,reader,title from Tgongwen where gid<=250000

Use of time: 4470 ms

Here, the aggregation index than the primary key index is not aggregated speed of nearly 1 / 4.

2, with the aggregate than the general index for order by primary key when speed, especially in the case of a small amount of data

select gid,fariqi,neibuyonghu,reader,title from Tgongwen order by fariqi

When: 12,936

select gid,fariqi,neibuyonghu,reader,title from Tgongwen order by gid

When: 18,843

Here, the aggregate than the general index for order by primary key, the fast 3 / 10. In fact, if a small amount of data, then use the clustered index to sort out than the use of non-clustered index speed is more than obvious; the amount of data if very large, such as 10 million or more, then the obvious speed difference between the two .

3, using the aggregate index of the time, search time will be accounted for by the entire table of data is proportional to the percentage of reduction, regardless of the aggregate index of the total number of:

select gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi>''2004-1-1''

With time: 6343 ms (extract 1 million)

select gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi>''2004-6-6''

With time: 3170 ms (extract 500 000)

select gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi=''2004-9-16''

With time: 3326 ms (and on the results of exactly the same sentence if the amount collected the same, then with greater than and equal to the number are the same)

select gid,fariqi,neibuyonghu,reader,title from Tgongwen
            where fariqi>''2004-1-1'' and fariqi<''2004-6-6''

Time: 3280 ms

4, the date column will not be there every minute of the input and slow down the speed of the following example query, a total of 100 million data, January 1, 2004 after the data has 500,000, but only two different dates, date accurate to date; before the data 500,000, 5,000 different dates, the date is accurate to seconds.

select gid,fariqi,neibuyonghu,reader,title from Tgongwen
          where fariqi>''2004-1-1'' order by fariqi

Time: 6390 ms

select gid,fariqi,neibuyonghu,reader,title from Tgongwen
            where fariqi<''2004-1-1'' order by fariqi

Time: 6453 ms

V. Other Considerations

"Water can carry a boat, can also capsize", index is the same. Index helps to improve retrieval performance, but excessive or improper indexing system can lead to inefficiencies. Because each user added to the table an index, the database is to do more work. Too many indexes and even lead to index fragmentation.
So, we want to establish an "appropriate" index system, especially for aggregation index creation, should keep improving, so that you can get high-performance database to play.
Of course, in practice, as a dedicated database administrator, you still need more testing of programs, find out which program the most efficient and effective.

相关文章
  • SQL Server uses an index structure and its 2011-06-01

    First, the layman to understand index structure In fact, you can be understood as a special index directory. Microsoft SQL SERVER provides two indexes: clustered index (clustered index, also known as the clustering index, clustered index) and non-clu

  • java call to sql server 2000 database storage structure for backup and recovery 2010-03-12

    Java programmers a familiar feeling to see the title will be Biao, java and sql server is very bad combination, especially in large data storage for later development has increased the barriers to maintain, but sometimes what the specific software de

  • SQL Server's composite index of learning 2011-03-31

    Transfer: http://www.cnblogs.com/bccu/archive/2007/08/14/855487.html Summary What is a single index, what is the composite index? When the new composite index, a composite index and need to pay attention to? This article is mainly a number of online

  • SQL Server Database Performance Optimization 2011-01-22

    Affect the SQL Server database performance factors, and SQL Server performance tuning principles, and made a number of guiding principles to optimize database performance. Design an application system does not seem difficult, but in order to optimize

  • Access与sql server的语法区别总结 2014-08-05

    这篇文章主要介绍了Access与sql server的语法区别总结,需要的朋友可以参考下 下边列出sql server与access的语法差别,方便更换程序数据库时查询. 日期分隔符号 access:英镑符(#) sql server:撇号(') Boolean 常量 access:True.False:On.Off:Yes.No:整数:-1(真).0(假). sql server:整数:1(真).0(假) 字符串连接 access:和号(&) sql server:加号(+) 通配符 acce

  • SQL Server index structure and its use (II) 2011-06-01

    Improve SQL statement Many people do not know the SQL statements in SQL SERVER is how to implement, they are worried about their own written SQL SERVER SQL statement will be misinterpreted. For example: select * from table1 where name=''zhangsan'' an

  • SQL Server index structure and its use (C) 2011-06-01

    Small amount of data to achieve a common page and display massive data storage process Build a Web application, tabbed browsing is essential. This problem is very common in database processing problems. The data page is classic: ADO record set paging

  • SQL Server index structure and its use (d) 2011-06-01

    The importance of the clustered index and how to select the clustered index In the section on the title, the author wrote is: to achieve a small amount of data and massive data storage process common paging display. This is because the stored procedu

  • Sybase SQL Server and optimize the use of the index 2010-06-25

    In the application system, especially in online transaction processing systems, data query and processing speed has become a standard measure of success or failure of applications. The use index to speed up data processing speed has become a vast dat

  • Index and its use - SQL Server 2005 2010-12-11

    VI. Index and its use 1. Index is based on the table as a database object, which holds the index of the column to sort the table. 2. Its main purpose is to improve the SQL Server system performance, faster data query speed and reduce system response

  • SQL Optimization - Index (A) simple terms to understand index structure (transfer) 2011-03-24

    (A) the layman to understand index structure In fact, you can be understood as a special index directory. Microsoft SQL SERVER provides two indexes: clustered index (clustered index, also known as the clustering index, clustered index) and non-cluste

  • SQL Server 2008 index tips 2011-09-29

    Microsoft MVP and best-selling book "Hitchhiker's Guide SQL Server" author Bill Vaughn gives a brief index to use SQL Server 2008 skills. The theme is based Kimberly Tripp and Paul Randall (the two is the SQL Server high availability and perform

  • [Sql server] Copy table structure to a specified table 2010-04-10

    Reprinted from http://anna-zr.javaeye.com/blog/466495 Keywords: database, how to copy the structure of other tables to a specified table to, such as the following example: There are three tables: Table A: Field 1, Field 2, Field 3 Table B: Field 1, F

  • SQL Server Create Index 2010-06-03

    What is the index Take Chinese dictionary catalog page (index) analogy: just as the Chinese character dictionary stored by the same page, SQL Server in the data record is stored by page, the page capacity is usually 4K. In order to accelerate the sea

  • SQL SERVER database, a complete example of full-text index 2010-11-10

    SQL SERVER database, a complete full-text index of the sample. (The pubs database as an example) First, introduce the system stored procedure using the specific steps to create full-text index: 1) Start the database full-text processing (sp_fulltext_

  • SQL Server database performance optimization of the index of articles 2010-12-23

    Recent projects need to do a period of time SQL Server performance tuning, encountered some problems, but also accumulated some experience, is to sum up, and the king shared. SQL Server performance optimization involves many aspects, such as sound sy

  • oracle / sql server/DB2/mysql/postgreSQL table structure and data back up 2011-08-03

    1 copy table structure and data in the table oracle / mysql / PostgreSQL: create table SBI_T_PCV_CHASSISPRICE_BACKUP as select t.c_id, t.var_id, t.id, t.c_price from SBI_T_PCV_CHASSISPRICE t; If you do not need the table structure, table data only, c

  • SQL Server 2008 Management Sudio can not save changes to the structure of the table to solve the problem 2011-02-16

    Today, installed SQL Server 2008 Enterprise Evaluation Edition, the created a table, you want to change the table design, but in the following dialog box pops up when you save *************** Picture in the attachment Later in the "Tools" menu &

  • Summary of Performance Optimization of SQL SERVER (good summary, do not miss oh) 1 / 3 2010-04-25

    Improve the performance of a system Di, not only is the pilot or maintenance stages of performance tuning Ren Wu, Not just his is the development phase of the matter, Er Shi Zheng Ge software life cycle in all require attention to work effectively to

  • With SQL Server skills to learn DB2 UDB V8 2010-05-12

    With SQL Server skills to learn DB2 UDB V8 Introduction In today's information technology field, new information continues to impact us - the new software product, new versions and new features. So you can keep up with these constant changes of the h