Index of advantages and disadvantages, how to create the index, the index of features

2010-04-16  来源:本站原创  分类:Database  人气:319 

Why does it want to create the index? This is because, to create an index can greatly improve system performance. First, by creating a unique index, you can ensure that each row of data in a database table is unique. Second, can greatly speed up data retrieval speed, which is the main reason to create the index. Third, you can speed up the connection between the table and the table, especially in reference to the integrity of data are particularly meaningful. Fourth, the use of grouping and sorting clause for data retrieval, the same can significantly reduce the query, grouping and sorting time. Fifth, through the use of the index, you can check the process, the use of hidden devices optimized to improve system performance.
Maybe someone will ask: increase the index has so many advantages, why not a column for each table to create an index it? The idea certainly has its rationality, but also has its one-sidedness. Although the index has many advantages, but, as the table column for each increase in the index, it is very unwise. This is because the increase in the index are also many negative aspect. First, create an index and maintain indexes to time-consuming, this time with the increase of the amount of data. Second, the index needs to account for physical space, in addition to accounting for the data table space outside, each index should occupy a certain physical space, if you want to build clustered index, then the space required will be greater. Third, the data on the table to add, delete and modify the time, the index should be dynamic to maintain, thus lowering the maintenance of data speed.
Index is based on a database table on top of some columns. Thus, in creating the index, it should carefully consider where the index can be created on columns, in which the column can not create the index. In general, it should create an index in these columns, for example: a regular column to search, you can speed up the search speed; in as the primary key columns, force the organization out of the uniqueness and the arrangement of data in the table structure ; in connection often used in the column, these columns are those foreign key, you can speed up the connection speed; in the scope of the search often need to create an index column, because the index has been sorted, the specified range is continuous; often need to sort out in the create index because the index has been sorted, so that queries can use index ranking, speed up the sorting query time; in the regular use of the column in the WHERE clause to create an index above, to speed up the conditions to judge speed.
Similarly, for some of the column should not create the index. In general, should not create an index of these columns have the following characteristics: First, for those who rarely used in the query or reference of the column should not create the index. This is because since these columns are rarely used to, so there is an index or no index, and can not speed up the search. On the contrary, the addition of the index, but reduces the system speed and increased maintenance space requirements. Second, for those who only a few data values should not increase the index of the column. This is because very few of these columns of values, such as personnel table sex column in the query results, the result set rows in the table account for a large proportion of data rows that need to search for data in the table line for a considerable proportion. Increase in the index, and can not significantly speed up the retrieval speed. Third, for those defined as text, image, and bit data types of columns should not increase the index. This is because these columns or a large amount of data, or very little value. Fourth, when the revised performance far greater than the retrieval performance, we should not create the index. This is because, modify performance and retrieval performance is inconsistent. When the increase in the index, it will improve the retrieval performance, but the changes will reduce performance. When the reduction of the index, the changes will increase performance and reduce retrieval performance. Therefore, when the revised performance far greater than the retrieval performance, we should not create the index.
Create index method and the characteristics of the index

Create index method to create the index with a variety of methods that include direct and indirect methods to create an index method to create the index. Directly to create an index, such as the use CREATE INDEX statement or use the wizard to create an index, indirect create indexes, such as the definition in the table primary key constraint or unique key constraint, but also to create the index. Although both methods can create an index, but the specific content they create the index is different.
Use the CREATE INDEX statement or use the wizard to create an index to create the index, which is the most basic ways to create the index, and this method the most flexible, can be customized to create an index that meet their needs. In using this method to create the index, you can use many options, such as specifying the full degree of data page, sort, collate statistical information, it can optimize the index. Using this method, you can specify an index type, uniqueness and compound, that is, not only can create a clustered index, you can also create non-clustered index can not only create an index in a column can also be in two one or two more columns to create the index.
By defining the primary key constraint or unique key constraint, you can also indirectly create the index. Primary key constraint is a logic to maintain data integrity, it limits the records in the table have the same primary key record. Create primary key constraint, the system automatically creates a unique clustered index of. Although, logically, the primary key constraint is an important structure, but in the physical structure, and the corresponding primary key constraint is the uniqueness of the structure of the clustering index. In other words, the physical implementation, the primary key constraint does not exist, but only existence and uniqueness of the cluster index. Similarly, to create unique key constraint, they also created the index, this index is a unique non-clustered index. Therefore, when using the constraints to create the index, the index of the type and characteristics have been determined largely by the relatively small room for customization.
As defined in the table primary key or unique key constraint, if the table already have created using the CREATE INDEX statement, the standard index, then the primary key constraint or unique key constraints indexes created before the creation of standards covering index. In other words, the primary key constraint or unique key constraints create indexes higher priority than using the CREATE INDEX statement to create the index.
Features index

Index has two features that unique index and the composite index.
The only guarantee of the index column in the index all the data is the only and will not contain redundant data. If the table has a primary key constraint or unique key constraint, then when you create tables or modify tables, SQL Server automatically creates a unique index. However, if the need to ensure uniqueness, you should create a primary key constraint or unique key constraint, rather than creating a unique index. When you create a unique index, you should seriously consider these rules: When in the table to create a primary key constraint or unique key constraint, SQL Server automatically creates a unique index; If the table already contains data, so when you create index, SQL Server checks the table has data redundancy; when using the insert statement to insert data or modify data using the modified statement, SQL Server checks the data redundancy: If there is redundancy value, then SQL Server Cancellation The statement Zhi Xing, and returns an error message; conducted to ensure Zhong of Mei Yi OK Shuojuduyou a unique Zhi, Zheyangkeyi ensure that each one uniquely identified entity All ; physical integrity of the Bao Zheng Zhi Neng Zai to the Lie Shangchuangjianwei One of the index, for example, can not the names of personnel listed in the table to create unique index, because people can have the same name.

Composite Index is an index created in the two columns or more columns. When searching, when two or more columns as a key value, the best of these columns to create composite index. When you create composite index, should consider these rules: up to 16 columns can be combined into a single composite index, constitute the composite index of the column of total length not more than 900 bytes, that can not be too long composite length of the column; in the composite index, all columns must be from the same table, not across the table to establish composite column; in the composite index, the order of columns is very important, so carefully arranged in order of the columns should, in principle, should first definition of the most unique column, for example, (COL1, COL2) on the index and in the (COL2, COL1) the index is not the same, because the two index columns in a different order; query optimizer to make use of composite indexes the query statement WHERE clause must reference a composite index in the first column; when there are multiple key columns in the table, the composite index is very useful; use of composite indexes can improve query performance, reduce in a table created index number.
The type of index and data according to the order of the index table is the same physical sequence, can be indexed into two types. One is the physical order of data tables and indexes in the same order of the cluster index, the other is the physical order of data tables and indexes are not the same order of non-clustered index.

Cluster index structure similar to the architecture of the index tree structure, called the top of the tree leaf, the rest of the tree known as the non-leaf, tree roots in the non-leaf level. Similarly, in the clustered index, clustered index the leaf level and non-leaf form a tree structure, the index level is the lowest leaf. In the clustered index, data in the table where the data page of a leaf, the leaf level above the non-leaf level index page, index data where non-leaf level index pages. In the clustered index, data values arranged in ascending order always.
Should always search out the table or column according to the order of access to create clustered index. When you create a clustered index, should consider these factors: Each table can have only one clustered index, because the physical order of data in the table can only have one; table and index the Bank of China Bank of China, the physical order of the physical order is the same De, Zai cluster together to create the index before any non-clustered index created, because the clustered index Gaibianlebiao Bank of Wulishunxu, Shu Ju OK followed a sequence arranged Bingjuzidong Weihuzhege order; Key uniqueness of the values or use the UNIQUE keyword specific maintenance, either by a unique identifier within the clear maintenance of these unique identifier used by the system itself, the user can not access; cluster index is about the average size of the data tables by 5, however, the actual size of the clustered index is often listed according to the size of the index varies; in the index creation process, SQL Server temporary use of the current database, disk space, when you create clustered index, you need 1.2 times the table the size of space, so make sure there is enough space to create a cluster index.
When access to data in the table, first determine the existence of the corresponding column has an index and whether the index to retrieve meaningful data. If the index exists and the index is meaningful, then the system uses the index to access records in the table. System to the data from the index began to browse, index, view from the index tree roots began. Start from the root, the search value compared to the value of each key to determine the search value is greater than or equal to the critical value. This step is repeated until the run into a large value than the search key value, or the search value is greater than or equal to the index page of all the key values so far.

Non-clustered index architecture of the structure of non-clustered index is a tree structure, and cluster index structure is very similar, but there are significant differences.
In the non-clustered index, leaf contains only key values, but does not include rows of data. Non-clustered index that line of logical sequence. There are two non-clustered index architecture: an architecture is not cluster index table to create non-clustered index, and the other architecture is a cluster index table to create non-clustered index.
If a data table without clustered index, then the data table is also known as the data heap. When the non-clustered index created on top of the heap in the data, the system uses the index page of the row identifier of the record points to the data page. Row identifier data stored location information. Data through the use of the index heap allocation map (IAM) pages to maintain. IAM page contains a heap where the cluster data store information. In the system table sysindexes, there is a heap pointer to data associated with the first IAM page. System uses the IAM pages in the data heap browse and search for new rows can be inserted into the space. These data pages and data pages in these records without any order and do not link together. In these data, the only link between pages is recorded in the order of IAM. When the data on the heap to create a non-clustered index, the leaf level contains the row identifier pointing to data pages. Row identifier of the logical order of rows from the file ID, page number and line ID form. These lines to maintain unique identifiers. Non-clustered index leaf page of the order is different from the physical order of data in the table. These key values in the leaf level in order ascending order to maintain.
When the non-clustered index created in a cluster index table when the system uses the index page of the point cluster index clustering key. Clustering of the data stored in the location of key information. If a table has clustered index, then the non-clustered index the leaf level contains the clustering key is mapped to the clustering key value instead of the line mapped to the physical identifier. When the system access a non-clustered index of the table data, and this non-clustered index created on the clustered index, then it is the first from a non-clustered index to clustered index to find the target point, and then through the use of clustering the index to find data.
When you need to retrieve data in many ways, the non-clustered index is very useful. When creating non-clustered index, consider these: By default, the non-clustered index created by the index; in every table top, you can create more than 249 non-clustered index, the clustering index can only have one.

System how to access the data in the table
In general, system access data in the database, you can use two methods: table scan and index lookup. The first method is a table scan, meaning the system will place the pointer in the header data of the table where the data page, and then follow the order of the data page, page by page, front to back scan the data in the table occupied by All the data page, until the table is scanned all the records. In the scan, if the records found that match query, then this record will be selected. Finally, all selected records meet the query criteria are displayed. The second method is to use the index to find. Index is a tree structure, which stores the keyword and the point where the record contains the keyword data page pointer. When using the index to find the system along the index tree structure, according to the index of keywords and pointers to find the records that meet the query. Finally, all look to the records meet the query criteria are displayed.
In SQL Server, when access to the data in the database, from SQL Server to determine whether there is an index of the table there. If there is no index, so SQL Server uses table scan method to access data in the database. Query processor according to the distribution of statistical information to generate the optimal query execution plan to improve the efficiency of targeted access to data, to determine the use of table scan or use the index.

Index option to create an index, you can specify the number of options, through the use of these options, you can optimize the performance of the index. These options include FILLFACTOR option, PAD_INDEX options and SORTED_DATA_REORG options.
Use FILLFACTOR option, you can optimize insert statements and modify the performance statement. When an index page becomes full, SQL Server must take time into the page in order to make room for new rows. Use FILLFACTOR option is in the leaf level index page, a certain percentage of free space allocated in order to reduce the decomposition time page. When a data table to create the index, you can use FILLFACTOR option to specify the index of each leaf node of the fill percentage. The default value is 0, this value is equivalent to 100. When you create the index, the internal index nodes are always left some space, this space is sufficient to accommodate one or two records in the table. In the absence of data in the table, when creating the index, do not use this option, because then the option is not meaningful. In addition, the value of the options specified after the creation can not be dynamically maintained, therefore, should be used only data in the table when creating the index used.
PAD_INDEX FILLFACTOR option value option will also be used within the index node to node within the index of the filling degree and the leaf-level index of the filling degree in the same node. If not specified FILLFACTOR option, then the individual designated PAD_INDEX option is not practical, it is because PAD_INDEX option option value is the value determined FILLFACTOR.
When you create a clustered index, SORTED_DATA_REORG remove sorting options, so you can build clustered index to reduce the time required. When a table has become pieces to create or rebuild clustered index, use SORTED_DATA_REORG option to compress the data page. When re-index the application needs to fill degrees, also use this option. When using SORTED_DATA_REORG option, should consider these factors: SQL Server to confirm each key value is a key value than the previous high, if not high, so can not create index; SQL Server requirements 1.2 times the table space to the physical re-organization data; use SORTED_DATA_REORG option to speed up the process by clearing the sort index creation process; from the table to physically copy data; when a row is deleted, its share of the space can be re-used; create all non-clustered index; if hope to leaf pages filled to a certain percentage, you can also use options and SORTED_DATA_REORG FILLFACTOR options.

Index maintenance in order to maintain system performance, the index is created after the data as frequently add, delete, modify, make index pages and other operations occurring fragments, so the index must be maintained.
Use DBCC SHOWCONTIG statement, you can display the table data and index pieces of information. When the implementation of the DBCC SHOWCONTIG statement, SQL Server browse the entire index leaf page, to determine the specified table or index is serious pieces. DBCC SHOWCONTIG statement can determine whether the data pages and index pages have been filled. When a lot of changes on the table or add a lot of data, or query the table is very slow, the government should run DBCC SHOWCONTIG these tables statement. When the implementation of the DBCC SHOWCONTIG statement, should consider these factors: When the implementation of the DBCC SHOWCONTIG statement, SQL Server requests the specified table ID number or the index ID, table ID number or ID number index can be obtained from the system table sysindexes; should determine how long to use a DBCC SHOWCONTIG statement, the length of time to the activities under the table to be daily, weekly or monthly can be.
Reconstruction of the table using the DBCC DBREINDEX statement, one or more indexes. When the wish to rebuild the index and when the table has a primary key constraint or a unique key constraint, the implementation of the DBCC DBREINDEX statement. In addition, the implementation of the DBCC DBREINDEX statement can also be re-organized leaf level index page of storage space, remove debris and re-calculate the index statistics. When using the implementation of the DBCC DBREINDEX statement, should consider these factors: the filling according to the specified degree, the system re-fill each leaf page; use the DBCC DBREINDEX statement to rebuild the primary key constraint or unique key constraint index; use SORTED_DATA_REORG more options quickly create a clustered index key value if not ordered, then can not use the DBCC DBREINDEX statement; DBCC DBREINDEX statement does not support the system tables. In addition, you can use the database maintenance plan wizard to automatically carry out the process of rebuilding the index.
Statistical information is stored in SQL Server in the column of data samples. These data are generally used for indexed columns, but can also create a non-indexed column statistics. SQL Server to maintain a certain critical value of the index distribution of statistical information, and use these statistics to determine which of the query process, the index is useful. Query optimization depends on the distribution of these statistics accuracy. Query optimizer to use the data sample to determine the use of table scan or use the index. When table data is changed, SQL Server automatically changes periodically statistics. Index statistics are automatically changes the index key values changed significantly. Statistics frequency changes the amount of data from the index and data to determine changes in volume. For example, if the table has 10000 rows, 1000 rows changed, then the statistics may need to modify. However, only 50 rows changed, then still maintain the current statistics. In addition the system automatically changes, the user can perform UPDATE STATISTICS statement or sp_updatestats system stored procedure to manually modify the statistics. Use UPDATE STATISTICS statement can either modify the table all of the index, you can modify the specified index.
STATISTICS IO statements using SHOWPLAN and indexing and query performance can be analyzed. These statements can be used to better adjust the query and index. SHOWPLAN statement displayed in the connection table using the query optimizer, and that each step of which index to use to access the data. SHOWPLAN statement can be viewed using the specified query query plan. When using SHOWPLAN statement, should consider these factors. SET SHOWPLAN_ALL statement returns the output of the SET SHOWPLAN_TEXT statement returns than the output in detail. However, the application must be able to handle SET SHOWPLAN_ALL statement returns the output. SHOWPLAN statement information generated for a session only. If you re-connect SQL Server, you must re-run SHOWPLAN statement. STATISTICS IO statement that the number of input and output, the input and output to return to the specified query result set and display logical and physical I / O information. You can use this information to determine whether the query should be rewritten or re-design of the index. Statement can be viewed using the STATISTICS IO is used to dealing with specific query I / O information.
The same as SHOWPLAN statement, the optimizer is also used to adjust the query performance hidden. Hide query optimizer can provide small improvements in performance, and if the index strategy changed, so that the optimizer hide is useless. Therefore, restricting the use of optimizer hidden, because the optimizer more efficient and more flexible hide. When using the optimizer is hidden, consider these rules: the specified index name, when index_id to 0 to use a table scan, when index_id to 1 to use the clustered index; optimizer hide cover query optimizer, if the data or the environment has changes, you must modify the optimizer hidden.

Index Index adjustments Adjustment Wizard Wizard is a tool that can analyze a series of database queries, using a database index to provide recommendations to optimize the query performance. For queries, need to specify the following:
Query, it is necessary to optimize the workload of the database contains the tables in these tables, you can create indexes to improve query performance in Analysis of tables used in the analysis, the consideration of the constraints, for example, the largest index that can be used Disk space refers to the workload here, can come from two aspects: use SQL Server to capture and track file contains SQL statements. Index tuning wizard is always based on the workload of an already defined. If a workload does not reflect the normal operation, it is recommended index than the actual performance of the workload on the best index. Index tuning wizard called Query Analyzer, using all possible combinations in the assessment of the workload in the performance of each query. Then, it is proposed to increase the workload on the performance of the query index. If not for the Index Tuning Wizard to analyze workload, you can create it using the graphical device immediately. Once the decision to track a sample of normal database activity descriptions, the wizard can analyze the database workload and recommendations to improve the performance of the index configuration.
Index Tuning Wizard after analyzing workload You can view the whole range of reports also may make the Xiangdao immediately proposed to create the best index, or to make this work a can schedule the job, or generates a create The index of the SQL statement file.
Index Tuning Wizard allows for the SQL Server database, select and create an ideal combination of the index and statistics, without requiring the database structure, workload, or SQL Server expert level of understanding within reach. In short, the Index Tuning Wizard can be done the following work:
By using the query optimizer to analyze queries in the workload of the task, the workload of the database to a large number of index recommended a best mode of mixing changes as suggested after the results, including index usage, query distribution and inter-table a lot of work in the query results for the small amount of work to adjust the database query method recommended by the task by setting the advanced options such as disk space constraints, the largest index for each query number and the maximum number of columns and so on, allow custom graphic device recommended way of illustration Real-time capture device can run continuously in the server image, you can select the item and want to monitor events, including Transact-SQL statements and batch commands, object usage, locking, security events and errors. Graphic device can filter these events, just show the user's concern. You can use the same server or other servers to repeat the track events have been recorded, re-run the command that have been recorded. By focusing on these events, you can easily monitor and debug the problems in SQL Server. Research on specific events, monitoring and debugging SQL Server problem becomes much simpler.
Query Processor query processor is a lot of work to be completed multi-purpose tool. In the query processor, you can interactively input and perform a variety of Transact-SQL statement, and a window can also view the Transact-SQL statements and result sets; query processor can simultaneously execute multiple Transact-SQL statement, can also perform some of the script file statement; provides a graphical query execution plan analysis method, reported by the query processor can select the data retrieval methods and query planning Diaozheng can query the implementation of the Ti Chu optimization can improve the performance of the implementation of the proposed index, this proposal is to query the index for a proposal that a query can improve query performance.
System to create a distribution of each index page, statistical information refers to the distribution page is stored in a table in a one or more key values of the index distribution information. When a query, in order to improve query speed and performance, the system can use to determine the distribution of information which an index using the table. Query processor is dependent on the distribution of statistical information, to generate the query implementation plan. The optimization of the implementation of plans depends on the distribution of statistical information on the steps of the high and low degree of accuracy. If the distribution of statistical information and indexes of physical information is consistent, then the query processor can generate highly optimized implementation of planning. On the contrary, if the actual statistical information and indexes the information stored in relatively large difference, then the query processor generates the optimization of the implementation plan is relatively low.
Query processor to extract statistical information from the keyword index distribution information, in addition to the implementation of the user can manually UPDATE STATISTICS, the query processor can also automatically collect statistical information on these distributions. In this way, we can fully guarantee the query processor using the latest statistical information to ensure the implementation of the plan has a high degree of optimization, reducing maintenance needs. Of course, using the query processor generates the implementation of the plan, there are some restrictions. For example, the implementation of planning can only enhance the performance of a single query, but the performance of the whole system may have a positive impact, or pay the face, therefore, to improve the query performance of the system should use tools such as index tuning wizard.
Conclusion
In previous versions of SQL Server in a query, a table of the most use an index. In SQL Server 7.0, the index operation has been strengthened. SQL Server is using the index and index insertion algorithm to implement the joint statement in a query can use multiple indexes. Identifier used to connect the line to share a table with two indexes. If a table has a clustered index, so there is a clustering key, then the table of all non-clustered index the leaf node to use the clustering key as the row locator, rather than the physical record identifier. If the table is not clustered index, then the non-clustered index to continue to use physical record identifier points to data page. In the above two cases, the row locator is very stable. When the clustered index leaf node separately, because of line locator is valid, so non-clustered indexes do not need to be modified. If there is no clustered index the table, then separate page would not have happened. In previous versions, the non-clustered index records using the physical identifier, such as page number and line number, as the row locator. For example, if the clustered index (data page) are decomposed, the number of rows to be moved to a new data page, so have a number of new physical record identifier. Then, all non-clustered indexes have to use these new modifications of physical record identifier, so that takes a lot of time and resources.
Index Tuning Wizard to both experienced users and new users are a very good tool. Skilled users can use the wizard to create a basic index configuration, then the index in the basic configuration and customization of the above adjustments. New users can use the wizard to quickly create optimized indexes.

This article comes from CSDN blog, reproduced, please indicate the source: http://blog.csdn.net/Damon_King/archive/2007/10/24/1841819.aspx

相关文章