Database tuning method

2010-12-17  来源:本站原创  分类:Database  人气:112 

1. Introduction database tuning database applications can run faster, it is necessary to consider a variety of complex factors. Uniform distribution of data on disk can improve I / O utilization, enhance data read and write performance; an appropriate level of non-standardized systems can improve query performance; indexing and writing efficient SQL statements can effectively avoid the low-performance operation; through the lock tuning the performance aspects of concurrency control to resolve the issue. Database tuning techniques can be used in different database systems, it is not entangled in complex formulas and rules, but it needs to process applications, database management systems, query processing, concurrency control, operating systems and hardware have a broad and deep understanding. 2. 2.1 Computer hardware tuning the placement of database objects using the database partitioning strategy of technology, data is distributed evenly to the disk in the system, balanced I / O access, to avoid the I / O bottlenecks: (1) access across different disks, even if the user data as possible across multiple devices, multiple I / O operation, to avoid the I / O competition, to overcome the access bottleneck; were placed on random access and continuous access to data. (2) separation system database I / O and application of database I / O, the system audit table and the temporary database table on busy disks. (3) the transaction log on a separate disk, reducing disk I / O overhead, which also help in the barrier recovery, improve system security. (4) the frequent visits of the "active" table on different disks; the frequent use of the table, often to do Join the tables were placed on a separate disk, even frequently accessed tables on different fields disk, to access distributed to different disks to avoid I / O contention. 2.2 Optimizing Database using disk hardware RAID (Redundant Array of Independent Disks) is a multiple disk drives (array) consisting of disk systems. By the disk array to be treated as a disk, hardware-based RAID allows users to manage multiple disk. Hardware-based RAID and RAID-based operating system compared to hardware-based RAID offers better performance. If you are using OS-based RAID, it will occupy the other system requirements CPU cycles; by using hardware-based RAID, the user does not shut down the system in the case to replace a failed drive. SQL Server is generally used RAID levels 0 and 5. RAID 0 is a traditional disk mirroring, disk array, each disk has one or more copies, it is mainly used to provide the highest level of reliability, so that RAID 0 increases exponentially, but you can write parallel processing of multiple reading operation, resulting in improved reading performance. RAID 1 is disk mirroring or disk duplexing, can the transaction log to ensure redundancy. RAID 5 disk striping with parity of the upcoming data and parity information distributed to all disks in the array, it can eliminate a parity disk bottleneck and single point of failure problem, RAID 5 write operation will increase, you can parallel processing a read, you can exponentially increase read performance. In contrast, RAID 5 writes than the increase in RAID 0 to increase much less. In practice, the user's read operations require far more than the write request, write to the disk very quickly, so that users almost feel the increase in time, so the burden does not increase the write what problem. Better performance of the server in general will choose to use RAID 5 disk array card to achieve, for the relatively poor performance of some of servers can also use a software-only way to achieve RAID 5. 3. Relations system and application tuning from 3.1 to optimize the application database designer's point of view, the application is to achieve nothing more than the increase of data, modify, delete, query, and reflects the data structure and relationships. Designers in terms of performance considerations, the overall starting point is: the resources of the database as a luxury treat, to ensure functionality, as little as possible to use database resources. Include the following principles: (1) does not access or less access to the database; (2) to simplify access to the database; (3) to enable access to the best; (4) early and subsequent development, deployment, adjustment requests, to help achieve performance goals. Also, do not directly perform a full SQL grammar, try to call stored procedures through SQL Server. Client and server connection, connection pool, so that connections can be reused as much as possible to avoid loss of time and resources. Not to resort, do not use the cursor structure is indeed used, pay attention to various characteristics of the cursor. 3.2 basic table design optimization based on table-driven information management system, the basic table design specification is the third paradigm. The basic features of the third paradigm of non-primary key attribute depends only on the primary key attribute. The third paradigm of database tables based design has many advantages: First, to eliminate redundant data, saving disk storage space; Second, there is good data integrity constraints (based on the complete primary foreign key reference the primary key constraints and based on the physical integrity of the limit), which makes the data easier to maintain, migrate and update; third is a good reversibility of the data, or merge join queries do when the table is not missing, do not repeat; Fourth, eliminate redundant data (mainly referring to redundant columns) make query data stored in each data page row number, Thus effectively reducing logical I / O, while reducing the physical I / O; Fifth, for most transactions, the operating performance; six physical design of the mobility is greater, to meet the growing user needs. The third design paradigm based on the library table despite its advantages, but in practical applications is sometimes not conducive to system performance optimization: for example, part of the data sometimes need to scan the entire table, many processes compete simultaneously the same data, repeated the same line of calculation the same result, the process to obtain data from multiple tables prompted a large number of connection operation, when data from multiple tables in a join operation; which are consumed by the disk I / O and CPU time. Particular need to be raised is that in the face of the following situations, we must extend the basic table design optimization: many processes to frequently access a table, a subset of data access, double counting and redundant data, and sometimes require some user process priority or low response time, in order to avoid these disadvantages, we usually based on the frequency of visits to the relevant table segmentation, redundant data storage, storage derived column, merger-related table processing, which are to overcome these disadvantages and optimize system run an effective way. (1) partition table partition table partition table can be divided into horizontal and vertical partition table two: the level of segmentation is based on a table about to split into multiple tables, each table which can improve query speed, but by causing a multi-table join , so it should be at the same time a different query or update the partition table out of the situation relatively few circumstances. Vertical partitioning is a lot of columns for a table, some columns if the access frequency is much higher than other columns, does not destroy the premise of the third paradigm, and these will be the primary key columns as a table, the primary key and other columns as another Table. One is frequently accessed tables when multiple processes in different columns, the table can be vertically divided into several tables, to reduce disk I / O. By reducing the width of the column, increasing the number of rows per data page, an I / O can scan more rows, thereby increasing the access speed of each table. Vertical partition table can maximize the use of Cache purposes. The disadvantage of the partition table is to insert or delete data integrity of the data to consider when using stored procedures to maintain. (2) derived from data storage to do a lot of some of the repetitive calculation process, if it is the result of double-counting process was the same, or involve multiple lines of data need to calculate the additional disk I / O overhead, or computational complexity requires significant CPU time, store the results to consider: If one or more rows for repetitive calculations, the increase in the table column storing the results, but if in the calculation of the column is updated, you must use a trigger or stored procedure to update this new column . In short, the storage of redundant data will help accelerate the access speed, but the violation of third normal form, which will increase the cost of maintaining data integrity, must trigger an immediate update, or stored procedure to update, to maintain data integrity. 3.3 modify the application of technology model to introduce the "middle of the table" concept, not the actual documents before entering the core business processes, the use of "intermediate form" of the technical ideas, that is, the actual user operation, the actual operation of a temporary table, making Data review at some stage (the next part), the data is written to the temporary table a formal table, and delete the temporary table data, so that the user operating table remain fixed amount of data and control the growth can be periodically removed. First, the use of temporary table technology needs to operate on data sets into a temporary table, this system will bring additional costs. It is assumed that temporary table in the data set is much smaller than the source data tables in the data set, thus making the data connection to operate or frequently read data set, the system performance will improve times or even several times range. Not all cases are suitable to use temporary table technology. In general, the following two conditions appropriate to adopt the technology for processing temporary tables: (1) of the large amount of data table join operations, and to connect the results of operations is a small result set. (2) a large amount of data tables for frequent visits were more fixed and more concentrated range. Rational use of temporary table techniques to help improve the application of real-time processing of large data sheet performance. 4. Database index optimization index is based on a data sheet on the organization, it can improve access to one or more records in the table the specific query efficiency. Optimization of system performance using the index is obvious to all commonly used in the query Where clause for sorting the columns and all columns to create the index to avoid full table scan or access table without changing the physical structure of the case, direct access to specific columns of data, which can reduce data access time; or exclude the use of indexes to optimize the classification of time-consuming operation, the data is distributed to different pages, so that dispersed into the data; automatically create a unique primary key index, the only index also ensure the uniqueness of data (ie, physical integrity). In short, the index can speed up queries and reduce the I / O operations, to eliminate disk sorts. Optimization index to avoid scanning the entire table, reducing overhead caused by the query. General index to note the following: (1) Check the indexed column or composite index of the first column appears in the PL / SQL statement's WHERE clauseThis is the "implementation plan" can be used to the necessary conditions for the relevant index. The only key to more of the following table records the number and the number of rows, you can determine the selectivity of the column. If the column "the number of unique key / number of rows in the table," the ratio closer to 1, the column select the line higher. In the optional query on the high bar, to return less data on the more suitable index query. On the contrary, such as the gender column on only two values, select the line is very small, not suitable for index query. Therefore, as a condition in the query expression are often different values, and more out on the index, fewer different values ​​do not create indexes on columns. (2) is also a need to create the index price for the delete, some update, insert, for each index must be a corresponding delete, update, insert. Leading to delete some of the updates, inserts inefficient. Therefore, frequent deletion, insertion of tables do not create too many indexes. (3) query frequently used out on the non-clustered index in the range of frequent queries, sorting, grouping, out on the clustered index. (4) there is no duplicate value for a column, create a unique index is superior to create a non-unique index. (5) When a large database table update data, delete and re-index to improve query speed. (6) When the update operation on a table far more than the select operation, should not create the index. (7) If the index column is a function of parameters, no access to the index in the query, the column is not suitable for indexing. (8) Hash Join (HJ) as computing HASH to be done, there is an index of the speed of data queries had little effect. (9) on the primary key index, especially when often use it as a connection time; often used to connect but not in the specified foreign key column on the index. (10) is often simultaneous access to multiple columns, and each column contains duplicate values, you can consider creating a composite index to cover or a group of queries, and to check out the most frequently cited as the leading column. (11) make use of a narrower index, on each page so that data can be stored more index lines for reduced operations. (12) will not use parallel query the index. (13) the value stored in the index can not be completely empty. (14) query to use fewer columns, larger amount of data columns are not indexed. 5. SQL statements to optimize the complete system design, index design, etc. After, we must consider the statement in the course of the design. Affect the performance of database applications is an important factor is the SQL statement, the severity of their impact, in turn can be divided into: unnecessary SQL, poor SQL, complex SQL. Unnecessary SQL: their access to the database, there is no technology and skills issues, but it is not necessary, beyond the actual business needs. The result is a waste of host resources, taking up network traffic, reducing system performance. Poor SQL: their access to the database are not redundant, as reflected in the business logic or the result is correct, but "writing" is not good enough, leading to optimization of the database is not enough to deal with them. Complex SQL: database, multiple tables (or views) associated conditions are very complex, lengthy, complex calculations, use the SQL unfamiliar technology. Among them, the unnecessary SQL and SQL are poorly developed skills of problem; complex SQL skills are design problems, design the database structure. In the use of structured query language to perform queries, the following initiatives: (1) selection operation should be as a guide, and a table in the same operation multiple choice, select a greater impact statement on the front; more weak selection criteria written on the back so that it can be obtained under more stringent conditions on the smaller data information, then the information in accordance with these conditions are met behind the weaker condition information. (2) should avoid the use of correlated subqueries. Subquery into a join to achieve. For each record in the main query subquery is executed once, the more efficient nesting of the lower level. Avoid the use of mathematical operators on clause. That is not listed on the table to operate the properties. SQL WHERE clause in the conceptually related sub-query, process parameters and get back into a single value or set of values ​​of the function. Because the subquery in the outer query to the corresponding tuple of each separate calculation. Resulting in a large number of random disk I / O operations. So in practical applications if the connection can be used instead of the subquery, then use the connection to. For example, the following related sub-query: SELECT ProductName FROM Products WHERE EXISTS (SELECT * FROM OrderDetails WHERE Discount> = 25 AND Products.ProductID = OrderDetails.ProjectID); with connections to achieve the following query: SELECT ProductName FROM Products, OrderDetails WHERE Discount> = 25 AND Products.ProductID = OrderDetails.ProjectID (3) field extracted in accordance with "the number required, to mention how many" principle, to avoid "SELECT *". "SELECT *" require a database to return all of the corresponding table column information, which a column of the table is undoubtedly a more time-consuming operation. (4) Avoid! = (Or), IS NULL or IS NOT NULL, IN, NOT IN and other such operators, the WHERE clause to avoid the use of non-aggregate expressions. The system operator will not use the index, but can only direct the search data in the table. For example, SELECT id, name FROM employee WHERE id! = B% index optimizer will not be able to determine the number of rows that will be hit, so the need to search for all rows of the table. (5) avoid the use of OR, instead of using UNION. OR statement execution on the principle of not using the column index based on the results of each statement were then seeking to find and set, but first take out the line to meet each OR clause, into the temporary database worksheet, then create a CD an index to remove duplicate rows from this temporary table in the final results. Such use may result in the index fail, causing the order to scan the entire table, the query efficiency greatly reduced. (6) before connecting to the relationship in the implementation of appropriate pre-treatment, pretreatment of two ways, the connection properties on the relationship between index and sort. (7) will be split into a large multi-step query execute the query. (8) If the application uses a loop can be considered in the query into the circulation. 6. Transaction processing database tuning may face during the daily operation of multiple users simultaneously concurrent operation of the database to bring the problem of inconsistent data, such as: lost updates, dirty reads and non-repeatable reads and so on. The main methods of concurrency control is to block, the lock is to prevent users in a period of time to do some operations to avoid data inconsistencies. Database applications will divide its work into a number of transactions processed. When a transaction is executed, it accesses the database and perform some local computation. Developers can assume that each transaction will be isolated to perform - without any concurrent action. Because the concept of isolation provides transparency, this approach guarantees the transaction is sometimes called the atomic guarantees. However, if the application transaction sequence as a whole, it does not guarantee the kind mentioned above. In the execution of an application between the two transactions may execute another application transaction, and the second execution of the application may change the first application of the two transactions (or one) needs to access data items. Therefore, the length of affairs has an important impact on correctness. Although the transaction cut into smaller particle size can improve the efficiency, but will so undermine the correctness of the implementation. This conflict between performance and correctness of concurrency control flooding throughout the tuning process. We consider the performance of services to take into account: the number of transactions use locks (all other conditions being equal, the fewer the number of locks used, the better the performance); type of lock (read lock for better performance) ; transaction holds a lock of the length of time (holding time shorter, the better the performance). Lock tuning on the following recommendations: (1) use a special system to handle a long read. R for a read-only affairs, it "sees" the state of the database has been the beginning of the state of affairs R. Read-only queries can not block overhead, does not cause blocking and deadlocks in the case of read-only queries on the same data with other smaller transactions to be updated in parallel execution. (2) eliminate unnecessary blockade. Only one transaction is executed, or all transactions are read-only transaction, the user should take advantage of configuration options to reduce the number of locks, lock management module to reduce the memory overhead and perform the blocking operation to the processing time overhead. (3) the content of the transaction based on transaction cut into smaller transactions. Firm requires more locks, it needs to wait for other transaction to release a lock may be. Transaction T executed the longer is T blocks the transaction waiting time may be longer. Thus, in the case of obstruction may occur, the use of short affairs better. (4) the application permitting, due to lower isolation level. (5) Select the appropriate block size. Page-level blockade prevents concurrent transactions from accessing or modifying all records on the page, table-level blockade prevents concurrent transactions from accessing or modifying a table of all the pages; record-level block (row-level locks) than the page-level block size is good, page-level block than Table-level block size is good. Long transaction (simply more access to the table almost all the affairs of the page) should try to use table-level blockade to prevent deadlock, while the short transaction record-level blockade should be used to improve concurrency. (6) only when the database is rarely accessed data to modify the definition of data (or metadata system directory). Each able to compile, add or delete tables, add or delete indexes, change the attributes defined transaction must access the directory data, so the directory can easily become a hot spot, and thus become a bottleneck. (7) to reduce access to hot spots (high-volume transactions to access and update data). Only in the update to complete a hot spot behind the transaction, other transactions in order to obtain a lock on this hot, so hot spots may become a bottleneck. (8) deadlock detection cycle tuning. Each of the above suggestions can be used independently of the other recommendations, but must be tuned to reflect the appropriate test for isolation guarantees. 7. Summarize the basic principles of database performance optimization is through the minimal disk access to obtain the required data. In this paper, computer hardware, relational systems and applications, databases, indexes, SQL statements, transaction analysis of several more common aspects of database performance optimization problem, a number of database performance optimization strategy. Of course, there are many ways to achieve optimal, according to the specific circumstances. For different applications, we should be specific conditions, the integrated use of all aspects of optimization measures, so that database performance is improved. Database application performance is a national project, the development team all have a responsibility to contribute to performance, establish performance awareness, making the daily work habits, rather than as a separate phase of work, to plan ahead, do not send hope to a particular part of the work.

相关文章
  • Database tuning method 2010-12-17

    1. Introduction database tuning database applications can run faster, it is necessary to consider a variety of complex factors. Uniform distribution of data on disk can improve I / O utilization, enhance data read and write performance; an appropriat

  • (Transfer) Oracle database migration method 2010-03-04

    (Transfer) Oracle database migration method This carried over: http://dev.tot.name/db/html/20090320/20090320161620.htm In the DBA's daily work, often need to reinstall or install a new machine Oracle, but the time wasted each installation, energy and

  • An ideal tree in a relational database structure, method of data storage 2010-05-28

    An ideal tree in a relational database structure, method of data storage 2008-06-24 17:05 by Jacky_Xu, 1553 visits, network Abstract, collecting, editing in a variety of applications based on relational database systems development, we often need to

  • Explain database tuning aspects related with the programmer 2011-07-29

    Many programmers in the interview, often asked, "Database Tuning" thing, then, many people will doubt that this is not something the DBA do, but the answer was not very good, or professional, I find some information, found that many aspects of w

  • JAVA processing time - java.sql.Date.java.util.Date in the Date field in the database conversion method 2010-03-29

    1, how java.util.Date into a java.sql.Date? Transformation: java.sql.Date sd; java.util.Date ud; / / initialize the ud such as ud = new java.util.Date (); sd = new java.sql.Date (ud.getTime ()); 2, if you want to insert into the database and the corr

  • Database Reverse Method 2010-03-21

    With regard to a method generated from the database. from; http://tomchikoore.com/2010/03/19/tutorial-for-using-spring-roo-with-an-existing-database/ http://forum.springsource.org/showthread.php?t=86474

  • MySQL database solution method does not allow remote access from 2010-03-03

    Solution: 1, changing table method. Your account may be allowed from a remote landing, only localhost. This time as long as the localhost of that computer, log in MySQL , change " MySQL " database in the "user" table of "host"

  • Execute script file mysql database and oracle database initialization method 2010-10-17

    mysql: Method One: In the command line (not connect to the database), enter the mysql-h localhost-u root-p123456 <F: / sql / test.sql Enter. Method Two: In the command line (connected database, then prompt for the mysql>), enter the source F: / sql

  • My SQL tuning method 2011-07-22

    Three key points: Implementation plan Index Queries in the query set size of each step My SQL tuning steps: 1 found that slow query, analysis of their implementation plan (2) a reasonable index, re-analysis test 3. If, after 1, 2, performance is stil

  • mysql performance check and tuning method 2010-11-17

    This Transfer: http://sudone.com/linux/mysql_debug.html I have been using this mysql database software, it works more stability, efficiency is also high. Performance in the face of serious problems, in general, so several possibilities: 1, the index

  • What database tuning involves 2011-04-08

    To influence the degree of ordering D1 business logic (the greatest impact) D2 data design (table design, data modeling) D3 Application Design (SQL statement written) D4 logical structure of the database (index, range, section, table space) D5 databa

  • Missing archive log file database recovery method 2010-05-22

    Demo content from a database will not work (due to a / multiple database files and other files inconsistent) to extract data. Scene: a disk damaged and lost a database file. A week ago the hot spare from the dump data files, unfortunately, lost a cou

  • SQL Server database recovery method after the collapse of 2010-07-21

    Any database system can not prevent the collapse of the state, even if you use the Clustered, Hot Standby ... ... still can not completely eradicated system single point of failure, let alone for most people, can not afford such expensive hardware in

  • Create a database instance method 2 2011-05-12

    1, the recent needs of the project, in an oracle database, you need to create two database instances, created as follows: Click Start -> oracle-> configuration and migration tools -> Database Configuration Assistant, and then click Next to OK has

  • Delete all records database table method 2010-08-03

    TRUNCATE TABLE to delete all the rows in the table, and not record a single line removal. Syntax TRUNCATE TABLE table_name table_name parameter is the name of the table to be truncated or to delete all the rows of the table name. If you want to retai

  • oracle database split () method implementation and testing 2010-11-13

    1 Reference: http://sonic10101.javaeye.com/blog/394187 2 specific SQL: CREATE OR REPLACE TYPE ty_str_split IS TABLE OF VARCHAR2 (4000) CREATE OR REPLACE FUNCTION fn_split (p_str IN VARCHAR2, p_delimiter IN VARCHAR2) RETURN ty_str_split IS j INT := 0;

  • Sybase and Oracle database query method of deadlock 2011-05-24

    Sybase Query deadlock spid: select l.spid, locktype=convert(char(12),name), dbname=convert(char(15),db_name(l.dbid)), 'table'=convert(char(15),object_name(l.id,l.dbid)), page, 'L' and l.spid = p.spid order by spid Oracle query deadlock spid: select p

  • Database Tuning: ORACLE EXPLAIN PLAN summary 2010-09-08

    SET AUTOTRACE ON explain;

  • Method of tuning the database 2010-12-17

    1. Introduction to database tuning database applications can run faster, it needs to consider a variety of complex factors. Uniform distribution of the data on disk can improve I / O utilization, increase data read and write performance; appropriate

  • Database performance tuning techniques 2010-03-30

    I. Introduction With the database used in various fields is growing, more and more applications made high-performance requirements. Database performance tuning is a knowledge-intensive disciplines, is necessary to consider a variety of complex factor