Archive

Posts Tagged ‘Performance Tuning’

MySQL– Performance Features

October 12, 2011 Leave a comment

Some of the features provided by MySQL that help to make it a high performing and fast responsive server 

  • Flexibility to choose the most appropriate Storage Engine as per performance requirements. A table’s Storage Engine can be changed later also, using the ALTER TABLE syntax. 
  • The feature EXPLAIN PLAN can be used to find out the actual path used by a query to fetch data from tables, so that query optimization can be performed. 
    Flexibility to choose the most appropriate data type as per performance and storage requirements. For example, a number type of data can be represented by INT, TINYINT, SMALLINT, MEDIUMINT, BIGINT, FLOAT or DOUBLE, depending on its characteristics and the range of values it can have, and consequently, the storage requirement for each is also different (1byte, 2bytes, 3bytes, 4bytes etc) 
  • Table maintenance utilities like mysqlcheck can be used to perform activities like checking, repairing, analyzing and optimizing tables to help improve performance. There are commands available too, like Analyze table, or Optimize table, that can be used to update key value distribution or reclaim unused space.  
  • For string data types, it is also possible to index on a column prefix rather than the entire width of it, which means that it is possible to create an index on a specified width of a column. For example, if a name column is of 255 characters, and using a query, we find that the first 10 characters of each row are sufficient to obtain distinct values for most of them, then an index can be created using only the first 10 characters of each row. This will not only allow more values to be cached in memory due to its small size, but also can improve index performance dramatically.  
  • The Leftmost Index prefixing feature can be used to avoid unnecessary indexes on tables. For example, if a composite index is created on columns A and B of a table (in the same sequence), then if a query requires an index on column A, it can use the same composite index without the need for a separate index.  However, if the sequence had been different in composite index, or had the other query needed an index on B, then we would require a separate index.  
  • MySQL also provides Engine specific optimizations. For example, the following optimizations are possible on MyISAM tables
    • Tables can be defined to have fixed-length or dynamic-length row format. Fixed-length allows data to be stored at a multiple of the row size resulting in faster access, whereas dynamic-row columns occupy smaller space, but can cause fragmentation.
    • Read-only tables can be compressed to a very small size using myisampack utility. The compression is done in an optimized form that allows quick retrievals.
    • It is also possible to split a Dynamic-row table into two separate tables (one fixed-length and the other dynamic length) keeping the Primary Key as common to both, so as to gain the advantages of  both types. This is usually considered when most queries access the fixed length columns of a dynamic row table.
    • To distribute disk activity, symlinking can be used to move MyISAM tables to different disks.
  • The MySQL server commands like STATUS can be used to obtain a snapshot report of current server status, showing its complete details, like current load, slow queries, open tables etc.
  • The server parameters that define various cache and buffer sizes can also be tuned as per performance requirements  

SQL Server Performance Tuning for TempDB

December 12, 2010 Leave a comment

Most common issues circumventing tempdb include:
- storage overrun
- IO bottleneck (for data read/write)
- excessive DDLs causing system table contention
- allocation contention

Quick review of what takes up tempDB:
1. User objects (#t1 and index, ##t2 and index, @v1)
2. Internal objects (work file or hash join, sorting, work table i.e. cursor-spool-LOBs)
3. Version Store (MARS, online indexing, triggers and snapshot/row-versioning based isolations)
4. Free space (if there is some, obvioulsy)

Monitoring tempDB space consumption:
Query below monitors for all 4 categories (as above) of tempDB consumers:
Select
    SUM (user_object_reserved_page_count)*8 as user_objects_kb,
    SUM (internal_object_reserved_page_count)*8 as internal_objects_kb,
    SUM (version_store_reserved_page_count)*8  as version_store_kb,
    SUM (unallocated_extent_page_count)*8 as freespace_kb
From sys.dm_db_file_space_usage
Where database_id = 2

Resolving Common tempDB issues:
1. Storage
1.a. User Objects

User objects arent owned by a session, so find specs of applicaiton that created them and adjust tempdb sizing accordingly. For individual objects you can use:
exec sp_spaceused @objname=’<user-object>’
Enumerating all the User Objects using query below:
DECLARE userobj CURSOR FOR
select sys.schemas.name + ‘.’ + sys.objects.name
from sys.objects, sys.schemas
where object_id > 100 and
      type_desc = ‘USER_TABLE’and
      sys.objects.schema_id = sys.schemas.schema_id
go
open userobj
go

declare @name varchar(256)
fetch userobj into @name
while (@@FETCH_STATUS = 0)
begin
    exec sp_spaceused @objname = @name
    fetch userobj into @name   
end
close userobj

1.b. Internal Objects
Its importatnt to find which query is generating too many internal objects and validate their query plan to be sure they arent out of order.
step1: Find top user sessions that are allocating internal objcets, including currently active tasks:
SELECT t1.session_id, (t1.internal_objects_alloc_page_count + task_alloc) as allocated, (t1.internal_objects_dealloc_page_count + task_dealloc) as deallocated from sys.dm_db_session_space_usage as t1,
    (select session_id, sum(internal_objects_alloc_page_count) as task_alloc, sum (internal_objects_dealloc_page_count) as task_dealloc
      from sys.dm_db_task_space_usage group by session_id) as t2
where t1.session_id = t2.session_id and t1.session_id >50
order by allocated DESC

step2: once the task is isolated, find which is this TSQL and its query plan:
select t1.session_id, t1.request_id, t1.task_alloc, t1.task_dealloc, t2.sql_handle, t2.statement_start_offset, t2.statement_end_offset, t2.plan_handle
from (Select session_id, request_id, sum(internal_objects_alloc_page_count) as task_alloc, sum (internal_objects_dealloc_page_count) as task_dealloc
      from sys.dm_db_task_space_usage group by session_id, request_id) as t1,
      sys.dm_exec_requests as t2
where t1.session_id = t2.session_id and (t1.request_id = t2.request_id)
order by t1.task_alloc DESC

step3: use sql_handle and plan_handle to get SQL statements and query plans":
select text from sys.dm_exec_sql_text(@sql_handle)
select * from sys.dm_exec_query_plan(@plan_handle)

1.c. Version Store
Row version are shared across sessions, and one who created it doesnt have control on when row-version can be reclaimed. You have to find and stop longest running transaction stopping row version cleanup.
select top 2
    transaction_id,
    transaction_sequence_num,
    elapsed_time_seconds
from sys.dm_tran_active_snapshot_database_transactions
order by elapsed_time_seconds DESC
There is no way to estimate how much space can be freed up by row version cleanup, so might need to cleanup some more transactions. Also, space requirements can be estimated for row-versioning using:
[Size of version store] = 2 * [version store data generated per minute] * [longest running time (minutes) of the transaction]
Note that version store data generated per minute is generally equivalent to log data generated per minute (unless its new record insert). Also, its important to be aware of Message 3967 in SQL2008. In case of storage constraint, SQLServer first marks the longest running transaction without having row versions generated as victim and rolls it back, and generates message 3967. Else, tempdb runs out of space error.

2. Excessive DDL and Allocation Operations
Use this query:
declare @now datetime
select @now = getdate()
select session_id, wait_duration_ms, resource_description, @now
from sys.dm_os_waiting_tasks where wait_type like ‘PAGE%LATCH_%’ and resource_description like ’2:%’

If contention is on PFS or SGAM pages, its Allocation contention. If contention is on other pages, its contention due to excessive DDLs. Also, such contention can be monitored using:
- SQL Server:Access Methods\Workfiles Created /Sec
- SQL Server:Access Methods\Worktables Created /Sec
- SQL Server:Access Methods\Mixed Page Allocations /Sec
- SQL Server:General Statistics\Temp Tables Created /Sec
- SQL Server:General Statistics\Temp Tables for destruction

In case of excessive DDL contention, only resolution is: find ways to minimize DDLs. Consider moving non-cached temp objects to cached objects (non cachced are the one’s declared inside a loop, or having indexes created seperately, or created using dynamic SQL). Try to eliminate some temp objects if possible.
In case of Allocation contention: increase the tempdb datafiles across the disks & files to match up the processor count. Use TF-1118 to eliminate mixed extent allocations.

Reference: http://msdn.microsoft.com/en-us/library/dd672789%28v=sql.100%29.aspx

 

Vikas Rajput (http://vikasrajput.wordpress.com)

SQL Server Performance Tools – All the options

December 11, 2010 Leave a comment

There are a lot of options that you can bank upon or have to use while dealing with performance problems. A concise listing would include:

- Perfmon (Performance Monitor)

- Profiler (SQL Server profiler utility)

- DBCC Commands

- DMVs (Data Management Views)

- Waits and Queues infrastructure (specifically the waiter lists)

- Data Collector and Master Data Warehouse (MDW) – new in SQL2008

- Extended Events

There are plethora of documentation available for each of them. I too intend to collate and present some information on this blog.

On a quick note, what would you do when someone walks up and say servers arent performing, where would you start? I suggest, start with waiter lists – it generally gives a good idea. Best authoritative paper on this topic can be referenced on MS site here.

 

Vikas Rajput (http://vikasrajput.wordpress.com)

Database Administration Tasks for Good Performance of SQL Server

Its sad that we, DBA, generally remain in reactive mode when it comes to performance tuning. It shall rather be proactive.
Below are some of well known best practices of admin tasks you shall be carrying out.
Go On. Test out your systems to be sure they are getting followed in your farms.

 

Task Recommendation
Avoid automatic database functions AUTO_CLOSE: cleanly shuts down database and frees all resources when the last user connection is closed.
Keep AUTO_CLOSE OFF (Keep it as default)
AUTO_SHRINK: periodically shrinks the size of the database. It can shrink the data files and, when in Simple Recovery mode, the log files. And blocks other processes when doing so.
Keep AUTO_SHRINK OFF (i.e. default)
Cycle the SQL error log file Use a SQL Server job to cycle the SQL Server log on a regular basis (EXEC master.dbo.sp_cycle_errorlog)
Default: Error log files are cycled only at server reboot (which is not ideal).
Keep the statistics up-to-date Enable for automatic statistics maintenance, ie. Using AUTO_CREATE_STATISTICS and AUTO_UPDATE_STATISTICS
Don’t rely on Auto Update completely. Plan for programmatic statistics update for EACH object
Ensure that statistics update is scheduled after completion of the index defragmentation job
Maintain a minimum amount of index defragmentation Defragment a database on a regular basis during nonpeak hours
Determine Index defragmentation regularly. Rebuild or Defrag the Index if needed
Minimize the overhead of SQL tracing Avoid online data column sorting while using Profiler
Capture trace output using a server-side trace (not Profiler)
Discard starting events while analyzing costly and slow queries
To run Profiler for very short tracing, run the tool remotely
Limit the number of events and data columns to be traced
Limit trace output size

SQL Server Database Design Best Practices for Good Performance

September 18, 2010 Leave a comment

Here’s a quick list of some of the Design best practices that shall be considered for good performing SQL Server database. 
- Avoid online data column sorting while using Profiler
- Capture trace output using a server-side trace (not Profiler)
- Discard starting events while analyzing costly and slow queries
- To run Profiler for very short tracing, run the tool remotely
- Limit the number of events and data columns to be traced
- Limit trace output size

 

Details for each is as below.

Practice Description or Reasons
Adopt index-design best practices Some quick and generic guidelines:
- Choose narrow columns for indexes
- Ensure that the selectivity of the data in the candidate column is very high OR that the candidate column has a large number of UNIQUE values
- Prefer columns with the integer data type (or its variants). Avoid VARCHAR columns for Index
- For a multicolumn index, prefer column with higher selectivity toward the leading edge of the index
- Use the INCLUDE list in an index as a way to make an index covering without changing the index key structure
- While deciding the columns to be indexed, pay extra attention to the queries WHERE clauses and JOIN criteria columns
- For queries retrieving a range of rows, clustered indexes are usually better. For point queries, nonclustered indexes are usually better.

Clustered Index require more careful planning as they are referenced by Non-Clustered Indexes as well:
- Keep clustered indexes as narrow as possible
- Create the clustered index first, and then create the nonclustered indexes on the table
- If required, rebuild a clustered index in a single step using the DROPEXISTING keyword in the CREATE INDEX command (rather than dropping and then rebuilding it)
- Do not create a clustered index on a frequently updatable column. Or nonclustered indexes on the table will have difficulty in remaining in sync with clustered index key values

Avoid the use of the sp_prefix for stored procedure names Performance hit is due to the way an SP is looked upon within SQL Server, ie.
1. In the master database
2. In the current database based on any qualifiers provided – DB name or owner
3. In the current database using dbo as the schema, if a schema is not specified
Thus object is first searched for within MASTER DB if its named as sp_prefix. A SP_CacheMiss event is caused, and then searched for within correct DB.
Balance under- and over normalization

Target "Balanced" normalization – extremes impacts the performance.
Under normalization : Causes excessive repetition of data, which can lead to improper results and often hurts performance
Over normalization: causes excessive joins across too many narrow tables. Heuristic I follow is to more closely examine a query when it exceeds 6 to 8 tables in the join criteria.

Use domain and referential integrity constraints

Data Integrity: can be enforced by restricting the data type of the column, defining the format of the input data, and limiting the range of acceptable values for the column
Data Integrity enablers: data types, FOREIGN KEY constraint, CHECK constraints, DEFAULT definitions, and NOT NULL definitions

Referential Integrity: ensures that a record in the child table exists only if the corresponding record in the parent table exists (except where NULL is allowed on child table for identifier or linked column)
Referential Integrity enablers: PRIMARY KEY (on parent), FOREIGN KEY (on child)
With the declarative referential integrity in place, the optimizer is assured that for every record in the child table, the parent table contains a corresponding record. Therefore, the JOIN clause between the parent and child tables is redundant.

Use entity-integrity constraints

Using "primary key" or "natural key" are possible options – with both having pros and cons
Primary/ Identity Key:
- generally an INT or BIGINT, which is excellent for index
- unique constraint has to be used as well to avoid any dupes

Natural Key (e.g. Social Security Number):
- separating the value of the primary key from any business knowledge is considered good design
- can carry business meaning which is bad design practice for identifier
- are clear, and human readable and carry business meaning at same time
- are generally composite, and hence not so good index candidate
Primary Key and UNIQUE constraint provide entity-integrity, and contribute to performance by providing useful information to the optimizer about the expected results, assisting the optimizer in generating efficient execution plans

Minimize the use of triggers Trigger itself is not a problem. Unawareness of its presence is a problem. As a Dev might lead a change and trigger might not be handling it causing performance issues. Avoid Triggers, if not possible, document them to best and max extent.

SQL Server Performance Tuning Approach

So many times a member walks upto you and asks what can be done in general to have a better performing SQL Server system. Arghh, I have to say it all again !

Not anymore. Here’s the list of things I think can benefit any such query. They presents an eagle-eye view only, and each shall be delved in greater details as you go along. But definitely an excellent (read structured) approach to start with.

  1. If you have the luxury of being in planning stage then – please, please – spend some time on articulating a proper physical model for your databases. Use multiple disk devices, use file groups, decide an appropriate RAID for your system, and estimate your tempdb space, decide for a backup strategy (remember, it eases up your transaction log and stops it from becoming a pain point).
  2. Nothing by-passes performance of a well planned architecture with well suited normalization or an efficient schema design. It should be your number One priority.
  3. Partition range based data tables, both vertically and horizontally
  4. Dont forget the fill factor, isolation level and error handling/logging – use them to have SQL Server offer its best
  5. Use indexes, but dont over-use them. They are there for a purpose, but not to replace the planning phase of your project
  6. Hints and Views: they are good and bad. Use them only when you know they will help. Or Test.
  7. Scale up, throw more system resources (memory, processors) so system can breath properly
  8. Scale out, in case you cant afford scale up or if it doesnt resolve your performance issue

Let me know if you think there’s something more that shall be added to the listing. I will try to present a deep dive for each of them as the time elapses.

SQL Server : LogFile getting too big or not getting truncated!

There are so many times/ situations when you can find a SQL Server DBA (like myself) completely lost. A classical is: "My DB is STRANDED for a ridiculous reason – LogFile not truncating". Its simple yet complex, for the fact that it can be avoided most of the times (alright, no design lectures here) – but it gets really tricky at times. And imagine a Prodcution site Manager staring at you while you are glaring your desktop screen with that remote DB Connection and scratching your head carrying those odd expressions… :-)

Well, here’s a step down approach which I generally consider for self (OK; for my troubled DB rather) when am really trying to "solve" the problem rather than taking a random toll to fix it:

A. DB Recovery Model
The issue can be because of transaction logFile (obviously). Things to checkout:
- recovery mode of your DB (select name, recovery_model_desc from sys.databases)
- if its FULL, have you ever taken transactional log backup. If not, better switch the DB to SIMPLE recovery mode (ALTER DATABASE DBName SET RECOVERY SIMPLE)
- if its FULL and you have taken transactional log backup in past, then try looking up what is holding things up (select name, log_reuse_wait_desc from sys.databases;)

B. Long Running transactions
- identify open transactions and respective SPIDs will be returned (DBCC OPENTRAN)
- try to identify who is using those spid’s (exec sp_who2 <spid>) OR (SELECT * FROM sys.dm_exec_sessions WHERE session_id = <spid>;)
- try to find whats running as part of these SPIDs and kill them if needed/ possible. Use "DBCC INPUTBUFFER(spid)" OR try the script below:

SELECT r.session_id, r.blocking_session_id, s.program_name, s.host_name, t.text
FROM sys.dm_exec_requests r INNER JOIN sys.dm_exec_sessions s 
ON r.session_id = s.session_id CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) t
WHERE
s.is_user_process = 1 AND
r.session_id = SPID –FROM DBCC OPENTRAN

C. Backup
LogFile truncation cannot happen during the backup/restore operation. You can create a transaction log backup when full/differential backup is occuring. BUT, you cant truncate the logFile while doing so.

D. Transactional Replication
With transactional replication, the inactive portion of the transaction log is not truncated until transactions have been replicated to the distributor.

E. Database Mirroring
Database mirroring is somewhat similar to transactional replication in that it requires that the transactions remain in the log until the record has been written to disk on the mirror server.

F. Disk Space
Might well be the reason that you are genuinely out of disk space
See if you want to add another logFile using (ALTER DATABASE DBName ADD LOG FILE;)

Once that you have truncated the logFile, you now will have to shrink the file to get it to manageable filesize.
- It can be done using EM (DB -> All Tasks -> Shrink -> LogFile) or commandLine by first identifying logFile name and then shrinking it:

select name from sys.database_files where type_desc =’LOG’; –List of logFile
DBCC SHRINKFILE (\’SalesHistory_Log\’, 1000) –shrinks the logFile to 1GB

NOTE: TRUNCATE_ONLY
One of the disastorous ways of dealing with issue and is not recommended by MS in SQLServer 2008 and onwards.
Doing so breaks the transaction log chain, which makes recovering to a point in time impossible because you have lost transactions that have occurred not only since the last transaction log backup but will not able to recovery any future transactions that occur until a differential or full database backup has been created.
BACKUP LOG SalesHistory WITH TRUNCATE_ONLY;
BACKUP LOG SalesHistory TO DISK = \’C:\Backup\SalesHistory.bak\’;  — to backup log to disk

Follow

Get every new post delivered to your Inbox.