Find if Statistics is missing or is outdated
One way to tell if the statistics for a column are missing or outdated, or if the accuracy of current statistics is not ideal (they may not be ideal if the sample size of the statistics are not high enough), is to run a graphical execution plan of a query as an estimated plan and as a real plan. In other words, you will run both an estimated and an actual graphical execution plan using Query Analyzer, and then compare the results.
When you run an estimated graphical execution plan for a query, you get results, such as the estimated row count returned. When you run an actual graphical execution plan for a query, you get the actual row count returned. If the statistics for the column(s) used by the query are current and accurate, then the estimated row count and the actual row count should be very similar. If they are wildly different, this can mean three things. Either the statistics are missing, out of date, or not accurate enough.
First, check to see if the statistics are missing. If they are, they can easily be added.
Second, if statistics do exist, then update them to see if this resolves the problem.
Third, if the first two suggestions don’t work, then the statistics may not be accurate enough.
By default, when a table is less than 8 MB, the sample used to create the column statistics is the entire table. So for tables of less than 8 MB, in theory the statistics created can’t be any better than they are. So if your table is less than 8 MB, and updating the statistics doesn’t resolve the issue of a large difference between the estimated and actual row count, then nothing can. In a case like this, your only option, assuming there is a performance problem, is to use a hint to override what the Query Optimizer is doing wrong.
But if the table in question is greater than 8 MB, and when SQL Server automatically creates statistics for a table, it does not examine every row. Instead, it takes a sample of the available rows. This sample may or may not be good enough to produce accurate enough statistics. One way to get better statistics is to manually update the statistics for the table using the UPDATE STATISTICs command. This command has two options that allow you to get a better sampling of the rows in your table.
First, you can use the FULLSCAN option. This will force every row in the table or index to be fully examined and used to build the column statistics. As you might guess, using this option can increase the time it takes to update the statistics, which could hurt performance elsewhere on your server, especially if the table is huge.
As an alternative to the FULLSCAN option, you can use the SAMPLE option. This allows you to control how large a sample of rows to sample. For example, you might specify that only 50 percent of the rows are to be sampled. This is less than what a FULLSCAN will produce, but more than what a default sample will produce, offering a compromise in accuracy and resources needed to perform the sample.
If you find that the statistics for one or more of our tables is not accurate enough for your needs, you may need to schedule running the UPDATE STATISTICS command at regular intervals using an appropriate sampling level. Only through testing will you know what sampling level is best for your situation.
