12.04.2014 Views

Click here to download this presentation in PDF format. - Sybase

Click here to download this presentation in PDF format. - Sybase

Click here to download this presentation in PDF format. - Sybase

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Assumptions<br />

This is NOT go<strong>in</strong>g <strong>to</strong> be a ‘Basic’ Presentation<br />

We will be review<strong>in</strong>g and discuss<strong>in</strong>g fairly advanced areas<br />

of Optimizer P&T; some of <strong>this</strong> you may have seen <strong>in</strong> the<br />

past, but a little review never hurt<br />

• You’ve worked with optimizer P&T<br />

• You’re runn<strong>in</strong>g ASE 11.9.2 or above<br />

• You understand the basics of optimization<br />

• You’ve used Traceons 302/310 and Optdiag<br />

• You’ve used the various update statistics syntax available <strong>in</strong><br />

ASE 11.9.2 and above<br />

• You really want <strong>to</strong> know about tun<strong>in</strong>g the statistics


T<strong>here</strong> are Two K<strong>in</strong>ds of Optimizer Statistics<br />

• Table/Index level - describes a table and its <strong>in</strong>dex(es)<br />

• Page/row counts, cluster ratios, deleted and forwarded rows<br />

• Some are updated dynamically as DML occurs<br />

• page/ row counts, deleted rows, forwarded rows, cluster ratios<br />

• S<strong>to</strong>red <strong>in</strong> systabstats<br />

• Column level - describes the data <strong>to</strong> the optimizer<br />

• His<strong>to</strong>gram (distribution), density values, default selectivity<br />

values<br />

• Static, need <strong>to</strong> be updated or written directly<br />

• S<strong>to</strong>red <strong>in</strong> sysstatistics<br />

• This <strong>presentation</strong> deals with the column level statistics


Some Quick Def<strong>in</strong>itions<br />

Range cell density: 0.0037264745412389<br />

Total density: 0.3208892191740000<br />

Range selectivity: default used (0.33)<br />

In between selectivity: default used (0.25)<br />

His<strong>to</strong>gram for column: “A"<br />

Column datatype: <strong>in</strong>teger<br />

Requested step count: 20<br />

Actual step count: 10<br />

Step Weight Value<br />

1 0.00000000


Statistics On Inner Columns<br />

of Composite Indexes<br />

• Th<strong>in</strong>k of a composite <strong>in</strong>dex as a 3D object, columns with<br />

statistics are transparent, those without statistics are<br />

opaque<br />

• Columns with statistics give the optimizer a clearer picture<br />

of an <strong>in</strong>dex – sometimes good, sometimes not<br />

• This is a fairly common practice<br />

• Does add ma<strong>in</strong>tenance<br />

• update <strong>in</strong>dex statistics most commonly used <strong>to</strong> do <strong>this</strong>


Statistics On Inner Columns<br />

of Composite Indexes cont.<br />

Index on columns E and B – No statistics on column B<br />

select * from TW4<br />

w<strong>here</strong> E = "yes" and b >= 959789065 and id >= 600000 and<br />

F > "May 14, 2002“ and A_A = 959000000<br />

Beg<strong>in</strong>n<strong>in</strong>g selection of qualify<strong>in</strong>g <strong>in</strong>dexes for table TW4',<br />

varno = 0, objectid 464004684.<br />

The table (Allpages) has 1000000 rows, 24098 pages,<br />

Estimated selectivity for E,<br />

selectivity = 0.527436, upper limit = 0.527436.<br />

No statistics available for B,<br />

us<strong>in</strong>g the default range selectivity <strong>to</strong> estimate selectivity.<br />

Estimated selectivity for B,<br />

selectivity = 0.330000.


Statistics On Inner Columns<br />

of Composite Indexes cont.<br />

The best qualify<strong>in</strong>g <strong>in</strong>dex is ‘E_B' (<strong>in</strong>did 7)<br />

cost<strong>in</strong>g 49264 pages, with an estimate of 191<br />

rows <strong>to</strong> be returned per scan of the table<br />

FINAL PLAN (<strong>to</strong>tal cost = 481960):<br />

varno=0 (TW4) <strong>in</strong>dexid=0 ()<br />

path=0xfbccc120 pathtype=sclause<br />

method=NESTED ITERATION<br />

Table: TW4 scan count 1, logical reads:(regular=24098<br />

apf=0 <strong>to</strong>tal=24098)<br />

physical reads: (regular=16468 apf=0 <strong>to</strong>tal=16468),<br />

apf IOs used=0


Statistics On Inner Columns<br />

of Composite Indexes cont.<br />

Statistics are now on column B<br />

Estimated selectivity for E,<br />

selectivity = 0.527436, upper limit = 0.527436.<br />

Estimated selectivity for B,<br />

selectivity = 0.022199, upper limit = 0.074835.<br />

The best qualify<strong>in</strong>g <strong>in</strong>dex is ‘E_B' (<strong>in</strong>did 7)<br />

cost<strong>in</strong>g 3317 pages,with an estimate of 13 rows <strong>to</strong><br />

be returned per scan of the table<br />

FINAL PLAN (<strong>to</strong>tal cost = 55108):<br />

varno=0 (TW4) <strong>in</strong>dexid=7 (E_B)<br />

path=0xfbd1da08 pathtype=sclause<br />

method=NESTED ITERATION<br />

Table: TW4 scan count 1, logical<br />

reads:(regular=4070 apf=0 <strong>to</strong>tal=4070),<br />

physical reads: (regular=820 apf=0 <strong>to</strong>tal=820),


Statistics On Non-Indexed Columns and Jo<strong>in</strong>s<br />

Can’t help with <strong>in</strong>dex selection but can affect jo<strong>in</strong> order<strong>in</strong>g<br />

• Columns with statistics give the optimizer a clearer picture of the<br />

column – no hard coded assumptions have <strong>to</strong> be used<br />

• When cost<strong>in</strong>g jo<strong>in</strong>s of non-<strong>in</strong>dexed columns hav<strong>in</strong>g statistics may<br />

result <strong>in</strong> better plans than us<strong>in</strong>g the default values<br />

• Without statistics t<strong>here</strong> will be no Total density or his<strong>to</strong>gram that the<br />

optimizer can use <strong>to</strong> cost the column <strong>in</strong> the jo<strong>in</strong><br />

• Yes, <strong>in</strong> some circumstances his<strong>to</strong>grams can be used <strong>in</strong> cost<strong>in</strong>g jo<strong>in</strong>s –<br />

if t<strong>here</strong> is a SARG on the jo<strong>in</strong><strong>in</strong>g column and that column is also <strong>in</strong> the<br />

jo<strong>in</strong> table then the SARG from the jo<strong>in</strong><strong>in</strong>g table can be used <strong>to</strong> filter the<br />

jo<strong>in</strong> table<br />

• If t<strong>here</strong> is no SARG on the jo<strong>in</strong> column or on the jo<strong>in</strong><strong>in</strong>g column the<br />

Total density value (with stats) or the default value (w/o stats) will be<br />

used


Statistics On Non-Indexed Columns<br />

and Jo<strong>in</strong>s cont.<br />

“Inherited” SARG example<br />

select ....from TW1, TW4<br />

w<strong>here</strong> TW1.A = TW4.A and TW1.A = 10<br />

Select<strong>in</strong>g best <strong>in</strong>dex for the JOIN CLAUSE:<br />

TW4.A = TW1.A<br />

TW4.A = 10<br />

Estimated selectivity for a,<br />

selectivity = 0.003726,upper limit = 0.049683.<br />

His<strong>to</strong>gram values used<br />

select ....from TW1, TW4<br />

w<strong>here</strong> TW1.A = TW4.A and TW1.B = 10<br />

Select<strong>in</strong>g best <strong>in</strong>dex for the JOIN CLAUSE:<br />

TW4.A = TW1.A<br />

Estimated selectivity for a,<br />

selectivity = 0.320889. Total density value used


Statistics On Non-Indexed Columns<br />

and Jo<strong>in</strong>s - Example<br />

select * from TW1,TW2<br />

w<strong>here</strong> TW1.A=TW2.A and TW1.A =805975090<br />

A simple jo<strong>in</strong> with a SARG on the jo<strong>in</strong> column of one table<br />

Table TW2 column A has no statistics, TW1 column A does<br />

Select<strong>in</strong>g best <strong>in</strong>dex for the JOIN CLAUSE: (for TW2.A)<br />

TW2.A = TW1.A<br />

TW2.A = 805975090 Inherited from SARG on TW1<br />

But, can’t help…no stats<br />

Estimated selectivity for A,<br />

selectivity = 0.100000.<br />

The best qualify<strong>in</strong>g access is a table scan,<br />

cost<strong>in</strong>g 13384 pages, with an estimate of 50000<br />

rows <strong>to</strong> be returned per scan of the table,<br />

us<strong>in</strong>g no data prefetch (size 2K I/O),<br />

<strong>in</strong> data cache 'default data cache' (cacheid 0)<br />

with MRU replacement<br />

Jo<strong>in</strong> selectivity is 0.100000.<br />

Inherited SARG from other table doesn’t help <strong>in</strong> <strong>this</strong> case


Statistics On Non-Indexed Columns<br />

and Jo<strong>in</strong>s – Example cont.<br />

Without statistics on TW2.A the plan <strong>in</strong>cludes a re<strong>format</strong><br />

with TW1 as the outer table<br />

FINAL PLAN (<strong>to</strong>tal cost = 2855774):<br />

varno=0 (TW1) <strong>in</strong>dexid=2 (A_E_F)<br />

path=0xfbd46800 pathtype=sclause<br />

method=NESTED ITERATION<br />

varno=1 (TW2) <strong>in</strong>dexid=0 ()<br />

path=0xfbd0bb10 pathtype=jo<strong>in</strong><br />

method=REFORMATTING<br />

• Not the best plan – but the optimizer had little <strong>to</strong> go on


Statistics On Non-Indexed Columns<br />

and Jo<strong>in</strong>s – Example cont.<br />

• Table TW2 column A now has statistics<br />

• The <strong>in</strong>herited SARG on TW1.A can now be used <strong>to</strong> help<br />

filter the jo<strong>in</strong> on TW2.A<br />

Select<strong>in</strong>g best <strong>in</strong>dex for the JOIN CLAUSE:<br />

TW2.A = TW1.A<br />

TW2.A = 805975090<br />

Estimated selectivity for A,<br />

selectivity = 0.001447, upper limit = 0.052948.<br />

The best qualify<strong>in</strong>g access is a table scan,<br />

cost<strong>in</strong>g 13384 pages, with an estimate of 724 rows <strong>to</strong> be<br />

returned per scan of the table, us<strong>in</strong>g no data prefetch<br />

(size 2K I/O), <strong>in</strong> data cache 'default data cache' (cacheid<br />

0) with MRU replacement<br />

Jo<strong>in</strong> selectivity is 0.001447.


Statistics On Non-Indexed Columns<br />

and Jo<strong>in</strong>s – Example cont.<br />

• With statistics on TW2.A re<strong>format</strong>t<strong>in</strong>g is not used and the<br />

jo<strong>in</strong> order has changed<br />

FINAL PLAN (<strong>to</strong>tal cost = 1252148):<br />

varno=1 (TW2) <strong>in</strong>dexid=0 ()<br />

path=0xfbd0b800 pathtype=sclause<br />

method=NESTED ITERATION<br />

varno=0 (TW1) <strong>in</strong>dexid=2 (A_E_F)<br />

path=0xfbd46800 pathtype=sclause<br />

method=NESTED ITERATION


The Effects of Chang<strong>in</strong>g the<br />

Number of Steps (Cells)<br />

• The number of cells (steps) affects SARG cost<strong>in</strong>g – as the number<br />

of steps changes, cost<strong>in</strong>g does <strong>to</strong>o<br />

• Cell weights and range cell density are used <strong>in</strong> cost<strong>in</strong>g SARGs<br />

• Cell weight is used as column’s ‘upper limit’ Range cell density is used<br />

as ‘selectivity’ for Equi-SARGs – as seen <strong>in</strong> 302 output<br />

• Result(s) of <strong>in</strong>terpolation is used as column ‘selectivity’ for Range<br />

SARGs<br />

• Increas<strong>in</strong>g the number of steps narrows the average cell width, thus the<br />

weight of Range cells decreases<br />

• Can also result <strong>in</strong> more Frequency count cells and thus change the<br />

Range cell density value<br />

• More cells means more granular cells


The Effects of Chang<strong>in</strong>g the Number of Steps<br />

(Cells) cont.<br />

Average cell width = # of rows/(# of requested steps –1)<br />

• Table has 1 million rows, requested 20 steps -<br />

• 1,000,000/19 = 52,632 rows per cell<br />

• 1,000,000/199 = 5,025 rows per cell<br />

• What does <strong>this</strong> mean?<br />

• As you <strong>in</strong>crease the number of steps (cells) they<br />

become narrower – represent<strong>in</strong>g fewer values<br />

• We’ll see that <strong>this</strong> has an effect on how the optimizer<br />

estimates the cost of a SARG


The Effects of Chang<strong>in</strong>g the<br />

Number of Steps (Cells) cont.<br />

Chang<strong>in</strong>g the number of steps – effects on Equi-SARGs<br />

select A from TW2 w<strong>here</strong> B = 842000000<br />

With 20 cells (steps) <strong>in</strong> the his<strong>to</strong>gram<br />

Range cell density: 0.0012829768785739<br />

9 0.05263200


• Range cell density decreased because Frequency<br />

count cells appeared <strong>in</strong> the his<strong>to</strong>gram<br />

The Effects of Chang<strong>in</strong>g the<br />

Number of Steps (Cells) cont.<br />

With 200 cells (steps) <strong>in</strong> the his<strong>to</strong>gram<br />

Range cell density: 0.0002303825911991<br />

77 0.00507200


The Effects of Chang<strong>in</strong>g the<br />

Number of Steps (Cells) cont.<br />

Chang<strong>in</strong>g the number of steps – effects on Range SARGs -<br />

select * from TW2 w<strong>here</strong> B between<br />

825570000 and 830000000<br />

With 20 cells (steps) <strong>in</strong> the his<strong>to</strong>gram<br />

Range cell density: 0.0012829768785739<br />

9 0.05263200


The Effects of Chang<strong>in</strong>g the<br />

Number of Steps (Cells) cont.<br />

select * from TW2 w<strong>here</strong> B between<br />

825570000 and 830000000<br />

With 200 cells (steps) <strong>in</strong> the his<strong>to</strong>gram<br />

Range cell density: 0.0002303825911991<br />

67 0.00505200


Add<strong>in</strong>g Boundary Values To The His<strong>to</strong>gram<br />

• Chang<strong>in</strong>g the boundary values can keep SARG values<br />

with<strong>in</strong> the his<strong>to</strong>gram<br />

• Avoids ‘out of bounds’ cost<strong>in</strong>g<br />

• Out of bounds cost<strong>in</strong>g usually happens on an a<strong>to</strong>mic column<br />

whose his<strong>to</strong>gram is out of date <strong>in</strong> relation the SARG value(s)<br />

• Optimizer has only two choices for selectivity – 1 or 0<br />

depend<strong>in</strong>g on the SARG opera<strong>to</strong>r and which end of the<br />

his<strong>to</strong>gram the SARG value falls outside of


Add<strong>in</strong>g Boundary Values<br />

To The His<strong>to</strong>gram cont.<br />

His<strong>to</strong>gram for column: “F"<br />

Column datatype: datetimn<br />

Requested step count: 20<br />

Actual step count: 20<br />

Step Weight Value<br />

1 0.28396901 < "May 1 2002 12:00:00:000AM"<br />

2 0.04839900 = "May 1 2002 12:00:00:000AM“<br />

<br />

20 0.00432500


Add<strong>in</strong>g Boundary Values<br />

To The His<strong>to</strong>gram cont.<br />

Out of bounds cost<strong>in</strong>g that uses a 0.00 selectivity<br />

select count(*) from TW1 w<strong>here</strong> F = "April 30, 2002“<br />


Add<strong>in</strong>g Boundary Values<br />

To The His<strong>to</strong>gram cont.<br />

Out of bounds cost<strong>in</strong>g that uses a 1.00 selectivity<br />

select count(*) from TW1 w<strong>here</strong> F >= “Apr 30 2002”<br />

> “Apr 30 2002”<br />

“May 16 2002”<br />

Estimated selectivity for F,<br />

selectivity = 1.000000.<br />

Lower bound search value 'Apr 30 2002 12:00:00:000AM' is less<br />

than the smallest value <strong>in</strong> sysstatistics for <strong>this</strong> column.<br />

Estimat<strong>in</strong>g selectivity of <strong>in</strong>dex ‘<strong>in</strong>d_F', <strong>in</strong>did 6<br />

scan selectivity 1.000000,filter selectivity 1.000000<br />

Search argument selectivity is 1.000000.


Add<strong>in</strong>g Boundary Values<br />

To The His<strong>to</strong>gram cont.<br />

What <strong>to</strong> do if out of bounds cost<strong>in</strong>g is a problem<br />

• Not always a problem, particularly when a selectivity of<br />

0.000000 is used<br />

• T<strong>here</strong> are two ways <strong>to</strong> deal with it<br />

• Add a dummy row <strong>to</strong> the table with a column value that<br />

allows the SARG value(s) <strong>to</strong> fall with<strong>in</strong> the his<strong>to</strong>gram – not<br />

always allowed<br />

• If you do add a dummy row keep <strong>in</strong> m<strong>in</strong>d that it will affect<br />

the his<strong>to</strong>grams of other columns; be careful with the values<br />

you use<br />

• Write a new his<strong>to</strong>gram boundary us<strong>in</strong>g optdiag. Edit the file<br />

and read it back <strong>in</strong>. This won’t directly affect the data, but it<br />

will extend the his<strong>to</strong>gram <strong>to</strong> <strong>in</strong>clude the SARG values(s)


Remov<strong>in</strong>g Statistics Can Effect Query Plans<br />

Sometimes no statistics are better then hav<strong>in</strong>g them<br />

This will usually be an issue when very dense columns<br />

are <strong>in</strong>volved<br />

His<strong>to</strong>gram for column: “E"<br />

Step Weight Value<br />

1 0.00000000 < "no"<br />

2 0.47256401 = "no"<br />

3 0.00000000 < "yes"<br />

4 0.52743602 = "yes“<br />

This can also show up when you have ‘spikes’<br />

(Frequency count cells) <strong>in</strong> the distribution


Remov<strong>in</strong>g Statistics Can<br />

Effect Query Plans cont.<br />

select count(*) from TW4<br />

w<strong>here</strong> E = “yes” and C = 825765940<br />

The table…has 1000000 rows, 24098 pages,<br />

Estimated selectivity for E,<br />

selectivity = 0.527436, upper limit = 0.527436.<br />

Estimat<strong>in</strong>g selectivity of <strong>in</strong>dex ‘E_AA_B', <strong>in</strong>did 6<br />

scan selectivity 0.52743602,filter selectivity 0.527436<br />

527436 rows, 174107 pages<br />

The best qualify<strong>in</strong>g <strong>in</strong>dex is ‘E_AA_B' (<strong>in</strong>did 6)<br />

cost<strong>in</strong>g 174107 pages, with an estimate of 526 rows<br />

FROM TABLE<br />

TW4<br />

Nested iteration.<br />

Table Scan.


Remov<strong>in</strong>g Statistics Can<br />

Effect Query Plans cont.<br />

delete statistics TW4(E)<br />

Estimated selectivity for E,<br />

selectivity = 0.100000.<br />

Estimat<strong>in</strong>g selectivity of <strong>in</strong>dex ‘E_AA_B', <strong>in</strong>did 6<br />

scan selectivity 0.100000,filter selectivity 0.100000<br />

100000 rows, 20584 pages<br />

The best qualify<strong>in</strong>g <strong>in</strong>dex is ‘E_AA_B (<strong>in</strong>did 6)<br />

cost<strong>in</strong>g 20584 pages, with an estimate of 92 rows<br />

FROM TABLE<br />

TW4<br />

Nested iteration.<br />

Index : E_AA_B<br />

Forward scan.<br />

Position<strong>in</strong>g by key.


Ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g Tuned Statistics<br />

Tuned statistics will add <strong>to</strong> your ma<strong>in</strong>tenance<br />

• Any statistical value you write <strong>to</strong> sysstatistics either via<br />

optdiag or sp_modifystats will be overwritten by update<br />

statistics<br />

• Keep optdiag <strong>in</strong>put files for reuse<br />

• If needed get an optdiag output file, edit it and read it <strong>in</strong><br />

• Keep scripts that run sp_modifystats<br />

• Rewrite tuned statistics after runn<strong>in</strong>g update statistics that<br />

affects the column with the modified statistics


Sampl<strong>in</strong>g For Update Statistics<br />

New feature <strong>in</strong> 12.5.0.3<br />

• Can dramatically speed up the runn<strong>in</strong>g of update statistics<br />

• Reads rows from random pages <strong>to</strong> build column level<br />

statistics (his<strong>to</strong>gram)<br />

• The percentage of pages <strong>to</strong> sample can be specified<br />

update statistics table(col) with sampl<strong>in</strong>g=10 percent<br />

• Also applies <strong>to</strong> update <strong>in</strong>dex statistics and<br />

update all statistics<br />

• Unofficial tests show that a sampl<strong>in</strong>g rate of 10% on a 1<br />

million row numeric column reduces the time for update<br />

statistics <strong>to</strong> run from 9 m<strong>in</strong>utes <strong>to</strong> 30 seconds


Sampl<strong>in</strong>g For Update Statistics cont.<br />

• Density values not updated by sampl<strong>in</strong>g<br />

• Sampled statistics will vary from those obta<strong>in</strong>ed by a ‘full<br />

scan’<br />

• More variations will appear as the sampl<strong>in</strong>g rate<br />

decreases<br />

• Test queries aga<strong>in</strong>st sampled statistics. In most cases<br />

you won’t see any major changes<br />

• Values may become ‘out of bounds’ <strong>this</strong> will affect the<br />

optimizer – likely <strong>to</strong> have greatest affect on a<strong>to</strong>mic<br />

columns


W<strong>here</strong> To Get More In<strong>format</strong>ion<br />

• The <strong>Sybase</strong> Cus<strong>to</strong>mer newsgroups<br />

• http://support.sybase.com/newsgroups<br />

• The <strong>Sybase</strong> list server<br />

• SYBASE-L@LISTSERV.UCSB.EDU<br />

• The external <strong>Sybase</strong> FAQ<br />

• http://www.isug.com/<strong>Sybase</strong>_FAQ/<br />

• Jo<strong>in</strong> the ISUG, ISUG Technical Journal, feature requests<br />

• http://www.isug.com


W<strong>here</strong> To Get More In<strong>format</strong>ion<br />

• The latest Performance and Tun<strong>in</strong>g Guide<br />

• Don’t be put off by the ASE 12.0 <strong>in</strong> the title, it covers the<br />

11.9.2 features/functionality <strong>to</strong>o<br />

• http://sybooks.sybase.com/onl<strong>in</strong>ebooks/group-as/asg1200e<br />

• Any “What’s New” docs for a new ASE release<br />

• Tech Docs at <strong>Sybase</strong> Support<br />

• http://tech<strong>in</strong>fo.sybase.com/css/tech<strong>in</strong>fo.nsf/Home<br />

• Upgrade/Migration help page<br />

• http://www.sybase.com/support/techdocs/migration


<strong>Sybase</strong> Developer Network (SDN)<br />

Additional Resources for Developers/DBAs<br />

• S<strong>in</strong>gle po<strong>in</strong>t of access <strong>to</strong> developer software, services,<br />

and up-<strong>to</strong>-date technical <strong>in</strong><strong>format</strong>ion:<br />

• White papers and documentation<br />

• Collaboration with other developers and <strong>Sybase</strong> eng<strong>in</strong>eers<br />

• Code samples and beta programs<br />

• Technical record<strong>in</strong>gs<br />

• Free software<br />

• Jo<strong>in</strong> <strong>to</strong>day: www.sybase.com/developer or visit SDN at<br />

TechWave’s Technology Boardwalk

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!