Introduction to the DirectX 9 Shader Models - Nvidia
Introduction to the DirectX 9 Shader Models - Nvidia Introduction to the DirectX 9 Shader Models - Nvidia
How to use fp16 • When using pixel shader 2.0+ assembly, you can ask for 16-bit precision via _PP modifier – Modifier applies to instructions – Modifier applies to inputs dcl_[pp] dest[.mask] dcl_pp t0.xyz // declare t0 as partial precision sub_pp r0, r1, t0 // perform math at fp16
Precision Pitfalls • An IEEE-like fp16 is 1 sign bit, 5 bits of exponent and 10 bits of mantissa • Think of mantissa as tick marks on a ruler • The Exponent is the length of ruler – +/-1024 ticks, no matter what, so .1% precision across whatever range you have Inches Feet
- Page 23 and 24: Original HLSL Compiler Results vs_2
- Page 25 and 26: Begin Sim VS Sim
- Page 27 and 28: Caps for vs_2_x Sim • New D3DVSHA
- Page 29 and 30: Vertex Shader Predication - HLSL Si
- Page 31 and 32: Vertex Shader Predication Details S
- Page 33 and 34: Nested Static Flow Control Sim •
- Page 35 and 36: Dynamic Flow Control - HLSL Sim for
- Page 37 and 38: End Sim VS Sim
- Page 39 and 40: 2.0 Pixel Shader Instruction Set
- Page 41 and 42: Argument Swizzles •.r, .rrrr, .xx
- Page 43 and 44: ps.2.0 Review - Comparison with ps.
- Page 45 and 46: Caps for Pixel Shader 2.x D3DCAPS9
- Page 47 and 48: Pixel Shader 2.x • 512 instructio
- Page 49 and 50: Single Pass Lighting? • Sometimes
- Page 51 and 52: Single-Pass Lighting ? • Detailed
- Page 53 and 54: Single Pass Lighting? • Putting m
- Page 55 and 56: Single Pass Lighting? • It doesn
- Page 57 and 58: Lighting Render Loop • Per Light
- Page 59 and 60: Lighting Render Loop • Per-Object
- Page 61 and 62: Lighting Render Loop Summary • Go
- Page 63 and 64: Predication • Essentially a desti
- Page 65 and 66: Caveats for Conditional Instruction
- Page 67 and 68: Texture fetch with gradients • Gr
- Page 69 and 70: Arbitrary swizzling • Extremely u
- Page 71 and 72: Pixel Shader Precision • Low Prec
- Page 73: How to use fp16 • In HLSL or Cg,
- Page 77 and 78: Precision Pitfalls • Most precisi
- Page 79 and 80: Fully Fragment Shading? • Doing t
- Page 81 and 82: Precision Pitfalls • Avoid precis
- Page 83 and 84: Precision Summary • If high-preci
- Page 85 and 86: vs_3_0 • More flow-control • In
- Page 87 and 88: vs_3_0 inputs
- Page 89 and 90: vs_3_0 Output example vs_3_0 dcl_co
- Page 91 and 92: ps_3_0
- Page 93: Lunch Break We will start back up a
How <strong>to</strong> use fp16<br />
• When using pixel shader 2.0+ assembly, you can<br />
ask for 16-bit precision via _PP modifier<br />
– Modifier applies <strong>to</strong> instructions<br />
– Modifier applies <strong>to</strong> inputs<br />
dcl_[pp] dest[.mask]<br />
dcl_pp t0.xyz // declare t0 as partial precision<br />
sub_pp r0, r1, t0 // perform math at fp16