Performance Analysis and Tuning â Part 1 - Red Hat Summit
Performance Analysis and Tuning â Part 1 - Red Hat Summit Performance Analysis and Tuning â Part 1 - Red Hat Summit
Red Hat Enterprise Linux 6Scheduler TunablesImplements multilevel run queuesfor sockets and cores (asopposed to one run queueper processor or per system)RHEL6 tunables●sched_min_granularity_ns●sched_wakeup_granularity_ns●sched_migration_cost●sched_child_runs_first●sched_latency_nsSocket 0Core 0Thread 0 Thread 1Core 1Thread 0 Thread 1ProcessProcessSocket 1Thread 0 Thread 1ProcessProcessSocket 2ProcessProcessProcessProcessProcessProcessProcessProcessScheduler Compute Queues
Finer grained scheduler tuning●●/proc/sys/kernel/sched_*Red Hat Enterprise Linux 6 Tuned-adm will increase quantum onpar with Red Hat Enterprise Linux 5●●echo 10000000 > /proc/sys/kernel/sched_min_granularity_ns●Minimal preemption granularity for CPU bound tasks. Seesched_latency_ns for details. The default value is 4000000(ns).echo 15000000 > /proc/sys/kernel/sched_wakeup_granularity_ns●The wake-up preemption granularity. Increasing this variablereduces wake-up preemption, reducing disturbance ofcompute bound tasks. Lowering it improves wake-up latencyand throughput for latency critical tasks, particularly when ashort duty cycle load component must compete with CPUbound components. The default value is 5000000 (ns).
- Page 2 and 3: Performance Analysis andTuning - Pa
- Page 4 and 5: Red Hat Enterprise Linux: Scale Up
- Page 6 and 7: Red Hat Enterprise Linux 6Benchmark
- Page 8 and 9: Red Hat Enterprise Linux 6.4 vs Win
- Page 12 and 13: Load Balancing●●●●●Schedu
- Page 14 and 15: sched_child_runs_first●●●fork
- Page 16 and 17: 2MB standard Hugepages# echo 2000 >
- Page 18 and 19: Transparent Hugepagesecho never > /
- Page 20 and 21: 32-bitMemory Zones64-bitUp to 64 GB
- Page 22 and 23: Per Node/Zone split LRU Paging Dyna
- Page 24 and 25: Typical System Building BlockMemory
- Page 26 and 27: Four NUMA node system,fully-connect
- Page 28 and 29: Per NUMA-Node ResourcesMemory zones
- Page 30 and 31: zone_reclaim_mode●●●●Contro
- Page 32 and 33: Visualize CPUs via lstopo(from hwlo
- Page 34 and 35: Sample remote access latencies4 soc
- Page 36 and 37: So, what's the NUMA problem?●●
- Page 38 and 39: numastat: compatibility mode# numas
- Page 40 and 41: numastat: per-node meminfo# numasta
- Page 42 and 43: numastat shows aligned guests# numa
- Page 44 and 45: How to manage NUMA manually●●
- Page 46 and 47: numad can help improve NUMA perform
- Page 48 and 49: numad aligns process memory and CPU
- Page 50 and 51: numad usage●●●●numad is int
- Page 52 and 53: To change utilization target● -u
- Page 54 and 55: To get pre-placement advice● -w :
- Page 56 and 57: numad “-w” shell script(the imp
- Page 58 and 59: numad “-w” shell script (advise
<strong>Red</strong> <strong>Hat</strong> Enterprise Linux 6Scheduler TunablesImplements multilevel run queuesfor sockets <strong>and</strong> cores (asopposed to one run queueper processor or per system)RHEL6 tunables●sched_min_granularity_ns●sched_wakeup_granularity_ns●sched_migration_cost●sched_child_runs_first●sched_latency_nsSocket 0Core 0Thread 0 Thread 1Core 1Thread 0 Thread 1ProcessProcessSocket 1Thread 0 Thread 1ProcessProcessSocket 2ProcessProcessProcessProcessProcessProcessProcessProcessScheduler Compute Queues