Application Performance Analysis - Sharkfest - Wireshark

Application Performance Analysis - Sharkfest - Wireshark Application Performance Analysis - Sharkfest - Wireshark

sharkfest.wireshark.org
from sharkfest.wireshark.org More from this publisher
13.07.2015 Views

1Mike CanneyPrincipal Network Analystgetpackets.com

1Mike CanneyPrincipal Network Analystgetpackets.com


My contact infocontactMike Canney,Principal Network Analyst, getpackets.comcanney@getpackets.com319.389.11372


Capture StrategiescaptureCapture to Disk Appliance (CDA)6


Commercial vs. Free Capturecapture• Define your capture strategy– Data Rates– What are my goals? Troubleshooting vs.Statistical information.– Do I need to capture every packet?7


SPAN vs. TAPSPAN8


Capture to Disk Appliance (on a budget)budget• What is needed?– dumpcap is a command line utility includedwith the <strong>Wireshark</strong> download to enable ringbuffer captures.– Use an inexpensive PC or laptop (best tohave 2 NICs or more).– Basic batch file to initiate capture.– Cascade Pilot9


Dumpcap Exampleexamplecd \program files (x86)\wiresharkdumpcap -i 1 -s 128 -b files:100 -b filesize:2000000 –w c:\traces\internet\headersonly1.pcapThis is a basic batch file that will capture offof interface 1, slice the packets to 128bytes, write 100 trace files of ~2 Gigabytes,and write the trace file out to a pcap file.10


So why did I write multiple 2 Gig trace files?trace file• Pilot!• Pilot can easily read HUGE trace files.• This allows us to utilize our CDA in waysno other analyzer can.• I personally have sliced and diced 100GB trace files in Pilot in a matter ofseconds.11


So how does this all work together?practice• Directory full of 2GB trace files, all timestamped based on when they were writtento disk.• User calls in and complains that “thenetwork” is slow.• Locate that trace file based on time anddate and launch Pilot.12


Instructor DemodemoTroubleshooting user“Network Issue”13


Think about what you just sawhmmm• From a 2 GB trace file we were able to:– Look at the total Network throughput.– See what applications were consuming thebandwidth.– Identify the user that was responsible for consumingthe bandwidth.– Identify the URI’s the user was hitting and what theresponse times were.– Drill down to the packets involved in the slow webresponse time in <strong>Wireshark</strong>.• All in a matter of a few seconds.14


Much better and Still on a Budget…• vShark– Cascade Virtual Shark– Low cost, up to 2 TB of capture capability– Can analyze within a ESX server as well as build astand alone CDA– Great for 1 GB SPANs and below– My new, CDA of choice15


<strong>Application</strong> <strong>Analysis</strong>AppsUsing Pilot and <strong>Wireshark</strong> totroubleshoot <strong>Application</strong> issues16


So why focus on the <strong>Application</strong>?focus• In many cases it is the Network Engineers thathave the tool set to help pinpoint where theproblem exists.• “It’s not the Network!” - The Network is guiltyuntil proven innocent.• <strong>Application</strong> performance issues can impactyour business/customers ability to makemoney.• User Response time is “Relative”.• Intermittent performance issues (movingtarget).17


The “moving target”target• Analyzer placement - Two options– Move the analyzers as needed– Capture anywhere and everywhere• To defend the Network multiple capturepoints of the problem is often the bestsolution.18


Scenario XYour company has a remote site that is connected back tothe Data Center via a T1. There are 25-30 users at thissite and they are complaining that the Network is slow.Not only is is slow but it is down. When you check yourbandwidth usage you see that it is only ~300Kbps. Youdecide to take a trace file at the Data Center and at theRemote site to see what is going on.19


Why are there so many application issues?help• <strong>Application</strong>s are typically developed in a“golden” environment– Fastest PCs– High Bandwidth/low latency• When applications move from test (LAN)to production (WAN) the phone startsringing with complaints coming in.20


The <strong>Application</strong> QA Lifecycleqa cycle• In most organizations, applications go through aQA process• Typical QA/App developers test the following:– Functional tests– Regression tests– Stress tests (server)– Rinse and Repeat• What is often missing is “Networkability” testing• All QA Lifecycles should include Networkabilitytesting21


<strong>Application</strong> Networkablility Testingtesting• Identify key business transactions, number ofusers and network conditions the applicationwill be deployed in.• Simulation vs. Emulation– Simulation is very quick, often gives you roughnumbers of how an application will perform overdifferent network conditions.– Emulation is the only way to determine when anapplication will “fail” under those conditions.• A Combination of both is recommended.22


Top Causes for Poor <strong>Application</strong> <strong>Performance</strong>Top 6• TCP• <strong>Application</strong> Turns• Misconfigured Devices• Layer 7 Bottlenecks• Congestion (network)• Processing Delay23


Causes for Slow <strong>Application</strong> <strong>Performance</strong>TCPtcp24


25When are TCP ACKs sent?


Server Client TCP Hands 5,840 Bytes Up to the applica@on 5,840 Byte Block 5,840 Bytes ACKed 5,840 Byte Block 26


TCP Window Sizesize• The TCP Window Size defines the host’sreceive buffer.• Large Window Sizes can sometimeshelp overcome the impact of latency.• Depending on how the application waswritten, advertised TCP Window Sizemay not have an impact at all (more onthis later).29


TCP Inflight Datainflight• The amount of unacknowledged TCP datathat is on the wire at any given time.• TCP inflight data in limited by thefollowing:– TCP Retransmissions– TCP Window Size– <strong>Application</strong> block size• The amount of TCP inflight data will neverexceed the receiving devices advertisedTCP Window Size.30


Scenario VII – The Slow File Transfer• A customer is having an issue with their FTP to one oftheir customers. They had recently installed adedicated FTP server for this specific customer. Theyhad a 10Mbps MPLS connection and as expected, theNetwork was being blamed for the poor file transfers.When we checked the Bandwidth usage on the MPLSconnection all looked well. In fact, they were hardlyusing any of their bandwidth. The customer wasplanning to upgrade the MPLS connection from10Mbps to 20Mbps in hopes to speed up the filetransfers but decided to let us help them take a look atthe issue before making the purchase of the additionalbandwidth.31


Causes for Slow <strong>Application</strong> <strong>Performance</strong>turns<strong>Application</strong> Turns32


<strong>Application</strong> Turnsturns• An <strong>Application</strong> Turn is a request/responsepair• For each “turn” the application must waitthe full round trip delay.• The greater the number of turns, theworse the application will perform over aWAN (latency bound).33


App TurnBegin End 34


Example in <strong>Wireshark</strong>Display Filter:882 <strong>Application</strong> Turns in this trace35


App Turns and Latencylatency• It is fairly easy to determine App Turnsimpact on end user response time– Multiply the number of App Turns by theround trip delay:• 10,000 turns * .050 ms delay = 500 seconds dueto latency• Note, this has nothing to do withBandwidth or the Size of the WAN Circuit36


So what causes all these App Turns?cause• Size of a fetch in a Data Base call• Number of files that are being accessed• Loading single images in a Web Pageinstead of using an image map• Number of bytes being retrieved and howthey are being retrieved (block size)37


Pop Quiz!• We have an application that uses a 16,384 KB blocksize• This application is going to be rolled out over a DS3with 40 ms of round trip latency.• Users are complaining that the network is slow andthey need to upgrade the DS3 to a 1 Gbpsdedicated link.• What kind of throughput should I expect?39


Hint…• Forget about the bandwidth, it has nothing to dowith the throughput40


Answer• Very simple calculation• Throughput = Blocksize / Round Trip Delay• (16,384 / .04) * 8 = 3,278,800 bits per second• Again, note, this has NOTHING to do withbandwidth it’s all on the application.41


Remember to Calculate BDP• Bandwidth Delay Product• BW * RT Latency /8 = offered load to fill the pipe• (44,000,000 * .04) /8 = 220,000 bits per second tofill the DS342


43TCP Inflight Data in <strong>Wireshark</strong>


TCP Inflight Data in <strong>Wireshark</strong>The Bytes in Flight Column shows us how much payload is in each packet.44


47Easier way for SMB/CIFS


TCP Inflight Data in <strong>Wireshark</strong>Graphed in Excel48


TCP Retransmissionstcp• Every time a TCP segment is sent, aretransmission timer is started.• When the Acknowledgement for thatsegment is received the timer is stopped.• If the retransmission timer expires beforethe Acknowledgement is received, theTCP segment is retransmitted.49


TCP Retransmissionstcp flow• Excessive TCP Retransmissions can havea huge impact on application performance.• Not only does the data have to get resent,but TCP flow control (Slow Start) kicks intoaction.50


ULPs (upper layer protocols)ulp• TCP often gets blamed for the ULPs problem.– The application hands down to TCP amount of data to go retrieve(application block size)– TCP then is responsible for reliably getting that data back to theapplication layer• TCP has certain parameters in which to work with and can usuallybe tuned based on bandwidth and latency• Many times too much focus is put on “tuning” TCP as the fix forpoor performance in the network• If the TCP advertised receive window is set to 64K and the application isonly handing down to TCP requests for 16K, where is the bottleneck?51


ULPs (upper layer protocols)Case in point: CIFS/SMBulp52


Troubleshooting CIFS/SMBCifs/smb• Arguably the most common File Transfermethod used in businesses today.• SMB was NOT developed with the WAN inmind.• One of the most “chatty” protocols/applications I run into (with the exception ofpoorly written SQL).53


CIFS/SMB Quiz• What is faster using MS File Sharing?– Pushing a file to a file server?– Pulling a file from a file server?quiz54


55ULPs (upper layer protocols)


56ULPs (upper layer protocols)


CIFS/SMBcifs/smb• What is faster using MS File Sharing?– Pushing a file to a file server?– Pulling a file from a file server?• SMB Write (Pushing the file) can almost be 2X asfast as pulling (SMB Read)• Depends on the Latency57


CIFS/SMB Tuningtuning• SMB Maximum Transmit Buffer Size– Negotiated MaxBufferSize in the NegotiateProtocol response– Default for Windows servers is typically16644 (dependent upon physical memory)– Client default typically 435658


59CIF/SMB Tuning


CIFS/SMB Tuningtuning• Caveat:– SMB is extremely dependent upon the API• Even though you set the max buffer size to 64K,windows “share” data will always get truncated to60K (61440) even though the server can support64K60


CIFS/SMB Tuning• Custom SMB APIstuning– The Windows limitation can be exceeded byprograms written to use SMB as they filetransfer protocol61


CIFS/SMB TuningNote the SMB writes of 65,536 This is a file transfer using a custom API on a Windows XP machine 62


Instructor Demo of SMB ProfilesdemoDemo of SMB Tracefiles64


65My personal SMB Profile


Take Away Pointspoints• Building your own CDA is easy to do andmay fit in a majority of the areas you needto capture from• Pilot, Pilot, Pilot, it’s not just a fancyreporting engine for <strong>Wireshark</strong>!• Test your applications “Networkability”before they hit production.• Use the <strong>Wireshark</strong> Profiles, they will saveyou a ton of time.67


68Mike CanneyPrincipal Network Analystcanney@getpackets.com

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!