May 19, 2008

Use the Shell, Luke

My Bitcricket cohort Robert Bullen couldn't resist the urge to write a column for the Network Guy.  I said "Sure, why not?" and so here it is!  Take it away Robert...

Most of the time, a GUI-based protocol analyzer is the primary weapon in the fight against network issues. It is to the analyst as the Light Saber is to the Jedi. But there is also a lesser known dark side (so called for its typical black background). It goes by many names: the Shell, Terminal, Console, DOS Prompt, or Command Line. This article will introduce you to the power of the dark side, and describe how I was seduced in the heat of one analysis battle in particular.

One of Bitcricket’s recent Enterprise clients was suffering from intermittent printer problems, which was described as everything from printers being slow to printing garbage to widespread outages. Together we identified several potentially interesting printers to be monitored. The client had little in the way of protocol analyzers so using the free Wireshark to capture these printers’ traffic was a no-brainer. Several Wireshark-enabled laptops were attached to switch SPAN ports and configured to capture to disk in ring buffer mode, saving files in 32 MB segments.

What’s that you say? You aren’t a Wireshark user? This article doesn’t apply to you? I disagree. Read on.

The first printer that went down under my watch was one of nine attached to the same switch. This means that nine printer ports were SPANed to a single capture, and therefore nine copies of broadcast and multicast packets were sent to the SPAN port. Due to buffering limitations in this particular switch model, a maximum of five copies made it out of the SPAN port and into the trace files. In addition, the client’s network was suffering from a separate unicast flooding issue, so not only were broadcasts and multicasts duplicated, or more accurately quintupled, but so were some unicasts. Nonetheless, it was four more copies than I needed.

So in my post-capture analysis I had the displeasure of looking at dozens of 32 MB files containing packets from several switch ports, many of them quintupled. For each of these files I needed to filter out unrelated traffic, remove duplicate packets (my own personal Clone Wars), and concatenate the results back into a single file for easier viewing in the analyzer GUI. After going through the process manually in Wireshark, I felt that I should automate it. Fortunately, Wireshark’s command line tools are plentiful and capable. Four of them proved useful in this scenario:

  • dumpcap – This utility does one thing and does it well; it captures to disk. While I didn’t use dumpcap in my post-capture analysis, it was used to initiate the round-the-clock capturing and saving of printer traffic.
  • editcap – Manipulates capture files, mostly by removing packets but has other uses as well, such as converting between trace file formats. I used editcap’s remove duplicate packets option to reduce quintuplets to a single packet. It also came in handy for slicing packets to a shorter length to reduce file sizes when transferring files via email.
  • tshark – Stands for Terminal-based Wireshark. It is the spunky R2D2 of the command line tools and does everything short of projecting a holographic distress call from the Princess. The feature I found most useful in this case was the ability to read an input trace file, apply a Wireshark display filter, and write an output file containing only the packets that passed the filter.
  • mergecap – Merges or appends multiple input trace files into a single output trace file. After reducing the original 32 MB-segmented trace files using editcap and tshark, I used megecap to recombine the results into a single file.

To learn more about these command line tools or for some usage examples, check out these external resources:

  • Laura Chappell, the Obi Wan Kenobi of Wireshark, recently started the BitSpitters series, which are short, instructional videos posted on YouTube. (Think of scaled down versions of her Animated Articles.) “Edit Trace File Timestamps” is one that demonstrates the use of editcap to adjust timestamps between a client-side capture and a server-side capture.
  • Sake Blok is the Yoda of the Wireshark command line, although his name already sounds like that of a Star Wars character. He led a presentation at Sharkfest 2008 titled “Advanced Scripting and Command Line Usage with tshark and Related Utilities”. In his slide deck, he discusses a similar client/server synchronization situation and provides a shell script that automatically synchronizes the timestamps between two trace files based on PING packets contained therein.

I assembled these utilities in a script to processes dozens of the aforementioned 32 MB files. It saved me a great deal of time, resulted in a single 3 MB file that was a breeze to view in the analyzer, and allowed me to more easily uncover some interesting clues to the problem. I was happy enough to raise my arms over my head and utter Chewbacca-like sounds of joy. For those who are curious about the script, or would like to use it as a jumping point for their own dark side analysis, look at it here.

What does this have to do with non-Wireshark users? Well, much like C-3PO, the, ahem, protocol droid who boasts his fluency in over 6 million forms of communication, these utilities support the same extensive set of input file formats and output file formats as Wireshark. Most likely they can be used with your analyzer’s native file format too, or at least read and write a common denominator file format.

The next time you find yourself wading through a heap of data, remember the power of the dark side. Like a Light Saber, it may be a life saver. Or at least a time saver.

May the shell be with you.

April 08, 2008

Pilot Sneak Preview: A New Direction in Network Analysis?

Build a better analysis front-end and they will come. That’s what CACE Technologies hopes to achieve with its Pilot visualization and reporting tool (expected to be announced on or prior to 4/18). Pilot (named after the fish that "congregate around sharks, rays, and sea turtles, where it eats parasites on and leftovers around the host species" according to Wikipedia) was previewed at the Wireshark developer’s conference last week. I was fortunate enough to get my hands on a beta.

In my opinion, established vendors had nothing to fear from Wireshark. Build a superior expert system, high performance capture and aggregation hardware, easy to use distributed tools and data mining and you have a winner. That is, until now.

CACE is the first commercial vendor to truly embrace Wireshark as a platform while other vendors stood back in fear. Why are they afraid? Why not embrace open source rather than try to hide it as others have done. An example of this is incorporating so-called third-party decodes from Wireshark’s predecessor, Ethereal.

Pilot is different the moment you fire it up.  Notice in the screen shot below, the modern GUI and the ability to learn several aspects of the product via a series of short videos.  I'd love to see other vendors follow this refreshing approach.

Pilot_overview_3

Pilot is more than just a pretty face. It also serves as a data mining tool to cull data from a large number of Wireshark files. In a recent situation, I had an analyzer-less customer deploy a number of Wiresharks at several suspected problems areas in their network for long-term capture to disk. We were then able to go back and manually mine data from several capture points when a particular event occurred and zero in on the problem. With Pilot, we can now bring those long term capture files together to assist in the mining and analysis process.

At the heart of the product is a Google Finance-like chart that slides across statistics collected from one or more packet traces, shown in the screen shot below. The highlighted part is a section I selected by hand to "send to Wireshark" for deep packet inspection. Pilot leaves not only the packet decodes but all packet display functions to Wireshark, a departure from other vendors that merely grabbed the Wireshark decoders. Pilot can also take advantage of WinPcap and AirPcap to grab real-time wired and wireless packet-derived data.

Pilot_graph_2

There are other goodies in the interface like dragging and dropping a view such as top MAC or IP sources, conversations, bandwdith by bytes or packets, and so on top of your selected files(s) or a section of a graph.  For instance, perhaps you only want the IP Conversations view for the highlighted portion in the bytes per second graph in the above screenshot. Merely drag the view from the selection tree on the left-hand side over the highlighted part in the graph and instantly see the conversations only for that time span. Way cool.

Linux users are out of luck though – this is a Windows only product built using Microsoft Visual Studio tools, as clearly evidenced by the Office 2007 ribbon interface. Frankly, when I first used Office 2007, I didn’t like the new interface as I was used to using previous versions. Once I forced myself to learn it however, I felt that it was superior (who says you can’t teach an old dog new tricks?). As such, I felt right at home with Pilot.

There are a couple of improvements I'd like to see, however. For instance, you can "send" a statistic or part of a graph, such as one or more parts of a histogram (using multi-select) for top talkers (sources) to Wireshark for deep packet inspection. Unfortunately, you only see one-way packets streams from those source addresses. I’d love to see a feature pioneered by WildPackets with its Select Related feature and imitated by others as a "quick filter", to select a choice of source and/or source and peers, so I can follow the flows. Analyzing one-way top talkers at the packet level makes sense for broadcasts, but less so for unicast traffic.

There's more to the product including a number of output options for reporting in a variety of formats from PDF to Excel.  Watch for the annoucement and check out a demo.

I was thinking it would be interesting if CACE supported more than just the Wireshark analyzer. Despite claiming to be integrated with Wireshark, it really boils down to passing a portion of one or more trace files as a capture source along with a filter to Wireshark. Why not support the same for other analyzers? On second thought, that could cause some serious heartburn for competing vendors.

With over 300,000 Wireshark downloads per month, users will finally have a real tool to go hand-in-hand to help ease some of their analysis pains. One question that comes to mind is how many users of a free open source tool will be willing to pay real money for Pilot at $1,295 a pop including maintenance (the projected introductory pricing)? Only time will tell.

Meanwhile by feasting on those morsels surrounding the Wireshark community, Pilot could prove be an industry disruptor, even more so when the distributed version becomes available.

March 11, 2008

A Tale of Five Analyzers

How do packet analyzers stack up in detecting and reporting the simplest and most fundamental indication of an anomaly, the venerable TCP retransmission?  I recently looked at five tools ranging from free to $80,000.  I did this because I saw something suspicious as reported by the big gun.

In my book I recommend that anytime you see suspect reporting or diagnostics, that you verify them at least once the hard way – by hand – so that from then on you know whether they are accurate under those circumstances.

In this particular case, the "AppDoctor" of the high-end tool was informing me that “There are many packet retransmissions.  The network may be heavily congested, or there may be an error-prone link.”  The value it gave me was over 5% which is certainly of concern.  That’s 50+ packets out of every 1,000.

I didn't suspect at the time that the network path was congested and didn't want to chase down duplex mismatches and the like, so I wanted a second opinion and ran the trace through the free tool.   It came up with all of 3 retransmissions per 1000 packets or 1/3 of a percent.

Why the large difference of opinion?  Shouldn’t packets be cut and dry, i.e. factual?  Before answering the question, I’d like to point out some of the various idiosyncrasies in how a number of analyzers report TCP retransmissions.

As you may have guessed, the free tool is Wireshark.  The $80,000 tool, Opnet’s ACE.  I also ran the trace through the latest Network Instruments Observer, WildPackets OmniPeek, and Network General Sniffer analyzers and focused on one section where the Opnet ACE diagnosed 72 retransmissions out of some 1,100 packets.

Observer reported the same high retransmission count.  It tried to be extra helpful in noting that they were also of the “too fast retransmission” variety (at or below 180 ms by default) and that they were “excessive”  (2% or more of the total packets by default for the critical level).  That would have been a great diagnosis if it had been correct.  More on this later.

The Sniffer values were a little strange, depending on if you are looking at the number of Sniffer symptom objects or the packet summary decodes.   The summary decodes contain three “Expert: Retransmission” notifications yet the tally in the expert summary lists only two possibly due to a grouping by TCP flow/conversation (i.e. two of the three retransmissions were in the same TCP flow.)

So the individual packet retransmission counts for each analyzer were:

  • Opnet: 72
  • Wireshark: 3
  • Observer: 72
  • OmniPeek: 3
  • Sniffer: 3

As I used to say in my classes, shall we just average the results and call it a day?  Not!

The right answer verified manually is three, making WireShark, OmniPeek and Sniffer were correct in this particular scenario.  That’s not to say that these tools are always correct in every situation - they aren't.  Again, the purpose of this exercise is to verify your data.  I'm not picking on any particular tool.

The reason for the large number of false positives in and Opnet and Observer was due to its misinterpretation of the TCP close connection sequence.

A graceful TCP close (i.e. not a RST or reset) is a four-packet TCP FIN sequence consisting of a FIN followed by an ACK to close one half of the connection, and then another FIN-ACK pair closes the second half of the connection to bring it to a full close (or in short FIN-ACK, FIN-ACK) as shown in the following figure (from Network General Sniffer).

Normal

Textbook TCP Close

In the trace in question, I noticed a different close sequence:  FIN one direction, FIN the other direction, ACK the FIN, ACK the FIN (or in short, FIN-FIN-ACK-ACK). Also unusual was the fact that the FIN bit was set in the ACK from the server.  The following shows the alternate TCP close.

Abnormal_2

Non-Textbook Close

Observer and Opnet apparently are fooled into thinking that the final TCP ACK packet is a retransmission since the FIN bit is set again (which is irrelevant as the connection is already closed) and the TCP sequence number matches the previous FIN packet.  Sniffer et. el. do not report them as retransmissions because they are simply the last of the four packets in a TCP FIN close sequence.

This particular application used hundreds of such TCP sessions and subsequent closes to transfer a relatively small amount of data, another problem in itself.

The lesson is that when in doubt, seek a second opinion from another tool or roll up your sleeves and perform manual analysis on a small test section of packet data to confirm your suspicions.

February 01, 2008

Live Webinar and Survey Reveals Wireless Secrets

I recently attended a live Cisco Mobility TV webinar co-sponsored by AirMagnet entitled "Designing and Deploying 802.11n Next-Generation Wireless." Apparently it was a big hit; according to the moderator, a record "thousands" of viewers logged in to watch it. Here’s what I thought were a couple of interesting takeaways.

Drop in Replacement or New Site Survey Required?

A Cisco representative started out by recommending a 1-for-1 access point replacement of legacy APs giving priority to performance over coverage. In other words, swap in a Cisco dual band 1250 AP to handle both legacy 802.11bg devices with the same coverage pattern as before while providing 802.11n access (in the 5 GHz band) for new 802.11n clients.

I thought this was a bit strange since 802.11bg does not like reflections whereas 802.11n using MIMO thrives on it. APs must be relocated accordingly. Later in the Webinar, the AirMagnet guy noted that an active site survey using a laptop with a live 802.11n client adapter is required to figure out the optimal 802.11n AP placement to take advantage of multipath. This seemed to contradict the Cisco 1-for-1 forklift strategy.

AirMagnet also recommended surveying with more than one 802.11n client adapter type if you are supporting more than one brand. I think this is a good idea since, unlike 802.11bg, Cisco does not provide a client side adapter.

Upping the Power Requirements

Perhaps the most controversial aspect of an 802.11n wireless LAN upgrade is the additional power requirements for dual band APs. Cisco claims that their enhanced PoE is the only viable single port solution for dual radio operation. They stated that dual PoE is twice the cable cost, 4x the cost to pull the cable, and uses more switch ports. They also noted that competitors supporting dual band operation over standard 802.3af PoE do so at reduced transmission power.

Luckily, CPUs are not the only chips going green. Witness the recent announcement from Siemens that generated a flurry of online articles discussing the 802.11n power controversy. Siemens managed to cut 3W off a full 802.11n MIMO (roughly 600 Mbps using transmission over 3 antenna with 3 streams each) AP running at maximum radio output in both the 2.4 GHz and 5 GHz bands simultaneously and yet operate over standard PoE.

Audience Survey

Having a captive audience of thousands, Cisco conducted three polls during the Webinar. Here are the questions and results.

"How do you expect to benefit from the use of 5 GHz with 11n?"

Surveywhy_5ghz

No surprises here. Only 5% claim that they will not use 5 GHz for 11n, vindicating the use of 5 GHz in the enterprise.

"What do you see as the biggest inhibitor to 11n adoption?"

Surveyinhibitors

Is the undetermined business need from those not deploying wireless whatsoever or are their needs currently met by 802.11bg? Also note the lack of a warm fuzzy for the current 802.11 draft 2.0 standard.

"How do you plan to power your 802.11n access points?"

Surveypowering_11n

Not much to add here. Looks like the largely Cisco audience prefers enhanced PoE.

So there you go. Some inside info and survey results from a wildly popular wireless webinar.