May 19, 2008

Use the Shell, Luke

My Bitcricket cohort Robert Bullen couldn't resist the urge to write a column for the Network Guy.  I said "Sure, why not?" and so here it is!  Take it away Robert...

Most of the time, a GUI-based protocol analyzer is the primary weapon in the fight against network issues. It is to the analyst as the Light Saber is to the Jedi. But there is also a lesser known dark side (so called for its typical black background). It goes by many names: the Shell, Terminal, Console, DOS Prompt, or Command Line. This article will introduce you to the power of the dark side, and describe how I was seduced in the heat of one analysis battle in particular.

One of Bitcricket’s recent Enterprise clients was suffering from intermittent printer problems, which was described as everything from printers being slow to printing garbage to widespread outages. Together we identified several potentially interesting printers to be monitored. The client had little in the way of protocol analyzers so using the free Wireshark to capture these printers’ traffic was a no-brainer. Several Wireshark-enabled laptops were attached to switch SPAN ports and configured to capture to disk in ring buffer mode, saving files in 32 MB segments.

What’s that you say? You aren’t a Wireshark user? This article doesn’t apply to you? I disagree. Read on.

The first printer that went down under my watch was one of nine attached to the same switch. This means that nine printer ports were SPANed to a single capture, and therefore nine copies of broadcast and multicast packets were sent to the SPAN port. Due to buffering limitations in this particular switch model, a maximum of five copies made it out of the SPAN port and into the trace files. In addition, the client’s network was suffering from a separate unicast flooding issue, so not only were broadcasts and multicasts duplicated, or more accurately quintupled, but so were some unicasts. Nonetheless, it was four more copies than I needed.

So in my post-capture analysis I had the displeasure of looking at dozens of 32 MB files containing packets from several switch ports, many of them quintupled. For each of these files I needed to filter out unrelated traffic, remove duplicate packets (my own personal Clone Wars), and concatenate the results back into a single file for easier viewing in the analyzer GUI. After going through the process manually in Wireshark, I felt that I should automate it. Fortunately, Wireshark’s command line tools are plentiful and capable. Four of them proved useful in this scenario:

  • dumpcap – This utility does one thing and does it well; it captures to disk. While I didn’t use dumpcap in my post-capture analysis, it was used to initiate the round-the-clock capturing and saving of printer traffic.
  • editcap – Manipulates capture files, mostly by removing packets but has other uses as well, such as converting between trace file formats. I used editcap’s remove duplicate packets option to reduce quintuplets to a single packet. It also came in handy for slicing packets to a shorter length to reduce file sizes when transferring files via email.
  • tshark – Stands for Terminal-based Wireshark. It is the spunky R2D2 of the command line tools and does everything short of projecting a holographic distress call from the Princess. The feature I found most useful in this case was the ability to read an input trace file, apply a Wireshark display filter, and write an output file containing only the packets that passed the filter.
  • mergecap – Merges or appends multiple input trace files into a single output trace file. After reducing the original 32 MB-segmented trace files using editcap and tshark, I used megecap to recombine the results into a single file.

To learn more about these command line tools or for some usage examples, check out these external resources:

  • Laura Chappell, the Obi Wan Kenobi of Wireshark, recently started the BitSpitters series, which are short, instructional videos posted on YouTube. (Think of scaled down versions of her Animated Articles.) “Edit Trace File Timestamps” is one that demonstrates the use of editcap to adjust timestamps between a client-side capture and a server-side capture.
  • Sake Blok is the Yoda of the Wireshark command line, although his name already sounds like that of a Star Wars character. He led a presentation at Sharkfest 2008 titled “Advanced Scripting and Command Line Usage with tshark and Related Utilities”. In his slide deck, he discusses a similar client/server synchronization situation and provides a shell script that automatically synchronizes the timestamps between two trace files based on PING packets contained therein.

I assembled these utilities in a script to processes dozens of the aforementioned 32 MB files. It saved me a great deal of time, resulted in a single 3 MB file that was a breeze to view in the analyzer, and allowed me to more easily uncover some interesting clues to the problem. I was happy enough to raise my arms over my head and utter Chewbacca-like sounds of joy. For those who are curious about the script, or would like to use it as a jumping point for their own dark side analysis, look at it here.

What does this have to do with non-Wireshark users? Well, much like C-3PO, the, ahem, protocol droid who boasts his fluency in over 6 million forms of communication, these utilities support the same extensive set of input file formats and output file formats as Wireshark. Most likely they can be used with your analyzer’s native file format too, or at least read and write a common denominator file format.

The next time you find yourself wading through a heap of data, remember the power of the dark side. Like a Light Saber, it may be a life saver. Or at least a time saver.

May the shell be with you.

March 11, 2008

A Tale of Five Analyzers

How do packet analyzers stack up in detecting and reporting the simplest and most fundamental indication of an anomaly, the venerable TCP retransmission?  I recently looked at five tools ranging from free to $80,000.  I did this because I saw something suspicious as reported by the big gun.

In my book I recommend that anytime you see suspect reporting or diagnostics, that you verify them at least once the hard way – by hand – so that from then on you know whether they are accurate under those circumstances.

In this particular case, the "AppDoctor" of the high-end tool was informing me that “There are many packet retransmissions.  The network may be heavily congested, or there may be an error-prone link.”  The value it gave me was over 5% which is certainly of concern.  That’s 50+ packets out of every 1,000.

I didn't suspect at the time that the network path was congested and didn't want to chase down duplex mismatches and the like, so I wanted a second opinion and ran the trace through the free tool.   It came up with all of 3 retransmissions per 1000 packets or 1/3 of a percent.

Why the large difference of opinion?  Shouldn’t packets be cut and dry, i.e. factual?  Before answering the question, I’d like to point out some of the various idiosyncrasies in how a number of analyzers report TCP retransmissions.

As you may have guessed, the free tool is Wireshark.  The $80,000 tool, Opnet’s ACE.  I also ran the trace through the latest Network Instruments Observer, WildPackets OmniPeek, and Network General Sniffer analyzers and focused on one section where the Opnet ACE diagnosed 72 retransmissions out of some 1,100 packets.

Observer reported the same high retransmission count.  It tried to be extra helpful in noting that they were also of the “too fast retransmission” variety (at or below 180 ms by default) and that they were “excessive”  (2% or more of the total packets by default for the critical level).  That would have been a great diagnosis if it had been correct.  More on this later.

The Sniffer values were a little strange, depending on if you are looking at the number of Sniffer symptom objects or the packet summary decodes.   The summary decodes contain three “Expert: Retransmission” notifications yet the tally in the expert summary lists only two possibly due to a grouping by TCP flow/conversation (i.e. two of the three retransmissions were in the same TCP flow.)

So the individual packet retransmission counts for each analyzer were:

  • Opnet: 72
  • Wireshark: 3
  • Observer: 72
  • OmniPeek: 3
  • Sniffer: 3

As I used to say in my classes, shall we just average the results and call it a day?  Not!

The right answer verified manually is three, making WireShark, OmniPeek and Sniffer were correct in this particular scenario.  That’s not to say that these tools are always correct in every situation - they aren't.  Again, the purpose of this exercise is to verify your data.  I'm not picking on any particular tool.

The reason for the large number of false positives in and Opnet and Observer was due to its misinterpretation of the TCP close connection sequence.

A graceful TCP close (i.e. not a RST or reset) is a four-packet TCP FIN sequence consisting of a FIN followed by an ACK to close one half of the connection, and then another FIN-ACK pair closes the second half of the connection to bring it to a full close (or in short FIN-ACK, FIN-ACK) as shown in the following figure (from Network General Sniffer).

Normal

Textbook TCP Close

In the trace in question, I noticed a different close sequence:  FIN one direction, FIN the other direction, ACK the FIN, ACK the FIN (or in short, FIN-FIN-ACK-ACK). Also unusual was the fact that the FIN bit was set in the ACK from the server.  The following shows the alternate TCP close.

Abnormal_2

Non-Textbook Close

Observer and Opnet apparently are fooled into thinking that the final TCP ACK packet is a retransmission since the FIN bit is set again (which is irrelevant as the connection is already closed) and the TCP sequence number matches the previous FIN packet.  Sniffer et. el. do not report them as retransmissions because they are simply the last of the four packets in a TCP FIN close sequence.

This particular application used hundreds of such TCP sessions and subsequent closes to transfer a relatively small amount of data, another problem in itself.

The lesson is that when in doubt, seek a second opinion from another tool or roll up your sleeves and perform manual analysis on a small test section of packet data to confirm your suspicions.

January 16, 2008

Optimize Your Web Site with YSlow

Some of the best free tools in life are the niche ones such as this nifty little utility available from the Yahoo! Developer Network called YSlow (a nice play on Why Slow). You'll need to install Firefox along with the Firebug web development tool to run it, but it's definitely worth it. I'll first describe a controversial aspect of the tool, then get on to the interesting stuff.

The Controversial Part

YSLow will analyze a web page and assign a grade of A through F based on 14 criteria. For a complete list of the criteria, check out "Best Practices for Speeding up Your Web Site." Some of the grading criteria is straight forward like the number of HTTP requests or turns (the fewer the better). Others are more complex and designed for larger enterprises such as having a Content Delivery Network (CDN) like the Akamai provides for the likes of Apple, Dell, and IBM. Thus if you have a small to medium sized business, you'll get dinged for not having a CDN.

You'll also get nicked for not using something known as "far future Expires header", GZIP components (even for small objects), and ETags. ETags allow web servers and browsers to figure out if an object in the user's cache matches the one on the server. This is somewhat controversial as even the YSlow developers admit that they are "unique to a specific server hosting a site." Please check out the aforementioned "Best Practices" for more details.

Naturally, the Yahoo! Web site gets an "A" grade which is interesting, considering there's far more junk on the page then a lightweight page like Google. Yahoo! used to be lightweight, but thanks to Expires headers, cache priming (i.e. the biggest hit is when you visit the site for the first time), they manage a good score and load very quickly on subsequent visits. One area that could be improved as YSlow points out is to reduce the number of HTTP requests.

Just for fun, I decided to see what the YSlow score was from the very vendors that sell stuff to help us to troubleshoot our network and application performance. It surprised me that the web home pages from companies like Fluke Networks, OPNET, Network General, Network Instruments, NetQoS, NetScout, Niksun, and WildPackets all received a grade of D or F.

Embarrassing? Perhaps. Even without a CDN, one can get much better than a failing grade by employing some of the other techniques. Keeping your pages fairly lightweight and minimizing the number of HTTP requests can go a long ways towards improving the score. Read on.

The Good Stuff

YSlow will show the total bytes and total time to download a page. For instance, the OPNET website weighed in at a whopping 501k bytes and most surprising, 4.0 seconds to download even over a fast 8 Mbps connection. These timings were with a primed cache (i.e. I merely did a page refresh after already visiting the side). Why so long? YSlow to the rescue.

YSlow will show you the timing for each object downloaded, for each TCP session that the browser supports (both IE and Firefox support two TCP sessions to a Web server by default.) The following screenshot of the YSLow "Net" tab gives you an idea of what this looks like. (In the interest of size and readability, I'm only showing you a portion of the web page and timing analysis.)


Opnet1_2

YSlow Component Load Time Analysis

You can expand the information for each component, including the source code from where it came and even see what the object (such as a gif image) looks like in isolation in another browser window. Each component (HTML, CSS, images, Flash, etc.) can be inspected in detail.

A closer look at the timing analysis revealed that the web page has a large number of HTTP requests including nine external JavaScript files. In all, there were 40 HTTP requests generating 508k bytes worth of traffic that make up this web page, yet roughly 501k bytes are downloaded every time even with a primed cache. Why?

YSlow revealed that only 7k bytes worth of objects are loaded from the user's browser cache. The biggest culprit was the flash animation, weighing in at 477k bytes and downloaded each and every time. This, and the large number of HTTP turns (37 according to a packet capture) coupled with round trip network delay, leads to the longish four second load time for the page even for just a refresh.

How about after clearing the browser cache? Does it get worse? Surprise! There was little real difference. I saw one more turn count and a bit more data. The HTTP turn count is there regardless to check the browser cache against the server and/or download the objects, most of which were tiny gif images or small JavaScript files so it didn't make a whole lot of difference in this case.

Furthermore, the round trip delay (56 ms in this example) starts to become a factor. Fewer turns would help. It also wouldn't hurt to optimize the flash. I'll leave the rest of the web page optimization as an exercise for the student. As a hint, not all web pages that use animation are penalized. The WildPackets home page for instance, went from 471k to 47k when cached (the next figure shows how YSlow gives you these stats). The animation stays cached. The number of objects and HTTP subsequent requests however, are quite high at 62. That's a lot of turns and should be optimized.


Wildpackets

YSlow Cache Analysis: Note the drop in bandwidth, but high turn count.

Conclusion

Let's not get complacent just because we can throw more bandwidth at it; i.e. the Internet pipes are getting fatter all the time. Web sites can and should be optimized. If everyone did so we could save, oh, perhaps a few billion bytes worldwide every minute or so. Even corporate intranets need to consider web tuning to best serve up their users.

December 28, 2007

Got Analysis Tools? Put a Shark in Your Tank

As the year winds down this snowy December day at a balmy 18 degrees (I’m writing this from an undisclosed cabin “up north” in Minnesota), my mind is far from the cardinals, nuthatches, and woodpeckers just outside my window.  In fact, I’m thinking about sharks.

For me, no one analysis tool fits all.  I typically use a wide set of tools: OS command line utilities, open source projects, commercial data mining and analysis products, resource monitoring utilities, and applications like Excel.  Being resourceful and combining tools is the key to quick and effective enterprise infrastructure analysis. 

I prefer the term infrastructure analysis because it avoids saying, “Is it the network or server?” When it comes to troubleshooting a problem, I look at everything starting from the way devices talk to each other (device and even user behavioral analysis based on protocol analysis) and digging in or drilling down from there.  Sometimes I get lucky and find a simple physical connection duplex mismatch causing CRC errors leading to TCP retransmissions. Other times it gets far more complex, like when I get into n-tier analysis with a multitude of devices, protocols, server types, and applications. But I digress.

Regarding the shark, as you probably guessed I’m referring to Wireshark (formerly Ethereal), updated this month to version 99.7.  I would not recommend depending solely on Wireshark unless you have time to burn doing stuff manually that could be done much faster using other tools. For that matter, some data you just can’t get with any protocol analyzer.

But the shark is in my tank for good reason.  I suppose one reason is that it’s free.  But time is money and free tools can actually be more expensive compared to shelling out a few (thousands) dollars for commercial offerings if the commercial products save you valuable time. But the shark is quite good for certain tasks.

One of the more obvious reasons to use Wireshark is for its breadth and depth of decodes.  Believe it or not, some analyzers actually have deeper decodes for a few protocols, but none have the breadth.  For example, in a recent analysis of a large enterprise, I needed decodes for Distributed Relational Database Architecture (DRDA), a standard driven by the Open Group and IBM for accessing distributed data based on SQL.  While the SQL statements could be read as plain text in the TCP payload, it helped to know in what context these commands were being used.  A couple of other unnamed commercial analyzers that I had access to did not decode it.  The shark to the rescue.

A not-so-obvious use for Wireshark is packet conversion—read in one format and write to another.  Observer native format to OmniPeek native format?  No problem.  Most analyzers support the simple Sniffer .enc format as well as their proprietary format.  The problem is that some do not do a good job of converting to .enc, resulting in packets that are flagged as “sliced” but really aren’t or have the occasional timestamp conversion problem.  Wireshark seems to do a pretty decent conversion to .enc, at least a lot better than one unnamed commercial offering.

The shark is also portable.  I’m not so much referring to its multi-OS portability as I am to its USB flash drive portability.  I can run a version of Wireshark right off a U3-compatible USB flash drive without having to install it on the host system.  A shark in your pocket.  How cool is that?

There are numerous other tricks for the shark that can complement your analysis including some cool command line stuff.  You can learn from the experts that live and breathe the shark at the forthcoming Sharkfest that promises to be more exciting than, well, the original Jaws classic.  The event is March 31st – April 2nd in Los Altos Hills, CA (near Sunnyvale). 

I can envision the feeding frenzy if the highly energetic Laura Chappell shows up dressed in a shark outfit with a big TCP FIN on her back.  Ouch.

October 11, 2007

TCP Selective ACK (SACK) Packet Recovery Analysis: Part 2 - The Analyzer

My previous blog looked at some of the operational details of TCP SACK, both from a performance and analysis perspective.  I noted how packet retransmissions are reduced when the client performs selective acknowledgements but the server performance may still be impacted, lowering the overall throughput.  In this second of two parts I’ll comment on how things look from a protocol analyzer perspective.

One of the first things that should have been obvious from the discussion thus far is that analyzers should also be looking at SACK information from the client to determine whether or not retransmissions exist.  Unfortunately, most analyzers only look for retransmissions the old fashion way:  duplicate TCP sequence numbers.  If the packet was dropped prior to your analyzer insertion point into the network, retransmissions go undetected.

One must also be careful when seeing a TCP packet with a lower sequence number than the previous packet.  It is not necessarily an out of order packet.  This can easily be determined by looking for prior SACK information, which virtually all clients and servers in use today support by default.

Finally, it is of interest to see how efficiently a server is able to process and send the missing segment or segments.  The case I referred to in the previous blog showed server delay in segment recovery.

Let’s look at three analyzers in order by name: Observer (Network Instruments), OmniPeek (WildPackets), and Wireshark (the Ethereal replacement spearheaded by CACE).

Note:  I tested latest shipping version of each analyzer as of this blog posting. The three I picked are all representative of today’s protocol analysis tools and all have some type of expert system.  I ran the same exact trace file (TCP packet loss as discussed in Part 1) through all three analyzers, using the .enc format.  The trace was originally captured from the client’s Ethernet, with the remote server located on the other side of a WAN.

Observer (12.1)

Observer reported no TCP events in the expert analsysis summary. Yet in looking at the TCP retrans column at the far right, it did show 32 retransmissions.

Observersack_2

Bottom line:  With Observer, be sure to examine the retransmission column in the connection details for problematic application/TCP performance.  Be careful with the Connection Dynamics feature which identifies the packets as out of order, not as retransmissions like the expert.

OmniPeek (5.0)

OmniPeek has a unique expert event called “TCP Slow Segment Recovery” as you can see in the screen shot below.  Thus, OmniPeek does look at both the SACK information from the client followed by the respective recovered segments sent by the server.  The default delay setting is 250 milliseconds.  By using a little trick and lowering this to 0 milliseconds, we can catch all recovered segments.  Per the screenshot, the number of segments recovered is 30.

Omnipeeksack

Bottom line:  Slow segment recovery may or may not be a problem, but is a strong hint.  You need to analyze further to see if it goes with the flow or disrupts the flow.  One hint is to look at the max, min, and average throughput reported by the expert.

Wireshark (99.6)

As shown in the next screen shot, the Wireshark expert info correctly ‘Notes' that there are TCP Retransmissions (32 total) and 'Warns' about 'Previous Segment Lost' (30 total).  Thus, it took 32 retransmissions to recover 30 segments.

Wiresharksack_2

Bottom line:  Look prior to the retransmissions to see if SACK is being using.  Also, the duplicate ACK’s shown in the screen shot are not really duplicates per se.  As packets continue to stream in following a missing segment, the client continues to SACK each segment, incrementing the received block range.  Once the missing segment or segments are received, the client resorts back to normal periodic ACKs.

Conclusion

Hopefully I’ve whet your appetite for probing further.  TCP Selective ACKs work well in reducing TCP retransmissions but watch out for how efficiently servers handle it.  This is but one example of how we can analyze better, smarter, deeper.

October 04, 2007

TCP Selective ACK (SACK) Packet Recovery Analysis: Part 1 - The Problem

One of the most common symptoms that we look for as evidence of dropped packets in a network is TCP retransmissions.  Virtually every protocol analyzer today will alert you when it detects a retransmission.

The recovery mechanism for dropped or delayed TCP packets has changed over the years.  The question is: Have analysis tools kept up?

A receiver’s TCP stack can address lost (or delayed) packets a number of different ways including:

  • Acknowledging up to and including the last TCP segment  that has been contiguously received (i.e. there no gaps in the received byte stream);
  • Sending “fast” duplicate ACKs immediately upon sensing a gap; or,
  • Using the selective acknowledgement (SACK) feature.

Acknowledging only the last “good” segment received (the first two aforementioned techniques) will cause the sender to back up and resend from that point forward.  Typically more than one packet has already been sent due to the TCP windowing mechanism, which allows multiple packets to be sent before an ACK is required.  This means that not only is the missing segment resent, but all subsequent segments as well, even if they were already received by their destination. The number of packets outstanding worsens as networks get faster and window sizes get larger (i.e. window scaling beyond 65K bytes).

There is a better way to recover lost TCP segments.  A client can use SACKs to inform the sender of all segments that have arrived via a sequence number range or block.  Up to four blocks can be acknowledged in one SACK packet.  Note that a receiver can only use SACKs if the sender indicates that it is a supported option.  You can check for this in the TCP SYN and TCP SYN-ACK packets with your analyzer, where one side will indicate to the other in the TCP options field that SACK is permitted.

For more operational details and some complex scenarios, please refer to RFC 2018, “TCP Selective Acknowledgement Options”.

How well does SACK work?

Analysis of SACK in action proves that it definitely does the job in cutting down on the number of packets that are retransmitted.  However, I’ve noticed some caveats in SACK behavior as well as how certain network analysis tools (i.e. expert systems) report this behavior.

In theory, the SACK mechanism should also cut down on delay due to dropped packets.  The retransmitted packets should be streamed right into the regular flow from the sender without hesitation.

In practice, I’ve noticed that this is not always the case. For example, I examined a remote file transfer between a client and server over a WAN experiencing some packet loss. I noticed that whenever the server resent a packet due to a SACK from the client, the sender would often a pause for up to several hundred milliseconds between the last good packet sent and the recovered packet (far longer than the round trip delay in the WAN). This was followed by a similar delay before the stream got going again.

How can you analyze these SACK recovery delays?  A good place to start is to employ a display filter to find TCP packets with SACK information in the header.  A quick and dirty way is to check for TCP headers longer than 20 bytes. As mentioned previously, SYN packets will advertise the sender’s capabilities and therefore will have longer headers. Thus, SYN packets can be excluded by the filter.

The reason I suggest using a display filter and not capture filter is because this is a situation where you’ll want to capture all TCP packets between a client and server and then apply some post-capture analysis.  If you only capture packets with SACK information, you’ll see which packets needed to be retransmitted but you won’t be able to deduce how long the sender took to recover and actually resend the packets. You may also wish to trigger a capture on a SACK packet-- chances are if you see one, you'll see more.

If the analyzer is capturing close to the source, you’ll probably also notice TCP retransmissions from your expert system or duplicate TCP sequence numbers if you like doing things by hand (be sure to check that the IP ID is not the same in the repeated packet, which can happen when analyzing VLANs off a SPAN port). One pitfall is that you will not see TCP retransmissions flagged as such if the packets were dropped prior to the segment from which you are capturing.

What then?  Focus on packets that contain SACK information.  You will typically see a few SACKs with a widening range of bytes received until the missing segment or segments are received.  When the SACKs stop, you know that the segment(s) in question have been sent.  Checking the sequence number of a packet following a “SACK burst” will confirm this.  The sequence number will be lower than the previous transmission.

I will cover how a number of protocol analyzers faired in detecting this problem in my next blog.

September 11, 2007

Pocket Hub to the Rescue

A recent quote I provided for a paper by industry veteran Tim O'Neil published on lovemytool.com, reminded me that sometimes the hardest problems can figured out with the simplest tools.  Take a recent experience I had resolving a performance issue with a particular server in a server farm.  It’s somewhat ironic that I actually solved the problem by inserting a hub as an analyzer tap (SPAN was not an option in this particular situation) in an otherwise 100 Mbps full duplex connection.

The server did about 4 Mbps per second connected directly to a switch port and nearly 80 Mbps when we inserted the hub.  The problem was recognized immediately - an auto negotiate mismatch between the switch and server and we didn’t even need to look at the packets.

It's also a good idea to keep a constant check on switch port collision and CRC statistics via your SNMP monitoring tool.  High values are one such indication of a duplex mismatch.  Yes, even today, we still see such problems from time to time.  Mike Pennacchi over at Protocol Specialists LLC, keeps reminding me of such.

BTW, real hubs are hard to find these days.  Everything has gone to a "switch on on chip" and that's what you find inside today's purported hubs.  My recommendation is to grab a few pocket-sized hubs for your team before it's too late.  The NetGear Dual-Speed (DS) 10/100 Mbps hub models, are my favorite and you can still find some on eBay.

May 11, 2007

Practical Forensics - Part 2 of 2

This second of two parts on practical forensics illustrates a means for detecting a type of "technical" activity as mentioned in the CERT study: “Organizations failed to detect or ignored policy rules such as forbidden downloads”.  In the previous blog, I referred to this as red flag #5.  For illustrative purposes, we are going to check for a user searching for and downloading password cracking tools or illegal software key generator utilities.  If detected, there's a high probability that the user will actually use such a tool.

One of the things I love about advanced network analysis and forensics tools is the ability to easily apply custom triggers and triggers to find stuff inside packets.  Triggers are a special case of filters that allow us to “trigger” to start a packet capture and/or send an alert if we are capturing in real time.

In this case, we are going to take advantage of a special capability of the analyzer to search for an arbitrary word or pattern anywhere in a packet.  This “sliding pattern match” does not require any prior knowledge of where the pattern might be.  To optimize performance a bit, we are going to start at offset 54 inside the packet, which tells the analyzer to start at the beginning of the TCP payload for any upper layer protocol.  No sense in wasting CPU cycles looking for application data in the data link, IP, and TCP headers.

The screenshot you see here (from WildPackets OmniPeek) shows such a filter, using a combination of AND along with OR conditions.  We start out by looking for the words ‘password’ or ‘key’.  If password is found, it must match (AND) the words or patterns ‘crack’ OR ‘krack’ OR ‘recover’.  Thus phrases like password crack, password krack, password recover, password recovery (or the reverse order) will be found.  Likewise for ‘key serial’ <number>, ‘key generator’, ‘keygen’, and so on.  I’ve also told the analyzer to ignore case sensitivity.

Filter_2

Test it by searching on-line with your favorite search engine (which will trigger a hit right there) or going to any website containing such tools.  The filter and/or trigger will hit immediately – even if the tool is named something else, purveyors of such tools love to fill up their underlying web site code with key words to gain search engine positioning.

This is the tip of the iceberg when it comes to real time or forensics data mining.  Such tools can be invaluable in assisting you to effectively combat potential sabotage and espionage in your network.  For more ideas, check out the forensics filters available for download at the WildPackets Developer Network (Registration is free but a login is required).

May 09, 2007

Practical Forensics - Part 1 of 2

With the billions of dollars spent protecting our corporate networks from the outside world, when will we begin to pay serious attention to what happens inside our networks?  To continue the theme of Network Forensics from my previous entry, I'll take a closer at the security aspects with respect to your network from the inside.  We'll take a brief look at a Carnegie Mellon study, followed by a practical tip you can apply using your favorite forensics and analysis tool.

The world famous Carnegie Mellon Computer Emergency Response Team (CERT) published an interesting paper entitled “Comparing Insider IT Sabotage and Espionage:  A Model-Based Analysis."  This 108 page (gulp) study cannot be summarized in its entirety here, but if network security is of interest to you, I highly encourage downloading it.  The best part is that it was funded by the DoD and as such, is freely available.  Click here for a direct link to the paper.

The study examined “the psychological, technical, organizational, and contextual factors” which contributed to espionage and sabotage against IT.  Their research led to the following red flags:

  1. Saboteurs had personal problems outside the workplace.
  2. Stressful events such as internal reorganizations increase the likelihood of malicious acts.
  3. Poor work ethics including performance or tardiness were often observed before and during sabotage.
  4. Insiders had a tendency to “set-things up” such as creating back door accounts.
  5. Organizations failed to detect or ignored policy rules such as forbidden downloads.
  6. A lack of access control for both physical locations and on-line computing resources.

The report goes on to provide recommendations for further research to mitigate the risk.  One of the repeated themes is to acquire “improved data” related to things like interrelationships, stressful events, assess policy enforcement vs. technical rule violations, research tools for auditing and monitoring, etc.

While personal problems can be tough to deal with especially with touchy regulations like HIPPA, we can deploy tools to help us with technical matters.  The tool I have in mind provides powerful forensics capabilities as discussed in the previous blog.  Not discussed were specific best practices for using such a tool. 

To get your creative juices going, on Friday I’l share with you one such technique to help detect anomalous activity.  This technique detects scrupulous activity that you would otherwise not see with simple network statistics.

Stay tuned and check back!

August 21, 2006

10GBASE-T: Four Years in the Making

The IEEE has recently approved the 802.an-2006 standard for 10GBASE-T, 10 Gigabit Ethernet over twisted pair media.  With Cat6A approval just around the corner, this will be a big boon for 10 Gig.  The previous 10 Gig copper standard, CX4, was of very short distance (15 meters vs. the more conventional 100 meters promised for Cat6A) and the weird custom built and terminated cables are expensive.

To put the complexity of this milestone in perspective consider this:  “Work on 10GBase-T began in 2002, and the IEEE task force that wrote the specification agreed that 10GBase-T would require roughly 1,000 times better cancellation of internal cable impairments, at more than six times the speed of 1000Base-T.”  Not a merely a factor of 10 over Gigabit Ethernet, but 1000x better!

Click here for the full scoop and read related posts at "Tapping 10 Gig" and "10 Gig Big in '06?".