Most of my posts on BreachBytes are about using flow data, primarily NetFlow, for network security, incident response and network forensics on enterprise networks. I also tend to get rather technical most of the time. For this post I want to take a step back and answer the following question: what’s the big deal about network flow data? Let me try to answer this question in a single sentence:

“Network flow data, which can be generated by all enterprise routers, provides security analysts with real-time, long-term network visibility that can be used to prevent data leakage, defend against the insider threat and enhance incident response effectiveness.”

Key Points:

  1. Generated by all enterprise routers: The technology is in place, your network can generate flow data in some form.
  2. Real-time: Flow reporting can be near-real time depending on configuration.
  3. Network visibility: Most enterprises are essentially blind to their internal network (the Soft Gooey Center — good in candy, bad in networks).
  4. Long-term: Disk is cheap and flows are small, while still providing adequate information for a variety of network security tasks.

Background

Before going into more detail, I want to provide some context. Network flow data is mainly used for network operations (quality of service, usage metering, etc). It can provide the operations staff with a variety of reports, benchmarks and throughput measurements to assure that the network is operating optimally. A quick Google search for “netflow analysis” returns a litany of links to products and vendors eager to provide you with their network operations solution.

A network flow can be defined as a statistical summary of a conversation between two hosts on the Internet. The hosts could both be internal, helping in detecting insider threat activity and policy violations, or inside-outside (a internal host going out to the Internet or a external host contacting an internal host). The fields in a network flow vary by type but usually include port information, protocol and the number of bytes and packets exchanged in the conversation.

All enterprise router vendors support some form of flow data, so in effect your enterprise has already made the investment in this technology, whether it is being used or not. In most routers, turning on flow data is a 2-3 line modification to the router’s configuration file. You have already paid a lot of money for your router and support contract. Make the most of it by installing a flow collector and turning on your flow data.

There are three majors technologies today in the flow arena: NetFlow, sFlow and IP Flow Information Export (ipfix). NetFlow is a Cisco technology that has been found in the IOS router operating system for many years. It is supported on Cisco’s enterprise routers and, more recently, switches. Other vendors, such as Enterasys, have adopted the NetFlow protocol in their routers as well. NetFlow data is sent via UDP to a collector location where it is stored for further analysis.

NetFlow is the dominant flow technology today but that may be changing in the near-future as open standards become more prevalent. sFlow and ipfix are the two likely candidates to steal momentum from NetFlow. sFlow is a packet sampling technology found in many routers and switches. ipfix is an open standard based on NetFlow version 9 that provides better reliability and improved extensibility. Like NetFlow, sFlow and ipfix send flows over UDP to a collector on the network (although the current IETF draft also allows for TCP or STCP transfer).

Security Benefits of Flow Data

Flows are ubiquitous, but how are they of benefit to network security operations? Let’s start with a comparison to packet capture technologies. Packet capture offers huge amounts of information content but at a heavy price. A medium-sized enterprise network can easily generate a terabyte of packet data in a single day — and that is just from monitoring internal to external traffic. Storing this volume of data is obviously expensive to store and searching the data is slow and painful at best. Flow data, by contrast, is very small as many packets are summarized in a single flow event record. A single UDP NetFlow packet can contain up to 30 flow events. Flow data is also much easier to understand than packet dumps. In many cases a simple text editor is all that is needed to analyze flow data, unlike packet data which requires a combination of tcpdump and Wireshark (Ethereal) to make sense out of a network event.

I am in no way advocating giving up packet data, however I am advocating adding flow data as a data source for network forensics and incident response. It is simply not feasible to maintain large packet capture repositories over the long term. Long-term storage of flow data on the other hand is cheap and easy, since your network is already capable of generating the data and the footprint is so small (relatively of course, a medium-sized enterprise is still capable of generating tens or hundreds of millions of flow events per day).

To my surprise, some network security professionals do not seem to understand why they should store network events long-term. They seem to think a small packet collection solution between the firewall and the internal network is an adequate data source for performing network forensics and incident response. I once had security analysts at a major bank admit there were times when their three-day packet capture window had long since passed when a security incident was reported. At their own admission they did not see this as a problem worth solving, even after they were made aware that flow data can be a great asset in solving such problems.

Flow data is nearly free to most enterprises and provides network visibility at from the router level all the way down to the switch. There is simply no reason not to collect this information (the performance impact to routers is a solved problem and incurs little overhead to the network). With the enhanced visibility provided by flow data comes the ability to find policy violations, monitor for malicious insiders and respond to network breaches and security incidents. If flow data is not enabled inside the network, you can never know the extent of a security incident and are forced to go off hunches or spend countless hours paying consultants to do host-based forensics. In the event of a compromise or breach, flow data is the data source that lets you quickly reconstruct the time line of the event and definitively determine the scope and magnitude of the incident.

Beyond network visibility for network forensics and incident response, flow data can be leveraged for a number of other security related tasks. Two hot areas right now that make excellent use of flow data are Network Behavior Analysis (NBA) systems and compliance reporting solutions. NBA is a form of anomaly detection and thus does not require a set of signatures and can detect zero-day attacks. The best products on the market today use flow events as a primary data source for their alerting capabilities. Compliance reporting, made popular by legislation like SarBox and HIPPA, can leverage flow data to assure that only authorized users are connecting to protected resources within the network.

Getting Flows

Hopefully this post has convinced you of the benefits of flow data to your enterprise network’s security. The next step for you is to reach out to those that can help you start leveraging flow data to make you a more effective network security professional.

If you are a security professional with an enterprise that is already collecting flow data for network operations purposes, talk to the network operations staff and see what your options are for gaining access to their flow repository. Some routers can be configured to send flows to multiple locations so ask if this is possible with the routers on your network. If not, your other option is to convince the network operations staff to split the flow stream at the collection point. Most flow collection software (commercial or free) has this capability so be leery if they tell you that it can’t be done.

If flow data is not currently being collected or you need to collect data specifically for security purposes, take a look at my posts on free NetFlow collector software and using sFlow for network forensics. There are also some links below to sites dedicated to particular flow technologies. You should be able to find plenty of information on how to collect flow data, configure routers and analyze incoming flow data streams.

Links About Flows:

  • NetFlow - Cisco’s page on NetFlow in the IOS router operating system
  • J-Flow - Juniper’s page about configuring J-Flow
  • SFlow - sflow.org has numerous resources on a variety of SFlow technologies
  • ipfix - The IETF ipfix charter
  • Wikipedia entries: NetFlow | sFlow | ipfix
Leave a Reply