Connecting the Dots

3d network connections isolated in white backgroundOne of the most important lessons from cyber-war fighters is that relying on a single mechanism to defend your enterprise is naive. In fact, the more disparate and heterogeneous the network defense mechanisms, the better. MetaFlows fully embraces this concept by providing several detection mechanisms that work together:

  • IDS behavioral analysis looking for multiple symptoms that indicate a compromised host.
  • Using up to 50 different antivirus solutions at once to find bad content on the network.
  • Honeypots continuously mining for new threats.
  • Flow and log analysis.

These are just a few things that MetaFlows does.

Until now, MetaFlows has used these mechanisms independently to find and defeat threats. Our multifunctional approach has proven to be very effective. Many customers characterize the MetaFlows Security System as “The Last Line of Defense”. But now, we just upped the ante!

Leveraging our multifunctional view, we now also support behavioral correlation to combine disparate intelligence sources. Our Correlation Engine Rule (CER) specification now allows you to connect the dots across the different functional paradigms. But enough smoke and mirrors! Here are some REAL examples.

Data Exfiltration

  1. Detect the external hosts that are scanning your network.
  2. If any of these hosts exchange more than a few thousand packets with an internal host, flag the internal host as compromised.

Notice that (1) is an IDS function while (2) is a flow analysis function.

Zero-day Infection of Something That Cannot Be Executed in a Sandbox

  1. Host A downloads a bad .exe from server B .
  2. Host C (an Apple computer) downloads a JAR file from server B .
  3. Host C is talking to a known Command & Control site.

(1) is detected by a virus scanning application, (2) is detected with L7 analysis, (3) is detected by an IDS rule.

These examples demonstrate why traditional defenses are inadequate. Correlated together, these rules give you a powerful view of exactly what is happening on your network. You really need a multifunctional system that can connect the dots.

What’s Wrong with Sandboxing?

How Sand-Boxing Works

The latest and hottest trend in cyber-security is sand-boxing. Sand-boxing is virus detection on steroids. Instead of relying on prior knowledge about particular viruses, this technique emulates a user’s workstation with a sandbox and tracks anything that attempts to go out of the box or attempts to infect other machines. The process is straightforward:

  1. Get all potentially infectious content coming into your organization, and
  2. Emulate each piece of content as if it was executing on your hosts.

Limitations of Sand-Boxing

Sand-boxing has low false positive rates, but causes a lot of false negatives. In other words, when it tells you that something is bad, it is almost certainly bad. But it has the potential to miss a lot of bad things.

Architectural Limitations

PerimeterThis limitation has to do with step 1 above (get all dangerous content coming into your organization). Your defense perimeter is dissolving because of new network trends and applications:

  1. Mobile devices continuously come into and go out from your network.
  2. Peer-to-peer protocols (which go right through sand-boxing and firewall appliances) are becoming mainstream (skype, bittorrent, b2b applications).
  3. Services are being pushed to the cloud, out of the grasp of your sandbox.
  4. Virtual machines move around at the speed of light from one host to another.
  5. IPv6 and other emerging trends are facilitating end-to-end encrypted tunneling right through your perimeter.

So, if you do not have a perimeter, how do you know what is coming in? Well, you don’t! That is why sand-boxing (or pure virus detection) is limited in scope and cannot survive the evolution of malware.

Another architectural limitation has to do with cost. If you run a large network, executing and/or opening every piece of content before it is delivered requires a lot of CPU and will slow down your network. Sand-boxing can only scale to a certain size; beyond that it becomes unrealistic and expensive.

Algorithmic Limitations

EvasionThis limitation has to do with step 2 above (emulate each piece of content as if it was executing on your hosts). Evasion is an information security term that refers to the ability of the bad guys to:

  1. Know how you are detecting them and
  2. Add subterfuges to defeat your specific security measures.

A sandbox can be detected. Once malware realizes that it is in a sandbox, the malware will switch to its best behavior so that the sandbox is happy. Only when the malware gets out of the sandbox and on to the the actual target device will it do its damage.

A second algorithmic limitation is that not every system is the same. Sandboxing a particular version of Microsoft (which is what commercial sandbox solutions do) leaves all you other devices (Linux, Apple, Android, etc.) completely open to attack.

How is MetaFlows Better?

MetaFlows is not an antivirus. We detect the attempts to introduce a virus in your network AND/OR detect the presence of a virus. Think of it as a network-level sandbox that not only inspects individual pieces of content, but also keeps track of the behavior of all your devices over time. There is one thing a malicious host cannot evade: being malicious!

If it looks like a duck, swims like a duck, and quacks like a duck… it is a duck.

How does it work?

MetaFlows looks for classes of odd behavior from hosts on your network:

  1. Scanning behavior
  2. Being attacked on vulnerable ports
  3. Downloading dangerous content
  4. Communication with questionable sites or sites that are already known to be bad
  5. Scanning outward or doing a lot of DNS lookups

If we detect behavior from multiple event classes over a time period (ranging from minutes to hours), MetaFlows triggers an alert.

Here is simple example:

  1. External host B performs a brute force attack to guess your password on port 22 on server A .
  2. One hour later there there is a large transfer of data from server B to another server C (on your network).

Bang! That’s a hit for us. But a sandbox has no clue! By itself, a sandbox would not detect this behavior. The malware could “play nice” once it realizes that it is in a sandbox. The sandbox would then allow the malware to leave and get inside your network, where it could do substantial damage. But MetaFlows can keep an eye on software even after it leaves the sandbox.

biohazard-laptopThe main advantage of a network level sand-box is that it does NOT solely rely on inspecting content (like an antivirus) but instead detects malware in the act of being bad. So, if someone walks in through your front gate with an infected laptop, as soon as that laptop misbehaves, it will be flagged down.

 

The best part is that MetaFlows works regardless of what devices are on your network – it solves the algorithmic limitations of sandboxes. Our behavioral event classes do not depend on the type of system: if an internal host is performing outbound scanning, we do not care if it is a Microsoft device or an Apple device. All we need to know is that it has engaged in malicious behavior.

 

networkcableFinally, our approach is much more scalable than a content sandbox. MetaFlows mitigates the architectural limitations of sandboxes by scaling to 10 Gbps links with standard off-the-self quad-CPU systems. The cost and power consumption are orders of magnitude lower.

Predictive Correlation — The Future of Cyber Security?

What is Predictive Correlation?

Research funded by the National Science Foundation has led to the development of a proprietary inter-domain correlation algorithm that is mathematically similar to Google’s Page Rank algorithm. Event scores are autonomously obtained from a global network of honeypot sensors monitored by the MetaFlows Security System (MSS). The honeypots are virtual machines that masquerade as victims. They open up dangerous ports/applications and/or browse dangerous websites. As the honeypots are repeatedly infected, the MSS records both successful and unsuccessful hacker URLs, files, bad ports, and bad services. When a honeypot has a security event that triggers a false positive, the alerts for those events are ranked negatively, thus providing insight into events that should be routinely ignored or turned off. Security events that trigger true positives are ranked positively, thus improving their visibility. This information is then propagated in real time to each of our subscribers’ sensors in the system to augment traditional correlation techniques. This additional inter-domain correlation is important because it adds operational awareness based on real-time intelligence.

How does it work?

As shown in the figure below, honeypots work behind the scenes, continuously mining global relevance data and flow intelligence (IP reputation) for threats that penetrate differing degrees of cyber-defenses on different types of systems. After this step, annotated data from all network sensors (whether the sensors are honeypots or not) are compared and events are correlated with an algorithm similar to Google’s Page Rank algorithm: (X = bs + aW*X) .

Diagram of MetaFlows event correlation system
Figure 1: Predictive Global Correlation

This process is designed to provide subscribers with intelligence data that takes into account the similarities and differences between the sources of the data. For space limitations we cannot explain the math and why it makes sense; however our system builds on the work described in “Highly Predictive Blacklisting” by Jian Zhang, Phillip Porras, Johannes Ullrich, SRI International and the SANS Institute in Usenix Security, August 2008 (we highly recommend that you read this article).

So What?

As a result of the algorithm, once a piece of intelligence reaches our system it is not equally distributed to all customers. Instead, it is mathematically weighted and routed to where it is most relevant, just as the first few web pages of a Google search yield the most relevant information for a particular search.

In addition to real-time intelligence on true positive security events (positive ranking), our system also provides information on security alerts that are irrelevant by demoting them and reducing false positive clutter. In other words, this system can propagate known false positives and known true positives among sensors using a mathematical model that maximizes prediction.

Graph of prediction power for MetaFlows ranking algorithm

The graph above quantifies the prediction power of the ranking algorithm. The experiment was carried out on the Snort event relevance data gathered between February 7th, 2010 and February 22nd, 2010. At the start of each day we performed the ranking operation over the previous day’s Snort event data and compared the predicted ranking values with the actual events gathered during that day from the sensors and honeypots. The simple prediction (blue line) is based on predicting that, for each sensor, the same event ranking is carried over from the previous day without running the algorithm (this is what people normally do today).

The Y axis is the hit ratio. The “hit ratio” is defined as the number of times the prediction matches the outcome in terms of the sign (positive or negative), divided by the number of non-zero rankings predicted.

  • We increment the hit counter if the prediction and the outcome have both positive rankings.
  • We increment the hit counter if the prediction and the outcome have both negative rankings.
  • We decrement the hit counter if the prediction and the outcome have opposite signs.

The figure shows that the ranking prediction (orange line) is strictly superior to simple prediction by 141% to 350% (depending on the day). This might not seem too impressive on the surface but if you dig a little deeper this is what it means:

  • Assuming 5 minutes of human analysis time per incident, a system with no ranking would give you a hit rate that finds 1 actionable item for every 20-30 incident investigations (or 0.4 incidents per analyst hour).
  • A system with predictive ranking would let you find 1 actionable item every 6-7 incidents investigations (2 incidents per analyst hour).

You can do the math in terms of cost savings: it’s huge! Most of the cost of network security systems is not the appliance or the software, but rather wasted analyst time!

You Should Not Just Take our Word for it!

The cyber-security arena is packed with technologies that claim they have the best solutions. That is why we encourage users to take the time to evaluate our predictive correlation and run it side-by-side with existing solutions. The outcome is always surprisingly good.

Collaborate with An Audit Log

Audit Log

cloud-basedcorrelationThe MetaFlows Security System allows organizations to grant access to multiple users for online collaboration in sharing sensor data and intelligence. This is a big advantage because it helps distribute workloads across departments and at different levels of the incident response process. One issue customers brought up was the lack of ability to know who took what action, and at what time they did the action. This is why we added the Account Audit Log feature. You can find this feature under Account -> Account Audit Log. With this new Audit Log, you can track most account actions, including:

  • Changes to contact information and subscription
  • All account access
  • Sensor restarts
  • Creating, changing, or deleting:
    • Sensors
    • Classifications
    • Snort Rules
    • Report Specifications

For every logged action, we track the user, time, and IP address from which these actions originated. We also provide extra details if available.

New Packet Logging and File Carving

carvingPacket Logging and File Carving

Being able to go back and look at the payloads or files transmitted on a network is extremely useful for several reasons:

  1. If you do not have the payload, you cannot really prove malicious intent, and legally you are on the hook.
  2. Payloads/Files are the ultimate forensic tool to decide if a particular incident is a false positive or a true positive.
  3. In more advanced systems payloads can also be used to find false negatives (things should have caused a security event but did not).

Obviously logging all data transferring on a network is challenging because disk space is limited and disks are relatively slow.

The MetaFlows Security System Logging Approach

Our overall approach to overcoming logging limitation is:

  • We store Payloads/Files that are associated with a specific security alert (using the time and the source/destination addresses and ports for identification)
  • When logging proactively (to also see Payloads/Files that do not involve a security alert), keep the disk at 90% utilization or below a certain number of Gigabytes by deleting the older logs.

This scheme gives you certainty of access if there is an incident and a time window to go back in time to look for certain things that might have been overlooked.

Recent Improvements

The Logging and File carving system has been vastly improved by the following:

  1. We now index the packets based on IP addresses using a proprietary approach. Instead of looking for particular packets in a big bucket full of files, the files are divided in smaller buckets each representing a subset of the addresses. This indexing scheme slows down packet logging a bit but makes looking for packets about 200 times faster!
  2. We added the ability to specify user-defined logging policies. Once a policy hits, the logging system prioritizes all packets for the matching policy and stores the Files/payloads in a separate high-priority repository which takes precedence over the normal logging. We will make a separate announcement on the policy specification because it is quite powerful and complex, and requires a dedicated post. For now, the only logging policy is to prioritize any packets involved in high priority events. In the future users will be able to customize more precise ad-hoc policies based on IP addresses, ports, and type of alerts.
The new carving system is backward compatible and automatically converts the existing packet logs stored on the sensor hard drive to the new indexing scheme. This process can take from a few minutes to days depending on your disk size. While this conversion takes place, queries on older logs may not return any data.

Got IPv6?

IPv6 support

???????????????????Many organizations are transitioning to IPv6 because it allows the address space to be managed more easily. One thing is for sure, hackers are on top of it; they are already serving Malware from IPv6-capable servers! It is therefore imperative that all the security software be IPv6 capable in order to avoid glaring security holes.

When it comes to IPv6 most people put the blinders on. Most security policies really just ignore it because it is not main-stream. But you can be sure that whatever is being ignored can be used against you. Ipv6 tunnels are proliferating and usually not monitored at all. They can easily be used to have a data exfiltration super-highway out your network.

The MetaFlows Security System can now work on both IPv4 and IPv6, without gaps in your security.

A Very Expensive Lesson in Malware Protection

The attack on credit card numbers through Target has made many realize that network security, malware, and password protection needs to be taken more seriously. According to the article below, the two major factors in this data breach were 1. undetected malware that was able to scan credit card numbers in the real time, and 2. simple/default passwords that were never updated (especially not in accordance with PCI regulations). Both of these issues have seemingly easy fixes: For the Malware, get something that uses not just signature but behavioral detection and gives analysts real-time analysis (oh hey, what do you know, the MetaFlows Security System does all that, and more!). For the passwords it is a bit trickier. This requires staff training and individual memories the size of elephants in order to remember the hundreds of passwords we use nowadays. But with some staff education on the importance of keeping passwords up to code, and perhaps some mnemonic tricks, the world can be a safer place.

IT Weaknesses Paved the Way for Target Hackers