Security / June 7, 2016

Metadata Use Case: Retain Top Talent with URL Metadata

Employee turnover can cost a bundle.

According to a study by the Society for Human Resource Management, the average cost to find and train a replacement employee is six to nine months of the exiting employee’s salary. Some studies put that number even higher and also say that, as related to skill level, the costs can increase exponentially. Replacing an entry-level employee runs about 30 to 50 percent of the annual salary. For mid-level, you’re talking 150 percent. And for specialized and senior-level employees? Forget about it. You might be shelling out four times the annual salary of a departing employee to find a replacement.

So can metadata help with employee retention? Why yes, it can.

The Not So Good Old Way

Imagine that you have an employee base of 10,000 people and are concerned because you’ve noticed attrition in a certain department. Sure, you want to know who’s left, but you also want to know who may be considering leaving, maybe look into who’s visiting LinkedIn and other job sites.

The traditional approach would have you gathering information on who your users are, what computers they’re using, and what departments they belong to. Next, you’d determine what job boards are out there (LinkedIn, Monster, Dice, etc.). With that information, you could create a group and a tag and correlate all the data across a set time (say a month) to find out who’s visiting what sites when.

The problem is that stitching together all those records is hard. If you were to ask a computer (even a big bunch of servers) to perform even this simple computation, it’s going to start to slow down . . . until eventually, you get the information.

And what if more complicated computation is necessary? To find out if something is going on in your organization, even standard queries get complex fast, which makes getting actionable answers super difficult.

Turning Goo into Goodness

Let’s take a look at Hadoop. It lets you build a clustered environment so you can have hundreds of machines sharing drives in a single giant open source HDFS file system. Using various tools, you can take unstructured data “goo” like logs from Web proxies; AD DHCP requests; calendar invites to lunch in .PSTs; badge reader data; DNS (it doesn’t matter what the data is) and pour it into that big ol’ HDFS pot.

Once the goo is in the pot, you can query for things like: Who’s looking at LinkedIn right now? It’ll take the first pass of stitching everything together—from the IP address to the user name to the URL to the time—and then spit out a single specific record per user. In a day, that might turn out to be a million records. You take those and feed them into an analytics tool like Splunk in order to get a more manageable, visual “graph” of what’s going on over a given length of time. Boom! The data renders because all the hard work of stitching everything together has already been done by Hadoop.

Gigamon is doing the same thing with metadata—only at the network level and in real time! We’re saying, instead of dumping all the data, all the packets, everything about everything into the pot, ask the questions you want of the network first. Then take that very specific high-value, low-volume metadata, put it into an analysis tool, and make your decisions—on good, bad, whatever—based on the results.

The Metadata Way

Most NetFlow-generation engines are line cards and routers or control planes and switches. Unfortunately, routers and switches are not built to produce metadata for every packet and, therefore, can’t give you unsampled NetFlow without crushing the control plane. Instead, you get samples (e.g., for every x thousand packets; or at two-minute intervals). By contrast, Gigamon can produce and send unsampled NetFlow to collectors, storage devices, and SIEMs like Splunk, ArcSight, or QRadar. It’s easy for us if you’re hungry enough for it all.

With the latest release of the GigaSECURE Security Delivery Platform and if you’re consuming IPFIX/NetFlow v10, your SIEM and security analysts can leverage our unique metadata elements not only to run queries to find standard information about traffic—such as source and destination IP addresses and ports—but also to request DNS query information and URL and HTTP/HTTPS return codes.

But that’s not all. At Gigamon, we understand it might not make sense to send every NetFlow packet to your SIEM. That’s why we created adjustable NetFlow templates. You can configure the Gigamon appliance to send only NetFlow with URLs, DNS, SSL certificate info, or HTTP response codes to the SIEM; the rest of the NetFlow can go to a collector. That way, you have both real-time situational awareness and forensic NetFlow in case you need after-the-fact analysis.

URL metadata off the wire gets interesting when traffic going to the Internet is not behind a proxy. Much legacy software is not proxy-aware. In those cases, you can still snarf URL data and forward it to the SIEM, where it can live alongside your proxy logs.

Or, how about if you have machines at a remote site and you’re tapped, but don’t have the bandwidth to ship logs? You can do what you need with Gigamon—out of band. In the past, I’ve also experienced a number of proxy issues when running antivirus, finding that my proxy was underpowered or would add latency. Snarfing wire URL data is perfect here, too.

In the case of employee retention, metadata can be used to discover “intent to leave.” You can configure your Gigamon appliance to send IPFIX templates that contain URL and source IP information in order to trace what sites employees are visiting. Enabled machines or applications that cannot be proxied may still send information about what they are doing on the Internet for analysis by a SIEM like Splunk, ArcSight, or QRadar. The only difference is that the data is in IPFIX format and not syslog or JSON (for now).

Transform Intent to Leave into Reasons to Stay

Ninety percent of a CISO’s job isn’t about stopping advanced persistent threats (APTs) and defending against nation states, it’s about keeping good talent on board, as well as preventing people from doing dumb stuff like plugging in random infected thumb drives into domain controls and putting a company at risk, or sending confidential files because no one told them otherwise.

The network, again, is key here—as is URL metadata. It’s a way of catching dissatisfaction up front and creating an opportunity to make a change. If you can uncover intent to leave, you can intervene sooner and, as fitting, pull out all the stops to keep top talent.

Think about a new manager joining a company. Suddenly, his group has become much less productive. The department head comes to the executive committee and wants information about what is going wrong. Conversations were had, but people are still gloomy. What other ways can you begin to help assessing what’s happening? How about delving into whether or not there have been changes in workflow? Did the manager implement a new tool? If you can see these things on the network, you can start to uncover patterns and deduce what’s going on faster and more effectively.

As a manager, I’ve had really good people leave. In one case, when I was feeling disappointed and sad to lose a great guy, on the way to HR, I asked him why he decided to quit. He told me he was leaving because I didn’t give him a particular “Penthao” project. I looked at him and said, “You never asked me for it.” He looked back at me like I should have been able to read his mind.

If he had told me before—or if I’d had data points alerting me to the fact that something was amiss—things might have turned out differently. In his case, I would have happily given him that project. Sadly, it was too late at that point; he had another gig and I was left to find a (costly) qualified replacement.

Back to top