Beware the Invisible Man Using SSH

Permanent link to this article:

A theory on the 1 billion account hack – and what you should do to avoid being Yahoo’d

Yahoo has been making a lot of news lately and not for good reason. Marisa Myers failed attempt to turn the company around which resulted in the sale of the company to Verizon for 4.8 Billion has been placed in jeopardy due to its inability to protect and secure its users data.

In September of this year, Yahoo had announced that information pertaining to 500 Million user e-mail accounts had been stolen dating back to 2014. It had taken them two years to discover and report on this loss,

In the immediate aftermath of this announcement, some US senators demanded to know what was learned of the first breach and called for hearings on the matter. But, as the news faded from the media spotlight, so too did the pressure to understand just what happened, and who knew what information when. Fast forward to December, with Yahoo announcing a second data breach, that eclipses the first breach, both in sheer scale over 1 Billion User Accounts, and elapsed time – the breach occurred in 2013.

Horrifically, it’s been reported that the information collected is being sold to hacker groups within the Dark Web and fetching upwards of $300,000 for the information. The information was turned over to US Authorities by an unnamed third party, which upon further investigation deemed it to be credible and formally notified Yahoo. Unfortunately, the downstream effects of this stolen information will be felt for years as hackers seek to exploit this data for financial gain. Therefore, I would urge every yahoo account holder reading this article to do two things right now: 1. Change your password and security questions. 2. Use multi-factor authentication on every web based account you have – bank, credit card, amazon, etc. Do yourself a huge favor, and do that right now.

Back to the important question, just how was someone able to steal user account information for over 1 billion accounts completely undetected? My guess, without having any inside knowledge is that it has all the markings of exploiting SSH Key Access, which is precisely how Snowden, and later Martin stole troves of data from the NSA and how the Government of North Korea stole every bit of sensitive information they could get their hands on from Sony.

SSH creates and encrypted channel that can’t be monitored. That’s what makes SSH so powerful and effective when used for good. However, if used for bad, all those fantastic capabilities render all your security tools – SIEM, DLP tools, etc. ineffective. Which is why we (as the inventors of SSH) strongly advocate that customers implement stronger controls over the entire SSH lifecycle, and recommend the immediate remediation of any SSH Key Access issues within your own company.

If you’d like to learn more about what can and should be done about SSH in your company, please feel free to contact me – I’m happy to help, or at least point you in the right direction.

Permanent link to this article:

May the Brute Force NOT Be With You

I recently met with a customer that is using username and password instead of keys to control SSH access.  For the past several months I’ve been so engrossed with solving SSH key management issues that I was somewhat taken aback by the approach.  Upon further discussions with some experts on the subject, I’ve come to understand just how dangerous that is.  Here is what I’ve discovered:

SSH Keys are the gold standard for SSH access.  SSH Keys are long and complex, far more than any username and password could be. Keys can be created for different sets of users, different levels of access, and no secret value is ever sent to the server and as such, SSH Keys are not prone to Man in the Middle Attacks.  In fact, modern SSH keys can use an extremely high level of encryption eliminating the possibility of brute force attacks. SSH Key’s definitely have their own security challenges, but there are solutions to eliminate those risks.

On the other hand, passwords are subject to the human element – forgotten passwords, password reuse, simple passwords that are easily guessed, or they are susceptible to brute force attacks.  Passwords are transmitted to the server and are also susceptible to Man in the Middle Attacks.

Okay, after hearing all of this I asked what turned out to be a very naive question – How likely is it that someone could crack an SSH password?  The response was – It can be child’s play, just Google it.  I did, and I experienced the same reaction that the Sheriff in the movie Jaws did from looking at clippings of previous shark attacks – quick, everyone get out of the water!

If your company is taking this approach to managing SSH access, then I strongly advise you to make it a top priority to change right away.  I didn’t come to this conclusion without first doing the research.  Having read the articles, watched the videos, investigated the software programs, and spoken to experts on the topic that I can confidently conclude that relying on username and passwords alone to control SSH access is extremely risky and a very dangerous proposition.  At a minimum, it would be very wise to assume that all username and passwords are compromised and further restrict access through multi-factor authentication (MFA). I’m a fan of SecureAuth, but many identity and access management vendors provide that capability. One thing to remember is that enforcement is key, you can’t simply use MFA for the jumphost (that won’t solve the problem), MFA has to be applied to all servers. If this is going on unabated in your company, we should talk about devising a more secure and comprehensive approach to ssh security.

May the brute force NOT be with you…

Permanent link to this article:

This could be heaven, or this could be hell.

california sign


In late February, California’s attorney general Kamala Harris released a breach report that you can find here.  The report requires companies conducting business in the state of California to use “reasonable security procedures and practices…to protect personal information from unauthorized, access, destruction, use, modification, or disclosure.”

Essentially, the reasonable security protocol’s she’s referring to are essentially the SANS top 20 security controls.  However, Ms. Harris expanded on the 20 security controls and emphasized that consumers should have the option to employ Multi-Factor authentication for system access, and use strong encryption of all customer data.  In the event that there is a data breach, she went on to say that the business that had been breached should provide fraud alert services to affected parties.

Even if you don’t conduct any business in the state of California, I believe that it’s just a matter of time before other states follow California’s example and establish similar requirements – so your business should really start planning for that accordingly.  I know, I know – add it to the list.  If you were looking for a irrefutable reason to justify additional security-related project funding, this is report is heaven sent.  However, if your budget can’t grow, re-prioritizing security initiatives will likely be hell.

What a nice surprise (what a nice surprise).  Bring your alibis…

Permanent link to this article:

Outside-In To Win

speed-bump-400x400With very few exceptions, most established business operate from the inside out.  The business becomes focused on what’s core (manufacturing, processing, etc.) and they project that focus out to the world.  What I find fascinating, is that most businesses don’t start out that way, but complacency and the mindset of “we’ve always done it this way” drive such behavior.  The further away from the core of the business one moves, the closer one gets to what really drives business, namely customers.  However, it’s been my experience that most established businesses inevitably fall victim to this inside out vs. an outside in perspective.  Of course, the risk of behaving in this way leaves these “established businesses” vulnerable to competition.  Also newer disruptive businesses that are hyper focused on improving user experience along this outer edge are forever changing customer perspectives and expectations.

Netflix is a great example of this “outside in” behavior. While established television, cable, and movie rental providers were focused “inside out”, Netflix was hyper focused on servicing the outer edge of the user experience. For Netflix, this mindset drove their behavior to give their customers what they wanted – on demand entertainment, wherever and whenever they wanted it.  The result of this strategy helped propel Netflix from a nascent media provider, to a fearsome entertainment powerhouse that the established providers are still struggling to contend with.

However, I would submit that the “outside in” approach doesn’t involve a ground breaking idea and could be something as simple as merely removing customer friction in a process.  For example, I recently signed up with a financial service provider over the telephone.  Overall, the process was fairly painless and positive until the representative told me that he would have to mail me several forms to sign.  Although the same representative helped me pre-fill the paperwork, mailing me documents to review and sign introduced what I felt was an unnecessary speed bump in an otherwise smooth process. To add insult to injury, the paperwork took 4 days to show up, and it sat on my desk unopened for about a week and a half.  It wasn’t until I got a reminder phone call from the company that I finally got around to completing the document and sending it back. It wasn’t because I changed my mind about using the service, but I got busy with work and the service became a lower priority.  The truth is, if I didn’t receive that reminder call, there’s a good chance the envelope would still be sitting on my desk unopened.

If the financial service provider had a greater “outside in” perspective, they likely would have been using DocuSign to allow me to complete the entire sign-up process, during the initial call with the representative.  Overall, this would have saved weeks of time, removed frustration, and provided me a very favorable view of the business.   Instead, my perception of this business is that they are a great company, with great people, but they are difficult to do business with.  This perception may not be a fair assessment, but since I had been exposed to the DocuSign technology when I purchased my home, that excellent customer-focused experience rightly or wrongly gave me a new standard for modern day document management.

I submit that established businesses should pay a lot more attention to their customer experience.  Examine all the clicks, telephone punches, phone transfers, and paperwork involved in the customer process and see what can be eliminated either through process re-engineering, or applied technology.  If one takes a Kaizen (continuous improvement) approach, customer satisfaction will soar, and as a result, the business will grow.  Isn’t that worth the effort?

Permanent link to this article:

What does the Microwave, 3D Printing and Splunk all have in common?


Q. What does the Microwave, 3D Printing and Splunk all have in common?  

A. They are all technologies that have introduced a fundamental paradigm shift from the conventional ways of doing things. Before the invention of the microwave oven, if I told you that I could place food in a box, turn it on, and cook food in a fraction of the time that it normally takes without making the box hot you would have looked at me like I was crazy. Prior to the microwave, your reference point would have been either a gas or electric range which incorporated a heating element (fire or heated coils) that would heat your oven.  You place your food into a pre-heated oven, and the ambient heat would in turn heat your food.

If you happen to be old enough to remember the mind blowing experience of witnessing the a microwave in use for the first time – you may have thought that it was some magic trick.  Personally, it took me a little to wrap my head around the technology.  No pan, no pre-heating, no metal – just place the food on a plate, place the plate in the microwave, set the time, hit run and pow – out comes hot food. Most shockingly, when you opened the door to retrieve your food, the inside of the microwave (minus the plate with your food) was cool to the touch – remarkable!  The need to heat food has existed forever, but this new approach to heating and defrosting food was fast, efficient, and radically different.

3D Printing is equally remarkable. Until the invention of the 3D printer, creating a product prototype or one of mechanical part was a manual, time consuming, and often-expensive process. Prior to 3D printing, if you needed to create or recreate something you would have to hand craft a model or prototype, create a mold and have the part fabricated in a special facility. The introduction of the 3D printer changed all that.  Now anyone outfitted with a CAD Cam program and a 3D printer can spin up mechanical designs in just a few hours, resulting in enormous savings of manpower and time.

Splunk is like a microwave and a 3D printer in that introduces a whole new way of accessing and leveraging machine and human generated data in ways that have heretofore have been impossible to achieve. Human generated data is simple enough to understand, that’s all the data we create from filling in information manually.  But just is machine data? Here’s the short answer – any electronic device that has some intelligence built into it generates machine data.  That would include everything from your automobile that you drive and the elevators your ride, to the laptop you type on, and the cell phone you talk on – in short, most everything creates machine data.

For the most part, this machine is often unused and often discarded, which is a huge mistake, here is why.  Imagine your automobile engine is running erratically.  You bring it into mechanic, he/she takes it for a drive, listens for the sound, and simply guesses what it might be, replaces several parts, and sends you on your way – on your drive back from the mechanic, you discover that engine is still operating erratically.  You are out hundreds of dollars, spent hours of you time bringing the car in and the problem still isn’t fixed.

Using the same example, you bring your car into the mechanic, the technician connects a handheld computer to the automobiles on-board interface, and reads the machine data (error codes) from the engine.  The handheld computer interprets these codes and alerts the technician to the malfunctioning part – the repair is made, and you drive home with the problem resolved.  No guesswork, no unnecessary repairs, and the problem is quickly repaired properly the first time.  That’s the power of leveraging machine data.

Just imagine that you could collect this machine information from all over your company, from every device (servers, applications, databases, hardware devices, etc.) and like the automobile example, you could quickly make sense of this data in real-time, how many hours of wasted activity chasing down technical problems could be eliminated?  Better yet, imagine you could break down informational silos and ask questions of this amassed data to help you manage your business more efficiently. Just imagine that you could easily organize and visualize all of this information in meaningful ways, with information updating these visualizations in real-time. Just imagine that you could set alerts, so that when conditions that you’ve outlined are met notifications are sent or programs are triggered to take immediate corrective action. Now imagine a world where finding information no longer involved taking days, weeks or months, but instead only took seconds or minutes. Actually, you can stop imagining, just like the Microwave Oven and 3D Printers, Splunk actually exists.

Permanent link to this article:

The inherent risk of a fixed focal point security posture

Fixed Focal point

There are inherent limitations to relying upon traditional Security Information & Event Management Systems or SIEMS, which are often overlooked that every organization must be made aware of. These limitations are: 1) SIEM’s fixed focal point and 2) Dependencies upon structured data sources

Maintaining a fixed focal point (or monitoring just a subset of data) only encourages nefarious opportunists to find vulnerabilities outside of this narrow field of vision. Any experienced security professional will say that all data is security relevant. However, traditional SIEM’s limit their field of vision to just a fixed focal point of data. To understand why this matters, let’s look at an example outside of information technology that’s perhaps easier to follow. Imagine for a moment that three are a string of home break-ins happening in your neighborhood.  To safeguard your property you decide to take precautionary measures. You consult a security professional and they make several recommendations – Place deadbolts on the front and back doors, reinforce locks on the first floor windows, set camera’s and alarm systems above the front and back doors and windows. With all this complete, you rest easier feeling far more secure.   This is what a traditional SIEM does. It takes known vulnerability points and monitors them.

Building upon this example, let’s imagine that a bad person comes along and is intent on breaking into your home.  She cases the house, spots the camera’s, and decides that the windows and doors on the first floor pose too much of a risk of detection. After studying the house for a while, she finds and exploits a blind spot in your defenses. Using a coat hanger, she quickly gains access to the home in just six seconds through the garage without any alarm being tripped. How can this be?  Well, your security professional didn’t view a closed garage door as one of your vulnerability points, so no cameras or security measures were installed there.  As a result, your home has been breached, and no alarms have been triggered since the breach occurred outside your monitored field of vision.  This scenario illustrates the inherent limitation of defining a problem based upon anticipated vulnerabilities.  Determined inventive criminals will figure out ways to defeat known defenses that haven’t been considered. That too is the inherent problem of traditional SIEM’s; they are designed to only look at known threats and vulnerabilities, as a result – they do little to no good alerting you to unanticipated threats or vulnerabilities.

Also, the dependence upon structured data sources also creates another serious security limitation. Traditional SIEMS store information in a relational database. The limitation of this approach is that in order to get information from different sources into a database, users first need to define a structure for this information, then force ably make the data adhere to this defined structure. Oftentimes, imposing this structure leads to relevant security information being left out in this process.

To illustrate why this is an issue, let’s imagine that detectives are trained to only look for finger prints when analyzing a crime scene. Their investigations totally ignore any information that isn’t a finger print – they search for finger prints, partial prints, and if they are really advanced, maybe they’ll include hand and foot prints. However, in the course of their investigation they completely ignore collecting blood, hair, saliva or other DNA related evidence. Now, just how effective would a detective be in solving this case if the criminal wore gloves and shoes? I think everyone would agree that the answer to that question is that wouldn’t be a very effective investigator. Well, that’s exactly what happens by limiting the types of data captured by force fit different data types into a standard database schema – running through a schema format process effectively removes lots of relevant information that can be of great help in an investigation.

Instead, since all data is security relevant, to be truly effective, security professionals must have the ability to collect information from all sources of data in its full fidelity. Since traditional SIEM’s strips out this ability, then it follows that no business should solely rely upon a traditional SIEM for security – make sense?

Instead, what is needed is more of a fluid approach to security, one that captures information from multiple sources, evaluates all known exploits, and allows you to correlate different information to uncover new potential exploits before a report-able data breach occurs. Splunk’s real-time machine data platform is extremely well suited to that task.


Permanent link to this article:

What is the difference between Business Intelligence and Operational Intelligence?


The differences between Operational Intelligence (OI) and Business Intelligence (BI) can be confusing. Just the name, Business Intelligence sounds like Nirvana. Show of hands, who doesn’t want their business to be intelligent? No, the names are fairly ambiguous so let’s turn to Google define to shed some light on their meaning;

Business intelligence, or BI, is an umbrella term that refers to a variety of software applications used to analyze an organization’s raw data. BI as a discipline is made up of several related activities, including data mining, online analytical processing, querying and reporting.

Operational intelligence (OI) is a category of real-time dynamic, business analytics that delivers visibility and insight into data, streaming events and business operations.

These definitions are helpful, but I think the picture above really illustrates the differences quite clearly.  Business Intelligence comes after the fact which is illustrated by looking in the rear view mirror of a car. Therefore, it’s helpful to think about BI as a reference to where you’ve been or what’s happened in the past.  Yes, you can store information in a data mart or data warehouse, and you can “mine that data”, but that doesn’t fundamentally change the fact that the information you are looking at or analyzing occurred sometime in the past.

On the other hand, operational intelligence is represented in the above photograph as the front windshield of a car depicting what’s happening right now in real-time. If you spot a large pothole in the distance, OI will alert you to that fact, and enable you to make a course correction to avoid ruining your alignment;  whereas, BI will only let you know that you had driven through a pothole as your car is wobbling down the road from all the damage.

Most businesses have the potential to leverage Operational Intelligence for competitive gain, but many are still stuck in the past with traditional BI tools. If you want to really crank up your business, I say it’s time to get real-time and discover what a paradigm shift of moving to OI can do for your business.

Permanent link to this article:

What is machine data and how can you harness it?

DataImage2Let me begin this post by first describing what machine data is.  Machine data is essentially log file information.  When engineers build systems (hardware or software), they usually incorporate some element of log file capture into the design of those systems for several reasons: first for troubleshooting purposes and second as a backup, in case something unintended happens with the primary system.

As a result, almost every electronic device and software program generate this “machine data”.   It’s fairly safe to say, that most things we interact with on a day-to-day basis captures this machine data.  For example, our automobile, cell phone, ATM, EZ-Pass, Electric Meters, laptops, TV, Online Activity, Servers, Storage Devices, pacemakers, elevators, etc. all generate and locally store this machine data in one form or another.  When we call the mechanic about a “check engine” light warning on our automobile, they ask us to bring the car in to the shop so that they can hook it up to the computer to diagnose the problem, we are leveraging machine data.  What the mechanic is really doing is accessing the machine data stored on our automobile to identify error codes or anomalies that would help them to pinpoint a mechanical problem.  And, the proverbial “Black Box” that is so crucial to explaining why an airplane may have crashed also leverages this machine data.

So, if machine data is everywhere, how come we never heard much about it?

In a word, it’s difficult.  Since machine data comes in lots of different shapes and sizes, it is a difficult proposition to collect and analyze this information against lots of different sources.  Going back to the car example, information collected from different sensors are all fed into one collection point.  The engineers building the automobile are able to dictate requirements to component manufacturers about the shape, format, and frequency of data collection, etc. of all this machine data.  Since they design and build the entire process, they are able to correlate and present this information in a way that useful for mechanics troubleshooting a car problem.

However, if we look at an enterprise IT infrastructure, this same collaboration & integration doesn’t exist.  A typical enterprise will have lots of unrelated components.  From, load balancers, web servers, application server, operating systems, pc’s, storage devices, multiple sites (on premise and in the cloud), to virtual environments, mobile devices, card readers, etc.  So, depending upon the size and scale of the business, they could have lots and lots of machines generating this data.  I personally work with some customers whose server counts are measured in the tens of thousands.

Within the enterprise, no universal format for machine data exists.  This fact creates an enormous challenge for any enterprise looking to unlock the value of machine data. That, combined with the variety, volume, and variability of this machine data can be downright overwhelming.  As a result, enterprises collect the information in silos, and resort to an old school, brute force approach to analyzing this data only when it’s necessary.  If a system or process is failing, a team is assembled from the various IT departments to engage in what can best be compared to an IT scavenger hunt, manually pouring through log files, comparing those files to the cause and effect across other log files throughout the network.  This whole process is so labor intensive and time consuming that if the problem is only intermittent a decision may be made to abandon even identifying the root cause.

Let’s go back to the car example.  Imagine that we bring our car to the mechanic, but instead of simply hooking a computer up to a command and control sensor, the mechanic instead had to connect to and analyze hundreds of different data points on the automobile, and compare all the available data against other data, with the hope of finding the problem.  To further build on this point, let’s suppose that our automobile emits an annoying screech at 35 mph.  We’ve had the car in the shop three times already for the same problem and have spent hundreds of dollars all to no avail.  Eventually, we come to accept the fact the screech as the new normal, and turn the radio up when approaching 35 mph.

There has to be a better way!

Let’s think about this for a minute, what would be needed to get the most value out of this machine data?  Well, if we tried to structure the information by storing it in a database using a schema, we wouldn’t be able to account for the variety of the data.  No, instead we’ll need a way to store information in an unstructured format.  Next, we’ll need a way to get the data from all the different devices to send the information to our unstructured storage in real-time.  Building connectors will be too expensive and difficult to maintain, so what we’ll need is a way to simply forward this machine data in any format to our unstructured storage.  Next, we’ll need be able to search the data, but how can we do that if it’s totally unstructured?  Well, to do that we’’ll need some way to catalog all the data.  Since the value of the data raises exponentially in relation to corresponding information, we’ll also need some way to correlate information across different data types, but how?  So we start to think, what’s common across all these different data types? Eureka! We discover the date that something happened, and the time that it occurred is present within all this data.  We’ll also need a way to extract information from all this data, otherwise, what’s the point of doing all this in the first place?  Hmm, since the data has no structure, creating reports with a traditional BI tool won’t work, besides reports are too rigid to ask the complex questions we will likely be looking for in your data.  Lastly, we’ll need to address the issue of scale and performance.  Whatever we design has to able to bring in massive amounts of data, in real-time, because knowing what’s happening in the present across everything we are running in our enterprise is way more interesting and valuable than what happened last week.

Well, we can continue to ponder ways to solve all these technical challenges, or we can just opt to use Splunk, whose brilliant engineers seemed to have totally nailed it.

Permanent link to this article:

End-to-End Operational Control & Application Management – Do you have it?

End to End


So much has been written about the US Government’s Health Insurance Exchange that I’m almost afraid to mention it.  For this posting, I’m going to stay out of the political fray and avoid rendering any opinion about whether we should, or should not have the Affordable Care Act, aka Obamacare.  Instead, I would like to discuss the Challenge of the Health Insurance Exchange strictly from an IT perspective.

The US Government has spent approximately 400 million and counting on the current system.  So far, the system has been down more often than it’s been operational.  Secretary Kathleen Sebelius is on the defensive, and she’s been called before congress to testify about how she spent the money,  what went wrong?, and how she plans to fix it?.  On top of that, her boss, the President of the United States has been forced to acknowledge the problems with the American Public.  You get the idea – the site is a train wreck.  What we discovered is that the project was rushed, the supporting technology was dated, the systems are vastly more complex than originally thought and nothing works as advertised.

Hypothetically speaking, how would solve these technical problem if you were Sebelius?   Bear in mind, you have to change the proverbial tires on the bus while its driving down the road.  Well, I’ve actually given this some thought.  Throwing the whole thing out and starting from scratch isn’t an option – it would take too long, and you have the President of the United States, Congress and US Public breathing down your neck.  No, about the only thing you could do in the short-term is identify, isolate and repair the glitches.  The trouble is, a single transaction spans multiple systems and technologies.  What’s needed is the ability to trace a transaction end-to-end in order to ferret out and address the problems.   Stabilize and fix what you can, and replace what you must.  Once stabilized, you can test and upgrade fragile components.  All this sounds great, but without End-to-End visibility and a single pane of glass to identify problems you wouldn’t know where to start.

I’m quite proud of the fact that I work for a software company that has actually solved this problem.  In fact, my employer (Splunk) is the only machine data platform that I’m aware of that can provide this level of visibility and insight across heterogeneous environments in real-time.  If you simply Splunk it, find it, and fix it, you’ll quickly get a handle on what you need to fix and your priorities.

Are you able to quickly identify and isolate technology problems across all your environments?

Permanent link to this article:

Older posts «