IDS Blocks Cloudflare
More accurately, my intrusion detection systems try to block Cloudflare.
I've been using Fail2ban to monitor my servers' logs and inject IP blocks when it notices bad activities. It has a monitor that scans configured log files, and when the entries in the log files match one or more regular expressions for a configured number of attempts in a configured time window, it creates a "ban" for a configured amount of time that doesn't allow that IP to connect to the configured port or ports. I have rules to watch SSH logins, IMAP/POP3 logins, mail spamming, and web services. Each server is configured with fail2ban, and watches its own log files. It works, but has a limitation that only the server that notices the abuse gets the block.
A couple years ago, I added Crowdsec as another mechanism. Unlike fail2ban, Crowdsec offers a distributed opportunity that leverages a centralized service on my network, and an agent on each server that needs protection (not all the servers are Internet accessible). The agents all report the suspected abuse to the centralized service, and it in turn tells the other connected agents so they also block the offender. Additionally, my centralized service occasionally shares its information with Crowdsec, and receives updates from them regarding the suspected abuses others have seen.
For my network, fail2ban suffices for the mail server, as there is just the one exposed server. Crowdsec helps with the SSH and web monitoring, helping protect all the connected servers whenever one of them notices an abuse attempt.
That part is all well and good. But today I noticed some notes in IDS logs that note blocks being added for Cloudflare.
I use Cloudflare to protect my servers from most direct access, leveraging their CDN services, and hiding my servers' IP addresses with their DNS proxies. When you look up my servers (like jekewa.com), you get a bevy of Cloudflare addresses instead of my address. Ultimately I'd like to only allow the CDN addresses (they give a list), and a few others, and block the rest of the world from seeing the servers at all. For now, obfuscation works, even though the servers can be found by IP scanning.
When an abuser reaches the site directly by trying the IP, and hits one or more of the things that are configured, they get appropriately banned, and the hits stop in the logs very shortly after (because sometimes the access logs are slower, I imagine, or connections are in progress). That's great, and how it 's supposed to work. It kind of depends on who sees it first, but either IDS will stop the traffic, so both don't necessarily create blocks.
However, when an abuser reaches the site through Cloudflare, the connection IP is one from Cloudflare. I do capture the original remote IP, as Cloudflare passes that along, but that isn't what the IDS sees. I considered tinkering with the rules to pull the origin IPs from the logs, or to change the way the logs are written so the remote IPs are written in the right place. But as I thought more, I realized it wouldn't work anyway. The log entry for the remote IP comes after the web access is complete. Further, if I did block the remote IP, the connections would still happen because they aren't coming from those IPs!
I poked a little bit, and Cloudflare does offer IP-based protection, but it doesn't seem so trivial to attach to the IDS I'm using, and has limits that might be hard to avoid or manage, given how much bad traffic hits the servers.
So, what happens now is that the IDS correctly identify the abuse attempts, but incorrectly identify the source, create a ban for the Cloudflare IP from which the traffic connected, and later the ban gets removed when the timeout expires. I have the Cloudflare IPs in my firewall whitelist, so the connections are allowed before the IDS ban is checked, so the rule created by the IDS (really an entry in the appropriate ipset list) is never hit.
I think I might rewrite the log pattern to put the origin IP in the right spot, so at least the block will be there should that IP directly find my server. Also so that the analytics will have the "real" IP in the right spot, and I won't have to have the analyzer parse them differently. Currently there are "host" and "vhost" logs written, depending on whether that remote IP information is available. That's a different thing, though.