SSH Honeypots & DataViz
I put up a handful of small servers with SSH honeypots running, and have been watching who tries to break in. I didn’t publicize the addresses, or point any DNS at them, but they almost immediately got found by hackers across the globe. Here’s a visualization and analysis of the data so far.
What’s a Honeypot? A honeypot is a fake service that runs on a standard port, waiting for people to try to break in. There are lots of different kinds, covering different protocols and interaction levels. I’m using kippo, which is a medium-interaction (or “research”) honeypot. If someone spends enough time guessing, they’ll eventually get in, and once in they are in a protected area where all their actions are recorded for future analysis. This yields a treasure trove of data, including patterns of who is trying to break in, how are they going about it, and what do they do if they are successful. For a production environment, honeypots may still be a good idea, but a low-interaction environment would be more appropriate (meaning people can attack it all they want, but will never get into anything).
The Data As of Feb 6, 2013, I have 5 servers running, across two data centers (AWS and Digital Ocean). I started them at various times, all around Feb 3, 2013. The data is shown here on a globe, with markers where attacks have originated. Because the boxes were started at various times, and the results are aggregated, the location of the markers are more relevant than the magnitude. That said, you can also see the data in table form.
Why use a Honeypot? Honeypots help you gain insight into a wave of attacks that is constantly going on. For instance, there’s a big difference between seeing login attempts for normal system accounts (root, guest, etc) and seeing login attempts for specific employees (john.smith). In the latter case, it may be an indication that you are being targeted specifically, rather than just randomly attacked, and you may want to react differently. You can also get some of this information without a honeypot (for instance ubuntu machines record login failures in /var/log/auth.log).
Should I run a honeypot? As with any service you expose, running a honeypot increases your surface area on the Internet, so it’s not necessarily a good idea on any particular server. The servers I’m running to collect this data are all single-purpose boxes, with no valuable/sensitive information, and if the boxes actually did get compromised, it wouldn’t be a big deal (they are AWS and Digital Ocean instances). For a box containing data/services you care about, while I wouldn’t use a research (eg medium- or high-interactivity) honeypot, I could see an argument for a low-interactivity one. Given that your servers’ are probably being attacked daily, it might be safer to move your SSH port and put a honeypot on the default port, so it can handle the malicious activity. Only moving your SSH port (and leaving port 22 unused) is another option, but in the case where you’re being specifically targeted, having the default port closed would probably just prompt the attacker to port scan you to find the service.
What Next? I’d like to bring up other honeypots (on the same servers probably), and compare the relative frequency of attacks against different protocols (I suspect HTTP(S) is the most attacked).