In this article I’m going to be talking about the first stage of a penetration test after getting clearance and scope: gathering information and planning an attack. I’ll be starting from the basics so you don’t need to have much knowledge or experience with the process, but you will need to be comfortable with Kali Linux and the command line interface. We’ll cover passive information gathering, port scanning, and service enumeration. Along the way we’ll learn about using tools like Nmap as well as a variety of open-source intelligence gathering techniques. Let’s jump right in!
Passive Information Gathering
When we talk about information gathering, we’re talking about gathering information. Stunning revelation, I know. It might sound boring compared to things like exploitation and data exfiltration, but it is arguably the most important step of a penetration test, aside from getting appropriate permission and scope of course. Personally, I have a lot of fun in this stage because you get to feel like a little cyber detective, and who doesn’t want that. Broadly speaking, this phase is divided into two parts: passive and active information gathering or recon. Passive info gathering includes things like using open-source intelligence (OSINT) such as domain registrations, social media, Google, Shodan, etc. to passively gather information about the target without doing anything to their network or systems directly. In contrast, active info gathering covers things like port or vulnerability scanning which use tools, make network connections, etc. to actively gather specific information about the target. Let’s discuss these each in more depth and talk about some common ways of doing each.
Common Tools and Techniques
You’re probably going to be surprised just how much you can find out about your target from using free, public, and legal methods that literally anyone can access. Google is a great place to start. One of the cool things it can do is use certain search parameters to help define a specific focus and automatically sift through different types of results. One of the simplest ways to being narrowing your search for information this way is through the “site” parameter.
Let’s just say for the sake of it that you’re doing a pentest for the really small retail chain Walmart. You could simply search in Google for “site:walmart.com” to get a full index of everything in the walmart.com domain that Google’s spiders have access to. Check out the screenshot below and/or try this out yourself with whatever domain you want to see how this looks in practice.

This may not seem extra exciting but even a simple search parameter like this can help you find specific information and, in this case, finding out all or most of a company’s subdomains could be a big asset later when trying to find an attack vector. Also, did you know you can chain different parameters together to get as specific as you want? Let’s look at another example to see how this could be helpful. Sticking with Walmart, let’s say we wanted to look for the presence of aspx scripts to do some testing on. By appending the search parameter “filetype:aspx” we can narrow this down to our specific domain like you can see below.

If nothing else, doing searches like this for file type extensions can tell you what server-side scripting languages are in use which could be important later on in the test. What would be really cool is if there was some kind of database full of Google search parameters that you could use to help find whatever it is you’re looking for. Fortunately, it does exist and it’s called the Google Hacking Database. This is a database maintained under exploit-db by Offensive Security that includes tons and tons of formulas to use based on a lot of different vulnerabilities and other info gathering needs. You can find it at https://www.exploit-db.com/google-hacking-database and you should probably spend some time checking out what it can do.
Another pretty awesome OSINT resource is the website Netcraft. Simply doing a quick search of your target domain through the site report tool will return a LOT of information including IP addresses, DNS information, a WHOIS lookup, and more.

This kind of information is all publicly available and helps you as a pentester get a better understanding of the target’s internet presence and external network. You can also use Netcraft’s searchdns tool to do something similar to what we did before on Google and find a whole lot of subdomains along with useful information about them.

Though there are other ways to obtain this information, Netcraft presents it in a manageable way and really streamlines the process for you. You can find this tool at https://searchdns.netcraft.com and the site report tool at
https://toolbar.netcraft.com/site_report if you’d like to try them out.
No article on passive information gathering and OSINT is complete without mentioning one of the most popular tools for the job: Recon-ng. A full discussion on its use is beyond the scope of this article but keep reading.
Recon-ng is “a full-featured web reconnaissance framework written in Python”, according to the tools creators at Black Hills Information Security. If you’re familiar with the wildly popular Metasploit Framework, you can think of this tool as sort of the OSINT version of it. It functions as a one-stop resource for all kinds of passive information gathering, including some of what I’ve already discussed. I STRONGLY recommend you check the tool out and get comfortable with it. The website is here at https://bitbucket.org/LaNMaSteR53/recon-ng/src/master/. This tool comes standard with Kali Linux so no extra installation steps are necessary if you’re using that distro for your pentesting needs. I could sit here and ramble on about all the different things this tool can do, but like most of this field, trying it out yourself will do you a lot more good.
With all of this talk about fancy tools and techniques, it’s easy to forget that sometimes the most simple route is the most useful. Looking to do a phishing campaign as part of a social-engineering pentest? Why not check out the company’s LinkedIn to find employees or executives, and then look up those employees on Facebook or Twitter to help you craft a better phishing email by understanding their interests and writing style. Maybe an employee has a username that they also happen to use elsewhere on the web which may lead you to even more info. The possibilities are pretty much endless with OSINT and the right combination of creativity and simplicity could lead to rich rewards.
Active Information Gathering
Typically, the active information gathering stage is going to occur after doing passive reconnaissance. This is the stage where you are digging in deeper to identify specific targets and research potential attack vectors. The key differentiation here is that active gathering involves making connections, doing scanning and other techniques that actively use and research the network resources of the target. It is important to understand that anything done in this stage (and after) very likely requires explicit permission! Do NOT run port scans or anything else in the section on a real-world target without appropriate scope and permission. If you don’t know what you’re doing, you could even cause a DoS of network resources and that would be considered an attack. An illegal one. Ideally, you’ve already got a home lab set up anyway so now’s a good time to use that.
One more thing before we get going. Keeping track of what you’ve found and taking good notes is essential. There’s a ton of options out there for note taking but I’ve come to rely almost exclusively on https://pentest.ws which is a pretty cool browser-based application geared specifically toward pentesting. It also includes a bunch of other really cool features that help cut out some legwork. I would recommend giving it a try.
Port Scanning
This is often the first step of the active information gathering phase and relies on making connections and searching for open ports on the target system/network in order to help determine the attack surface. By far the most popular tool for the job is Nmap so I’ll be discussing it below.
Nmap, and its GUI-based sister Zenmap, come default with Kali Linux so no steps are required to get it going other than opening a terminal window and smashing the right keys on your keyboard. I’m going to show you what those keys are now. Below is a very typical, and very basic command to illustrate the syntax of Nmap.

When you run this command, Nmap will open a TCP/SYN connection on a range of default ports and report back with their associated services. In this basic scan, that’s about all you’re going to get. Fortunately for us, this doesn’t even scratch the surface of what Nmap is capable of. A more typical, albeit still relatively basic command, will look like and return information similar to what is shown in the screenshot below.

The “-sV” and “-O” switches will enumerate the versions of detected services and attempt to discern the target operating system, respectively. The OS detection is usually pretty accurate but there are many ways it can be thrown off, even intentionally, and give you the wrong info. For that reason, it shouldn’t be the only source of that information if you’re able to find another reliable path to the truth elsewhere. Something interesting to note is that if you scan a specific port or range with the “-p” switch, Nmap may tell you that the port is filtered which likely indicates the presence of a firewall. Also keep in mind that just because a port is open it doesn’t mean it’s vulnerable to attack. That, however, is a discussion for another time, aka the next article I’m going to post about initial exploitation…
Although a straight forward TCP scan of a target server is always good, Nmap can also scan UDP ports with the “-sU” switch. Just like before, you can narrow this down to specific ports or let it run its default list of common ports. Due to the nature of the UDP protocol itself, these scans take significantly longer and can produce unreliable results but still, shouldn’t be ignored. These can be done in one fell swoop as well just by using the appropriate switches.
The Nmap Scripting Engine, or NSE, is a very powerful and extensible feature of Nmap that can do a lot of useful things. With it, you can do things like check for specific known vulnerabilities, enumerate SMB, check for anonymous FTP login, and so much more. Because of the vastness, I’m going to leave an in-depth discussion for a future article but for now, read through the documentation on NSE to get an idea of how to use it and all the things that can be done.
You can and should read a whole lot more about Nmap at the official website https://nmap.org/ and the NSE at https://nmap.org/book/nse.html . In the future I’ll be writing an entire article doing a deep-dive on this tool as well so keep your eye out for that!
Domain Name System (DNS)
Though you should already have a solid understanding of general networking before reading this article, I’ll pause for a very quick refresher. DNS is a system that maps IP addresses with domains and can translate back and forth between the two. Anytime you browse to a website by name, you’re using DNS to translate that name to the server’s IP address which is what you, the client, is actually connecting to.
So, how is this useful for information gathering? Well, DNS servers often contain a lot of information that can lead to a better understanding of a network and its domains, including things like email servers. The more we understand a target network’s structure, the better prepared we are to attack it. We’ll be using the “nslookup” command in the Linux terminal to get started. There are basically three main tasks that you will be using to enumerate a DNS server: forward lookup, reverse lookup, and zone transfers. Check out the difference between forward and reverse lookups in the screenshot below.

As you can tell, these lookups are used to find domains associated with an IP address depending on which you already have. Really pause to think about how you could use this type of information to you advantage. If the front door is locked, maybe there’s more than one door into the building?
DNS zone transfers, if enabled, can be a very useful thing for malicious computer geeks. Essentially, zone transfers replicate data between master and slave DNS servers which contain the treasure trove of domains within the given zone. This can be useful for administrators but poorly configured and exposed, a server with it enabled can leak all of this info to an attacker. If zone transfer is configured for these servers, you will get back a juicy list of other domains in the zone.
You can try out zone transfers for yourself using zonetransfer.me which is a project by DigiNinja to help students and administrators understand this potential vulnerability.
Before we move on, I want to mention a popular tool called “DNSrecon” which comes standard with Kali. This is a fun Python script that helps you with DNS enumeration, including automating zone transfers, and presents information in a very readable format. Check out the tool’s help or man pages for more info on its use.
Conclusion
Although this is a fairly brief introduction to the topic of information gathering, hopefully it’s been enough to get you started and, more importantly, get your gears turning and thinking about ways you might be able to use things like OSINT and port scanning to your advantage during a penetration test. A fun way to practice this type of thing is to do some digging on yourself. What can you learn about you with open-source intelligence? Well guess what, so can an attacker. It’s this mindset that will make you a better pentester and keep you safer in our new online world.
Resources
https://www.exploit-db.com/google-hacking-database
https://toolbar.netcraft.com/site_report
https://searchdns.netcraft.com
One thought on “Ethical Hacking 101: Information Gathering”