Ethical Hacking for Programmers: Subdomain Enumeration

What is Subdomain Enumeration?

Before exploring subdomain enumeration, it’s important to understand domains. A domain, is iof the form,example.com, serves as a web channel for an organization. Organizations also use subdomains to organize different services, such as:

  • academy.example.com – Online learning
  • blog.example.com – Blog section
  • product.example.com – Product page
  • admin.example.com – Internal admin panel

These subdomains represent various assets, some of which may become outdated, forgotten, or misconfigured, creating security risks. Subdomain enumeration helps us uncover such assets, revealing hidden services, development environments, or internal tools.


Objectives of Subdomain Enumeration

Subdomain enumeration helps security professionals identify hidden assets. Many past breaches originated from misconfigured or forgotten subdomains. For example, in 2017, Uber suffered a data breach when attackers accessed a private GitHub repository linked to an exposed subdomain [AWS environment], leading to compromised user data.

By mapping out subdomains, organizations can:

  • Discover Hidden Content: Identify forgotten web applications or services.
  • Assess Security Risks: Detect vulnerable or misconfigured subdomains.
  • Map Infrastructure: Understand how an organization structures its web services.
  • Ensure Monitoring & Compliance: Track and manage exposed subdomains to prevent security incidents.

Existing Tools and Techniques in Subdomain Enumeration

1. Passive Enumeration

  • Gathers data from public sources like search engines, Certificate Transparency(CT) Logs, and WHOIS records.
  • Google Dorking: Uses advanced queries to find indexed subdomains.
    • Example Dorks:
      • site:example.com -www (Finds subdomains of example.com)
      • inurl:admin site:example.com (Finds admin portals)
    • Tools: theHarvester, recon-ng, crt.sh

2. Active Enumeration

Involves directly interacting with the targetted DNS servers which could log your identity.

[ DISCLAIMER: DON’T TRY THIS WITHOUT AUTHORITY]

  • Queries DNS resolvers and uses brute-force techniques with predefined wordlists.
  • Tools: sublist3r, amass, subfinder

Programmatic Way of Subdomain Enumeration

Crt.sh search
Crt.sh is a public search engine that lets you search through Certificate Transparency logs. When you query a domain, Crt.sh returns a list of all SSL/TLS certificates issued for that domain, including associated subdomains. It pulls data from the public Certificate Transparency logs, which includes every certificate issued since their inception.

We can use crt.sh website to search for the certificates and gets subdomain information from the common name.

Below is the python script that does it programmatically.

#!/usr/bin/python3
import requests
def fetch_certificates(domain):
    url = f"https://crt.sh/?q={domain}&output=json"
    headers = {"User-Agent": "Mozilla/5.0"}  # Set a User-Agent to avoid blocking
    try:
        response = requests.get(url, headers=headers)
        response.raise_for_status()  # Raise an error for HTTP error codes
        data = response.json()
        return data
    except requests.exceptions.RequestException as e:
        print(f"Error fetching data: {e}")
        return None
if __name__ == "__main__":
    domain = input("Enter the domain to search certificates for: ")
    certificates = fetch_certificates(domain)
    if certificates:
        certificates_having_common_name = filter(lambda cert: "common_name" in cert, certificates)
        common_names = set(map(lambda cert: cert["common_name"], certificates_having_common_name))
        print(f"Subdomains: {common_names}")
    else:
        print("No certificates found or an error occurred.")

Google dork
Google Dorking is a technique that leverages advanced Google search operators to discover publicly available information, including subdomains of a target domain. By using Google search queries, we can extract subdomains indexed by Google without directly probing the target server.

When a website has multiple subdomains, Google often indexes them, making them accessible through search queries. We can use Google Dorks like:
site:*.{domain} -site:www.{domain}

Here is a sample python script that uses google search functionality for dorking.

#!/usr/bin/python3
import re
from googlesearch import search

def google_dork_subdomains(domain):
    query = f"site:*.{domain} -site:www.{domain}"
    subdomains = set()

    try:
        results = search(query, num_results=100, unique=True)
        for result in results:
            subdomain = result.split("://")[1].split('/')[0]
            if subdomain.startswith('www.'):
                subdomain = subdomain[4:]
            subdomains.add(subdomain)

        return subdomains

    except Exception as e:
        print(f"Error during Google Dorking: {e}")
        return None
if __name__ == "main":
    domain = input("Enter the domain name: ")
    subdomains = google_dork_subdomains(domain)
    print(f"subdomains: {subdomains}")

Bruteforce DNS

Brute-force DNS enumeration is a technique used to discover subdomains by attempting to resolve a list of potential subdomains against a domain’s DNS records. This method is useful when subdomains are not indexed by search engines or available in public certificate logs. It requires the use of predefined wordlist, we will be using SecLists for our example.

The script below uses dnspython to perform brute-force DNS resolution on a target domain:

import dns.resolver

def bruteforce_dns_subdomains(domain, wordlist):
    try:
        with open(wordlist, "r") as file:
            subdomains_list = file.read().splitlines()
    except FileNotFoundError:
        print(f"Error: {wordlist} not found")
        return None

    subdomains = set()
    resolver = dns.resolver.Resolver()

    for subdomain in subdomains_list:
        full_domain = f"{subdomain}.{domain}"
        try:
            answers = resolver.resolve(full_domain, 'A')  # Query DNS A records
            subdomains.add(full_domain)
        except (dns.resolver.NXDOMAIN, dns.resolver.NoAnswer, dns.resolver.Timeout):
            continue
        except Exception as e:
            print(f"Error resolving {full_domain}: {e}")

    return subdomains

if __name__ == "__main__":
    domain = input("Enter the target domain: ")
    wordlist = "subdomains-top1million-5000.txt"  # Using SecLists wordlist
    found_subdomains = bruteforce_dns_subdomains(domain, wordlist)
    
    if found_subdomains:
        print(f"Discovered subdomains for {domain}:")
        for sub in found_subdomains:
            print(sub)
    else:
        print("No subdomains found.")

CHALLENGE YOURSELF

The best way to learn is to challenge yourself. I challenge you to build a complete subdomain enumeration tool with reference to above code integrated so that you can get the both of hacking and programming at the same time.
Here is a sample implementation from my side for the subdomain enumeration: https://www.github.com/pwn-security/sdenum


Conclusion

Subdomain enumeration is essential for developers, security researchers, and ethical hackers. Understanding manual and automated techniques enhances security awareness and infrastructure visibility. Future posts will cover additional hacking techniques programmers can leverage for security research.

Stay tuned for the next blog post in this Ethical Hacking for Programmers series!

Leave a Reply

Your email address will not be published. Required fields are marked *