Web Enumeration

Overview

Websites are their own dedicated beast when it comes to enumeration. There are countless combinations of ports, web server configurations, and applications that could be the weakness onto the host.

Firewalls

Public facing instructure should be properly hardened to prevent an attack. As an attacker, verifying if a web app firewall exists is a good first step.

# detect firewall (stops after a signature match)
wafw00f https://target.xyz

# full signature detection
wafw00f -a https://target.xyz

VHOST Enumeration

Virtual Hosts, or VHOSTS, allow a single server to host multiple websites and therefore multiple subdomains from one IP. This can be done through configuration of web servers such as Apache, NGinx or IIS. There are 3 methods for vhosting: - Name-Based - uses the HTTP Host header to determine where to direct the traffic. This can be accomplished with only a single IP and port on the server - tends to be the most common as it is easy to setup and cost-effective - IP-Based - each website has to have it's own unique IP, even if it's hosted on the same server. - tends to be more complex to setup, but offers better isolation on websites - Port-Based - each website has it's own unique port on the same IP. - again more complex, and not as common.

# map an IP to domain in hosts file
gobuster vhost -u http://<url> -w <wordlist> --append-domain -o <output_file>

Crawling

Crawling (or spidering) is an automated process to build a full map of a website. It starts at the homepage, finds all links on this page, then navigates into the child links recursively. Search engines use this functionality to build their indexes that are visible when performing a search.

Types

Breadth-First Crawling
- top-down approach. It only goes one-level at a time
- if a starting page has 5 links, it will capture these first, then go another layer deep and crawl all available links
Depth-First Crawling
- immediately goes as deep as possible on the first link it sees before backtracking to root and going down the next links.

Scrapy is a python package that can be used to build a custom site scraper.

# scrapy runspider quotes_spider.py -o quotes.jsonl
import scrapy


class QuotesSpider(scrapy.Spider):
    name = "quotes"
    start_urls = [
        "https://quotes.toscrape.com/tag/humor/",
    ]

    def parse(self, response):
        for quote in response.css("div.quote"):
            yield {
                "author": quote.xpath("span/small/text()").get(),
                "text": quote.css("span.text::text").get(),
            }

        next_page = response.css('li.next a::attr("href")').get()
        if next_page is not None:
            yield response.follow(next_page, self.parse)

Source and Directory Browsing

Viewing the source code of the website (CTRL+U) or network tab to monitor requests may reveal further information - look for notes in comments - look at other links and assets to see if those directories can be manually browsed to - record the folders listed in any local assets (images, css, js). Good for checking for seeing if directory browsing is open, or to use for gobuster

Directory Enumeration with gobuster

# scan https, skip SSL cert validation with -k, save to output -o
$ gobuster dir -u https://target.pwn -k -w /usr/share/wordlists/seclists/Discovery/Web-Content/big.txt -o 2_gobuster_dirs_https.txt

# scan http, spoof user-agent to legitimate useragent -a, save to output -o
$ gobuster dir -u http://target.pwn -w /usr/share/wordlists/seclists/Discovery/Web-Content/big.txt -o 2_gobuster_dirs_http.txt -a 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/149.0.0.0 Safari/537.36 Edg/149.0.0.0'

# scan http, spoof user-agent randomly -rua, save to output -o
$ gobuster dir -u http://target.pwn -w /usr/share/wordlists/seclists/Discovery/Web-Content/big.txt -o 2_gobuster_dirs_http.txt -rua

Fuzzing

Fuzzing is where you identify an input (input field, header, url parameter) and wish to send a list of data to it to monitor how it performs.

# send a directory traversal list to a URL param to determine if directory traversal via file inclusion is possible
$ gobuster fuzz -u http://target.xyz?file.php=FUZZ -c "PHPSESSID=tigdi51ib0jj3u07h4710k8av5" -w /usr/share/seclists/Fuzzing/LFI/LFI-LFISuite-pathtotest.txt --exclude-length 1031,1032

Manual Fuzzing

If an application provides differing error outputs, it's also possible to send specific data to monitor for how the error performs. For example, the Nineveh box I completed on HackTheBox:

Example of manual fuzzing of a URL endpoint and monitoring the returned error to determine how it works
| Path                         | Result                                                 |
|------------------------------|--------------------------------------------------------|
|?notes=files/ninevehNotes.txt | ✅ loads the file                                      |
|?notes=files/ninevehNotes     | ⚠️ throws the include() error that file doesn't exist  |
|?notes=test/ninevehNotes      | ⚠️ throws the include() error that file doesn't exist  |
|?notes=secret/ninevehNotes    | ⚠️ throws the include() error that file doesn't exist  |
|?notes=/ninevehNotes          | ⚠️ throws the include() error that file doesn't exist  |
|?notes=files/ninevehNote      | ❌ No Note is selected                                 |
|?notes=ninevehNotes           | ❌ No Note is selected                                 |
Solution: Include `/ninevehNotes` in the URL to allow the include() function to be called

robots.txt

This file is a standard that informs a particular user-agent (typically a crawler) what portions of the site it is allowed to crawl. It may contain Disallow lines that point to secrets or interesting locations.

Well-Known URLs

Follows the RFC 8615 standard, which outlines a directory on a website that contains critical metadata. Full list of URIs can be found here. For example, /.well-known/openid-configuration contains information on how OAUTH2 is configured.