What is it?

OSINT - or Open Source Intelligence - utilizes methods of gathering information passively from online sources, and not directly targeting an organization's infrastructure.

Methodology

  • visit the orgs main website, find references to listed technologies, services and structures to compile a list of what the organization may be utilizing
    • SSL cert may help identify other subdomains using the same cert
  • Certificate Transparency logs can identify other subdomains: https://crt.sh/
  • Grayhatwarfare can identify open cloud buckets and perform keyword searches
  • Job postings for an organization can reveal technologies that may be in use
    • Specifically job postings technical employees (network team, IT, software dev, cyber) can have targeted keywords that explain the architecture of the organization (ie 5+ years of Splunk)
  • WHOIS Lookups - searching for domain registration information
  • Web Archivers, like the wayback machine, scan and capture periodic snapshots of websites that can reveal information that may no longer be present
  • Social media can be used to filter on employees for a certain organization
  • Source Code Repositories (GitHub, GitLab.etc) can potentially have secrets or misconfigured repositories publically accessible

Domain / Subdomain Enumeration

# search registration of domain / IP
whois <domain>

# find subdomains via certificate transparency logs
# replace domain.com
curl -s "https://crt.sh/json?q=domain.com" | jq . | grep name | cut -d":" -f2 | grep -v "CN=" | cut -d'"' -f2 | awk '{gsub(/\\n/,"\n");}1;' | sort -u >> subdomainlist.txt

# save above list to subdomainlist and then pull IPs
for i in $(cat subdomainlist.txt);do host $i | grep "has address" | grep domain.com | cut -d" " -f4 | sort -u >> ip_addresses.txt;done

# cross-reference IPs against shodan
for i in $(cat ip-addresses.txt);do shodan host $i;done

# subdomain brute forcing internal-only via dnsenum tool
# add -r to recurse on a subdomain to see if it has subdomains
# --dnsserver can be used to bind to an internal server
dnsenum --dnsserver 10.1.1.10 --enum -p 0 -s 0 -o subdomains.txt -f <wordlist> domain.com

# subdomain brute forcing via bash for loop
for sub in $(cat <wordlist>);do dig $sub.domain.com @10.1.1.10 | grep -v ';\|SOA' | sed -r '/^\s*$/d' | grep $sub | tee -a subdomains.txt;done

Google Dorking

Google dorking is a method of sending special parameters via the search engine to find very specific results. You can combine multiple search filters with & or or. You can also utilize * for a wildcard and - before a search parameter to negate it.

Here's a full cheatsheet from GitHub. Below are some key commands that can be used for OSINT.

# find sites with directory browsing enabled 
intext:"index of /"

# find web.config files
filetype:config inurl:web.config

# find publically accessible confidential documents on a specific site 
ext:(doc | pdf | xls | txt | ps | rtf | odt | sxw | psw | ppt | pps | xml) (intext:confidential | intext:”account number”) site:target.xyz

# find emails for a specific site
site:target.xyz @target.xyz

# operators
site:       # limit results to a certain site
inurl:      # find pages with a specific keyword in the url
filetype:   # find specific file types like pdf, docx.etc
intitle:    # find pages with a specific keyword in the title
intext:     # find pages with a specific keyword in the body
inbody:     # find pages with a specific keyword in the body
cache:      # return a cached version of the site
link:       # find pages that link to this webpage
related:    # find websites similar to a webpage
info:       # provides a summary of info about a page
define:     # define a word or phrase
numrange:   # search for numbers in a specific range
allintext:  # search for multiple keywords all on same page
allinurl:   # search for multiple keywords all in same url
AND         # require filters to be true
OR          # require either filter to be true
NOT         # exclude filter
*           # wildcard for any character or word
..          # number range ex 1..50
""          # exact string match
-           # negate ex -inurl:sports

Automated Recon

# FinalRecon - Banner Grabbing, WHOIS, SSL Cert, Crawler, DNS, Subdomain, Dir, Archive
# https://github.com/thewhiteh4t/FinalRecon.git
finalrecon --full --url <url>

# Recon-NG - metasploit-esque recon toolkit with expandable modules
# example workflow. Workspaces are meant to contanizerize each project
recon-ng # start program
workspaces create <domain> # create workspace for target
workspaces select <domain> # enter workspace for target
db insert domains <domain> # add domain to search in database
use recon/domains-hosts/bing_domain_web # load module
set SOURCE <domain> # set module option
run # run module
show tables # see result tables
select * from hosts # query result data

# theHarvester - subdomain, emails, vhosts, ports/banners, employee names
theHarvester -d <domain> -l <num_of_results> -b google