What is it?
OSINT - or Open Source Intelligence - utilizes methods of gathering information passively from online sources, and not directly targeting an organization's infrastructure.
Methodology
- visit the orgs main website, find references to listed technologies, services and structures to compile a list of what the organization may be utilizing
- SSL cert may help identify other subdomains using the same cert
- Certificate Transparency logs can identify other subdomains: https://crt.sh/
- Grayhatwarfare can identify open cloud buckets and perform keyword searches
- Job postings for an organization can reveal technologies that may be in use
- Specifically job postings technical employees (network team, IT, software dev, cyber) can have targeted keywords that explain the architecture of the organization (ie 5+ years of Splunk)
- WHOIS Lookups - searching for domain registration information
- Web Archivers, like the wayback machine, scan and capture periodic snapshots of websites that can reveal information that may no longer be present
- Social media can be used to filter on employees for a certain organization
- Source Code Repositories (GitHub, GitLab.etc) can potentially have secrets or misconfigured repositories publically accessible
Domain / Subdomain Enumeration
# search registration of domain / IP
whois <domain>
# find subdomains via certificate transparency logs
# replace domain.com
curl -s "https://crt.sh/json?q=domain.com" | jq . | grep name | cut -d":" -f2 | grep -v "CN=" | cut -d'"' -f2 | awk '{gsub(/\\n/,"\n");}1;' | sort -u >> subdomainlist.txt
# save above list to subdomainlist and then pull IPs
for i in $(cat subdomainlist.txt);do host $i | grep "has address" | grep domain.com | cut -d" " -f4 | sort -u >> ip_addresses.txt;done
# cross-reference IPs against shodan
for i in $(cat ip-addresses.txt);do shodan host $i;done
# subdomain brute forcing internal-only via dnsenum tool
# add -r to recurse on a subdomain to see if it has subdomains
# --dnsserver can be used to bind to an internal server
dnsenum --dnsserver 10.1.1.10 --enum -p 0 -s 0 -o subdomains.txt -f <wordlist> domain.com
# subdomain brute forcing via bash for loop
for sub in $(cat <wordlist>);do dig $sub.domain.com @10.1.1.10 | grep -v ';\|SOA' | sed -r '/^\s*$/d' | grep $sub | tee -a subdomains.txt;done
Google Dorking
Google dorking is a method of sending special parameters via the search engine to find very specific results. You can combine multiple search filters with & or or. You can also utilize * for a wildcard and - before a search parameter to negate it.
Here's a full cheatsheet from GitHub. Below are some key commands that can be used for OSINT.
# find sites with directory browsing enabled
intext:"index of /"
# find web.config files
filetype:config inurl:web.config
# find publically accessible confidential documents on a specific site
ext:(doc | pdf | xls | txt | ps | rtf | odt | sxw | psw | ppt | pps | xml) (intext:confidential | intext:”account number”) site:target.xyz
# find emails for a specific site
site:target.xyz @target.xyz
# operators
site: # limit results to a certain site
inurl: # find pages with a specific keyword in the url
filetype: # find specific file types like pdf, docx.etc
intitle: # find pages with a specific keyword in the title
intext: # find pages with a specific keyword in the body
inbody: # find pages with a specific keyword in the body
cache: # return a cached version of the site
link: # find pages that link to this webpage
related: # find websites similar to a webpage
info: # provides a summary of info about a page
define: # define a word or phrase
numrange: # search for numbers in a specific range
allintext: # search for multiple keywords all on same page
allinurl: # search for multiple keywords all in same url
AND # require filters to be true
OR # require either filter to be true
NOT # exclude filter
* # wildcard for any character or word
.. # number range ex 1..50
"" # exact string match
- # negate ex -inurl:sports
Automated Recon
# FinalRecon - Banner Grabbing, WHOIS, SSL Cert, Crawler, DNS, Subdomain, Dir, Archive
# https://github.com/thewhiteh4t/FinalRecon.git
finalrecon --full --url <url>
# Recon-NG - metasploit-esque recon toolkit with expandable modules
# example workflow. Workspaces are meant to contanizerize each project
recon-ng # start program
workspaces create <domain> # create workspace for target
workspaces select <domain> # enter workspace for target
db insert domains <domain> # add domain to search in database
use recon/domains-hosts/bing_domain_web # load module
set SOURCE <domain> # set module option
run # run module
show tables # see result tables
select * from hosts # query result data
# theHarvester - subdomain, emails, vhosts, ports/banners, employee names
theHarvester -d <domain> -l <num_of_results> -b google