The Wayback Machine is a digital archive of the internet, maintained by the Internet Archive. It allows you to view past versions of websites, which can be a valuable tool for OSINT investigations.

For example, you can use the Wayback Machine to:

  • Find information that has been deleted from a website.
  • See how a website has changed over time.
  • Investigate the history of a website or organization.

Here are some of the ways you can use the Wayback Machine for OSINT investigations:

Finding all saved copies of a page You can use the Wayback Machine’s CDX API to search for all of the archived copies of a particular web page. For example:

https://web.archive.org/cdx/search/cdx?url=andreafortuna.org

This query finds all archived copies of this website.

Finding all saved pages in a particular section of a website You can search for archived copies within a specific section:

https://web.archive.org/cdx/search/cdx?url=https://andreafortuna.org/2022/*&collapse=urlkey

This query finds all archived copies of pages under the “2022” directory on this website.

Finding all URLs of a website (with subdomains) You can search for a list of all URLs on a website:

https://web.archive.org/cdx/search/cdx?url=*.andreafortuna.org&collapse=urlkey

This query finds all URLs on this website.

Finding URL copies over a given period of time You can search for archived copies within a specific date range:

https://web.archive.org/cdx/search/cdx?url=https://andreafortuna.org/*&to=2021&from=2020

This query finds archived copies of this website from 2020 to 2021.

Finding all saved files of a certain type You can search for archived files of a certain type:

https://web.archive.org/cdx/search/cdx?url=andreafortuna.org/*&filter=mimetype:text/javascript&collapse=urlkey

This query finds all archived JavaScript files on this website.


There are also same useful command-line tools for automating searches:

  • GAU Fetch known URLs from AlienVault’s Open Threat Exchange, the Wayback Machine, and Common Crawl.
  • Waymore Find way more from the Wayback Machine, Common Crawl, Alien Vault OTX, URLScan & VirusTotal!
  • WaybackUrls Fetch all the URLs that the Wayback Machine knows about for a domain
  • Katana A next-generation crawling and spidering framework.
  • Wayback Keyword Search Downloads each page from the Wayback Machine for a specific domain and enables further keyword search on each saved page.

Finally, alternatives to the Wayback Machine include:

The Wayback Machine is a powerful tool for gathering information in OSINT investigations. By using the CDX API and the tools mentioned above, you can automate searches and find hidden information.