Skip to content

Recon Part 3.5 – HaveIBeenPwned?

I wanted to do a quick write-up on the awesome HaveIBeenPwned Database which is maintained by Troy Hunt – https://haveibeenpwned.com – If you haven’t seen it, check it out!

I recently discovered there was a public API to query the breach databases and ended up thinking it’d be pretty cool to notify employees at my company that they were involved in whatever breaches they showed up in. For any sysadmins or blue teamers out there, I encourage you to use this on your own organization after speaking with anyone who should be informed and use it to train your users and emphasize that password reuse is a huge risk. Below I will present a very simple script (made far more verbose than needed to improve readability for new coders). The walk-through follows.

The Code

#imports
import sys
import requests


#vars
import time
email_file =""
output_file = ""
verbose = ""
bquery = "https://haveibeenpwned.com/api/v2/breachedaccount/"
params = ""
compromised_accounts = ""

#Preparation
try:
    email_file = sys.argv[1]
    output_file = sys.argv[2]
    if len(sys.argv) > 3:
        verbose = sys.argv[3]
except Exception as e:
    print e

if verbose == "-v":
    params = "?truncateResponse=False"
else:
    params = "?truncateResponse=True"

compromised_accounts = open(output_file, "w+")

#RUN
print "email file is {e}".format(e=email_file)
print "output file is {e}".format(e=output_file)
if len(sys.argv) > 3:
    print "Running in verbose mode"
else:
    print "Running in truncated mode"

with open(email_file, "r") as f:
        for email in f.readlines():
            email = email.rstrip()
            r = requests.get(bquery+email+params)
            if len(r.text) > 1:
                    print "{e} was invoved in these breaches: {b}".format(e=email, b=r.text)
                    write_data = "{e} has been in the following breaches! {b} \n".format(e=email, b=r.text)
                    compromised_accounts.write(write_data)
            else:
                    print "{e} has not been involved in any tracked breaches.".format(e=email)
            time.sleep(1.5)
compromised_accounts.close()

So, what does this do? At a high level, these 3 things happen:

  • Requires two arguments – source and destination files as well as an optional verbose option
  • The script opens the source file, reads the lines, and checks for a response on their API
  • It logs the output to terminal and an output file.

Great, so this will be very simple to go through. In regards to the inputs, the source file must exist and must contain line-separated email addresses.

Code Breakdown

Lines 1-4: Import modules we use, easy.

Lines 6-12: – Initial Variables. I’m setting these here so that you see where everything comes from/goes to, it would be more proper to do some/most of these more dynamically. Important variables to note are: params, bquery, and verbose. ‘bquery’ is the ‘breach query’ from the haveibeenpwned API – here. This returns an object if the email address is in a breach, it is empty if there is no data on that account. ‘params’ will determine if the response is verbose or truncated and be appended to our API query. ‘verbose’ is the script input argument (-v) which sets ‘params’ to ?truncateResponse=False

Lines 14-28: Preparation. Lines 14-21 take the CLI arguments and turns them in to variables. You must supply arguments ‘1’ and ‘2’. Argument ‘3’ is optional). I put these in a try/except clause in order to troubleshoot any weirdness, not having at leastsys.argv[1] and sys.argv[2] will result in index errors, for instance. Lines 23-26 check the ‘verbose’ variable. If it is present the script sets our parms value to the verbose flag for the API. Line 28 opens our output file.

Lines 30-36: Print information to CLI. This is just for convenience for the user – maybe she put a filename without quotes and included a space, this will, hopefully, help troubleshoot that.

Lines 38-49: The stuff. This starts by opening our starting file,  looping through it as a list of lines, and then extracting just the email portion of each list item. Once completed we send a request to the haveibeenpwned api – line 41 – the response is stored in the variable ‘r’. If there is data in r.text() we know that the request was successful and we print out the user and breach information. This data then is written to file. If there is no data in the response we print that the user was safe and move forward. Line 48 exists because the API rate-limits requests and requires a 1-second pause between requests. Line 49 gracefully closes our file.

So that’s it for this post – I’m in the midst of writing a much larger automated recon tool which incorporates a script similar to this which I will hopefully be presenting in October at Toorcon. More information to come!

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *