Conservative abuse reporting throws new TLD program under the bus
Pioneers in Skirts: An Innovative Movement to Identify and Combat Bias in the Workplace

Facts & Figures: Whois Policy Changes Impair Blocklisting Defenses

In October 2018, APWG and M3AAWG jointly conducted a survey of 300+ cyber first responders, researchers, and law enforcement investigators to learn whether or not ICANN's Temporary Specification for Whois was interfering with efforts to mitigate security threats. The survey respondents overwhelmingly indicated that the Temp Spec, and in particular, the redaction of Whois contact data, was impeding investigations.
 
Having identified Whois redaction as an impedance,  the cybersecurity community should try to measure the impact.
 
I invited my colleagues Jeff Chan and Joe Wein of SURBL, and Ivo Bitter of Spamhaus, to work with me to conduct studies to measure the impact of redacted Whois contact data. I chose SURBL and Spamhaus because I've worked with my colleagues there for years and know their dedication to scientific methods. I was also keen to include widely used reputation services. Both of these services qualify: Spamhaus and SURBL blacklists protect billions of mailboxes and hundreds of millions of end users daily. We agreed to compare blocklist counts prior to, and following, the adoption of the Whois changes.
 
We conclude from two independently conducted studies that the onset of masking Whois contact data has had the direct, corresponding, and ongoing effect of reducing the number of blocklisted domains, dramatically undermining the efficiency of this and other security countermeasures.
 
Further, this interference exposes users of government and private networks, internet and hosting providers to various online threats that could have been preemptively stopped had Whois contact data remained available.
 
A briefing we've prepared for legislators and ICANN policy consideration follows. A more comprehensive report is forthcoming.

Facts & Figures: Whois Policy Changes Impair Blacklisting Defenses

Dave Piscitello, Interisle Consulting Group; Joe Wein, SURBL; Jeff Chan, SURBL; Ivo Bitter, Spamhaus

Summary

Let's begin with a summary of the problem space we intended to study:

  • Whois databases hold the records and details of who owns domain names.
  • Cyber attackers register domain names for political influence campaigns, fraud, malware hosting, and spam.
  • Investigators use Whois contact data to identify other domains with some or all of the same contact data that are owned by the same attackers.
  • Changes to Whois that were implemented on May 25, 2018 mask the Whois contact data.

We studied domain name tracking and blacklisting reporting provided by two industry-leading and influential reputation providers -  SURBL and Spamhaus -  to compare their blocklist counts prior to, and following, the adoption of the Whois changes.  

We have determined that changes to Whois impair blocklisting, and expose users of government and private networks, internet and hosting providers to various online threats that could have been preemptively stopped had Whois contact data remained available. 

Background

Criminals and fraudsters use global criminal networks to transmit email containing phishing scams, inauthentic content on web sites to incite, radicalize, or recruit terrorists, malware, sales of illegal pharmaceuticals (including opiods), and more. These threat actors register thousands of domain names to support these activities, cheaply and often in bulk, and use these in massive campaigns to thwart detection and mitigation activities.

  • Private actors, threat researchers, reputation providers (e.g., SURBL, Spamhaus and APWG) investigate online criminal activities in real time, and use Whois contact data to identify threat actors and blacklist domains they use.
  • They extract domain names from billions of email, text, and messaging app correspondences that they process daily, to detect cyber threats.
  • Having found these domain names, investigators next ask, "Who registered these, and what other domains have they registered?”

Prior to May 25, 2018, contact data from Whois databases around the world answered that question, and the resulting data was confidently used by IT staff to protect the users of government agency networks, ISPs, private networks, and hosting providers.

Whois contact data provides the means to find related domains

Threat detection and blocklisting are by necessity ongoing and iterative processes:

  • Criminals typically register hundreds or thousands of domains at a time, and can readily replace any domains that are identified as malicious and blocklisted and are therefore no longer useful to them.
  • Investigators query Whois databases constantly, in real time, as part of their analysis methodologies.

Whois contact data – whether owner name, email or postal addresses, telephone number – that is associated with confirmed malicious domains is essential to finding other domains with some or all of the same contact data. Some of these domains may be registered but not yet used, so this processing adds a valuable preemptive element to blocklisting. When matches are found, and other abuse criteria are satisfied, these domains are added a blocklist. Investigators use the lists to dismantle the criminal’s attack networks. IT administrators use the lists to protect users and networks from threats. Cybersecurity analysts need to find virtually ALL of a malefactor’s domain names in order to stop criminal campaigns.

Two Independent Studies, Similar Results

We conducted a study of domains that SURBL and Spamhaus identified and blocklisted from January 2018 to January 2019.

From January 2018 through May 24, 2018, SURBL and Spamhaus had access to Whois contact data for nearly 200 million domain names, and were able to identify lists of registrants that engaged in illicit or cyber threat activities.

On May 25, 2018 many Whois service operators began masking point-of-contact data. Only a handful of Top-level domains and registrars continue to provide unmasked Whois contact data.

To illustrate the impact of masking Whois, we graphed SURBL and Spamhaus blocklistings from January 2018 through January 2019. The counts in these graphs represent domains for which SURBL and Spamhaus had access to Whois contact data, and could thus find domains with similar Whois contact data "indicators".

Findings

Studies shows that Whois contact data availability correlates to blocklisting efficiency.

Figure A illustrates the precipitous drop in the number of criminal or cyber domain names that are identifiable, and thus trackable using Whois contact data, after May 25, 2018.

Neither SURBL nor Spamhaus are able to determine whether registrations created after May 25 2018 are part of a known criminal actor’s arsenal of domains, which also adversely affects the ability to separate good from bad. Knowing who the good actors are is extremely valuable information in conducting threat assessment.

Figure A Data available from Whois

               Figure A: Data Available from Whois   

Studies illustrate how the masking of Whois contact data impairs blacklisting services.

SURBL provided two sets of blacklist counts for the studies:

  • Set One, depicted in Figure 1, represents the TLDs .us and .gdn. These Top-Level Domains still provide unmasked Whois contact data, and so Whois contact data was thus available for the entire study period.
  • Set Two, depicted in Figure 2, represents counts of all other Top-level Domains. In this set, contact data was available until May 25, 2018 and masked thereafter.

Figure1 SURBL US-GDN

Figure 1: US and GDN Blocklistings

Figure2 SURBL other TLDs

Figure 2: Other TLD Blocklistings

Figure 1 illustrates that SURBL is still able to blacklist “.us” and “.gdn” domains using Whois point-of-contact data to identify a criminal or malicious actor, and then to find other domains registered by that actor. Figure 2 shows a dramatic and continuing decline in our ability to blocklist domains based on Whois point-of-contact indictors where Whois is masked. The number of domains registered by bad actors grows continuously, so the number of blacklisted domains should also grow over time.

Spamhaus did not study US and GDN separately but instead provided a set of blacklist counts for all TLDs.

Figure 3 represents all Top-Level Domains blocklisted by Spamhaus from January 2018 through January 2019. The counts in these graphs represent domains for which Spamhaus had access to Whois contact data and could thus find domains with similar Whois indicators.

Figure3 spamhaus all tldsFigure 3 Spamhaus All TLDs

Spamhaus experienced similar results despite using different data sources and methodologies than SURBL. As the trend line again illustrates, there is a dramatic decline in blocklisting to a fraction (~30%) of blocklistings as compared to pre-May 25, 2018 counts, despite two significant campaigns that triggered exceptionally large blocklisting events.

Note that the trend lines from Figures 2 and 3 track closely with the decline of available Whois contact data in Figure A.

Conclusions

Greg Aaron and I reported in an ICANN blog that reputation services (RBLs), "are used ubiquitously and are a proven way to protect Internet users. RBLs have been in use for twenty years. During that time, they have been one of the most widely deployed and effective security solutions on the Internet. It is likely every type of entity relies on RBLs, including companies, governments, nongovernmental organizations (NGOs), mobile networks, Internet service providers, email service providers, and social networking sites."

Policies that impair reputation services have wide reaching consequences, and thus studies to measure any impairment are merited. From the findings of two independently conducted studies, we conclude that:

The onset of masking Whois contact data has had the direct, corresponding, and ongoing effect of reducing the number of blocklisted domains, dramatically undermining the efficiency of this, and other, security countermeasures. Some, but not all domains that are associated with a known criminal actor may be blocklisted using alternative data, but not in the timely manner that modern organizational or government agency risk mitigation dictates. Incomplete or delayed information increases vulnerability and users, organizations and sensitive data at increased risk.

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.

Your Information

(Name is required. Email address will not be displayed with the comment.)