All matters security

APWG and M3AAWG Survey Finds ICANN WHOIS Changes Impede Cyber Investigations

The Anti-Phishing Working Group (APWG) and the Messaging, Malware and Mobile Anti-Abuse Working Group (M3AAWG) have collaborated to conduct a survey of cyber investigators and anti-abuse service providers to understand how ICANN’s Temporary Specification for gTLD Registration Data has affected their access and usage of domain name registration information and their ability to mitigate abuse. I served as Principal Investigator for APWG and M3AAWG for this project. I received strong subject matter expertise support from both working groups.
 
From our analysis of 327 survey responses we find that the changes to WHOIS access following ICANN’s implementation of the Temp Spec is significantly impeding cyber applications and forensic investigations and allowing more harm to victims.
 
The "Temp Spec" has introduced delays to investigations and the reduced utility of public WHOIS data is a dire problem. The loss of timely and repeatable access to complete WHOIS data is impeding investigations of all kinds, from cybercrime activities such as phishing and ransomware, to the distribution of fake news and subversive political influence campaigns.
 
The report contains a detailed analysis of the sets of questions asked to an targeted audience of cyber security practitioners, anti-abuse service providers and law enforcement officers, who were contacted by primary using APWG and M3AAWG mailing lists, augmented with trust collaboration mailing lists used by operational security and law enforcement to share threat intelligence data. The analyses are complemented with comments submitted by survey respondents. Many of these are quite insightful.
 
From the analyses, the APWG and M3AAWG make the following findings:
  1. Cyber-investigations and mitigations are impeded because investigators are unable to access complete domain name registration data.
  2. The mitigation or triage of cyber incidents cannot be accomplished in a timely manner.
  3. WHOIS has become an unreliable or less meaningful source of threat intelligence.
  4. Requests to access non-public WHOIS by legitimate investigators for legitimate. purposes are routinely refused.
  5. Those who protect Internet resources are also making more coarse blocking or mitigation decisions in the absence of what was formerly reliable data. 
  6. The utility of WHOIS has been severely damaged.
  7. The redaction of WHOIS data is excessive.

APWG and M3AAWG make a number of recommendations as well:

  1. There must be an accredited access mechanism, providing tiered or gated access to qualified security actors.
  2. ICANN should not allow redaction of the contact data of legal entities.
  3. ICANN should adopt a contact data access request specification that will ensure consistency across all accredited registrars and gTLD registries.
  4. ICANN should ensure that the accredited access to redacted WHOIS data does not introduce delays in collecting or processing WHOIS data, and further, that the access not be encumbered by per request authorizations.
  5. ICANN should reconsider the current redaction policy.
  6. We ask that ICANN publish point of contact email addresses to provide investigators with an effective means of identifying domains associated with a victim or person of interest in an investigation.

In their final comments, the Working Groups encourage ICANN to improve the current, difficult condition, stating:

"We recognize that ICANN is likely aware of several of these issues. We also realize that ICANN organization and Board of Directors are awaiting the Expedited Policy Development Process for answers to many issues; however, we believe that the ICANN Board of Directors and ICANN organization have the ability to update the Temp Spec to fix the problems that this survey and others have identified as most pressing or egregious while the EPDP work continues."

It's essential that ICANN  implement recommendations 2, 4, and 6 and quickly. From a public safety perspective, these are necessary adjustments. These fall within ICANN's remit to ensure security and stability of the Internet's Identifier systems. ICANN organization should further ensure that the parties involved in consensus policy development for the remaining recommendations consider the findings and analyses in this survey. This would be consistent with the organization's expressed desire to apply data to ensure informed policy deliberation. 

 

Download FINAL ICANN GDPR and WHOIS Users Survey 20181018.pdf (1683.6K)


Whois studies: it's time to ask the right questions

I remain skeptical of all the Whois studies that I’ve reviewed (FTC, SSAC, ICANN), including studies where I was a party to the research. I’ll apologize for failing to contribute to a satisfactory Whois study. I’ll also admit that my understanding of how to study a problem scientifically has greatly expanded over the past ten years.

A truly scientific Whois study should meet scientific must meet certain common criteria. The purpose should be clearly defined; in particular, the researchers or parties who commission the research should make certain that they are asking the right question.

Before I raise anyone’s brows or blood pressure, I’m not suggesting that these studies lacked rigor.

The researchers were asked to answer a question and they did.

I believe that the wrong question was studied.

The moment you ask, “Does public access to Whois-published data lead to a measurable degree of misuse?” or “Is the WHOIS Service a Source for email Addresses for Spammers?”, you’ve constrained the research scope too narrowly. You’ve also set up a strawman. There’s a difference between asking the wrong question and asking the right question but in the wrong way

These studies had different findings. They appear to have been conducted scientifically. The Whois-specific findings and conclusions do not provide a satisfactory comparative analysis of Whois personal data harvesting versus other methods of collecting personal data; in particular, they do not help us understand

How much of a threat is personal data harvesting using Whois relative to harvesting using other sources?

Why is this the relevant question? Databases that contain personal data are exposed to multiple threats. To better understand the personal data harvesting issue, we need to consider not one but many of the known personal data harvesting methods that are used by criminals or misused commercially. The study should compare a known set of methods of personal data collection and rank Whois among these. The list of methods that cybercriminals or commercial interests use to collect personal data might include:

  • Database breaches, e.g., an attack resulting in the theft or disclosure of tax records or filings, financial accounts, credit card accounts, merchant accounts, bank logins, corporate (HR, ERM, CRM) or government user accounts.
  • Crawling and Search, e.g., visiting hundreds of thousands of web sites and extracting email addresses or other personal data from the pages visited, and employing sophisticated or advanced searches across billions of publicly accessible pages that search engines index.
  • Traffic or message capture, e.g., capture of personal data from unencrypted or compromised web (http) sessions, email transmissions or file transfers, or the exfiltration of data by malware through covert channels.
  • Email user enumeration attacks, e.g., brute force or “dictionary attacks” that harvest active accounts, and which are succeeded by password cracking attacks against enumerated users.
  • Malware, the collection of personal data from contacts files by malware that has infected a computer or mobile device, and Ransomware, the extortion of financial account information or payments through malware that has infected a computer or mobile device.
  • Whois queries, the collection of point of contact information using the public Whois service.
  • Social Engineering Attacks including Phishing, the luring of victims to impersonation financial or corporate login pages where the victim submits personal or financial account information; Doxing, a social engineering and information harvesting of personal information, especially from social media sites; and Business Email Compromise (CEO Fraud), the use of social engineering and email account impersonation to acquire financial data or to hijack or perform fraudulent wire transfers.
  • Disclosure by third parties that aggregate and sell or share personal data, without notice or consent.

The ICANN study only asked about a single threat: Whois. This give us the same kind of insight as a study of cancer in which the question posed is “is X a cause of cancer?” The X study may give us an answer, but it will not be a very comprehensive one. Questions such as “what are the possible causes of cancer, which are most dangerous (aggressive), and which are the causes that we are most likely to be exposed to?” provide more and deeper insights. For example, a study may find that eating red meat causes cancer, and another study may find that extended exposure to the sun does, too, but we want to understand which of these and other possible causes of cancer is the greater or greatest risk so that we can weigh the risks and most importantly focus our attention on mitigating the worst among the risks.

If a study were to be conducted using the threats I listed above, I’m willing to speculate that the findings would rank Whois fairly low, for the reasons explained in the following table:

Threat to Personal Data

Risk

Popularity

Data Quality and Value

Reason
for
ranking

Database Breaches

Medium-High

High

High

Databases generally have complete and accurate personal data: contact data as well as account and verification (e.g., CCV) data. Value of
Difficulty: LOW. A wealth of exploit kits for common (SQL) vulnerabilities found in open repositories or Dark Web.

Malware, Ransomware

High

High

High

Lax security/encryption practices, vulnerable software, and poor configurations expose data malware attacks, unintentional leaks (e.g., posts to public not friends).

Difficulty: LOW. Malware, ransomware can be downloaded from open repositories, social media pages or Dark Web. Attack networks can be leased from spam infrastructures (e.g., Avalanche) or botnet herders.

Social Engineering Attacks: Doxing, BEC, Phishing, Advanced Fee Frauds

High

High

High

Highly sophisticated attackers collect personal data from targeted individuals, businesses or government agencies. 

Difficulty: High. Doxing and BEC in particular require direct or personal exchanges between attackers and targets, with typically lucrative returns.

Disclosure by 3rd parties

Medium-High

Medium-High

Varies

Notice of use or informed consent is poorly practiced, oversight (until GDPR) is poor. Users make poor or uninformed choices based on cost or desirability of service.

Difficulty: Varies. Collection is not difficult, but 3rd parties invest in measures to hide this activity from public or regulators. 

Crawling and Search

Medium

High

Varies

Anything data can be indexed can be found using advanced searches and disclosed or misused. Automation (scripts) can collect and extract email addresses or other personal data from millions of web pages with ease.

Difficulty: LOW. Automation can be downloaded from open repositories, social media pages or web pages.

Traffic or Message Capture

Medium

Medium

Varies

Traffic capture today is sometimes state-actor sponsored and typically a patient and long-term extraction of intelligence. The completeness and accuracy of any personal data collected depends on the targeted environment.

Difficulty: MEDIUM. These attacks are malware-driven. Some is custom, some available through open repositories or Dark Web.

Email User Enumeration

Medium

Medium

Medium

These attacks are conducted to generate email address lists or to gather intelligence or to acquire targets. Financial data and other personal data cannot be collected by this means. The lists are much less valued in underground marketplaces than database breach data. There is  a market for such synthesized mailing lists but the parties who purchase them often end up blocklisted because the sellers often collect spamtrap addresses.

Difficulty: LOW.  Automation can be downloaded from open repositories, social media pages or web pages.

Whois queries

Medium

Low

Poor

Whois point of contact information is often inaccurate. Whois does not contain financial data. The lists are much less valued in underground marketplaces than database breach data.

Difficulty: LOW. Automation can be downloaded from open repositories, or web pages. Whois rate limiting impedes collection.

The FTC study attempted a comparative harvesting analysis in 2002. This was the right context. The FTC asked the right question. Let’s repeat the study, with the caveat that the list of prominent attack vectors should be modernized to the list of today’s popular harvesting methods, and the procedures, measures, and data analysis should be reviewed to assure that the new study is scientific.

A scientific study of this breadth would allow us to assess the composite risk of private data disclosure as well as the individual risk through each of these collection methods. It would give us an opportunity to use data to drive informed consensus policy development for an important issue.