Skip to main content

Why masking exists

VantaPrism’s dataset contains real credentials, session cookies, payment cards, and crypto wallet secrets extracted from compromised machines. To use this responsibly while still proving a match is real, every response field is run through a declarative masking engine before it leaves the API. The strategy applied to each field depends on your key’s tier (free / pro / ultra) and is reported back to you in meta.masked_fields (see Response Envelope). Masking is applied uniformly — the same rules apply whether you’re calling /v1/search/emails, /v1/data/credentials, or /v1/victims/{victim_id}/credentials.

Masking strategies

StrategyWhat it doesExample
subnet_maskReduces an IPv4 address to its first two octets203.0.113.42203.0.x.x
partial_loginPartially masks an email/username — either the local part or the domain label, chosen deterministically per-value (see below)jondoe@example.comjondoe@*******.com, j.doe@acme-corp.comj.***@acme-corp.com
partial_identifierShows a short prefix, stars out the rest, preserving lengthDESKTOP-3ESCSGHDE*************
bin_last_fourKeeps the card’s 6-digit issuer prefix and last 4 digits, masks the middle4111111111111111411111••••••1111
domain_only_urlStrips a URL down to scheme://host/*** — path and query are dropped entirelyhttps://accounts.google.com/signin/v2/identifierhttps://accounts.google.com/***
partial_filepathReduces a file path to its drive/root + ***C:\Users\jdoe\Desktop\wallet_seed.txtC:\***
partial_filenameMasks the filename stem, keeps the extensionwallet_seed.txtwa*********.txt
full_maskReplaces the value with a fixed placeholderany → ••••
redactedReplaces the value with "[REDACTED]"any → [REDACTED]
rawReturned unmodified

partial_login variant selection

For emails/usernames, masking either the local part (before @) or the domain label (the part before the TLD) would each leak different information. To prevent a corpus of results from revealing a predictable pattern, the choice is made per-value via an HMAC keyed off a server-side pepper — it cannot be inferred from the output alone, but is stable for the same input across requests:
  • jondoe@example.comjondoe@*******.com (domain label masked)
  • j.doe@acme-corp.comj.***@acme-corp.com (local part masked)
  • admin@corp.ioad***@corp.io (local part masked — too short to mask the domain)

Per-tier masking matrix

Fieldfreeproultra
ipsubnet_maskrawraw
passwordfull_maskrawraw
value (cookies, autofill)redactedrawraw
ftp_passwordredactedrawraw
keychain_valueredactedredactedraw
card_numberbin_last_fourbin_last_fourbin_last_four
login / usernamepartial_loginrawraw
urldomain_only_urlrawraw
computer_name / user_name / hwid / machine_idpartial_identifierrawraw
malware_location / original_pathpartial_filepathrawraw
filenamepartial_filenamerawraw
Notes:
  • card_number is always bin_last_four, on every tier — full card numbers are never returned by the API.
  • keychain_value is only ever raw on ultra — and the view:keychain scope itself is ultra-only (see Authentication).

Boolean-flag fields

Some highly-sensitive fields are never returned as raw strings — instead the API returns a boolean has_* flag indicating whether the value is present:
Raw fieldReturned as
cvvhas_cvv
wallet_fileshas_wallet_files
seed_phrasehas_seed_phrase
private_keyhas_private_key
encrypted_vaulthas_encrypted_vault
system_passwordhas_system_password
These appear in meta.masked_fields using the original field name (e.g. "cvv", not "has_cvv") so you can detect that a flag substitution occurred.

The “searched value is never masked” rule

If you search for a specific email, username, IP, or PC name, that exact value is returned unmasked in matching rows — even on the free tier — because you already know it; masking it back to you adds no protection and only adds noise. Example: searching POST /v1/search/usernames with "usernames": ["DESKTOP-3ESCSGH"] returns rows where computer_name: "DESKTOP-3ESCSGH" in full, not "DE*************". This only applies to an exact match on the value(s) you searched for. A discovered related identifier — e.g. a different login on the same victim, or a subdomain found via include_subdomains — is still masked according to the table above. For example, searching usernames: ["john"] and getting back a row with login: "john@gmail.com" still masks that login on free tier, because "john@gmail.com" is discovered information, not the literal value you supplied.

Discovered subdomains

POST /v1/domain/search with include_subdomains: true can surface subdomains you didn’t ask about. On the free tier, a discovered subdomain’s label is fully masked (***.acme-corp.com) — proving a hit exists without handing over the specific subdomain name. The registrable domain you searched for (acme-corp.com) is always shown in full.

Checking what was masked

Every response’s meta.masked_fields array lists exactly which field names in data were masked or flag-substituted for this request:
{
  "data": [ { "login": "jondoe@*******.com", "password": "••••", "ip": "203.0.x.x" } ],
  "meta": {
    "request_id": "req_01HZXK3Q7N8YV6F3M2P9JABCDE",
    "took_ms": 24.0,
    "tier": "free",
    "masked_fields": ["login", "password", "ip"]
  }
}
If masked_fields is empty or absent, nothing in data was masked for this request.