Data Masking & Privacy - VantaPrism API

Why masking exists

VantaPrism’s dataset contains real credentials, session cookies, payment cards, and crypto wallet secrets extracted from compromised machines. To use this responsibly while still proving a match is real, every response field is run through a declarative masking engine before it leaves the API. The strategy applied to each field depends on your key’s tier (free / pro / ultra) and is reported back to you in meta.masked_fields (see Response Envelope). Masking is applied uniformly — the same rules apply whether you’re calling /v1/search/emails, /v1/data/credentials, or /v1/victims/{victim_id}/credentials.

Masking strategies

Strategy	What it does	Example
`subnet_mask`	Reduces an IPv4 address to its first two octets	`203.0.113.42` → `203.0.x.x`
`partial_login`	Partially masks an email/username — either the local part or the domain label, chosen deterministically per-value (see below)	`jondoe@example.com` → `jondoe@*****.com`, `j.doe@acme-corp.com` → `j.*@acme-corp.com`
`partial_identifier`	Shows a short prefix, stars out the rest, preserving length	`DESKTOP-3ESCSGH` → `DE*************`
`bin_last_four`	Keeps the card’s 6-digit issuer prefix and last 4 digits, masks the middle	`4111111111111111` → `411111••••••1111`
`domain_only_url`	Strips a URL down to `scheme://host/***` — path and query are dropped entirely	`https://accounts.google.com/signin/v2/identifier` → `https://accounts.google.com/***`
`partial_filepath`	Reduces a file path to its drive/root + `***`	`C:\Users\jdoe\Desktop\wallet_seed.txt` → `C:\***`
`partial_filename`	Masks the filename stem, keeps the extension	`wallet_seed.txt` → `wa*********.txt`
`full_mask`	Replaces the value with a fixed placeholder	any → `••••`
`redacted`	Replaces the value with `"[REDACTED]"`	any → `[REDACTED]`
`raw`	Returned unmodified	—

`partial_login` variant selection

For emails/usernames, masking either the local part (before @) or the domain label (the part before the TLD) would each leak different information. To prevent a corpus of results from revealing a predictable pattern, the choice is made per-value via an HMAC keyed off a server-side pepper — it cannot be inferred from the output alone, but is stable for the same input across requests:

jondoe@example.com → jondoe@*******.com (domain label masked)
j.doe@acme-corp.com → j.***@acme-corp.com (local part masked)
admin@corp.io → ad***@corp.io (local part masked — too short to mask the domain)

Per-tier masking matrix

Field	`free`	`pro`	`ultra`
`ip`	`subnet_mask`	raw	raw
`password`	`full_mask`	raw	raw
`value` (cookies, autofill)	`redacted`	raw	raw
`ftp_password`	`redacted`	raw	raw
`keychain_value`	`redacted`	`redacted`	raw
`card_number`	`bin_last_four`	`bin_last_four`	`bin_last_four`
`login` / `username`	`partial_login`	raw	raw
`url`	`domain_only_url`	raw	raw
`computer_name` / `user_name` / `hwid` / `machine_id`	`partial_identifier`	raw	raw
`malware_location` / `original_path`	`partial_filepath`	raw	raw
`filename`	`partial_filename`	raw	raw

Notes:

card_number is always bin_last_four, on every tier — full card numbers are never returned by the API.
keychain_value is only ever raw on ultra — and the view:keychain scope itself is ultra-only (see Authentication).

Boolean-flag fields

Some highly-sensitive fields are never returned as raw strings — instead the API returns a boolean has_* flag indicating whether the value is present:

Raw field	Returned as
`cvv`	`has_cvv`
`wallet_files`	`has_wallet_files`
`seed_phrase`	`has_seed_phrase`
`private_key`	`has_private_key`
`encrypted_vault`	`has_encrypted_vault`
`system_password`	`has_system_password`

These appear in meta.masked_fields using the original field name (e.g. "cvv", not "has_cvv") so you can detect that a flag substitution occurred.

The “searched value is never masked” rule

If you search for a specific email, username, IP, or PC name, that exact value is returned unmasked in matching rows — even on the free tier — because you already know it; masking it back to you adds no protection and only adds noise. Example: searching POST /v1/search/usernames with "usernames": ["DESKTOP-3ESCSGH"] returns rows where computer_name: "DESKTOP-3ESCSGH" in full, not "DE*************". This only applies to an exact match on the value(s) you searched for. A discovered related identifier — e.g. a different login on the same victim, or a subdomain found via include_subdomains — is still masked according to the table above. For example, searching usernames: ["john"] and getting back a row with login: "john@gmail.com" still masks that login on free tier, because "john@gmail.com" is discovered information, not the literal value you supplied.

Discovered subdomains

POST /v1/domain/search with include_subdomains: true can surface subdomains you didn’t ask about. On the free tier, a discovered subdomain’s label is fully masked (***.acme-corp.com) — proving a hit exists without handing over the specific subdomain name. The registrable domain you searched for (acme-corp.com) is always shown in full.

Checking what was masked

Every response’s meta.masked_fields array lists exactly which field names in data were masked or flag-substituted for this request:

{
  "data": [ { "login": "jondoe@*******.com", "password": "••••", "ip": "203.0.x.x" } ],
  "meta": {
    "request_id": "req_01HZXK3Q7N8YV6F3M2P9JABCDE",
    "took_ms": 24.0,
    "tier": "free",
    "masked_fields": ["login", "password", "ip"]
  }
}

If masked_fields is empty or absent, nothing in data was masked for this request.

​Why masking exists

​Masking strategies

​partial_login variant selection

​Per-tier masking matrix

​Boolean-flag fields

​The “searched value is never masked” rule

​Discovered subdomains

​Checking what was masked