Advanced Data Pilfering (ADP)¶

Release Date: October 10, 2025

Product: NodeZero (Internal Pentesting - Elite SKU)

Overview¶

Advanced Data Pilfering (ADP) extends NodeZero's autonomous attack engine to identify and contextualize sensitive business data encountered during internal operations-including phishing and insider threat scenarios. It combines:

Advanced Credential Pilfering (ACP)
GenAI-powered Data Risk Inference

Findings are auto-tagged with Business Risk categories, prioritized with context scoring, and displayed across the UI, reports, and attack graphs.

How To Use¶

Step 1: While creating an internal or phishing pentest, or an insider threat test, scroll to the Advanced Data Pilfering section:

Step 2: Choose whether to enable Advanced Credential Pilfering, which uses advanced semantic reasoning to detect and extract credentials from various data sources. These credentials are then used to move laterally and escalate privileges during the attack.

Note

Scanning extended domain users is turned off by default when this option is enabled.

Step 3: Choose a Data Risk Inference option to automatically identify business risks - such as exposed personal data, financial reports, or confidential strategy documents, and show the potential impact if they were compromised:
- Disabled: Turns off Data Risk Inference entirely.
- Metadata Only: Sends only file metadata (names, sizes, hashes). This is the fastest option, but may result in fewer findings.
- Full Inference Data: Sends small snippets of text from files that appear business‑sensitive, enabling deeper and more accurate risk inference.
Step 4: Run the test.
Step 5: Once the test completes, you can view the results in any of these existing NodeZero workflows:
- Impacts Tab: New Business Risk column.
- Weaknesses Tab: Same column linking to weak points.
- Data Tab: Filter by Business Risk tags.
- Sankey View: Shows flow of sensitive data.
- Attack Graph: Includes a Business Risk node.
- Reports: Executive Summary highlights exposures.
- APIs / MCP: Export tags and GenAI Action Logs.

Why It Matters¶

The Problem¶

Security teams routinely uncover data during pentests-files, credentials, blobs-but struggle to determine what's actually risky. Traditional tools (e.g., DLP, DSPM):

Generate noise with regex-based detection.
Lack attacker context.
Fail to link exposures to business consequences.

The Value of ADP¶

ADP closes this gap by:

Tagging files with business risk context (e.g., Material Financials, Trade Secrets).
Prioritizing high-impact findings using GenAI analysis.
Making every inference explainable through Action Logs.

"We don't just show you what was found - we show you why it matters."

Key Differentiators¶

Feature	Traditional Tools (DLP/DSPM)	ADP (NodeZero)
Detection Context	Regex / Keywords	Attacker-contextual (real operations)
Prioritization	Manual, noise-prone	GenAI-driven, context-aware
Explainability	Limited	Fully logged, transparent
Data Usage	Data stored/indexed	Runtime only, not used for training
Business Risk Mapping	Broad classification	Precise, executive-relevant tagging

How It Works¶

ADP is enabled by default during internal pentests. No configuration or tuning needed.

Runtime Flow¶

Advanced Credential Pilfering (ACP) → Extracts credentials from reachable systems, files, shares.
Data Risk Inference (GenAI) → Evaluates content using LLM to determine business significance.
Tagging → Applies Business Risk categories like:
- Operational Disruption
- Critical System Shutdown
- Revenue Interruption
- Software Delivery Disruption
- Supply Chain Breakdown
- Leak of Sensitive Communications
- Executive Fraud & Impersonation
- Regulatory Breach Penalties
- Theft of IP & R&D
- Unauthorized Physical Access
- Brand Impersonation & Takeover
Context Scoring → Adds +2 score boost to business-risk findings to elevate them in dashboards and reports.

Example Use Cases¶

Hidden VPN Credentials in contracts ➝ Tagged as Credentials + Legal Risk
Unreleased Earnings Forecasts ➝ Tagged as Material Financials
API Keys in Git Repo ➝ Tagged as Source Code Exposure
Defense Industrial Designs ➝ Tagged as Trade Secrets

Business Risk Categories¶

ADP currently supports 12 categories of risk:

Auto-Tag	Business Risk	Real-World Example
Sensitive Communications	Reputational Damage	Sony Pictures 2014
Health Data (PHI)	Regulatory Penalties	Anthem 2015
Personal Data (SSNs, etc.)	Regulatory Penalties	Equifax 2017
Employee Data / HR	Regulatory + Reputational Risk	Sony Pictures HR
Source Code	Software Delivery Disruption	SolarWinds 2020
IP or R&D	Theft of Intellectual Property	PLA Indictment 2014
Manufacturing Data	Theft of Production Techniques	U.S. Steel / PLA
Alarm / Security Systems	Unauthorized Physical Access	Verkada 2021
Customer Data	Regulatory Penalties	Target 2013
Financial Data	Regulatory Penalties	Equifax 2017
Database Schemas / Config Files	Operational Disruption	CloudNordic 2023

GenAI Architecture¶

Model Hosting¶

All LLM calls are containerized within Horizon3's AWS infra.

Data Flow¶

Component	Data Sent to Model
Model Hosting	AWS Bedrock (Llama 4 Maverick)
Credential Extraction	Small text snippets
Repo Risk Detection	Metadata only or up to 10 file snippets
Database Schema Analysis	Table/Column names only (no data rows)

All data is filtered, minimal, and not used for training.

Prompt Management¶

Custom prompts created by Horizon3.ai.
Model outputs are advisory only - no autonomous actions.

How GenAI Is Integrated and Secured¶

Advanced Data Pilfering (ADP) leverage AWS Bedrock, a fully managed foundation model platform provided by Amazon. Specifically, NodeZero uses the Llama 4 Maverick model to perform semantic reasoning over structured metadata and sensitive content (depending on the feature).

The architecture is designed for security, explainability, and data isolation:

A dedicated container runs inside the NodeZero Kubernetes cluster, sitting adjacent to the Core service.
This container is responsible for communicating with AWS Bedrock and sending data for inference.
The system analyzes short file content snippets that appear business-sensitive.

Importantly:

No data is stored or used for training, by either Horizon3.ai or AWS.
AWS Bedrock provides strong data isolation guarantees.
The amount and type of data passed to the model is strictly controlled via configuration—ensuring minimal, targeted input for each use case.

For more on AWS foundation models, see: https://aws.amazon.com/what-is/foundation-models/

Key Capabilities¶

What ADP Is Designed For¶

Highlighting business risk from attacker-discoverable data.
Providing explainable GenAI tagging with full transparency.
Prioritizing findings based on real attack paths—not theoretical access.
Requiring no manual tuning or rule writing—just run the test.
Making risk obvious across the UI, reports, and API.

What ADP Does Not Do¶

ADP is not a traditional DLP or DSPM tool. It doesn't crawl all data.
It doesn't perform full share scanning or file indexing.
ADP does not replace compliance archiving or governance workflows.
It focuses strictly on attacker-accessible data during live operations.

Business Risk Taxonomy¶

ADP maps discovered data to a defined set of Business Risk categories using GenAI reasoning. These mappings help security teams communicate clearly with executives, auditors, and remediation teams.

Data Auto-Tag	Business Risk Name	Impact Rationale
Strategic Business Communications	Leak of Sensitive Communications	Leaked strategic communications create immediate reputational damage and stakeholder confidence erosion through exposure of internal debates, personnel issues, and sensitive business discussions. The 2014 Sony Pictures breach is a key example.
Health Data	Regulatory Breach Penalties	Breaches involving PHI trigger HIPAA, HITECH, and GDPR Article 9 violations. The 2015 Anthem breach (78.8M records) led to a $16M HIPAA settlement and regulatory oversight.
Personal Data	Regulatory Breach Penalties	Exposure of SSNs, addresses, or phone numbers triggers GDPR, CCPA, and state notification laws. The 2017 Equifax breach affected 147M people and led to a $575M settlement.
Employee Data / HR	Regulatory Breach Penalties	Exposure of employee data (SSNs, payroll, benefits) results in dual regulatory exposure (GDPR, HIPAA). The Anthem breach again is illustrative.
Source Code	Software Delivery Disruption	Compromised source code or CI/CD pipelines allow attackers to inject malicious code or disrupt delivery. Example: 2020 SolarWinds supply chain attack.
Intellectual Property	Theft of IP & R&D	Theft enables replication of proprietary innovations. 2014 Chinese PLA indictments showed IP theft from Westinghouse and U.S. Steel.
Manufacturing / Production	Theft of IP & R&D	Theft of proprietary formulas or processes enables global competition to replicate and undercut pricing. Again seen in the 2014 PLA indictments.
Employee Data / HR	Leak of Sensitive Communications	HR data (reviews, salaries, layoffs) leaks cause reputational and legal exposure. The Sony breach revealed exec salaries and performance reviews.
Physical Security / Surveillance / Alarm Systems	Unauthorized Physical Access	Compromised surveillance or alarm systems grant unauthorized access. 2021 Verkada breach gave access to 150,000 live camera feeds.
Customer Data	Regulatory Breach Penalties	Regulated customer data triggers GDPR, PCI-DSS, CCPA, leading to fines and audits. Example: 2013 Target breach ($18M in settlements).
Financial Data	Regulatory Breach Penalties	Financial data breaches invoke PCI-DSS, SOX, GLBA rules. The Equifax breach again illustrates regulatory consequences.
Digital Infrastructure	Operational Disruption	Attackers can misconfigure or corrupt infrastructure (configs, routes). The 2023 CloudNordic incident caused massive service outages due to ransomware and config issues.

FAQs¶

Q: Which operations support ADP? A: Internal only - phishing, insider, internal lateral movement.

Q: Is it included in all SKUs? A: No - ADP is part of the Elite SKU only.

Q: Is it on by default? A: Yes - both ACP and Data Risk Inference are default ON.

Q: Is customer data used for training? A: No - data is used only at runtime. Inferences are logged and explainable.

Q: Does this replace DLP/DSPM? A: No - ADP finds attacker-reachable risk, not compliance-level coverage.