Advanced Data Pilfering (ADP)
Release Date: October 10, 2025
Product: NodeZero (Internal Pentesting - Elite SKU)
Overview
Advanced Data Pilfering (ADP) extends NodeZero's autonomous attack engine to identify and contextualize sensitive business data encountered during internal operations-including phishing and insider threat scenarios. It combines:
- Advanced Credential Pilfering (ACP)
- GenAI-powered Data Risk Inference
Findings are auto-tagged with Business Risk categories, prioritized with context scoring, and displayed across the UI, reports, and attack graphs.
How To Use
- Step 1: While creating an internal or phishing pentest, or an insider threat test, scroll to the Advanced Data Pilfering section:
- Step 2: Choose whether to enable Advanced Credential Pilfering, which uses advanced semantic reasoning to detect and extract credentials from various data sources. These credentials are then used to move laterally and escalate privileges during the attack.
Note
Scanning extended domain users is turned off by default when this option is enabled.
- Step 3: Choose a Data Risk Inference option to automatically identify business risks - such as exposed personal data, financial reports, or confidential strategy documents, and show the potential impact if they were compromised:
- Disabled: Turns off Data Risk Inference entirely.
- Metadata Only: Sends only file metadata (names, sizes, hashes). This is the fastest option, but may result in fewer findings.
- Full Inference Data: Sends small snippets of text from files that appear business‑sensitive, enabling deeper and more accurate risk inference.
- Step 4: Run the test.
- Step 5: Once the test completes, you can view the results in any of these existing NodeZero workflows:
- Impacts Tab: New
Business Risk
column. - Weaknesses Tab: Same column linking to weak points.
- Data Tab: Filter by Business Risk tags.
- Sankey View: Shows flow of sensitive data.
- Attack Graph: Includes a Business Risk node.
- Reports: Executive Summary highlights exposures.
- APIs / MCP: Export tags and GenAI Action Logs.
- Impacts Tab: New
Why It Matters
The Problem
Security teams routinely uncover data during pentests-files, credentials, blobs-but struggle to determine what's actually risky. Traditional tools (e.g., DLP, DSPM):
- Generate noise with regex-based detection.
- Lack attacker context.
- Fail to link exposures to business consequences.
The Value of ADP
ADP closes this gap by:
- Tagging files with business risk context (e.g., Material Financials, Trade Secrets).
- Prioritizing high-impact findings using GenAI analysis.
- Making every inference explainable through Action Logs.
"We don't just show you what was found - we show you why it matters."
Key Differentiators
Feature | Traditional Tools (DLP/DSPM) | ADP (NodeZero) |
---|---|---|
Detection Context | Regex / Keywords | Attacker-contextual (real operations) |
Prioritization | Manual, noise-prone | GenAI-driven, context-aware |
Explainability | Limited | Fully logged, transparent |
Data Usage | Data stored/indexed | Runtime only, not used for training |
Business Risk Mapping | Broad classification | Precise, executive-relevant tagging |
How It Works
ADP is enabled by default during internal pentests. No configuration or tuning needed.
Runtime Flow
-
Advanced Credential Pilfering (ACP) → Extracts credentials from reachable systems, files, shares.
-
Data Risk Inference (GenAI) → Evaluates content using LLM to determine business significance.
-
Tagging → Applies Business Risk categories like:
- Operational Disruption
- Critical System Shutdown
- Revenue Interruption
- Software Delivery Disruption
- Supply Chain Breakdown
- Leak of Sensitive Communications
- Executive Fraud & Impersonation
- Regulatory Breach Penalties
- Theft of IP & R&D
- Unauthorized Physical Access
- Brand Impersonation & Takeover
-
Context Scoring → Adds
+2
score boost to business-risk findings to elevate them in dashboards and reports.
Example Use Cases
- Hidden VPN Credentials in contracts ➝ Tagged as Credentials + Legal Risk
- Unreleased Earnings Forecasts ➝ Tagged as Material Financials
- API Keys in Git Repo ➝ Tagged as Source Code Exposure
- Defense Industrial Designs ➝ Tagged as Trade Secrets
Business Risk Categories
ADP currently supports 12 categories of risk:
Auto-Tag | Business Risk | Real-World Example |
---|---|---|
Sensitive Communications | Reputational Damage | Sony Pictures 2014 |
Health Data (PHI) | Regulatory Penalties | Anthem 2015 |
Personal Data (SSNs, etc.) | Regulatory Penalties | Equifax 2017 |
Employee Data / HR | Regulatory + Reputational Risk | Sony Pictures HR |
Source Code | Software Delivery Disruption | SolarWinds 2020 |
IP or R&D | Theft of Intellectual Property | PLA Indictment 2014 |
Manufacturing Data | Theft of Production Techniques | U.S. Steel / PLA |
Alarm / Security Systems | Unauthorized Physical Access | Verkada 2021 |
Customer Data | Regulatory Penalties | Target 2013 |
Financial Data | Regulatory Penalties | Equifax 2017 |
Database Schemas / Config Files | Operational Disruption | CloudNordic 2023 |
GenAI Architecture
Model Hosting
- All LLM calls are containerized within Horizon3's AWS infra.
Data Flow
Component | Data Sent to Model |
---|---|
Model Hosting | AWS Bedrock (Llama 4 Maverick) |
Credential Extraction | Small text snippets |
Repo Risk Detection | Metadata only or up to 10 file snippets |
Database Schema Analysis | Table/Column names only (no data rows) |
All data is filtered, minimal, and not used for training.
Prompt Management
- Custom prompts created by Horizon3.ai.
- Model outputs are advisory only - no autonomous actions.
How GenAI Is Integrated and Secured
Advanced Data Pilfering (ADP) leverage AWS Bedrock, a fully managed foundation model platform provided by Amazon. Specifically, NodeZero uses the Llama 4 Maverick model to perform semantic reasoning over structured metadata and sensitive content (depending on the feature).
The architecture is designed for security, explainability, and data isolation:
- A dedicated container runs inside the NodeZero Kubernetes cluster, sitting adjacent to the Core service.
- This container is responsible for communicating with AWS Bedrock and sending data for inference.
- The system analyzes short file content snippets that appear business-sensitive.
Importantly:
- No data is stored or used for training, by either Horizon3.ai or AWS.
- AWS Bedrock provides strong data isolation guarantees.
- The amount and type of data passed to the model is strictly controlled via configuration—ensuring minimal, targeted input for each use case.
For more on AWS foundation models, see: https://aws.amazon.com/what-is/foundation-models/
Key Capabilities
What ADP Is Designed For
- Highlighting business risk from attacker-discoverable data.
- Providing explainable GenAI tagging with full transparency.
- Prioritizing findings based on real attack paths—not theoretical access.
- Requiring no manual tuning or rule writing—just run the test.
- Making risk obvious across the UI, reports, and API.
What ADP Does Not Do
- ADP is not a traditional DLP or DSPM tool. It doesn't crawl all data.
- It doesn't perform full share scanning or file indexing.
- ADP does not replace compliance archiving or governance workflows.
- It focuses strictly on attacker-accessible data during live operations.
FAQs
Q: Which operations support ADP? A: Internal only - phishing, insider, internal lateral movement.
Q: Is it included in all SKUs? A: No - ADP is part of the Elite SKU only.
Q: Is it on by default? A: Yes - both ACP and Data Risk Inference are default ON.
Q: Is customer data used for training? A: No - data is used only at runtime. Inferences are logged and explainable.
Q: Does this replace DLP/DSPM? A: No - ADP finds attacker-reachable risk, not compliance-level coverage.