Advanced Data Pilfering (ADP)
Release Date: October 10, 2025
Product: NodeZero (Internal Pentesting - Elite SKU)
Overview
Advanced Data Pilfering (ADP) extends NodeZero's autonomous attack engine to identify and contextualize sensitive business data encountered during internal operations-including phishing and insider threat scenarios. It combines:
- Advanced Credential Pilfering (ACP)
- GenAI-powered Data Risk Inference
Findings are auto-tagged with Business Risk categories, prioritized with context scoring, and displayed across the UI, reports, and attack graphs.
How To Use
- Step 1: While creating an internal or phishing pentest, or an insider threat test, scroll to the Advanced Data Pilfering section:
- Step 2: Choose whether to enable Advanced Credential Pilfering, which uses advanced semantic reasoning to detect and extract credentials from various data sources. These credentials are then used to move laterally and escalate privileges during the attack.
Note
Scanning extended domain users is turned off by default when this option is enabled.
- Step 3: Choose a Data Risk Inference option to automatically identify business risks - such as exposed personal data, financial reports, or confidential strategy documents, and show the potential impact if they were compromised:- Disabled: Turns off Data Risk Inference entirely.
- Metadata Only: Sends only file metadata (names, sizes, hashes). This is the fastest option, but may result in fewer findings.
- Full Inference Data: Sends small snippets of text from files that appear business‑sensitive, enabling deeper and more accurate risk inference.
 
- Step 4: Run the test.
- Step 5: Once the test completes, you can view the results in any of these existing NodeZero workflows:- Impacts Tab: New Business Riskcolumn.
- Weaknesses Tab: Same column linking to weak points.
- Data Tab: Filter by Business Risk tags.
- Sankey View: Shows flow of sensitive data.
- Attack Graph: Includes a Business Risk node.
- Reports: Executive Summary highlights exposures.
- APIs / MCP: Export tags and GenAI Action Logs.
 
- Impacts Tab: New 
Why It Matters
The Problem
Security teams routinely uncover data during pentests-files, credentials, blobs-but struggle to determine what's actually risky. Traditional tools (e.g., DLP, DSPM):
- Generate noise with regex-based detection.
- Lack attacker context.
- Fail to link exposures to business consequences.
The Value of ADP
ADP closes this gap by:
- Tagging files with business risk context (e.g., Material Financials, Trade Secrets).
- Prioritizing high-impact findings using GenAI analysis.
- Making every inference explainable through Action Logs.
"We don't just show you what was found - we show you why it matters."
Key Differentiators
| Feature | Traditional Tools (DLP/DSPM) | ADP (NodeZero) | 
|---|---|---|
| Detection Context | Regex / Keywords | Attacker-contextual (real operations) | 
| Prioritization | Manual, noise-prone | GenAI-driven, context-aware | 
| Explainability | Limited | Fully logged, transparent | 
| Data Usage | Data stored/indexed | Runtime only, not used for training | 
| Business Risk Mapping | Broad classification | Precise, executive-relevant tagging | 
How It Works
ADP is enabled by default during internal pentests. No configuration or tuning needed.
Runtime Flow
- 
Advanced Credential Pilfering (ACP) → Extracts credentials from reachable systems, files, shares. 
- 
Data Risk Inference (GenAI) → Evaluates content using LLM to determine business significance. 
- 
Tagging → Applies Business Risk categories like: - Operational Disruption
- Critical System Shutdown
- Revenue Interruption
- Software Delivery Disruption
- Supply Chain Breakdown
- Leak of Sensitive Communications
- Executive Fraud & Impersonation
- Regulatory Breach Penalties
- Theft of IP & R&D
- Unauthorized Physical Access
- Brand Impersonation & Takeover
 
- 
Context Scoring → Adds +2score boost to business-risk findings to elevate them in dashboards and reports.
Example Use Cases
- Hidden VPN Credentials in contracts ➝ Tagged as Credentials + Legal Risk
- Unreleased Earnings Forecasts ➝ Tagged as Material Financials
- API Keys in Git Repo ➝ Tagged as Source Code Exposure
- Defense Industrial Designs ➝ Tagged as Trade Secrets
Business Risk Categories
ADP currently supports 12 categories of risk:
| Auto-Tag | Business Risk | Real-World Example | 
|---|---|---|
| Sensitive Communications | Reputational Damage | Sony Pictures 2014 | 
| Health Data (PHI) | Regulatory Penalties | Anthem 2015 | 
| Personal Data (SSNs, etc.) | Regulatory Penalties | Equifax 2017 | 
| Employee Data / HR | Regulatory + Reputational Risk | Sony Pictures HR | 
| Source Code | Software Delivery Disruption | SolarWinds 2020 | 
| IP or R&D | Theft of Intellectual Property | PLA Indictment 2014 | 
| Manufacturing Data | Theft of Production Techniques | U.S. Steel / PLA | 
| Alarm / Security Systems | Unauthorized Physical Access | Verkada 2021 | 
| Customer Data | Regulatory Penalties | Target 2013 | 
| Financial Data | Regulatory Penalties | Equifax 2017 | 
| Database Schemas / Config Files | Operational Disruption | CloudNordic 2023 | 
GenAI Architecture
Model Hosting
- All LLM calls are containerized within Horizon3's AWS infra.
Data Flow
| Component | Data Sent to Model | 
|---|---|
| Model Hosting | AWS Bedrock (Llama 4 Maverick) | 
| Credential Extraction | Small text snippets | 
| Repo Risk Detection | Metadata only or up to 10 file snippets | 
| Database Schema Analysis | Table/Column names only (no data rows) | 
All data is filtered, minimal, and not used for training.
Prompt Management
- Custom prompts created by Horizon3.ai.
- Model outputs are advisory only - no autonomous actions.
How GenAI Is Integrated and Secured
Advanced Data Pilfering (ADP) leverage AWS Bedrock, a fully managed foundation model platform provided by Amazon. Specifically, NodeZero uses the Llama 4 Maverick model to perform semantic reasoning over structured metadata and sensitive content (depending on the feature).
The architecture is designed for security, explainability, and data isolation:
- A dedicated container runs inside the NodeZero Kubernetes cluster, sitting adjacent to the Core service.
- This container is responsible for communicating with AWS Bedrock and sending data for inference.
- The system analyzes short file content snippets that appear business-sensitive.
Importantly:
- No data is stored or used for training, by either Horizon3.ai or AWS.
- AWS Bedrock provides strong data isolation guarantees.
- The amount and type of data passed to the model is strictly controlled via configuration—ensuring minimal, targeted input for each use case.
For more on AWS foundation models, see: https://aws.amazon.com/what-is/foundation-models/
Key Capabilities
What ADP Is Designed For
- Highlighting business risk from attacker-discoverable data.
- Providing explainable GenAI tagging with full transparency.
- Prioritizing findings based on real attack paths—not theoretical access.
- Requiring no manual tuning or rule writing—just run the test.
- Making risk obvious across the UI, reports, and API.
What ADP Does Not Do
- ADP is not a traditional DLP or DSPM tool. It doesn't crawl all data.
- It doesn't perform full share scanning or file indexing.
- ADP does not replace compliance archiving or governance workflows.
- It focuses strictly on attacker-accessible data during live operations.
Business Risk Taxonomy
ADP maps discovered data to a defined set of Business Risk categories using GenAI reasoning. These mappings help security teams communicate clearly with executives, auditors, and remediation teams.
| Data Auto-Tag | Business Risk Name | Impact Rationale | 
|---|---|---|
| Strategic Business Communications | Leak of Sensitive Communications | Leaked strategic communications create immediate reputational damage and stakeholder confidence erosion through exposure of internal debates, personnel issues, and sensitive business discussions. The 2014 Sony Pictures breach is a key example. | 
| Health Data | Regulatory Breach Penalties | Breaches involving PHI trigger HIPAA, HITECH, and GDPR Article 9 violations. The 2015 Anthem breach (78.8M records) led to a $16M HIPAA settlement and regulatory oversight. | 
| Personal Data | Regulatory Breach Penalties | Exposure of SSNs, addresses, or phone numbers triggers GDPR, CCPA, and state notification laws. The 2017 Equifax breach affected 147M people and led to a $575M settlement. | 
| Employee Data / HR | Regulatory Breach Penalties | Exposure of employee data (SSNs, payroll, benefits) results in dual regulatory exposure (GDPR, HIPAA). The Anthem breach again is illustrative. | 
| Source Code | Software Delivery Disruption | Compromised source code or CI/CD pipelines allow attackers to inject malicious code or disrupt delivery. Example: 2020 SolarWinds supply chain attack. | 
| Intellectual Property | Theft of IP & R&D | Theft enables replication of proprietary innovations. 2014 Chinese PLA indictments showed IP theft from Westinghouse and U.S. Steel. | 
| Manufacturing / Production | Theft of IP & R&D | Theft of proprietary formulas or processes enables global competition to replicate and undercut pricing. Again seen in the 2014 PLA indictments. | 
| Employee Data / HR | Leak of Sensitive Communications | HR data (reviews, salaries, layoffs) leaks cause reputational and legal exposure. The Sony breach revealed exec salaries and performance reviews. | 
| Physical Security / Surveillance / Alarm Systems | Unauthorized Physical Access | Compromised surveillance or alarm systems grant unauthorized access. 2021 Verkada breach gave access to 150,000 live camera feeds. | 
| Customer Data | Regulatory Breach Penalties | Regulated customer data triggers GDPR, PCI-DSS, CCPA, leading to fines and audits. Example: 2013 Target breach ($18M in settlements). | 
| Financial Data | Regulatory Breach Penalties | Financial data breaches invoke PCI-DSS, SOX, GLBA rules. The Equifax breach again illustrates regulatory consequences. | 
| Digital Infrastructure | Operational Disruption | Attackers can misconfigure or corrupt infrastructure (configs, routes). The 2023 CloudNordic incident caused massive service outages due to ransomware and config issues. | 
FAQs
Q: Which operations support ADP? A: Internal only - phishing, insider, internal lateral movement.
Q: Is it included in all SKUs? A: No - ADP is part of the Elite SKU only.
Q: Is it on by default? A: Yes - both ACP and Data Risk Inference are default ON.
Q: Is customer data used for training? A: No - data is used only at runtime. Inferences are logged and explainable.
Q: Does this replace DLP/DSPM? A: No - ADP finds attacker-reachable risk, not compliance-level coverage.
