Apex Foundry/Edge Deployments/Rare Earth HPC

Strategic Research · National Interest

Sovereign HPC Cluster for Rare Earth Research

Proposed Compute and Storage Cluster for Rare Earth Research

Air-Gapped, Deterministic Scientific Computing Platform for Strategic Mineral Research

2,300–3,000

CPU Cores

~20 TB

Aggregate RAM

~1.5 PB

Online Storage

$2.3–2.9M

Day 1 Estimate

Mission Context

A National-Interest Scientific Computing Platform

The rare-earth research mission demands a computing environment that is secure, largely offline, deterministic, and auditable. The platform is shaped around the physics and evidence chain of the mission — CPU density, memory capacity, and air-gapped ingest — not around GPU-heavy AI cluster fashion.

Local Language Model

Helps scientists interrogate results and navigate a large scientific corpus without sending sensitive data to external cloud.

~5 Million PDF Library

Starting corpus covering magnetic fields, materials science, and related technical literature — fully indexed on-cluster.

Field Telemetry Ingestion

Stream of electromagnetic and multi-modal signals gathered during mining operations — physically transported and air-gapped on import.

Molecular-Level Modeling

Atomic and molecular simulation that moves the problem beyond data analytics into real scientific computing.

Competing Model Portfolio

Parallel evaluation of multiple analytical approaches to assess processing value and determine which methods are economically meaningful.

Controlled Offline Ingest

Field data arrives via removable media through a formal quarantine boundary — chain-of-custody is part of scientific defensibility.

Design Principles

Five Principles. One Sovereign Platform.

🔬

Determinism Before Novelty

Analytical outputs must be explainable and reproducible. Default tools are numerical transforms, explicit statistical methods, and solver-based approaches. ML may assist discovery but never becomes the sole source of critical conclusions.

🔒

Air-Gapped by Operating Model

Field data is physically transported. The cluster has a formal ingest boundary — quarantine nodes, signed checksums, media handling procedures, and release workflows are architectural components, not operational afterthoughts.

⚙️

CPU-Dominant Compute

Electromagnetic analysis, waveform processing, inverse modeling, and model portfolio testing scale better with CPU cores and memory than with GPUs. Capital is spent where the real compute pressure lies.

💾

Storage as a First-Class Capability

Storage is part of the scientific method. Researchers need rapid access to active datasets, economical capacity for historical telemetry, and preserved snapshots that allow any run to be reconstructed.

🤖

Small but High-Quality AI Tier

A local language model helps scientists navigate the evidence base and query structured repositories — clearly bounded by provenance and human oversight. Not a black box. Not the center of the architecture.

Day 1 Architecture

28 Servers · 4 Racks · 7 Functional Tiers

Each tier is purpose-matched to its dominant workload. Every dollar of capex is aligned with the actual compute it performs.

Tier	Count	CPU / Storage	Memory	GPU
Management / Orchestration Scheduler, RBAC, provenance logging, package control, immutable audit trail.	2 nodes · 1U	🔒 Confidential	128–256 GB ECC RAM	None
Air-Gap Ingest / Quarantine Offline import, checksum validation, malware scanning, metadata normalization, signed ingest records.	2 nodes · 1U	🔒 Confidential	Standard	None
Deterministic CPU HPC Signal processing, inversion, statistics, EM analysis, model portfolio runs.	12 nodes · 2U	🔒 Confidential	1 TB ECC RAM / node	None
High-Memory Science Materials and molecular simulation, large in-memory jobs, matrix operations.	4 nodes · 2U	🔒 Confidential	2 TB ECC RAM / node	Optional (deferred)
GPU Inference Local LLM, embeddings, scientific literature navigation, limited multimodal assistance.	2 nodes · 4U	🔒 Confidential	512 GB ECC RAM	🔒 Confidential
Hot NVMe Storage Active datasets, vector index, project scratch, current telemetry.	2 nodes · 2U	—	256 GB RAM	~250–280 TB usable
Warm Storage PDF corpus, telemetry history, processed data, reproducibility snapshots.	4 nodes · 4U	—	NVMe cache	~1.2–1.5 PB usable

Storage Architecture

Storage as a Scientific Instrument

Five million documents become raw PDFs, normalized text, vector embeddings, graph relationships, extracted tables, and cross-run artifacts. Field telemetry multiplies similarly. Day 1 targets 1.5 PB usable online — split into fast and economical tiers to avoid exhausting capacity just as the research team relies on the platform.

Hot NVMe2 nodes

~250–280 TB usable

Active projects, vector index, fast scratch, current telemetry. All NVMe with 256 GB RAM and dual high-speed links.

Warm Storage4 nodes

~1.2–1.5 PB usable

PDF corpus, telemetry history, processed datasets, reproducibility snapshots. Dense HDD with NVMe cache tier.

Cold ArchiveExternal tier

3–5 PB (optional)

Long-term retention and campaign archive. Object or tape storage. Critical for multi-year research continuity.

Secure Offline Ingest

Chain-of-Custody Is Part of the Science

Without a formal ingest boundary, the organization may later be unable to prove that an analytical result came from a specific field acquisition — or that data were not altered or mixed with another campaign. For strategic mineral research, that is too large a weakness to accept.

Physical Media Arrives

Removable media from field operations lands on quarantine nodes — isolated from the research fabric.

Checksum Verification

Integrity verified against field-recorded checksums. Operator context and physical media identity are recorded.

Payload Scanning

Malicious content scan and metadata normalization performed in the quarantine environment.

Signed Ingest Record

An immutable, signed ingest record is created. Only then is the approved dataset released into the research fabric.

Power Envelope

Intentionally Modest

Average IT Load~32.5 kW

Realistic Peak IT Load~43 kW

IT Design Envelope50 kW

Facility Envelope (inc. cooling + UPS)70–85 kW

CPU HPC Tier (avg / peak)14.4 kW / 19.2 kW

GPU Inference Nodes4.0 kW avg

Budget

Capital Investment

Day 1 IT Stack

Into existing secure room. ROM estimate.

$2.3–2.9M

Fully Deployed

Includes IT stack, facility hardening, power redundancy, digital twin, installation, and 36-month support.

$6.9M – $8.7M

Physical Footprint

Rack 16 CPU HPC nodes, 2 management nodes, IB switch, networking

Rack 26 CPU HPC nodes, 4 high-memory nodes, 2 ingest nodes

Rack 32 GPU nodes, 2 hot NVMe nodes, 2 warm storage nodes

Rack 42 warm storage nodes, UPS/power, future expansion space

Deployment Architecture Options

Three Paths to Sovereign Compute

Each architecture reflects a different philosophy of performance density, operational control, and capital efficiency. The right choice depends on timeline, budget envelope, and long-term strategic positioning.

Recommended

Option 1

Modular Practical

Balanced · scalable · operationally efficient

Dense, modular HPC cluster optimized for CPU-dominant workloads with strong price-to-performance efficiency. Designed for rapid deployment, predictable scaling, and straightforward operations.

Configuration

Multi-node dense compute chassis
12 standard compute nodes
4 high-memory science nodes
2 GPU-assisted nodes (limited scope)
NVMe hot storage + high-capacity warm storage
High-speed fabric (200 Gb class)

Strengths

✓Best balance of cost, performance, deployability
✓Flexible expansion without architectural lock-in
✓Easier maintenance and component replacement
✓Suitable for field-adjacent deployments

Limitations

—Slightly lower peak density vs Premium HPC
—Less integrated cooling and fabric optimization

IT Stack Only

$1.7M – $2.9M

Fully Deployed

$2.1M – $3.5M

Best fit: National lab pilots · sovereign edge · scalability-first programs

Option 2

Premium HPC

Maximum density · integrated system · national infrastructure

Fully integrated supercomputing-class system designed for maximum compute density, tightly coupled workloads, and long-term expansion into large-scale national infrastructure.

Configuration

Fully integrated HPC cabinets
Advanced cooling (air or liquid-assisted)
High-density compute nodes
High-memory nodes integrated into fabric
GPU capability (expandable)
Tightly integrated high-performance interconnect

Strengths

✓Highest performance ceiling
✓Best suited for large-scale scientific modeling
✓Superior thermal efficiency at high densities
✓Strong long-term scalability to multi-megawatt

Limitations

—Higher capital cost
—More complex deployment and integration
—Vendor ecosystem dependency
—Over-provisioned for smaller clusters

IT Stack Only

$2.8M – $4.8M

Fully Deployed

$3.5M – $5.5M

Best fit: National flagship research · long-term sovereign infrastructure

Option 3

Retail Option

Component-based · lowest upfront cost · highest operational burden

Individually assembled server components sourced from standard enterprise or workstation-grade suppliers. Emphasizes low upfront cost but sacrifices system-level optimization.

Configuration

Individually racked servers
Standard enterprise motherboards and chassis
Mixed storage nodes
Conventional networking (Ethernet or entry HPC fabric)
Minimal system-level integration

Strengths

✓Lowest initial capital expenditure
✓Maximum flexibility in component sourcing
✓Rapid procurement in constrained environments

Limitations

—Higher failure rates over time
—Increased operational complexity
—Lower density, higher power per compute unit
—Difficult to manage at scale
—Weak deterministic performance consistency

IT Stack Only

$1.2M – $2.0M

Fully Deployed

$1.5M – $2.5M

Best fit: Early-stage experimentation · prototyping phases only

Attribute	Modular Practical	Premium HPC	Retail Option
Performance Density	High	Very High	Moderate
Deterministic Behavior	Strong	Very Strong	Variable
Scalability	Excellent	Excellent (large-scale)	Limited
Deployment Speed	Fast	Moderate	Fast
Operational Complexity	Moderate	High	High
Cost Efficiency	Best	Lowest efficiency	Lowest upfront
Long-term Viability	High	Very High	Low
IT Stack Price Range	$1.7M – $2.9M	$2.8M – $4.8M	$1.2M – $2.0M

What Apex Foundry Delivers

End-to-End. From Financing to Operations.

Your platform includes c-suite level consultation and hardware selection. We walk with you through the entire process — financing, commissioning, digital twin deployment, and long-term hardware support.

Access to Financing

Subject to approval — 75% CAPEX financing available
Your savings over 5 years are typically double what you would have paid a cloud provider
Full data sovereignty — no ongoing per-core cloud costs
Structured as infrastructure program financing

Operational Digital Twin

Real-time monitoring of cluster health and workload
Power, thermal, and GPU utilization telemetry
Maintenance alerts and workload management tools
Centralized dashboard for all rack and node status

Mobilization & Installation

Installation services — 5 days on-site
Technician labor: 2 technicians × 6 days
Engineering oversight: site scan, remote prep and validation
Full power-on testing and handover

Digital Twin & Monitoring (36 months)

Digital twin modeling and deployment
Integration and telemetry mapping
Monitoring platform license included (36 months)
Remote support included (36 months)

Hardware Support & Preventive Maintenance

Next business day on-site response upon monitoring alert
Physical hardware diagnostics and fault isolation
Component removal and installation: GPU, CPU, RAM, SSD, PSU, NIC, cables
Coordination for replacement parts procurement

Post-swap hardware verification and power-on testing
Quarterly preventive maintenance: thermal checks, firmware review
Spare parts inventory maintained — repairs within 48 hours of issue
Handoff to remote monitoring team for software validation

Building a Sovereign Research Environment?

We design, finance, and govern sovereign HPC clusters for national-interest scientific missions — air-gapped, auditable, and purpose-built.

Request a Scoping Call View All Deployments