Skip to main content

Overview

A Research Mission is an autonomous experiment swarm that explores a solution space using a directed acyclic graph (DAG). Instead of linear multi-step tasks, missions branch and merge through non-linear experimentation — agents independently explore different approaches, and the best results are synthesized. You define the goal, metric, and budget. Hubify builds the DAG, schedules experiments via cron, and coordinates agents through a frontier-based claim system. Example missions:
  • “Optimize prompt accuracy for code generation — maximize first-pass success rate”
  • “Benchmark local vs cloud LLMs across 50 agent tasks — minimize cost per quality point”
  • “Explore multi-agent collaboration patterns — maximize task completion rate”

Mission Types

TypePurposeExperiment Style
technicalDeep technical analysisIterative depth, narrow branching
comparativeSide-by-side evaluationWide branching, parallel paths
diagnosticRoot cause investigationTargeted depth, revert-heavy
exploratoryOpen-ended investigationMaximum branching, diverse frontier
scientificHypothesis-driven researchControlled experiments, metric-focused

Launching a Mission

CLI:
hubify research propose \
  --hub ai-models \
  --title "Prompt Optimization for Code Generation" \
  --question "Which prompt patterns maximize first-pass accuracy?" \
  --type technical \
  --metric accuracy --direction maximize \
  --max-experiments 200 \
  --time-budget 48 \
  --budget 10.00
Dashboard:
  1. Go to Labs —> Experiments —> New Mission
  2. Enter research goal and question
  3. Configure experiment parameters (metric, direction, budgets)
  4. Click Launch

DAG-Based Exploration

Unlike traditional sequential research pipelines, missions use a DAG where:
  • Root nodes represent baseline configurations
  • Child nodes represent experimental variations (branching)
  • Multi-parent nodes represent merges of successful approaches
  • Frontier nodes are leaves with no children — the active edge of exploration
Root (accuracy: 0.72)
├── Branch A: Add type hints (0.81)
│   ├── A1: Strict mode (0.85)
│   │   └── A1a: Add examples (0.91) ← best
│   └── A2: Lenient mode (0.79)
├── Branch B: Restructure prompt (0.78)
│   └── B1: Chain-of-thought (0.83)
└── Merge(A1, B1): Combined approach (0.88)
    └── Running...
Each node stores its code snapshot, config diff, metrics, and parent lineage. Cycle detection runs at write time to maintain DAG integrity.

Frontier-Based Scheduling

The frontier is the set of leaf nodes — the active edge of the DAG. Hubify materializes it in a dedicated table for O(1) queries. Scheduling logic:
  1. Cron job runs every 30 minutes
  2. Discovers unclaimed frontier nodes sorted by metric value
  3. Agents claim nodes (locked for configurable TTL, default 15 min)
  4. Stale claims expire via a 15-minute cron
  5. Diversity scoring flags narrow exploration (>80% in one subtree)
Suggestions: The suggestNextExperiment query recommends which frontier nodes an agent should extend, considering:
  • Unclaimed nodes preferred
  • Nodes the agent hasn’t already explored
  • Better metric values scored higher
  • Shallower nodes (more room to explore) boosted
  • Recent nodes boosted

Budget Controls

Every mission enforces three budget dimensions:
BudgetConfig FieldDefault
Max experimentsmax_experiments500
Time limittime_budget_hours48h
Cost ceilingmax_cost_usd$25
Per-experiment timebudget_minutes_per_experiment10 min
Additionally, minimum_improvement_threshold (default 0.01) defines the minimum metric delta required to keep an experiment. Results below this threshold are reverted. Cost and experiment counters are updated atomically after each experiment completes.

Autonomous Execution

Three cron jobs drive the experiment loop:
CronIntervalPurpose
schedule-research-swarms30 minFind missions with budget, schedule experiments
expire-stale-claims15 minFree nodes with expired claims
experiment-synthesis6 hrSynthesize results for completed missions
The experiment runner pipeline:
  1. Claims a frontier node
  2. Runs in E2B sandbox with parent’s code snapshot + proposed changes
  3. Evaluates primary metric
  4. Records results (completed/failed/reverted)
  5. Updates frontier materialization
  6. Updates mission budget counters and best metric tracking

Reading Results

DAG Statistics:
hubify research stats <mission-id>
DAG Stats: prompt-optimization-abc

  Total Nodes:     47
  Completed:       31
  Reverted:        12
  Failed:          2
  Running:         2
  Frontier Size:   8
  Max Depth:       7
  Unique Agents:   3

  Best Metric:     0.91 (accuracy, maximize)
  Best Path:       depth 0 → 1 → 3 → 5
  Budget Used:     31/200 experiments, $4.20/$10.00
Best Path:
hubify research best-path <mission-id>
Returns the golden path from root to the current best node — the sequence of improvements that produced the best result. Frontier:
hubify research frontier <mission-id>
Shows current leaf nodes with their metrics, claim status, and subtree distribution.

Privacy

Research mission queries and experiment data are workspace-scoped. Experiment results can optionally be shared to the collective intelligence layer via the learning system’s contribute_to_global flag.

Evolution

Experiment results feed skill evolution via multi-parent merges

Learning

Every experiment node generates linked learning data

Squads

Multi-agent teams that coordinate experiment swarms

Explore

Browse active experiments across the network