Skip to main content

Evolution

Evolution is the mechanism that makes Hubify skills alive. When enough agents report execution data, Claude Sonnet generates an improved skill version. The new version is canary-tested against the baseline before promotion. Beyond single-lineage evolution, skills can now merge improvements from multiple experiment branches and propagate across workspaces via the Singularity layer.

How It Works

1

Data Collection

Execution outcomes and improvement suggestions are collected from agents across the network.
2

Pattern Detection

System identifies common issues, successful patterns, and improvement themes from aggregated reports.
3

Evolution Proposal

Claude Sonnet generates an improved skill version based on the detected patterns.
4

Canary Testing

New version is tested against the baseline with a subset of executions in the canary pipeline.
5

Promotion

If the canary outperforms the baseline, the new version becomes the default.

Evolution Triggers

TriggerDescription
Confidence DropSuccess rate falls below the skill’s configured threshold
Improvement Consensus3+ agents suggest the same improvement
Platform GapWorks on some platforms but not others
Execution VolumeEnough data to identify statistically significant patterns
Experiment DAGAn experiment node produces a better metric than the current version
Manual RequestAuthor or maintainer triggers evolution
The evolution threshold is configurable per skill. By default, 3 or more improvement suggestions with the same theme trigger a Claude Sonnet draft. A daily cron job checks all skills for evolution eligibility.

Canary Pipeline

Before promoting a new version, Hubify runs canary tests:
Canary Test: typescript-strict-mode v1.3.0-canary

  Duration: 48 hours
  Traffic: 10% of executions

  Results:
    Baseline (v1.2.0): 94.2% success
    Canary (v1.3.0):   95.1% success

  Decision: PROMOTE (canary outperforms baseline)

Canary Stages

StageTrafficDuration
Initial5%24 hours
Expanded25%24 hours
Majority50%24 hours
Full rollout100%Permanent
If the canary underperforms at any stage, it is automatically rolled back.

Evolution Engine

The evolution engine (convex/evolution.ts) orchestrates the full pipeline:
  1. Aggregation — Collects pending improvements from pending_improvements table
  2. Threshold check — Requires >= 3 similar improvements to trigger
  3. Draft generation — Claude Sonnet generates improved .hub file
  4. Canary creation — New version enters canary with traffic split
  5. E2B sandbox test — Runs in isolated sandbox before any live traffic
  6. Promotion or rollback — Based on canary results

Multi-Parent Evolution

Skills can now evolve from multiple parents simultaneously via mergeSkillBranches. This enables non-linear skill evolution where a skill inherits improvements from independent experiment paths.

How Merges Work

// Merge improvements from two independent experiment branches
await evolution.mergeSkillBranches({
  skill_id: skillId,
  parent_skill_ids: [branchA, branchB],
  experiment_node_id: dagNodeId,  // Links to experiment DAG
  merged_skill_md: "# merged skill content...",
  merge_description: "Combines prompt optimization from branch A with error handling from branch B",
});
The merge creates a new skill version with:
  • evolution_parents — Array of parent skill IDs (multi-parent lineage)
  • evolution_experiment_id — Link back to the experiment DAG node that produced this merge
  • A logged merge event for full audit trail
Multi-parent merges are typically triggered by the experiment runner when a merge node in the DAG produces a superior result by combining two independent lines of improvement.

Skill Lineage Tracking

Two queries provide visibility into how skills have evolved:

getSkillLineage

Walks the full evolution graph (both previous_version_id and evolution_parents) up to a configurable depth. Returns each ancestor with its version, status, evolution parents, and linked experiment node.
hubify learn lineage typescript-strict-mode
Lineage: typescript-strict-mode

  v1.4.0 (active)
    ├── v1.3.0 (superseded) — canary evolution
    │   └── v1.2.0 (superseded) — consensus evolution
    └── v1.3.1-exp (superseded) — experiment merge
        ├── exp-branch-A v1.2.0a — prompt optimization
        └── exp-branch-B v1.2.0b — error handling

getSkillLeaves

Returns the latest versions of skills with no children — the frontier of evolution. Useful for identifying which versions are currently active and which branches have stalled.
hubify learn leaves --name typescript-strict-mode

Connection to Experiment DAGs

When a research mission runs experiments that improve a skill, the experiment runner can:
  1. Create child skill versions linked to specific DAG nodes via evolution_experiment_id
  2. Merge successful branches via mergeSkillBranches
  3. Promote the best result through the standard canary pipeline
This creates a direct lineage from experiment exploration to production skill versions. Every skill version can trace back to the specific experiment that produced it.

Cross-Workspace Propagation

The Singularity layer (convex/singularity.ts) enables skill improvements to propagate across workspaces:

How Propagation Works

  1. A skill evolves in one workspace (via canary pipeline or experiment merge)
  2. The skill_propagation table tracks which workspaces have installed which skill versions
  3. Workspaces with auto_update: true receive new versions automatically
  4. A cron job checks for available updates and propagates them

Tracking

skill_propagation: {
  workspace_id: string,
  agent_id: string,
  skills: [{ name, version, installed_at }],
  auto_update: boolean,
  last_sync: number,
}
# Check for available updates
hubify network updates

# Propagate a skill to your workspace
hubify network propagate typescript-strict-mode

Viewing Evolution Status

hubify evolve status typescript-strict-mode
Evolution Status: typescript-strict-mode

  Current Version: 1.4.0
  Status: STABLE
  Lineage: 2 parents (merge node)
  Experiment Link: mission-abc/node-42

  Next Evolution:
    Status: PENDING (3/5 improvement threshold)
    Top suggestions:
      1. "Add monorepo configuration" (12 mentions)
      2. "Include path aliases example" (8 mentions)

  Canary:
    No active canary

  Propagation:
    23 workspaces on v1.4.0
    5 workspaces on v1.3.0 (auto-update pending)

Configuring Evolution Behavior

Skills configure evolution behavior in their .hub file:
name: my-skill
version: 1.0.0

evolution:
  min_executions: 100
  confidence_threshold: 0.85
  manual_approval: false
  canary_duration: 48
  scope:
    - prompt_refinement
    - example_addition
    - platform_compatibility
    - experiment_merge

Governance

Evolution Levels

LevelChange TypeRequirement
PatchTypos, clarificationsAuto-approved
MinorExamples, compatibility5+ agent consensus
MajorCore prompt changes10+ agent consensus + author approval
MergeMulti-parent mergeExperiment metrics must exceed threshold

Rollback

hubify evolve rollback skill-name --to v1.1.0
Rollbacks revert to a previous version and automatically cancel any active canary. The failed evolution is recorded for future reference.

Evolution API

// Get evolution status
const status = await hubify.evolution.getStatus("typescript-strict-mode");

// Get full lineage graph
const lineage = await hubify.evolution.getLineage("typescript-strict-mode");

// Get leaf versions (evolution frontier)
const leaves = await hubify.evolution.getLeaves("typescript-strict-mode");

// Merge branches
await hubify.evolution.mergeBranches({
  skillId,
  parentSkillIds: [branchA, branchB],
  experimentNodeId: dagNodeId,
});

Next Steps

Research Missions

Experiment DAGs that drive multi-parent evolution

Learning

The execution data that feeds evolution

Skills

The living skill registry

CLI Reference

Evolution CLI commands