Multimodal AI5 min read

NVIDIA BioNeMo: Build an AI Scientist with BioNeMo Agent Toolkit

NVIDIA’s BioNeMo Agent Toolkit packages biomolecular NIM models as agent‑callable Skills.

The Brieftide

TL;DR

  • 01NVIDIA’s BioNeMo Agent Toolkit packages biomolecular NIM models as agent‑callable Skills.
  • 02The toolkit exposes optimized NIM models as documented, agent‑callable services for tasks such as structure prediction, docking, molecular generation, sequence analysis, and genomics.
  • 03An agent runtime can discover the platform via the BioNeMo Agent Toolkit GitHub repository and then use a Skill to call either hosted NIM endpoints or a local NIM deployment.

NVIDIA published the BioNeMo Agent Toolkit on Jun 23, 2026, a collection of BioNeMo Skills and Model Context Protocol (MCP) wrappers that turn its accelerated biomolecular stack into tools an AI "scientist" can discover and call. The toolkit exposes optimized NIM models as documented, agent‑callable services for tasks such as structure prediction, docking, molecular generation, sequence analysis, and genomics.

How does BioNeMo make biomolecular models agent‑callable?

BioNeMo packages NIM models behind Skills that describe purpose, required inputs, optional parameters, expected artifacts, and failure modes so an agent can choose, invoke, and interpret a model correctly. The toolkit layers BioNeMo Skills and MCP server wrappers on top of NVIDIA NIM and open models, and those models are accelerated by libraries such as cuEquivariance (for structure models) and Parabricks (for genomics). An agent runtime can discover the platform via the BioNeMo Agent Toolkit GitHub repository and then use a Skill to call either hosted NIM endpoints or a local NIM deployment.

The primary deployment options are hosted NIM endpoints for quick access and ease of scale, and local NIM deployment when repeated calls, lower warm latency, data locality, or tighter runtime control are required. Skills and MCP wrappers indicate where a model is available, how to call it, and what artifact to expect, for example CIF, SDF, FASTA, A3M, or SMILES files.

How much does using Skills change agent performance?

Measured tests show substantial gains: in internal benchmarking using Codex CLI with GPT-5.5 fast, agents with access to BioNeMo NIM Skills improved task completion from 57.1% to 100% and achieved a 2x improvement in passing assertions per tokens consumed. NVIDIA reports this by comparing the same agent running with and without Skills, and it evaluates both correctness (select the right model, prepare valid inputs, return expected artifacts, explain results) and efficiency (single‑call latency, parameter‑sweep latency, token use).

The company also measured token efficiency across ten NIM skills and presented a bar chart showing, on average, a 2x improvement in number of passing assertions per 1k tokens when Skills are available. All metrics cited were measured with Codex CLI and GPT-5.5 fast; the Skills themselves are designed to be agent‑agnostic so similar improvements can be expected with other agent backends.

What does a typical agent workflow look like?

An AI scientist starts with a scientific goal, selects models, prepares inputs, runs models, inspects outputs, and explains results with caveats; BioNeMo supplies deployable model services for each step. Example steps NVIDIA highlights include an MSA search with MMseqs2, folding a peptide with Boltz‑2 or OpenFold3, generating molecules with GenMol, and docking a ligand with DiffDock. The repository and Skills let an agent enumerate available capabilities before acting and then use a consistent prompt pattern to operate any skill.

Hosted endpoints such as the example OpenFold3 endpoint at https://build.nvidia.com/openfold3 are recommended for development and broad access, while local deployment (for example at http://localhost:8000) is advised when repeated, latency‑sensitive loops justify it. NVIDIA cautions that endpoints at build.nvidia.com are for small‑scale development and testing only, not production‑grade inference.

Why it matters

BioNeMo addresses a concrete operational gap: agents need more than model weights and APIs, they need documented, discoverable interfaces that explain when and how to use a model and what outputs to expect. Packaging these capabilities as Skills reduces setup friction, lowers error and retry rates, and shortens iterative loops in biomolecular research. The reported jump from 57.1% to 100% task completion and the 2x token efficiency gain indicate that Packaging and documentation, not just raw model access, materially change agent reliability and cost of using models in discovery workflows.

What to watch

Watch whether teams follow NVIDIA’s recommended pattern: begin with hosted NIM endpoints for evaluation and move selected models local when latency, throughput, security, or repeated iteration justify it. Also watch adoption signals for the broader platform components NVIDIA mentions—Nemotron and the NVIDIA NeMo Agent Toolkit—as indicators that teams attempt full orchestration and memory for multi‑step AI scientists.

For hands‑on developers, the BioNeMo Agent Toolkit repository (https://github.com/NVIDIA-BioNeMo/bionemo-agent-toolkit) is the entry point to enumerate Skills and MCP wrappers and to begin integrating NIM services into agent workflows.

How an AI scientist uses BioNeMo Skills and NIM
Agent runtime (Codex, Claude)BioNeMo Agent Toolkit (GitHub)BioNeMo Skills / MCP wrappersHosted NIM endpoints (build.nvidia.com)Local NIM deployment (local GPU node)Accelerated libraries (cuEquivariance, Parabricks)Expected artifacts (CIF, SDF, FASTA, A3M, SMILES)
Advertisement

Written by The Brieftide · Source: NVIDIA

The Brieftide Daily · 06:00

Briefs like this one, in your inbox every morning.

 

FreeOne email a dayEvery claim sourcedUnsubscribe in one click
Advertisement