Hugging FaceGitHubgithub.com/evalstate
MCP Dev Summit · New York · 2026
MCP at 18 Months
Protocols, patterns, and the things we did not see coming.
Shaun Smith · @evalstate
April 2026
Hugging Face huggingface.co/evalstate
GitHub github.com/evalstate
X x.com/evalstate
Hugging FaceGitHubgithub.com/evalstate

Shaun Smith @evalstate

  • Open Source @ Hugging Face
  • MCP maintainer / transports working group
  • huggingface/hf-mcp-server
  • huggingface/upskill
  • huggingface/skills
  • Maintainer of fast-agent


Hugging FaceGitHubgithub.com/evalstate

The debate about MCP is more interesting than MCP itself

...and that is a good thing!
Hugging FaceGitHubgithub.com/evalstate

Things we didn't have at launch

Streamable HTTP Transport and OAuth

AGENTS.MD and Agent Skills

Internal Tools in Inference APIs

Agent Client Protocol
Responses API

Long Running Tool Loops (and reasoning models)

Hugging FaceGitHubgithub.com/evalstate

Reinforcement Learning

Models are placed in an environment, given a task and scored with a reward function:

  • discover
  • self-correct
  • problem solve
  • keep driving the loop without constant human steering

mini-SWE-Agent: A single 100 line python and single freeform (non JSON) tool can score 76.0% on SWE-Bench!

It's hard to compete against that efficiency.

alt text

Reinforcement learning environment diagram SWE-Bench bash tool benchmark result
Hugging FaceGitHubgithub.com/evalstate

Smarter Tool Loops

Harness Changes

General Purpose Agent Harnesses are given direct Shell access

Fewer pre/post Tool/LLM Stop checks/hacks to keep model on-track.

Snapshot/Checkpointing techniques (AgentFS, Execution Monitoring)

Remote runtime environments (e.g. Codex Web, Claude Code)

Why this enabled Skills

Simple navigable, native hierarchy of content

Reusable procedures become strong scaffolding for capable models

Bash is token dense and unsurprising compared to custom JSON Tools / mid-context tool enablement

Between deterministic program and documentation.

Once models can discover, recover, and keep going, a “skill” becomes a practical acceleration layer rather than a brittle scripted hack.

Hugging FaceGitHubgithub.com/evalstate

Dynamic Tool Calling

Dynamic Space Tool: 45 tokens

MCP provides an inference gateway to thousands of specialized and custom models covering Audio, Video, Text, 3D Models, Environments and more.

MCP provides Authentication and Multimodal support.

Qwen 3.5-35B-A3B
Flux.1-Krea-Dev
Qwen-Edit-2509-Multiple-angles-LoRA
Wan2.2 First/Last Frame

Hugging FaceGitHubgithub.com/evalstate

Code Execution Tools

A model with access to general purposes tools has crossed into a very real form of code mode.

Bash provides a general purpose, token dense-execution language.

Task-specific tools generated on demand. Example: HF Tool Builder navigates OpenAPI spec to build composable CLI tools.

Some models are trained to use code tools natively, and are bundled with interpreters.

Hugging FaceGitHubgithub.com/evalstate

Generation and Execution Environments

Style 1 - Main Model owns Code Generation

Main model
Generates Search Function
Execution Tool
Uses Search Function to return API definitions
Main model
Generates code from that API surface
Execution tool
Runs the code and returns output
Main model
Reads result and writes final answer
Code Generation: Main Model
Code Execution: Tool Environment

Style 2 - Delegated Code Generation

Main model
Sends a natural-language task to the tool
Execution tool
System Prompt contains API definitions
Execution tool
Returns the result
Main model
Packages it as the final answer
Code Generation: Tool Model
Code Generation: Tool Environment
API Definitions Cacheable

MCP makes it easy to transfer generation and execution between models and environments!
(and who pays for inference)

Hugging FaceGitHubgithub.com/evalstate

LLMs for Navigating: GenUI, Apps SDK (Prefect Prefab)

A common pattern:

  1. user asks for navigation or retrieval
  2. tools fetch the answer
  3. the model then spends expensive output tokens reprocessing a result that was already good enough
  4. The MCP Apps pattern fixes this by letting the result become final for the user.
Hugging FaceGitHubgithub.com/evalstate

Inference and Environment Boundaries are Blurring

The new abstraction

Runtime Environments

Options from YOLO, Local/Remote containers, exe.dev-style or lightweight sandboxes (Monty, Just-Bash). Simple persistent storage (e.g. HF Buckets)

Model Selection

Mixed Model workloads handle different modalitites, specializations and price points. Token efficient task agent delegation.

Inference APIs

Increasingly absorb search, tools, code, and state into one bundled execution surface.

Hugging FaceGitHubgithub.com/evalstate

Agent Client Protocol

File and Shell Tools

Client provided tools, enables "follow along" in editors

Session Based

Listing, Resumption and Rehydration of Agent sessions

Streaming Results and Observability

Agent Results and Tool Status stream, are cancellable

MCP Native Support

Uses MCP Data Model. Client sends MCP Sever Configurations


Hugging FaceGitHubgithub.com/evalstate

Open Responses

Open standard extending OpenAI's Responses API. Provides a consistent, provider neutral way to interact with modern LLMs. Repairs Chat Completion API drift.

It defines a shared schema, and tooling layer that enable a unified experience for calling language models, streaming results, and composing agentic workflows—independent of provider.


Usage as a Provider / Router allows creation of rich Agent Environments

Internal Tools - (Model or Provider)
  • shell and local_shell
  • code_interpreter
  • apply_patch
  • web_search
  • etc..
External Tools (Client Supplied)
  • MCP Servers
  • Standard JSON function calls
  • Free-Form Tools
  • Grammar constrained Tools
Hugging FaceGitHubgithub.com/evalstate

It was close....! PMF for MCP

MCP is a Commodity Standard

Supports Consumer, Enterprise and Developer use-cases.

Single URL to install authenticated JSON tools across thousands of clients

MCP's "fit" features weren't present at launch!

URI/Resources based extensions deliver innovation and extensibility...

...Which enabled rapid MCP Apps distribution on a solid support base.

Model/Host Changes and STDIO

Host applications with Shell tool reduce the need for STDIO Servers.

In many cases for local running tools such as Apify mcp-cli or Pete Steinberger's MCPorter offer a better experience for MCP usage.

Distribution via MCPB is one potential advantage

Simple one-shot server design meant that distribution of ideas was more important than code.

Hugging FaceGitHubgithub.com/evalstate

Thank You!