Mohammad Sayed

Code, life, and everything in between

Category: AI & Emerging Tech

  • The Rise of the Agentic Web

    The Rise of the Agentic Web

    There is a quiet revolution happening under the hood of AI. We are shifting from asking models questions toward letting agents act on our behalf. In this shift, the web, that familiar space built for humans, must evolve too. This is where the idea of an agentic web comes alive, and where WebMCP becomes a bridge between agentic intelligence and the real-world web.

    What Is the Agentic Web?

    By “agentic web,” I mean an internet where AI agents are autonomous, goal-oriented systems. They do not just read pages but also interact, execute tasks, and collaborate across services. The web becomes a space not only for humans and browsers but also for agents.

    This is more than simply adding AI on top of websites. It is a shift from humans doing everything to delegating some work to intelligent systems. Agents can traverse websites, invoke actions, orchestrate workflows, and maintain shared state with users.

    The paper Agentic Web: Weaving the Next Web with AI Agents describes it across three dimensions: intelligence, interaction, and economics. The future web may not just be a network of documents but a network of tools, protocols, and delegated tasks.

    Why Today’s Web Fails Agents

    When you ask an LLM-based agent to interact with a site today, you hit several problems:

    1. UI is opaque Agents parse raw HTML, guess which buttons matter, and derive semantics from visual markup. It is fragile.
    2. Context is lost State does not travel. Agents re-derive context every time they refresh or navigate.
    3. Inefficiency Sending full-page snapshots or screenshots is expensive and slow.
    4. Security gaps If an agent can click and submit, how do we stop it from causing harm?

    In short, the web was built for humans. For agents, it is noisy, ambiguous, and costly.

    WebMCP: A Bridge for Developers

    WebMCP solves this by letting developers explicitly expose tools as JavaScript functions with clear schemas that agents can call directly. It makes parts of the UI machine-readable without needing to rebuild backends.

    How it works:

    • The page registers named tools with descriptions and input/output schemas
    • The agent discovers available tools via the Model Context Protocol (MCP)
    • The agent calls a tool with structured arguments
    • The page runs it in context and returns structured results

    This approach makes agent work cheaper and more reliable. Tests show WebMCP can cut agent processing by about 67 percent while keeping accuracy high. Developers keep control of what actions exist and how they behave.

    MCP: The Standard Underneath

    The Model Context Protocol (MCP) defines how agents talk to external tools and data sources. An MCP server exposes actions; an agent acts as a client that queries and executes them.

    WebMCP extends this idea into the browser. A page can act like an MCP server and offer its actions directly to agents. For developers, this means your web app can be an active participant in the agentic ecosystem without rewriting APIs.

    Challenges and Risks

    The move toward an agentic web brings serious challenges.

    1. Permissions and trust

    You must choose what to expose and under what conditions. A bad setup could let untrusted agents do harm.

    2. Prompt injection and naming attacks

    Agents may be tricked into calling dangerous functions or using fake tools.

    3. State conflicts

    Multiple agents and users can cause inconsistent state if actions are not carefully designed.

    4. Data leaks

    Even simple read tools can reveal sensitive information.

    5. Maintenance burden

    Every tool you expose becomes part of your API surface. You must version and test it.

    Designing for Agents

    Some guiding principles I find useful:

    • Expose as little as possible and start small
    • Use clear schemas for all inputs and outputs
    • Add validation and safety checks around every action
    • Keep a human approval path where it matters
    • Make actions idempotent or easy to roll back
    • Plan for versioning and future compatibility

    What Excites Me

    • WordPress and WebMCP: imagine agents creating, editing, and managing content natively
    • Cross-site agent workflows: agents hopping between your calendar, CRM, and analytics tools
    • Human-in-the-loop UIs: agents propose, humans confirm
    • Security-first tooling: better sandboxes and safe defaults for developers
    • New economic models: agents negotiating purchases or automating transactions

    I am cautious about hype and complexity, but the potential for cleaner, safer agent interactions on the web is huge.

    Closing Thoughts

    The agentic web will not arrive overnight. It is less about replacing humans and more about building systems that let agents assist while keeping humans in control.

    If you are a web developer, now is the time to ask: Which actions on my site could an agent help with? How should I expose them? How do I keep it safe?

    This future belongs to those who design thoughtfully for both humans and agents.

  • Over a Decade of Building Software in the Age of AI

    Over a Decade of Building Software in the Age of AI

    Every few years, the tech world discovers a new revolution. We’ve seen it with object-oriented programming, web frameworks, mobile apps, cloud computing and now, generative AI. Each wave arrives with excitement, bold promises, and a little fear of missing out. For someone who has spent years writing, shipping, and maintaining software, these cycles feel familiar. Experience doesn’t make you cynical, but it does teach you to see patterns through the noise.

    Patterns Repeat: Just With New Names

    When you’ve lived through several tech transitions, you notice that the story rarely changes. New tools arrive with dazzling demos. Early adopters rush in, convinced the old ways are dead. Then reality shows up scale, cost, security, integration pain. Over time, we figure out where the technology fits, the hype settles, and the best practices emerge.

    Generative AI follows this arc. It’s impressive, but it’s not magic. It’s another tool. Understanding where it genuinely helps and where it adds risk takes patience and clear thinking.

    Start With the Problem, Not the Hype

    One habit seasoned engineers develop is resisting the pull of “shiny new tech” until they know the customer’s real pain points. Technology is only valuable when it solves a problem that matters.

    Jumping into AI because “everyone is doing it” is risky. You might over-engineer, build features that don’t last, or miss simpler, cheaper solutions. First understand what users struggle with, then decide if AI or anything new actually helps.

    Fundamentals Still Win

    Through all the waves of change, the foundations of good software never stopped mattering:

    • Security and privacy: New tools often open new attack surfaces. AI adds its own data leaks, prompt injection, unexpected model behavior.
    • Reliability: Fancy models won’t save you if your product fails under load or breaks silently.
    • Maintainability: Code that’s clear, tested, and well-structured always survives longer than trendy hacks.

    When a new technology emerges, these basics become more important, not less.

    Healthy Skepticism Is a Strength

    Skepticism doesn’t mean dismissing innovation. It means asking the right questions before committing:

    • What problem does this solve better than existing tools?
    • How does it affect cost and complexity long term?
    • Can we secure and scale it responsibly?

    That mindset keeps teams focused and protects them from the whiplash of hype cycles.

    The Advantage of Experience

    Generative AI will change how we build, just as previous revolutions did. But experienced developers bring perspective that’s rare in early hype stages. We know the thrill of new tools, but also the pain of maintaining them. We can guide teams to adopt thoughtfully balancing innovation with stability, excitement with caution.

    In fast-moving tech, it’s tempting to chase trends. The wiser path is slower but stronger: understand deeply, apply carefully, and never forget the lessons learned from decades of building.

  • WebMCP — Giving AI Agents a Real API for the Web

    WebMCP — Giving AI Agents a Real API for the Web


    If you’ve been following the rise of AI copilots and browser agents, you might have noticed a big missing piece: how do these agents actually do things inside a web app without resorting to brittle click simulation or screen scraping?

    Two new ideas aim to fix that: MCP (Model Context Protocol) and its browser-friendly sibling WebMCP.

    In this post I’ll explain what they are, why static metadata isn’t enough, how the protocol works, and show real code so you can see what an agent actually does.


    The Problem Today

    Web apps are interactive and stateful.

    AI agents want to help us: “create an invoice,” “filter results,” “upload a file.”

    But right now, they mostly simulate user clicks or parse the DOM. That’s brittle, breaks when the UI changes, and can’t safely reuse your logged-in state.

    We have metadata standards (SEO tags, JSON-LD, schema.org) and accessibility APIs (ARIA). They’re great for describing pages, but they’re static and read-only.

    We need a way to expose real functions to agents.


    Enter MCP (Model Context Protocol)

    MCP is a formal, open protocol that lets an application expose tools (functions) to an AI agent in a structured way.

    Think of MCP as USB for AI agents:

    • It defines standard messages (requests, responses, events).
    • It defines verbs like listTools and invokeTool.
    • It uses JSON Schema to describe each tool’s inputs and outputs.

    The agent doesn’t guess — it asks for a list of tools, sees their schemas, and calls them like an API.


    A Tiny MCP Conversation

    // Agent → App
    {
      "type": "request",
      "id": "1",
      "method": "listTools",
      "params": {}
    }
    // App → Agent
    {
      "type": "response",
      "id": "1",
      "result": {
        "tools": [
          {
            "name": "getPageInfo",
            "description": "Get the current page's title and URL",
            "inputSchema": { "type": "object", "properties": {} }
          }
        ]
      }
    }
    // Agent → App
    {
      "type": "request",
      "id": "2",
      "method": "invokeTool",
      "params": {
        "name": "getPageInfo",
        "arguments": {}
      }
    }
    // App → Agent
    {
      "type": "response",
      "id": "2",
      "result": {
        "content": [
          { "type": "text", "text": "{\"title\":\"Dashboard\",\"url\":\"https://example.com\"}" }
        ]
      }
    }

    This format is defined by the MCP spec — every MCP-capable agent and app speaks this same language.

    Spec link: https://modelcontextprotocol.io/spec


    Where WebMCP Fits

    MCP itself doesn’t say how messages travel. It’s transport-agnostic — could be WebSocket, stdio, whatever.

    WebMCP (sometimes called MCP-B) is a browser transport binding: it lets your web page run an MCP server in the page itself using postMessage. This way:

    • Your page exposes tools directly.
    • Agents running as browser extensions or overlays can discover and call them.
    • Calls happen with the user’s existing auth/session, securely in their own browser.

    Why Not Just Metadata or ARIA?

    • Static metadata (JSON-LD, meta tags) Great for describing content to search engines. But can’t run live code, use the logged-in user’s session, or return fresh data.
    • ARIA and accessibility attributes Help screen readers, but still require simulating clicks or typing. Agents would be fragile if the UI changes.
    • MCP tools are durable APIs You expose createInvoice() or filterTemplates() directly. Agents call them like RPC, independent of UI markup.
    • Security & auth The call happens in the user’s authenticated page context — no need to share tokens with an external service.

    Real Code: Exposing a Tool in the Browser

    Install the libraries:

    npm install @mcp-b/transports @modelcontextprotocol/sdk

    Drop the server code into your app and you’re ready to expose tools.

    import { TabServerTransport } from '@mcp-b/transports';
    import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
    
    const server = new McpServer({ name: 'demo-app', version: '1.0.0' });
    
    server.tool(
      'getPageInfo',
      'Get the page title and URL',
      {},
      async () => ({
        content: [{ type: 'text', text: JSON.stringify({ title: document.title, url: location.href }) }]
      })
    );
    
    server.tool(
      'createInvoice',
      'Create a new invoice',
      {
        customerEmail: { type: 'string', format: 'email' },
        items: {
          type: 'array',
          items: {
            type: 'object',
            properties: {
              description: { type: 'string' },
              amount: { type: 'number' }
            },
            required: ['description', 'amount']
          }
        }
      },
      async ({ customerEmail, items }) => {
        const resp = await fetch('/api/invoices', {
          method: 'POST',
          headers: { 'Content-Type': 'application/json' },
          body: JSON.stringify({ customerEmail, items })
        });
        const data = await resp.json();
        return { content: [{ type: 'text', text: `Invoice created: ${data.id}` }] };
      }
    );
    
    server.connect(new TabServerTransport());

    Once this script runs, an agent in the same tab (via extension) can:

    1. Discover the server.
    2. Call listTools.
    3. Invoke createInvoice with JSON args.
    4. Get a structured result.

    Why This Matters

    • Stable for developers — you expose clear functions instead of hoping DOM stays the same.
    • Safer for users — agent runs inside their logged-in session; no need to share tokens.
    • Richer AI experiences — instead of “clicking around,” agents can perform meaningful, complex tasks.

    If we want a future where AI assistants really help us on the web, MCP + WebMCP is the missing API layer.


    Takeaway

    Think of MCP like a universal connector for AI → apps.

    Think of WebMCP as the browser plug that lets your web app join that ecosystem.

    When you’re building a modern web app and want it to be agent-friendly, you’ll soon be able to simply:

    server.tool('whateverYouWant', 'Human friendly description', schema, handler);

    …and every compliant AI agent will know exactly how to talk to it.