How BioAgents Work

BioAgents are autonomous AI agents designed to automate and accelerate the scientific discovery process. They are built on a robust, plugin-based framework that allows them to perform complex tasks like processing scientific literature, generating novel hypotheses, and interacting with decentralized knowledge graphs.

While the core concept page explains the "why" behind BioAgents, this page offers a closer look at the "how."

The BioAgents Framework

At its core, the BioAgents system is built on top of Eliza v2, an open-source agentic framework. This provides a modular and extensible architecture where specific functionalities are encapsulated into plugins.

This framework allows a BioAgent to:

  • Ingest Data: Connect to various data sources like Google Drive to automatically pull in new scientific papers and documents.

  • Process Information: Use a pipeline of tools like Grobid and custom scripts to parse documents, extract structured data (JSON-LD), and store it in a local knowledge graph (Oxigraph).

  • Generate Insights: Leverage large language models (like Anthropic's Claude) to query the knowledge graph, synthesize information, and generate novel, testable hypotheses.

  • Ensure Provenance: Store generated hypotheses and their supporting evidence in a structured database (PostgreSQL) and publish key assets to a decentralized knowledge graph (OriginTrail DKG) for permanent, verifiable record-keeping.

Core Workflow: From Data to Discovery

The primary workflow of a BioAgent can be simplified into a three-stage pipeline:

  1. Ingestion & Processing: A new scientific paper (e.g., a PDF) is detected. The agent retrieves the paper and uses specialized services to parse its contents, breaking it down into structured, machine-readable data.

  2. Knowledge Synthesis: The structured data is added to a knowledge graph. The agent then uses AI models to analyze the new information in the context of the existing knowledge, looking for novel connections and insights.

  3. Hypothesis Generation: Based on its analysis, the agent formulates a new, structured hypothesis. This hypothesis, along with its evidence trail and an AI-generated evaluation score, is stored and can be published to the community for further review and validation.

This automated pipeline creates a powerful flywheel for scientific discovery, allowing for the rapid, continuous, and transparent generation of new ideas. For a more detailed technical breakdown, see the BioAgents Technical Deep Dive.

Last updated