jimmc414/Kosmos View GitHub Homepage for Latest Official Releases

An open-source, Claude-powered autonomous AI scientist capable of automatically executing the full scientific research cycle: literature analysis, hypothesis generation, experimental design, execution, analysis, and iterative improvement.

PythonKosmosjimmc414 413 Last Updated: January 26, 2026

Kosmos - Autonomous AI Scientist Platform Detailed Introduction

Project Overview

Kosmos is an open-source implementation of an autonomous AI scientist capable of executing the full scientific research cycle: from literature analysis and hypothesis generation, to experimental design, execution, analysis, and iterative improvement. The project is based on the Kosmos AI paper released in November 2025 (https://arxiv.org/abs/2511.02824) and adapted to be driven by Claude Code or Anthropic API.

Core Features

🔬 Autonomous Research Cycle

End-to-end Scientific Workflow: Full automation of the research cycle
Multi-discipline Support: Biology, Physics, Chemistry, Neuroscience, Materials Science
Iterative Improvement: Automatic optimization of hypotheses and experimental designs based on results

🤖 AI-Driven Intelligent System

Claude Sonnet 4 Driven: For hypothesis generation and advanced analysis
Multi-model Support: Supports Anthropic Claude, OpenAI GPT, and local models (Ollama, LM Studio)
Intelligent Model Selection: Automatically selects the optimal model based on task complexity

🔧 Flexible Integration Options

Dual Integration Options:
- Option A: Anthropic API (Pay-as-you-go)
- Option B: Claude Code CLI (Requires Max Subscription)
Mature Analysis Patterns: Integrates battle-tested statistical methods from kosmos-figures

📚 Literature Integration

Automated Paper Search: Supports arXiv, Semantic Scholar, PubMed
Literature Summarization: Automatically extracts key information
Novelty Check: Verifies the novelty of research hypotheses

🏗️ Agent Architecture

Modular Design: Each research task corresponds to an independent agent
Parallel Execution: Simultaneously runs multiple research tasks
Collaborative Work: Agents share information through a structured world model

🛡️ Safety First

Sandboxed Execution: Isolated code execution environment
Verification Mechanisms: Result verification and reproducibility checks
Human Approval Gate: Optional human review step

💰 Cost Optimization

Multi-layer Caching System: Reduces API costs by 30-40%
Smart Prompt Caching: Significantly saves costs when using Anthropic
Model Selection Optimization: Intelligently selects models based on task complexity, reducing costs by 15-20%

System Architecture

Core Components

┌─────────────────────────────────────────────────────────────┐
│                    Research Director                         │
│              (Main controller coordinating the               │
│                 autonomous research cycle)                   │
└──────────────┬──────────────────────────────────────────────┘
               │
┌──────────────┴────────┬───────────────┬──────────────┐
│                       │               │              │
┌───▼────┐   ┌─────────▼──────────┐  ┌▼──────────┐ ┌▼─────────────┐
│Literature│  │Hypothesis Generator│  │Experiment │ │Data Analyst  │
│Analyzer  │  │     (Claude)       │  │Designer   │ │   (Claude)   │
└───┬────┘   └─────────┬──────────┘  └┬──────────┘ └┬─────────────┘
    │                  │               │             │
    └──────────────────┴───────────────┴─────────────┘
                       │
            ┌──────────▼──────────┐
            │  Execution Engine   │
            │   (kosmos-figures   │
            │   proven patterns)  │
            └─────────────────────┘

Agent Descriptions

Research Director (Research Manager): Main coordinator managing the research workflow
Literature Analyzer (Literature Analyzer): Searches and analyzes scientific papers (arXiv, Semantic Scholar, PubMed)
Hypothesis Generator (Hypothesis Generator): Generates testable hypotheses using Claude
Experiment Designer (Experiment Designer): Designs computational experiments
Execution Engine (Execution Engine): Runs experiments using verified statistical methods
Data Analyst (Data Analyst): Interprets results using Claude
Feedback Loop (Feedback Loop): Iteratively improves hypotheses based on results

Technical Requirements

Basic Requirements

Python 3.11 or 3.12
One of the following:
- Option A: Anthropic API key (Pay-as-you-go)
- Option B: Claude Code CLI installed (requires Max subscription)

Installation Guide

Basic Installation

# Clone the repository
git clone https://github.com/jimmc414/Kosmos.git
cd Kosmos

# Create a virtual environment
python3.11 -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -e .

# Support for Claude Code CLI (Option B)
pip install -e ".[router]"

Option A: Anthropic API Configuration

# Copy the example configuration
cp .env.example .env

# Edit .env and set your API key
# ANTHROPIC_API_KEY=sk-ant-api03-your-actual-key-here

Get your API key from console.anthropic.com

Pros:

Pay-as-you-go
No CLI installation required
Works anywhere

Cons:

Charged per token
Rate limits apply

Option B: Claude Code CLI Configuration

# 1. Install Claude Code CLI
# Visit https://claude.ai/download and follow the instructions

# 2. Authenticate Claude CLI
claude auth

# 3. Copy the example configuration
cp .env.example .env

# 4. Edit .env and set the API key to all 9s (triggers CLI routing)
# ANTHROPIC_API_KEY=999999999999999999999999999999999999999999999999

This will route all API calls to the local Claude Code CLI, using your Max subscription without per-token charges.

Pros:

No per-token cost
Unlimited usage
Latest Claude models
Local execution

Cons:

Requires Claude CLI installation
Requires Max subscription

Database Initialization

# Run database migrations
alembic upgrade head

# Verify the database has been created
ls -la kosmos.db

Quick Start

Basic Usage Example

from kosmos import ResearchDirector

# Initialize the research director
director = ResearchDirector()

# Propose a research question
question = "What is the relationship between sleep deprivation and memory consolidation?"

# Run the autonomous research
results = director.conduct_research(
    question=question,
    domain="neuroscience",
    max_iterations=5
)

# View the results
print(results.summary)
print(results.key_findings)

Configuration Options

All configurations are done via environment variables (see .env.example):

Core Configuration

ANTHROPIC_API_KEY: API key or 999... for CLI mode
CLAUDE_MODEL: Model to use (API mode only)
DATABASE_URL: Database connection string
LOG_LEVEL: Log verbosity

Research Configuration

MAX_RESEARCH_ITERATIONS: Maximum number of autonomous iterations
ENABLED_DOMAINS: Which scientific domains are supported
ENABLED_EXPERIMENT_TYPES: Allowed experiment types
MIN_NOVELTY_SCORE: Minimum novelty threshold

Security Configuration

ENABLE_SAFETY_CHECKS: Code safety verification
MAX_EXPERIMENT_EXECUTION_TIME: Experiment timeout
ENABLE_SANDBOXING: Sandboxed code execution
REQUIRE_HUMAN_APPROVAL: Human approval gate

Performance Optimization

Caching System

Kosmos includes a multi-layer caching system that can reduce API costs by 30-40%:

# View cache performance
kosmos cache --stats

# Example output:
# Overall cache performance:
# Total requests: 500
# Cache hits: 175 (35%)
# Estimated cost savings: $15.75

Note: Significant cost-saving prompt caching is currently available only when using Anthropic Claude. OpenAI and local providers use in-memory response caching only.

Intelligent Model Selection

When using Anthropic as the LLM provider, Kosmos intelligently selects Claude models based on task complexity:

Claude Sonnet 4.5: Complex reasoning, hypothesis generation, analysis
Claude Haiku 4: Simple tasks, data extraction, formatting

This reduces costs by 15-20% while maintaining quality.

Note: This feature is specific to Anthropic Claude.

Project Structure

kosmos/
├── core/              # Core infrastructure (LLM, configuration, logging)
├── agents/            # Agent implementations
├── db/                # Database models and operations
├── execution/         # Experiment execution engine
├── analysis/          # Result analysis and visualization
├── hypothesis/        # Hypothesis generation and management
├── experiments/       # Experiment templates
├── literature/        # Literature search and analysis
├── knowledge/         # Knowledge graph and semantic search
├── domains/           # Domain-specific tools (Biology, Physics, etc.)
├── safety/            # Safety checks and verification
└── cli/               # Command-line interface

tests/
├── unit/              # Unit tests
├── integration/       # Integration tests
└── e2e/               # End-to-end tests

docs/
├── kosmos-figures-analysis.md      # Analysis patterns from kosmos-figures
├── integration-plan.md             # Integration strategy
└── domain-roadmaps/                # Domain-specific guides

Development Testing

# Install development dependencies
pip install -e ".[dev]"

# Run all tests
pytest

# Run coverage tests
pytest --cov=kosmos --cov-report=html

# Run specific test suites
pytest tests/unit/
pytest tests/integration/
pytest tests/e2e/

# Code formatting
black kosmos/ tests/

# Code linting
ruff check kosmos/ tests/

# Type checking
mypy kosmos/

Development Roadmap

✅ Completed (10 phases)

Phase 1: Project Foundation ✅

Project structure
Claude integration (API + CLI)
Configuration system
Agent framework
Database setup

Phase 2: Literature Capabilities ✅

Literature APIs (arXiv, Semantic Scholar, PubMed)
Literature analysis agent
Semantic search vector database
Knowledge graph

Phase 3: Hypothesis Generation ✅

Hypothesis generator agent
Novelty check
Hypothesis prioritization

Phase 4: Experimental Design ✅

Experiment designer agent
Protocol templates
Resource estimation

Phase 5: Experiment Execution ✅

Sandbox execution environment
Integration of kosmos-figures patterns
Statistical analysis

Phase 6: Result Analysis ✅

Data analysis agent
Visualization generation
Result summarization

Phase 7: Research Orchestration ✅

Research director agent
Feedback loop
Convergence detection

Phase 8: Safety and Verification ✅

Safety verification
Domain-specific tools

Phase 9: Production Deployment ✅

Performance optimization (20-40× improvement)
Multi-layer caching system
Comprehensive testing (90%+ coverage)

Phase 10: Documentation and Polishing ✅

Extensive documentation (10,000+ lines)
User guide
API documentation
Example code

Scientific Discovery Cases

Kosmos has generated several validated scientific discoveries across various fields:

1. Metabolomics - Brain Hypothermia Protection

Independently reproduced findings from an unpublished manuscript, identifying nucleotide metabolism as the primary altered pathway in hypothermic mouse brains.

2. Materials Science - Solar Cell Efficiency

Discovered that humidity during thermal treatment is a key determinant of solar cell efficiency and identified critical humidity thresholds.

3. Neuroscience - Neural Network Connectivity

Showed that brain networks across species follow a log-normal pattern rather than a power-law pattern.

4. Heart Disease - SOD2 Protective Factor

Found that the SOD2 protein appears to protect the heart by reducing fibrosis.

5. Diabetes - SSR1 Gene Variants

Discovered that genetic variants near the SSR1 gene may have a protective effect against type 2 diabetes.

6. Alzheimer's Disease - Temporal Analysis Method

Proposed a new analytical technique for tracking protein changes over time in disease-affected brain cells.

7. Neurodegenerative Diseases - Phosphatidylserine Exposure

Identified that neurons expose "eat me" signals due to age-related loss of flippase expression.

Research Efficiency: Independent scientists found 79.4% of statements in Kosmos reports to be accurate, and collaborators reported that a single 20-cycle Kosmos run was equivalent to 6 months of their own research time.

Inspiration

This project is inspired by the following works:

Paper: Kosmos: An AI Scientist for Autonomous Discovery (November 2025)
Analysis Patterns: kosmos-figures repository
Claude Router: claude_n_codex_api_proxy

Contribution Guidelines

Contributions are welcome! See CONTRIBUTING.md for guidelines.

Welcome Contribution Areas:

Domain-specific tools and APIs
Experiment templates for different domains
Literature API integrations
Safety verifications
Documentation
Testing

License

MIT License - see LICENSE

Citation

If you use Kosmos in your research, please cite:

@software{kosmos_ai_scientist,
  title={Kosmos AI Scientist: Multi-Provider Autonomous Scientific Discovery},
  author={Kosmos Contributors},
  year={2025},
  url={https://github.com/jimmc414/Kosmos}
}

Acknowledgments

Anthropic for providing Claude and Claude Code
Edison Scientific for providing the kosmos-figures analysis patterns
The open science community for providing literature APIs and tools

Support and Community

Issues: GitHub Issues
Discussions: GitHub Discussions

Project Status

Status: Production-ready (v0.2.0) - All 10 development phases completed

Last Updated: 2025-11-07

Documentation Resources

Architecture Overview - System design and components
Integration Plan - Integration approach for kosmos-figures patterns
Domain Roadmaps - Domain-specific implementation guides
API Reference - API documentation
Contribution Guide - How to contribute

Summary of Core Advantages

Fully Autonomous: End-to-end automation from hypothesis to discovery
Multi-discipline Support: Across Biology, Physics, Chemistry, and more
Validated: Has produced 7 validated scientific discoveries
Cost-Optimized: Multi-layer caching reduces costs by 30-40%
Flexible Integration: Supports both API and CLI options
Safe and Reliable: Sandboxed execution, verification mechanisms, 90%+ test coverage
Production-Ready: v0.2.0 version, all development phases completed