An open-source, Claude-powered autonomous AI scientist capable of automatically executing the full scientific research cycle: literature analysis, hypothesis generation, experimental design, execution, analysis, and iterative improvement.
Kosmos - Autonomous AI Scientist Platform Detailed Introduction
Project Overview
Kosmos is an open-source implementation of an autonomous AI scientist capable of executing the full scientific research cycle: from literature analysis and hypothesis generation, to experimental design, execution, analysis, and iterative improvement. The project is based on the Kosmos AI paper released in November 2025 (https://arxiv.org/abs/2511.02824) and adapted to be driven by Claude Code or Anthropic API.
Core Features
🔬 Autonomous Research Cycle
- End-to-end Scientific Workflow: Full automation of the research cycle
- Multi-discipline Support: Biology, Physics, Chemistry, Neuroscience, Materials Science
- Iterative Improvement: Automatic optimization of hypotheses and experimental designs based on results
🤖 AI-Driven Intelligent System
- Claude Sonnet 4 Driven: For hypothesis generation and advanced analysis
- Multi-model Support: Supports Anthropic Claude, OpenAI GPT, and local models (Ollama, LM Studio)
- Intelligent Model Selection: Automatically selects the optimal model based on task complexity
🔧 Flexible Integration Options
- Dual Integration Options:
- Option A: Anthropic API (Pay-as-you-go)
- Option B: Claude Code CLI (Requires Max Subscription)
- Mature Analysis Patterns: Integrates battle-tested statistical methods from kosmos-figures
📚 Literature Integration
- Automated Paper Search: Supports arXiv, Semantic Scholar, PubMed
- Literature Summarization: Automatically extracts key information
- Novelty Check: Verifies the novelty of research hypotheses
🏗️ Agent Architecture
- Modular Design: Each research task corresponds to an independent agent
- Parallel Execution: Simultaneously runs multiple research tasks
- Collaborative Work: Agents share information through a structured world model
🛡️ Safety First
- Sandboxed Execution: Isolated code execution environment
- Verification Mechanisms: Result verification and reproducibility checks
- Human Approval Gate: Optional human review step
💰 Cost Optimization
- Multi-layer Caching System: Reduces API costs by 30-40%
- Smart Prompt Caching: Significantly saves costs when using Anthropic
- Model Selection Optimization: Intelligently selects models based on task complexity, reducing costs by 15-20%
System Architecture
Core Components
┌─────────────────────────────────────────────────────────────┐
│ Research Director │
│ (Main controller coordinating the │
│ autonomous research cycle) │
└──────────────┬──────────────────────────────────────────────┘
│
┌──────────────┴────────┬───────────────┬──────────────┐
│ │ │ │
┌───▼────┐ ┌─────────▼──────────┐ ┌▼──────────┐ ┌▼─────────────┐
│Literature│ │Hypothesis Generator│ │Experiment │ │Data Analyst │
│Analyzer │ │ (Claude) │ │Designer │ │ (Claude) │
└───┬────┘ └─────────┬──────────┘ └┬──────────┘ └┬─────────────┘
│ │ │ │
└──────────────────┴───────────────┴─────────────┘
│
┌──────────▼──────────┐
│ Execution Engine │
│ (kosmos-figures │
│ proven patterns) │
└─────────────────────┘
Agent Descriptions
- Research Director (Research Manager): Main coordinator managing the research workflow
- Literature Analyzer (Literature Analyzer): Searches and analyzes scientific papers (arXiv, Semantic Scholar, PubMed)
- Hypothesis Generator (Hypothesis Generator): Generates testable hypotheses using Claude
- Experiment Designer (Experiment Designer): Designs computational experiments
- Execution Engine (Execution Engine): Runs experiments using verified statistical methods
- Data Analyst (Data Analyst): Interprets results using Claude
- Feedback Loop (Feedback Loop): Iteratively improves hypotheses based on results
Technical Requirements
Basic Requirements
- Python 3.11 or 3.12
- One of the following:
- Option A: Anthropic API key (Pay-as-you-go)
- Option B: Claude Code CLI installed (requires Max subscription)
Installation Guide
Basic Installation
# Clone the repository
git clone https://github.com/jimmc414/Kosmos.git
cd Kosmos
# Create a virtual environment
python3.11 -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -e .
# Support for Claude Code CLI (Option B)
pip install -e ".[router]"
Option A: Anthropic API Configuration
# Copy the example configuration
cp .env.example .env
# Edit .env and set your API key
# ANTHROPIC_API_KEY=sk-ant-api03-your-actual-key-here
Get your API key from console.anthropic.com
Pros:
- Pay-as-you-go
- No CLI installation required
- Works anywhere
Cons:
- Charged per token
- Rate limits apply
Option B: Claude Code CLI Configuration
# 1. Install Claude Code CLI
# Visit https://claude.ai/download and follow the instructions
# 2. Authenticate Claude CLI
claude auth
# 3. Copy the example configuration
cp .env.example .env
# 4. Edit .env and set the API key to all 9s (triggers CLI routing)
# ANTHROPIC_API_KEY=999999999999999999999999999999999999999999999999
This will route all API calls to the local Claude Code CLI, using your Max subscription without per-token charges.
Pros:
- No per-token cost
- Unlimited usage
- Latest Claude models
- Local execution
Cons:
- Requires Claude CLI installation
- Requires Max subscription
Database Initialization
# Run database migrations
alembic upgrade head
# Verify the database has been created
ls -la kosmos.db
Quick Start
Basic Usage Example
from kosmos import ResearchDirector
# Initialize the research director
director = ResearchDirector()
# Propose a research question
question = "What is the relationship between sleep deprivation and memory consolidation?"
# Run the autonomous research
results = director.conduct_research(
question=question,
domain="neuroscience",
max_iterations=5
)
# View the results
print(results.summary)
print(results.key_findings)
Configuration Options
All configurations are done via environment variables (see .env.example):
Core Configuration
ANTHROPIC_API_KEY: API key or999...for CLI modeCLAUDE_MODEL: Model to use (API mode only)DATABASE_URL: Database connection stringLOG_LEVEL: Log verbosity
Research Configuration
MAX_RESEARCH_ITERATIONS: Maximum number of autonomous iterationsENABLED_DOMAINS: Which scientific domains are supportedENABLED_EXPERIMENT_TYPES: Allowed experiment typesMIN_NOVELTY_SCORE: Minimum novelty threshold
Security Configuration
ENABLE_SAFETY_CHECKS: Code safety verificationMAX_EXPERIMENT_EXECUTION_TIME: Experiment timeoutENABLE_SANDBOXING: Sandboxed code executionREQUIRE_HUMAN_APPROVAL: Human approval gate
Performance Optimization
Caching System
Kosmos includes a multi-layer caching system that can reduce API costs by 30-40%:
# View cache performance
kosmos cache --stats
# Example output:
# Overall cache performance:
# Total requests: 500
# Cache hits: 175 (35%)
# Estimated cost savings: $15.75
Note: Significant cost-saving prompt caching is currently available only when using Anthropic Claude. OpenAI and local providers use in-memory response caching only.
Intelligent Model Selection
When using Anthropic as the LLM provider, Kosmos intelligently selects Claude models based on task complexity:
- Claude Sonnet 4.5: Complex reasoning, hypothesis generation, analysis
- Claude Haiku 4: Simple tasks, data extraction, formatting
This reduces costs by 15-20% while maintaining quality.
Note: This feature is specific to Anthropic Claude.
Project Structure
kosmos/
├── core/ # Core infrastructure (LLM, configuration, logging)
├── agents/ # Agent implementations
├── db/ # Database models and operations
├── execution/ # Experiment execution engine
├── analysis/ # Result analysis and visualization
├── hypothesis/ # Hypothesis generation and management
├── experiments/ # Experiment templates
├── literature/ # Literature search and analysis
├── knowledge/ # Knowledge graph and semantic search
├── domains/ # Domain-specific tools (Biology, Physics, etc.)
├── safety/ # Safety checks and verification
└── cli/ # Command-line interface
tests/
├── unit/ # Unit tests
├── integration/ # Integration tests
└── e2e/ # End-to-end tests
docs/
├── kosmos-figures-analysis.md # Analysis patterns from kosmos-figures
├── integration-plan.md # Integration strategy
└── domain-roadmaps/ # Domain-specific guides
Development Testing
# Install development dependencies
pip install -e ".[dev]"
# Run all tests
pytest
# Run coverage tests
pytest --cov=kosmos --cov-report=html
# Run specific test suites
pytest tests/unit/
pytest tests/integration/
pytest tests/e2e/
# Code formatting
black kosmos/ tests/
# Code linting
ruff check kosmos/ tests/
# Type checking
mypy kosmos/
Development Roadmap
✅ Completed (10 phases)
Phase 1: Project Foundation ✅
- Project structure
- Claude integration (API + CLI)
- Configuration system
- Agent framework
- Database setup
Phase 2: Literature Capabilities ✅
- Literature APIs (arXiv, Semantic Scholar, PubMed)
- Literature analysis agent
- Semantic search vector database
- Knowledge graph
Phase 3: Hypothesis Generation ✅
- Hypothesis generator agent
- Novelty check
- Hypothesis prioritization
Phase 4: Experimental Design ✅
- Experiment designer agent
- Protocol templates
- Resource estimation
Phase 5: Experiment Execution ✅
- Sandbox execution environment
- Integration of kosmos-figures patterns
- Statistical analysis
Phase 6: Result Analysis ✅
- Data analysis agent
- Visualization generation
- Result summarization
Phase 7: Research Orchestration ✅
- Research director agent
- Feedback loop
- Convergence detection
Phase 8: Safety and Verification ✅
- Safety verification
- Domain-specific tools
Phase 9: Production Deployment ✅
- Performance optimization (20-40× improvement)
- Multi-layer caching system
- Comprehensive testing (90%+ coverage)
Phase 10: Documentation and Polishing ✅
- Extensive documentation (10,000+ lines)
- User guide
- API documentation
- Example code
Scientific Discovery Cases
Kosmos has generated several validated scientific discoveries across various fields:
1. Metabolomics - Brain Hypothermia Protection
Independently reproduced findings from an unpublished manuscript, identifying nucleotide metabolism as the primary altered pathway in hypothermic mouse brains.
2. Materials Science - Solar Cell Efficiency
Discovered that humidity during thermal treatment is a key determinant of solar cell efficiency and identified critical humidity thresholds.
3. Neuroscience - Neural Network Connectivity
Showed that brain networks across species follow a log-normal pattern rather than a power-law pattern.
4. Heart Disease - SOD2 Protective Factor
Found that the SOD2 protein appears to protect the heart by reducing fibrosis.
5. Diabetes - SSR1 Gene Variants
Discovered that genetic variants near the SSR1 gene may have a protective effect against type 2 diabetes.
6. Alzheimer's Disease - Temporal Analysis Method
Proposed a new analytical technique for tracking protein changes over time in disease-affected brain cells.
7. Neurodegenerative Diseases - Phosphatidylserine Exposure
Identified that neurons expose "eat me" signals due to age-related loss of flippase expression.
Research Efficiency: Independent scientists found 79.4% of statements in Kosmos reports to be accurate, and collaborators reported that a single 20-cycle Kosmos run was equivalent to 6 months of their own research time.
Inspiration
This project is inspired by the following works:
- Paper: Kosmos: An AI Scientist for Autonomous Discovery (November 2025)
- Analysis Patterns: kosmos-figures repository
- Claude Router: claude_n_codex_api_proxy
Contribution Guidelines
Contributions are welcome! See CONTRIBUTING.md for guidelines.
Welcome Contribution Areas:
- Domain-specific tools and APIs
- Experiment templates for different domains
- Literature API integrations
- Safety verifications
- Documentation
- Testing
License
MIT License - see LICENSE
Citation
If you use Kosmos in your research, please cite:
@software{kosmos_ai_scientist,
title={Kosmos AI Scientist: Multi-Provider Autonomous Scientific Discovery},
author={Kosmos Contributors},
year={2025},
url={https://github.com/jimmc414/Kosmos}
}
Acknowledgments
- Anthropic for providing Claude and Claude Code
- Edison Scientific for providing the kosmos-figures analysis patterns
- The open science community for providing literature APIs and tools
Support and Community
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Project Status
Status: Production-ready (v0.2.0) - All 10 development phases completed
Last Updated: 2025-11-07
Documentation Resources
- Architecture Overview - System design and components
- Integration Plan - Integration approach for kosmos-figures patterns
- Domain Roadmaps - Domain-specific implementation guides
- API Reference - API documentation
- Contribution Guide - How to contribute
Summary of Core Advantages
- Fully Autonomous: End-to-end automation from hypothesis to discovery
- Multi-discipline Support: Across Biology, Physics, Chemistry, and more
- Validated: Has produced 7 validated scientific discoveries
- Cost-Optimized: Multi-layer caching reduces costs by 30-40%
- Flexible Integration: Supports both API and CLI options
- Safe and Reliable: Sandboxed execution, verification mechanisms, 90%+ test coverage
- Production-Ready: v0.2.0 version, all development phases completed