A no-code AI data processing tool to build, enrich, and transform datasets using AI models.

TypeScriptaisheetshuggingface 114 Last Updated: August 08, 2025

AI Sheets - No-code AI Data Processing Tool

Project Overview

AI Sheets is a no-code tool open-sourced by Hugging Face, specifically designed for building, enriching, and transforming datasets using AI models. The tool can be deployed locally or run on the Hub, supporting access to thousands of open-source models on the Hugging Face Hub.

Project Address: https://github.com/huggingface/aisheets Online Demo: https://huggingface.co/spaces/aisheets/sheets

Core Features

1. User-Friendly Interface

  • Easy-to-learn spreadsheet-like user interface
  • Supports rapid experimentation, starting with small datasets and then running large-scale data generation pipelines
  • Create new columns by writing prompts, allowing infinite iterations and cell edits

2. Powerful AI Integration

  • Supports thousands of open-source models on the Hugging Face Hub
  • Supports inference via Inference Providers API or local models
  • Supports OpenAI's gpt-oss models
  • Supports custom LLM endpoints (must comply with OpenAI API specifications)

3. Diverse Data Operations

  • Model Comparison Testing: Test the performance of different models on the same data
  • Prompt Optimization: Improve prompts for specific data and models
  • Data Transformation: Clean and transform dataset columns
  • Data Classification: Automatically categorize content
  • Data Analysis: Extract key information from text
  • Data Enrichment: Supplement missing information (e.g., postal codes for addresses)
  • Synthetic Data Generation: Create realistic but fictitious datasets

Technical Architecture

Frontend Tech Stack

  • Framework: Qwik + QwikCity
  • Build Tool: Vite
  • Package Manager: pnpm

Directory Structure

├── public/              # Static assets
└── src/
    ├── components/      # Stateless components
    ├── features/        # Business logic components
    └── routes/          # Route files

Backend Services

  • Server: Express.js
  • Authentication: Hugging Face OAuth
  • API: OpenAI API specification compatible

Installation and Deployment

Docker Deployment (Recommended)

# Get Hugging Face token
export HF_TOKEN=your_token_here

# Run Docker container
docker run -p 3000:3000 \
  -e HF_TOKEN=HF_TOKEN \
  AI Sheets/sheets

# Access http://localhost:3000

Local Development

# Install pnpm
# Clone the project
git clone https://github.com/huggingface/aisheets.git
cd aisheets

# Set environment variables
export HF_TOKEN=your_token_here

# Install dependencies
pnpm install

# Start development server
pnpm dev

# Access http://localhost:5173

Production Build

# Build production version
pnpm build

# Start production server
export HF_TOKEN=your_token_here
pnpm serve

Environment Variable Configuration

Core Configuration

  • HF_TOKEN: Hugging Face authentication token
  • OAUTH_CLIENT_ID: Hugging Face OAuth client ID
  • OAUTH_SCOPES: OAuth authentication scopes (default: openid profile inference-api manage-repos)

Model Configuration

  • DEFAULT_MODEL: Default text generation model (default: meta-llama/Llama-3.3-70B-Instruct)
  • DEFAULT_MODEL_PROVIDER: Default model provider (default: nebius)
  • MODEL_ENDPOINT_URL: Custom inference endpoint URL
  • MODEL_ENDPOINT_NAME: Model name corresponding to the custom endpoint

System Configuration

  • DATA_DIR: Data storage directory (default: ./data)
  • NUM_CONCURRENT_REQUESTS: Number of concurrent requests (default: 5, max: 10)
  • SERPER_API_KEY: Serper web search API key
  • TELEMETRY_ENABLED: Telemetry feature switch (default: 1)

Usage Methods

1. Data Import Methods

Create Dataset from Scratch

  • Suitable for: Familiarizing with the tool, brainstorming, quick experiments
  • Describe the dataset you want, and AI automatically generates structure and content
  • Example: "Cities around the world, including their countries and landmark images for each city, generated in Ghibli style"

Import Existing Dataset (Recommended)

  • Supported formats: XLS, TSV, CSV, Parquet
  • Up to 1000 rows, unlimited columns
  • Suitable for most real-world data processing scenarios

2. Data Processing Operations

Add AI Column

Click the "+" button to add a new column, you can choose to:

  • Extract specific information
  • Summarize long text
  • Translate content
  • Custom prompt: "Perform X operation on {{column}}"

Optimize and Extend

  • Add more cells: Drag down to auto-generate
  • Manual editing: Directly edit cell content as an example
  • Feedback mechanism: Use likes to mark good outputs
  • Configuration adjustment: Modify prompts, switch models or providers

3. Export and Extension

  • Export to Hugging Face Hub
  • Generate reusable configuration files
  • Supports HF Jobs for batch data generation

Integrate Ollama

# Start Ollama server
export OLLAMA_NOHISTORY=1
ollama serve
ollama run llama3

# Set environment variables
export MODEL_ENDPOINT_URL=http://localhost:11434
export MODEL_ENDPOINT_NAME=llama3

# Start AI Sheets
pnpm serve

Usage Scenarios Examples

Model Comparison Testing

  • Import a dataset containing questions
  • Create different columns for different models
  • Use an LLM as a judge to compare model quality

Dataset Classification

  • Import an existing dataset from the Hub
  • Add a classification column to categorize content
  • Manually verify and edit initial classification results

Image Generation Comparison

  • Create a dataset of object names and descriptions
  • Use different image generation models
  • Compare the effects of different styles and prompts

Project Advantages

  1. No-code Operation: Process complex data without programming knowledge
  2. Open Source & Free: Completely open source, supports local deployment
  3. Rich Model Integration: Access to the Hugging Face ecosystem
  4. User-Friendly Interface: Familiar Excel-like operation experience
  5. Flexible Extension: Supports custom models and API endpoints
  6. Real-time Feedback: Improve AI output through editing and liking
  7. Batch Processing: Supports large-scale data generation pipelines

Community and Support

AI Sheets provides data scientists, researchers, and developers with a powerful yet easy-to-use tool, making AI data processing simple and efficient. Whether it's model testing, data cleaning, or synthetic data generation, it can be quickly accomplished through an intuitive interface.

Star History Chart