huggingface/aisheetsView GitHub Homepage for Latest Official Releases

A no-code AI data processing tool to build, enrich, and transform datasets using AI models.

Apache-2.0TypeScriptaisheetshuggingface 1.5k Last Updated: September 25, 2025

AI Sheets - No-code AI Data Processing Tool

Project Overview

AI Sheets is a no-code tool open-sourced by Hugging Face, specifically designed for building, enriching, and transforming datasets using AI models. The tool can be deployed locally or run on the Hub, supporting access to thousands of open-source models on the Hugging Face Hub.

Project Address: https://github.com/huggingface/aisheets Online Demo: https://huggingface.co/spaces/aisheets/sheets

Core Features

1. User-Friendly Interface

Easy-to-learn spreadsheet-like user interface
Supports rapid experimentation, starting with small datasets and then running large-scale data generation pipelines
Create new columns by writing prompts, allowing infinite iterations and cell edits

2. Powerful AI Integration

Supports thousands of open-source models on the Hugging Face Hub
Supports inference via Inference Providers API or local models
Supports OpenAI's gpt-oss models
Supports custom LLM endpoints (must comply with OpenAI API specifications)

3. Diverse Data Operations

Model Comparison Testing: Test the performance of different models on the same data
Prompt Optimization: Improve prompts for specific data and models
Data Transformation: Clean and transform dataset columns
Data Classification: Automatically categorize content
Data Analysis: Extract key information from text
Data Enrichment: Supplement missing information (e.g., postal codes for addresses)
Synthetic Data Generation: Create realistic but fictitious datasets

Technical Architecture

Frontend Tech Stack

Framework: Qwik + QwikCity
Build Tool: Vite
Package Manager: pnpm

Directory Structure

├── public/              # Static assets
└── src/
    ├── components/      # Stateless components
    ├── features/        # Business logic components
    └── routes/          # Route files

Backend Services

Server: Express.js
Authentication: Hugging Face OAuth
API: OpenAI API specification compatible

Installation and Deployment

Docker Deployment (Recommended)

# Get Hugging Face token
export HF_TOKEN=your_token_here

# Run Docker container
docker run -p 3000:3000 \
  -e HF_TOKEN=HF_TOKEN \
  AI Sheets/sheets

# Access http://localhost:3000

Local Development

# Install pnpm
# Clone the project
git clone https://github.com/huggingface/aisheets.git
cd aisheets

# Set environment variables
export HF_TOKEN=your_token_here

# Install dependencies
pnpm install

# Start development server
pnpm dev

# Access http://localhost:5173

Production Build

# Build production version
pnpm build

# Start production server
export HF_TOKEN=your_token_here
pnpm serve

Environment Variable Configuration

Core Configuration

HF_TOKEN: Hugging Face authentication token
OAUTH_CLIENT_ID: Hugging Face OAuth client ID
OAUTH_SCOPES: OAuth authentication scopes (default: openid profile inference-api manage-repos)

Model Configuration

DEFAULT_MODEL: Default text generation model (default: meta-llama/Llama-3.3-70B-Instruct)
DEFAULT_MODEL_PROVIDER: Default model provider (default: nebius)
MODEL_ENDPOINT_URL: Custom inference endpoint URL
MODEL_ENDPOINT_NAME: Model name corresponding to the custom endpoint

System Configuration

DATA_DIR: Data storage directory (default: ./data)
NUM_CONCURRENT_REQUESTS: Number of concurrent requests (default: 5, max: 10)
SERPER_API_KEY: Serper web search API key
TELEMETRY_ENABLED: Telemetry feature switch (default: 1)

Usage Methods

1. Data Import Methods

Create Dataset from Scratch

Suitable for: Familiarizing with the tool, brainstorming, quick experiments
Describe the dataset you want, and AI automatically generates structure and content
Example: "Cities around the world, including their countries and landmark images for each city, generated in Ghibli style"

Import Existing Dataset (Recommended)

Supported formats: XLS, TSV, CSV, Parquet
Up to 1000 rows, unlimited columns
Suitable for most real-world data processing scenarios

2. Data Processing Operations

Add AI Column

Click the "+" button to add a new column, you can choose to:

Extract specific information
Summarize long text
Translate content
Custom prompt: "Perform X operation on {{column}}"

Optimize and Extend

Add more cells: Drag down to auto-generate
Manual editing: Directly edit cell content as an example
Feedback mechanism: Use likes to mark good outputs
Configuration adjustment: Modify prompts, switch models or providers

3. Export and Extension

Export to Hugging Face Hub
Generate reusable configuration files
Supports HF Jobs for batch data generation

Integrate Ollama

# Start Ollama server
export OLLAMA_NOHISTORY=1
ollama serve
ollama run llama3

# Set environment variables
export MODEL_ENDPOINT_URL=http://localhost:11434
export MODEL_ENDPOINT_NAME=llama3

# Start AI Sheets
pnpm serve

Usage Scenarios Examples

Model Comparison Testing

Import a dataset containing questions
Create different columns for different models
Use an LLM as a judge to compare model quality

Dataset Classification

Import an existing dataset from the Hub
Add a classification column to categorize content
Manually verify and edit initial classification results

Image Generation Comparison

Create a dataset of object names and descriptions
Use different image generation models
Compare the effects of different styles and prompts

Project Advantages

No-code Operation: Process complex data without programming knowledge
Open Source & Free: Completely open source, supports local deployment
Rich Model Integration: Access to the Hugging Face ecosystem
User-Friendly Interface: Familiar Excel-like operation experience
Flexible Extension: Supports custom models and API endpoints
Real-time Feedback: Improve AI output through editing and liking
Batch Processing: Supports large-scale data generation pipelines

Community and Support

GitHub Repository: https://github.com/huggingface/aisheets
Online Community: https://huggingface.co/spaces/aisheets/sheets/discussions
Issue Feedback: Submit via GitHub Issues
Technical Documentation: Detailed environment configuration and API integration guides

AI Sheets provides data scientists, researchers, and developers with a powerful yet easy-to-use tool, making AI data processing simple and efficient. Whether it's model testing, data cleaning, or synthetic data generation, it can be quickly accomplished through an intuitive interface.