Skip to content

A experimental SQL extension for declarative data visualisation based on the Grammar of Graphics.

License

Notifications You must be signed in to change notification settings

posit-dev/ggsql

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ggsql - SQL Visualization Grammar

A SQL extension for declarative data visualization based on the Grammar of Graphics.

ggsql allows you to write queries that combine SQL data retrieval with visualization specifications in a single, composable syntax.

Example

SELECT date, revenue, region
FROM sales
WHERE year = 2024
VISUALISE date AS x, revenue AS y, region AS color
DRAW line
LABEL title => 'Sales by Region'
THEME minimal

Project Status

✨ Active Development - Core functionality is working with ongoing feature additions.

Completed:

  • βœ… Complete tree-sitter grammar with SQL + VISUALISE parsing
  • βœ… Full AST type system with validation
  • βœ… DuckDB reader with comprehensive type handling
  • βœ… Vega-Lite writer with multi-layer support
  • βœ… CLI tool (ggsql) with parse, exec, and validate commands
  • βœ… REST API server (ggsql-rest) with CORS support
  • βœ… Jupyter kernel (ggsql-jupyter) with inline Vega-Lite visualizations
  • βœ… VS Code extension (ggsql-vscode) with syntax highlighting and Positron IDE integration

Planned:

  • πŸ“‹ Additional readers
  • πŸ“‹ Additional writers
  • πŸ“‹ More geom types and statistical transformations
  • πŸ“‹ Enhanced theme system

Architecture

ggsql splits queries at the VISUALISE boundary:

  • SQL portion β†’ passed to pluggable readers (DuckDB, PostgreSQL, CSV, etc.)
  • VISUALISE portion β†’ parsed and compiled into visualization specifications
  • Output β†’ rendered via pluggable writers (ggplot2, PNG, Vega-Lite, etc.)

Development Setup

Getting Started

  1. Clone the repository:

    git clone https://github.com/georgestagg/ggsql
    cd ggsql
  2. Install tree-sitter CLI:

    npm install -g tree-sitter-cli
  3. Build the project:

    cargo build
  4. Run tests:

    cargo test

Project Structure

ggsql/
β”œβ”€β”€ Cargo.toml                       # Workspace root configuration
β”œβ”€β”€ README.md                        # This file
β”‚
β”œβ”€β”€ tree-sitter-ggsql/               # Tree-sitter grammar package
β”‚
β”œβ”€β”€ src/                             # Main library
β”‚   β”œβ”€β”€ lib.rs                       # Public API and re-exports
β”‚   β”œβ”€β”€ cli.rs                       # Command-line interface
β”‚   β”œβ”€β”€ rest.rs                      # REST API server
β”‚   β”œβ”€β”€ parser/                      # Parsing subsystem
β”‚   β”œβ”€β”€ reader/                      # Data source readers
β”‚   └── writer/                      # Visualization writers
β”‚
β”œβ”€β”€ ggsql-jupyter/                   # Jupyter kernel
β”‚
└── ggsql-vscode/                    # VS Code extension

Development Workflow

Running Tests

# Run all tests
cargo test

# Run specific test modules
cargo test ast                       # AST type tests
cargo test splitter                  # Query splitter tests
cargo test parser                    # All parser tests

# Run without default features (avoids database dependencies)
cargo test --no-default-features

# Run with specific features
cargo test --features duckdb,sqlite

Working with the Grammar

The tree-sitter grammar is in tree-sitter-ggsql/grammar.js. The grammar is automatically regenerated whenever the tree-sitter-ggsql project is build. After making changes, you can manually test:

  1. Regenerate the parser:

    cd tree-sitter-ggsql
    tree-sitter generate
  2. Test the grammar:

    # Test parsing a specific file
    tree-sitter parse test/corpus/basic.txt
    
    # Test all corpus files
    tree-sitter test
  3. Debug parsing issues:

    # Enable debug mode
    tree-sitter parse --debug test/corpus/basic.txt
    
    # Check for conflicts
    tree-sitter generate --report-states-for-rule=query

Code Organization

  • AST Types (src/parser/ast.rs): Core data structures representing parsed ggsql
  • Query Splitter (src/parser/splitter.rs): Separates SQL from VISUALISE portions
  • AST Builder (src/parser/builder.rs): Converts tree-sitter parse trees to typed AST
  • Error Handling (src/parser/error.rs): Parse-time error types and formatting

Adding New Grammar Features

  1. Update the grammar in tree-sitter-ggsql/grammar.js
  2. Add corresponding AST types in src/parser/ast.rs
  3. Update the AST builder in src/parser/builder.rs
  4. Add test cases for the new feature
  5. Update syntax highlighting in tree-sitter-ggsql/queries/highlights.scm

Testing Strategy

Unit Tests

Located alongside the code they test:

  • src/parser/ast.rs - AST type functionality and validation
  • src/parser/splitter.rs - Query splitting edge cases
  • src/parser/builder.rs - CST to AST conversion

Integration Tests

  • Full parsing pipeline tests in src/parser/mod.rs
  • End-to-end query processing (planned)

Grammar Tests

  • tree-sitter-ggsql/test/corpus/ - Example queries with expected parse trees
  • Run with tree-sitter test

Running Specific Test Categories

# Core AST functionality
cargo test ast::tests

# Query splitting logic
cargo test splitter::tests

# Tree-sitter grammar
cd tree-sitter-ggsql && tree-sitter test

# All parser integration tests
cargo test parser

Grammar Specification

See CLAUDE.md for the in-progress ggsql grammar specification, including:

  • Syntax reference
  • AST structure
  • Implementation phases and architecture
  • Design principles and philosophy

Key grammar elements:

  • VISUALISE [mappings] [FROM source] - Entry point with global aesthetic mappings
  • DRAW <geom> [MAPPING] [SETTING] [FILTER] - Define geometric layers (point, line, bar, etc.)
  • SCALE <aesthetic> SETTING - Configure data-to-visual mappings
  • FACET - Create small multiples (WRAP for flowing layout, BY for grid)
  • COORD - Coordinate transformations (cartesian, flip, polar)
  • LABEL, THEME, GUIDE - Styling and annotation

Jupyter Kernel

The ggsql-jupyter package provides a Jupyter kernel for interactive ggsql queries with inline Vega-Lite visualizations.

Installation

cargo build --release --package ggsql-jupyter
./target/release/ggsql-jupyter --install

Usage

After installation, create a new notebook with the "ggsql" kernel or use %kernel ggsql in an existing notebook.

-- Create data
CREATE TABLE sales AS
SELECT * FROM (VALUES
    ('2024-01-01'::DATE, 100, 'North'),
    ('2024-01-02'::DATE, 120, 'South')
) AS t(date, revenue, region)

-- Visualize with ggsql using global mapping
SELECT * FROM sales
VISUALISE date AS x, revenue AS y, region AS color
DRAW line
SCALE x SETTING type => 'date'
LABEL title => 'Sales Trends'

The kernel maintains a persistent DuckDB session across cells, so you can create tables in one cell and query them in another.

Quarto

A Quarto example can be found in ggsql-jupyter/tests/quarto/doc.qmd.

VS Code Extension

The ggsql-vscode extension provides syntax highlighting for ggsql files in Visual Studio Code and Positron IDE.

Installation

# Install dependencies and package the extension
cd ggsql-vscode
npm install
npm install -g @vscode/vsce
vsce package

# Install the VSIX file
code --install-extension ggsql-0.1.0.vsix

# For Positron integration, also install the kernel
cargo run --package ggsql-jupyter -- --install

Features

  • Syntax highlighting for ggsql keywords, geoms, aesthetics, and SQL
  • File association for .ggsql, .ggsql.sql, and .gsql extensions
  • Bracket matching and auto-closing for parentheses and brackets
  • Comment support for -- single-line and /* */ multi-line comments

The extension uses a TextMate grammar that highlights:

  • SQL keywords (SELECT, FROM, WHERE, JOIN, etc.)
  • ggsql clauses (VISUALISE, DRAW, SCALE, COORD, FACET, etc.)
  • Geometric objects (point, line, bar, area, etc.)
  • Aesthetics (x, y, color, size, shape, etc.)
  • Scale types (linear, log10, date, viridis, etc.)

Positron IDE Integration

When running in Positron IDE, the extension provides additional features:

  • Language runtime registration for executing ggsql queries directly within Positron
  • Plot pane integration - visualizations are automatically routed to Positron's Plots pane

CLI

Installation

cargo install --path src

About

A experimental SQL extension for declarative data visualisation based on the Grammar of Graphics.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5