GSoC 2025: Tölvera NLI - Project Completed!

Welcome to my project documentation for the Google Summer of Code 2025 for Tölvera! This site documents my journey through the Tölvera Natural Language Interface project.

🎉 Project Completed!

After 12 weeks of development, the Tölvera NLI is now functional and outperforms (thankfully 😅) frontier models like Gemini 2.5 Pro and Claude Opus on Tölvera-specific tasks.

Read the Full Final Report | Watch the Demo Video | Final Overview Video

Final GSoC Overview Video

A summary of the entire 12-week GSoC development journey, showcasing the Natural Language Interface for Tölvera from concept to completion.

Gallery of Generated Sketches

Two species repel each other while moving

Complex self-organizing behavior

Boids with OSC mapping

Particle Life simulation

The Textual User Interface

Textual UI Screenshot

The completed interface allows users to type natural language descriptions and instantly generate working Tölvera simulations - no coding required!

Project Overview & Results

My project enhanced creative workflows within the Tölvera artificial life library. Here is the official abstract from my proposal:

This project proposes to refine and significantly extend a functional proof-of-concept (POC) Natural Language Interface (NLI) for Tölvera, aiming to enhance creative workflows by improving accessibility for artists and researchers. The NLI translates natural language commands into Tölvera sketch generation and modification, acting as an interactive collaborator for users regardless of their coding expertise. Leveraging local Large Language Models (LLMs) via Ollama prioritizes user privacy and control. The existing tv.llm module, demonstrated for Flock and Slime simulations, uses Pydantic for validation and Jinja2 for reliable code generation. Core GSoC work involves expanding this architecture to more Tölvera modules (including tv.vera, tv.osc, tv.cv, tv.mp, tv.iml), refining prompt strategies, enhancing the user interface concept, and ensuring structured outputs. Evaluation will use functional tests, schema adherence metrics, qualitative user feedback, and LLM-as-a-judge assessments. This project aims to elevate the prototype into a core Tölvera feature, lowering technical barriers for the community of users.

Key Achievements

60-85% success rates for various behavior types (gravity, interactions, species detection)
Outperformed frontier models - Our system succeeded where Gemini 2.5 Pro and Claude Opus failed for initial correct sketch generation
Full Textual UI with syntax highlighting, error recovery, and natural language refinement
12 weeks of documented development with weekly progress reports

Comparative Analysis Results

Our system was tested against frontier models (Gemini 2.5 Pro and Claude Opus) with the following results:

Test Case: Day/Night Cycle Behavior

Our System (Gemini 2.0 Flash): ✅ Worked on first attempt
Gemini 2.5 Pro (Zero-shot): ❌ Multiple failures with Tölvera API errors
Claude Opus (Zero-shot): ❌ Fundamental misunderstanding of module structure
Gemini 2.5 Pro (Full Tölvera Context): ✅ Generates after several refinements
Claude Opus (Full Tölvera Context): ✅ Generates after several refinements

Test Case: Food Competition

Our System (Gemini 2.0 Flash): ✅ Correct implementation with one refinement (for color)
Gemini 2.5 Pro: ⚠️ Works but incorrect mechanics after several refinements
Claude Opus: ❌ Taichi scope violations, never runs

How It Works

User provides a natural language command (e.g., “Create a flock with 2 species, one red and one blue”).
BehaviorOrchestrator analyzes and decomposes the request into implementable components
StateManager dynamically creates required states based on behavior needs
SpeciesManager create species requirements based on descriptions
CodeGenerator synthesizes Taichi expert functions with physics-aware prompts
TemplateRenderer assembles experts and kernels into complete executable sketches
The user can execute the script, see the visual output, and iteratively refine their creation

Core Technologies

Python
Tölvera
Ollama (for local LLMs)
Gemini 2.0 Flash (primary synthesis model)
Pydantic (for data validation)
Jinja2 (for template rendering)
Textual (for the terminal UI)

Project Resources

Documentation

Final Report - Complete technical documentation with architecture diagrams
Weekly Development Journals - 12 weeks of progress
Midterm Report - 6-week implementation plan

Code & Demos

Demo Video - Full walkthrough of the Natural Language Interface
Final Overview Video - Complete GSoC project summary
Source Code - Complete implementation
Try It Yourself - Demo scripts and UI

Special thanks to the Tölvera community and GSoC mentors (Jack, Victor, and Piotr) for their support throughout this project.

MClem's Journal for GSoC '25

Explorer