dolt: git for data

The Data as Code Series

Ernest Guevarra

17 March 2026

Learning outcomes

At the end of this session, participants should be able to:

  • describe the similarities between git and dolt and GitHub and DoltHub

  • use concepts of dolt-based version control of data in relation to git-based version control of code

  • identify the advantages and limitations of using dolt and DoltHub for version control of data

  • describe some relevant use cases for dolt and DoltHub for version control of data

Session outline

  1. What is dolt and DotHub

  2. Demonstration: How to initiate and manage a version-controlled database using dolt and DoltHub

  3. Use cases for version-controlled database using dolt and DoltHub

What is dolt and DoltHub

dolt

  • an SQL database that can be forked, cloned, branched, merged, pushed, and pulled just like a git repository

DoltHub

  • a place to share dolt databases

  • public data are hosted for free

  • adds a modern, secure, always on database management web GUI to the dolt ecosystem.

dolt is to DoltHub as git is to GitHub

Creating and managing a version-controlled database using DoltHub

Create and login to your DoltHub account

Go to the Databases tab and click on Create Database button

Give the database a name and a description and then create database

Click on File upload option

Give the new table a name

Click on Browse files and then select the data to upload

Choose primary keys for the table you uploaded

Add a commit message and specify a branch (optional)

Review pull request and merge

Database is now created in main branch …

… and it is the same as in dev branch

Making versioned changes to the database

On dev branch, add a new row of data to the cyclones table

Create a pull request from dev to main with the new row of data

Select a reviewer of the pull request

Review the changes proposed by the pull request

Merge pull request to main

New row of data is now on main

Use cases

  • Data collaboration and curation on DoltHub

  • Distributing versioned data through dolt as an alternative to APIs

  • Data model quality control and versioning

What questions do you have?

Thank you!

License for code License for text