Go to site menu

GraniteDB

This page is a bit of a mess at the moment, it's still being written

GraniteDB is a minimalist relational database management system. Currently in-development, being written in Rust and targetting Linux.

Project goals

The aim is to design a new database system from first principles, implementing only the functionality that is strictly required from a database. No user management, no access control, no views. Just tables, records, and relationships.

Ideally it will be possible to write a proof-of-concept Granite client for Uxn, to show that the system is simple.

Points of difference from traditional relational database systems

TODO: 
- Make notes on the short-comings of SQL as a query language (perhaps as a separate page)
- text-vs-byte over-the-wire format
- no need for user management for current use-cases
- relationships between tables being explicit

Query language

Here's a query written in SQL:

SELECT part.name FROM part 
INNER JOIN supplier ON part.supplier_id=supplier.id 
WHERE supplier.city="Auckland";

Here's the same query, but written in Quarry, the GraniteDB query langauge:

part.name, part->supplier.name : part->supplier.city="Auckland"

Both of these queries would return a response like the following:

part.name supplier.name
Hinge C W Manufacturing
Connector Industritech Ltd.
Handle C W Manufacturing

The relationships between tables are explicitly named in Granite. In the query above, the relationship between part.supplier_id and supplier.id is defined inside the database. To get the supplier name for each part one can just use part->supplier.name.

To elaborate further, each row in the part table points to exactly one row in the supplier table (via the part.supplier_id column pointing to the supplier.id column). The -> syntax is used to traverse a one-to-one relationship, where for part->supplier there is guaranteed to be exactly one supplier for each part. This relationship can also be traversed backwards as a many-to-one relationship with the => syntax, where supplier=>part would return zero or more parts for each row of the supplier table.

Relationships and foreign-key constraints are bundled together, so relationships are always safely constrainted and constraints can always be traversed with no extra work.

Syntax

A table is a set of tabular data. A table contains zero or more columns, zero or more rows, and zero or more relationships. Neither columns, rows, or relationships are ordered.

A column comprises a name and a data type. Each row of a table contains exactly one data value for each column of the table.

A row contains as many data values as the table has columns.

A relationship is a link between a column in one table and a column in another table.

As an example, consider the following tables:

CREATE TABLE part (id: u64, name: str);

CREATE TABLE supplier (id: u64, name: str, city: str);

CREATE TABLE supplier_part (part_id: u64, supplier_id: u64)
  RELATIONSHIP part_id->part.id
  RELATIONSHIP supplier_id->supplier.id;

A table expression is an expression that returns a table.

Table expressions have two parts. The left half restricts a table to a subset of columns, the right half restricts the rows.

More ideas

Enum fields. Declare a string field as *str[1..20], and a hidden table is created to handle it. Instead of using up to 20 bytes to store and index on the value, only 1 byte is used, and no joining is needed. This is a neat idea, but we lose one of the main advantages of a foreign table, which is that the foreign-key constraint prevents typos from creating new erroneous entries. How about we have some kind of indicator to show that a table is being used for this purpose? Or can this be automatically detected? Yeah, do that, automatically detect the tiny join. The extra piece of join syntax is no hassle, order->status.name vs order.status.

{order->part.name, order.quantity} will return a row for each order.

{+order.quantity} will return a single scalar.

{/order.quantity} will return a single scalar, returning the column default if empty.

{/order.quantity, +order.quantity} will throw an error, need unique names on columns.

{avg: /order.quantity, sum: +order.quantity} will return a single row.

{order->part.name, +order.quantity} will return a row for each order. quantity is not summed.

{order->part.name, {+order.quantity}} will return a row for each order. quantity is summed.

{order->part.name!, +order.quantity} will return a row for each part. quantity is summed over each part.

{order->part.name, status.name} will return an error, there's so far no use for Cartesian products in Granite.

{order->part.name!, order.status} will throw an error, can't shrink down status.

{order->part.name!, order.status, +order.quantity} will throw an error, can't shrink down status.

{order->part.name!, order.status!, +order.quantity} will return a row for each part and status, summing quantity over them.

{part.name, part->supplier.name | part.name=["Nail","Rivet"]} uses the = symbol to check for membership of a scalar in a column expression.

[] is a column expression, () is a row expression. name: value is an alias.

order->part.(name, supplier.name) as a way of concisely referring to multiple fields, following from the row expression syntax.

DELETE { table | condition }

TABLE table (column_name: type,)

Testing the language

TABLE part (
    id: u32!,
    name: str[1..20],

    orders: id => order.part_id,
);

TABLE supplier (
    id: u32!,
    name: str[1..20],
    
    orders: id => order.supplier_id,
);

TABLE order (
    part_id: u32!,
    supplier_id: u32!,
    status_id: u8,
    project_id: u32?,
    quantity: u32,
    cost: d64.2,

    part: part_id -> part.id,
    supplier: supplier_id -> supplier.id,
    status: status_id -> status.id,
    project: project_id -> project.id,
);

TABLE status (
    id: u8!,
    name: str[1..20],
);

TABLE project (
    id: u32!,
    name: str[1..20],
    manager_id: u64?,
    budget: d64.2,
    
    manager: manager_id -> employee.id,
    orders: id => order.project_id,
);

TABLE department (
    id: u32!,
    name: str[1..20],
    code: str[3],
);

TABLE employee (
    id: u32!,
    name: str[1..],
    department_id: u64?,
    manager_id: u64?,

    department: department_id -> department.id,
    manager: manager_id -> employee.id,
)


// Get the total number of parts in the database
{ #part }

// Get a list of all projects that have exceeded their budgets, with their current cost and budgets
{ ~project!, project.name, project.budget, cost_to_date: +project=>orders.cost | cost_to_date > budget }

// Get the total quantity of parts that have ever been ordered from Fisher Appliances Ltd
{ +order.quantity | order.supplier->name="Fisher Appliances Ltd" }

// Get the IDs and names of all parts where fewer than 1000 have ever been purchased
{ order->part.id!, order->part.name | +order.quantity > 1000 }

// Get the names of all managers
// None values are excluded from optional relationships unless `table->relationship.column?` is specified
{ ~employee->manager.id!, employee->manager.name }