Spaces:
Sleeping
Sleeping
# MiniDataAPI Spec | |
<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! --> | |
The `MiniDataAPI` is a persistence API specification that designed to be | |
small and relatively easy to implement across a wide range of | |
datastores. While early implementations have been SQL-based, the | |
specification can be quickly implemented in key/value stores, document | |
databases, and more. | |
<div> | |
> **Work in Progress** | |
> | |
> The MiniData API spec is a work in progress, subject to change. While | |
> the majority of design is complete, expect there could be breaking | |
> changes. | |
</div> | |
## Why? | |
The MiniDataAPI specification allows us to use the same API for many | |
different database engines. Any application using the MiniDataAPI spec | |
for interacting with its database requires no modification beyond import | |
and configuration changes to switch database engines. For example, to | |
convert an application from Fastlite running SQLite to FastSQL running | |
PostgreSQL, should require only changing these two lines: | |
<div class="columns"> | |
<div class="column" width="19%" | |
style="border-right: 1px solid #ccc; padding-right: 10px;"> | |
FastLite version | |
``` python | |
from fastlite import * | |
db = database('test.db') | |
``` | |
</div> | |
<div class="column" width="79%" style="padding-left: 10px;"> | |
FastSQL version | |
``` python | |
from fastsql import * | |
db = Database('postgres:...') | |
``` | |
</div> | |
</div> | |
As both libraries adhere to the MiniDataAPI specification, the rest of | |
the code in the application should remain the same. The advantage of the | |
MiniDataAPI spec is that it allows people to use whatever datastores | |
they have access to or prefer. | |
<div> | |
> **Note** | |
> | |
> Switching databases won’t migrate any existing data between databases. | |
</div> | |
### Easy to learn, quick to implement | |
The MiniDataAPI specification is designed to be easy-to-learn and quick | |
to implement. It focuses on straightforward Create, Read, Update, and | |
Delete (CRUD) operations. | |
MiniDataAPI databases aren’t limited to just row-based systems. In fact, | |
the specification is closer in design to a key/value store than a set of | |
records. What’s exciting about this is we can write implementations for | |
tools like Python dict stored as JSON, Redis, and even the venerable | |
ZODB. | |
### Limitations of the MiniDataAPI Specification | |
> “Mini refers to the lightweightness of specification, not the data.” | |
> | |
> – Jeremy Howard | |
The advantages of the MiniDataAPI come at a cost. The MiniDataAPI | |
specification focuses a very small set of features compared to what can | |
be found in full-fledged ORMs and query languages. It intentionally | |
avoids nuances or sophisticated features. | |
This means the specification does not include joins or formal foreign | |
keys. Complex data stored over multiple tables that require joins isn’t | |
handled well. For this kind of scenario it’s probably for the best to | |
use more sophisticated ORMs or even direct database queries. | |
### Summary of the MiniDataAPI Design | |
- Easy-to-learn | |
- Relative quick to implement for new database engines | |
- An API for CRUD operations | |
- For many different types of databases including row- and | |
key/value-based designs | |
- Intentionally small in terms of features: no joins, no foreign keys, | |
no database specific features | |
- Best for simpler designs, complex architectures will need more | |
sophisticated tools. | |
## Connect/construct the database | |
We connect or construct the database by passing in a string connecting | |
to the database endpoint or a filepath representing the database’s | |
location. While this example is for SQLite running in memory, other | |
databases such as PostgreSQL, Redis, MongoDB, might instead use a URI | |
pointing at the database’s filepath or endpoint. The method of | |
connecting to a DB is *not* part of this API, but part of the underlying | |
library. For instance, for fastlite: | |
``` python | |
db = database(':memory:') | |
``` | |
Here’s a complete list of the available methods in the API, all | |
documented below (assuming `db` is a database and `t` is a table): | |
- `db.create` | |
- `t.insert` | |
- `t.delete` | |
- `t.update` | |
- `t[key]` | |
- `t(...)` | |
- `t.xtra` | |
## Tables | |
For the sake of expediency, this document uses a SQL example. However, | |
tables can represent anything, not just the fundamental construct of a | |
SQL databases. They might represent keys within a key/value structure or | |
files on a hard-drive. | |
### Creating tables | |
We use a `create()` method attached to `Database` object (`db` in our | |
example) to create the tables. | |
``` python | |
class User: name:str; email: str; year_started:int | |
users = db.create(User, pk='name') | |
users | |
``` | |
<Table user (name, email, year_started)> | |
``` python | |
class User: name:str; email: str; year_started:int | |
users = db.create(User, pk='name') | |
users | |
``` | |
<Table user (name, email, year_started)> | |
If no `pk` is provided, `id` is assumed to be the primary key. | |
Regardless of whether you mark a class as a dataclass or not, it will be | |
turned into one – specifically into a | |
[`flexiclass`](https://fastcore.fast.ai/xtras.html#flexiclass). | |
``` python | |
@dataclass | |
class Todo: id: int; title: str; detail: str; status: str; name: str | |
todos = db.create(Todo) | |
todos | |
``` | |
<Table todo (id, title, detail, status, name)> | |
### Compound primary keys | |
The MiniData API spec supports compound primary keys, where more than | |
one column is used to identify records. We’ll also use this example to | |
demonstrate creating a table using a dict of keyword arguments. | |
``` python | |
class Publication: authors: str; year: int; title: str | |
publications = db.create(Publication, pk=('authors', 'year')) | |
``` | |
### Transforming tables | |
Depending on the database type, this method can include transforms - the | |
ability to modify the tables. Let’s go ahead and add a password field | |
for our table called `pwd`. | |
``` python | |
class User: name:str; email: str; year_started:int; pwd:str | |
users = db.create(User, pk='name', transform=True) | |
users | |
``` | |
<Table user (name, email, year_started, pwd)> | |
## Manipulating data | |
The specification is designed to provide as straightforward CRUD API | |
(Create, Read, Update, and Delete) as possible. Additional features like | |
joins are out of scope. | |
### .insert() | |
Add a new record to the database. We want to support as many types as | |
possible, for now we have tests for Python classes, dataclasses, and | |
dicts. Returns an instance of the new record. | |
Here’s how to add a record using a Python class: | |
``` python | |
users.insert(User(name='Braden', email='[email protected]', year_started=2018)) | |
``` | |
User(name='Braden', email='[email protected]', year_started=2018, pwd=None) | |
We can also use keyword arguments directly: | |
``` python | |
users.insert(name='Alma', email='[email protected]', year_started=2019) | |
``` | |
User(name='Alma', email='[email protected]', year_started=2019, pwd=None) | |
And now Charlie gets added via a Python dict. | |
``` python | |
users.insert({'name': 'Charlie', 'email': '[email protected]', 'year_started': 2018}) | |
``` | |
User(name='Charlie', email='[email protected]', year_started=2018, pwd=None) | |
And now TODOs. Note that the inserted row is returned: | |
``` python | |
todos.insert(Todo(title='Write MiniDataAPI spec', status='open', name='Braden')) | |
todos.insert(title='Implement SSE in FastHTML', status='open', name='Alma') | |
todo = todos.insert(dict(title='Finish development of FastHTML', status='closed', name='Charlie')) | |
todo | |
``` | |
Todo(id=3, title='Finish development of FastHTML', detail=None, status='closed', name='Charlie') | |
Let’s do the same with the `Publications` table. | |
``` python | |
publications.insert(Publication(authors='Alma', year=2019, title='FastHTML')) | |
publications.insert(authors='Alma', year=2030, title='FastHTML and beyond') | |
publication= publications.insert((dict(authors='Alma', year=2035, title='FastHTML, the early years'))) | |
publication | |
``` | |
Publication(authors='Alma', year=2035, title='FastHTML, the early years') | |
### Square bracket search \[\] | |
Get a single record by entering a primary key into a table object within | |
square brackets. Let’s see if we can find Alma. | |
``` python | |
user = users['Alma'] | |
user | |
``` | |
User(name='Alma', email='[email protected]', year_started=2019, pwd=None) | |
If no record is found, a `NotFoundError` error is raised. Here we look | |
for David, who hasn’t yet been added to our users table. | |
``` python | |
try: users['David'] | |
except NotFoundError: print(f'User not found') | |
``` | |
User not found | |
Here’s a demonstration of a ticket search, demonstrating how this works | |
with non-string primary keys. | |
``` python | |
todos[1] | |
``` | |
Todo(id=1, title='Write MiniDataAPI spec', detail=None, status='open', name='Braden') | |
Compound primary keys can be supplied in lists or tuples, in the order | |
they were defined. In this case it is the `authors` and `year` columns. | |
Here’s a query by compound primary key done with a `list`: | |
``` python | |
publications[['Alma', 2019]] | |
``` | |
Publication(authors='Alma', year=2019, title='FastHTML') | |
Here’s the same query done directly with index args. | |
``` python | |
publications['Alma', 2030] | |
``` | |
Publication(authors='Alma', year=2030, title='FastHTML and beyond') | |
### Parentheses search () | |
Get zero to many records by entering values with parentheses searches. | |
If nothing is in the parentheses, then everything is returned. | |
``` python | |
users() | |
``` | |
[User(name='Braden', email='[email protected]', year_started=2018, pwd=None), | |
User(name='Alma', email='[email protected]', year_started=2019, pwd=None), | |
User(name='Charlie', email='[email protected]', year_started=2018, pwd=None)] | |
We can order the results. | |
``` python | |
users(order_by='name') | |
``` | |
[User(name='Alma', email='[email protected]', year_started=2019, pwd=None), | |
User(name='Braden', email='[email protected]', year_started=2018, pwd=None), | |
User(name='Charlie', email='[email protected]', year_started=2018, pwd=None)] | |
We can filter on the results: | |
``` python | |
users(where="name='Alma'") | |
``` | |
[User(name='Alma', email='[email protected]', year_started=2019, pwd=None)] | |
Generally you probably want to use placeholders, to avoid SQL injection | |
attacks: | |
``` python | |
users("name=?", ('Alma',)) | |
``` | |
[User(name='Alma', email='[email protected]', year_started=2019, pwd=None)] | |
We can limit results with the `limit` keyword: | |
``` python | |
users(limit=1) | |
``` | |
[User(name='Braden', email='[email protected]', year_started=2018, pwd=None)] | |
If we’re using the `limit` keyword, we can also use the `offset` keyword | |
to start the query later. | |
``` python | |
users(limit=5, offset=1) | |
``` | |
[User(name='Alma', email='[email protected]', year_started=2019, pwd=None), | |
User(name='Charlie', email='[email protected]', year_started=2018, pwd=None)] | |
### .update() | |
Update an existing record of the database. Must accept Python dict, | |
dataclasses, and standard classes. Uses the primary key for identifying | |
the record to be changed. Returns an instance of the updated record. | |
Here’s with a normal Python class: | |
``` python | |
user | |
``` | |
User(name='Alma', email='[email protected]', year_started=2019, pwd=None) | |
``` python | |
user.year_started = 2099 | |
users.update(user) | |
``` | |
User(name='Alma', email='[email protected]', year_started=2099, pwd=None) | |
Or use a dict: | |
``` python | |
users.update(dict(name='Alma', year_started=2199, email='[email protected]')) | |
``` | |
User(name='Alma', email='[email protected]', year_started=2199, pwd=None) | |
Or use kwargs: | |
``` python | |
users.update(name='Alma', year_started=2149) | |
``` | |
User(name='Alma', email='[email protected]', year_started=2149, pwd=None) | |
If the primary key doesn’t match a record, raise a `NotFoundError`. | |
John hasn’t started with us yet so doesn’t get the chance yet to travel | |
in time. | |
``` python | |
try: users.update(User(name='John', year_started=2024, email='[email protected]')) | |
except NotFoundError: print('User not found') | |
``` | |
User not found | |
### .delete() | |
Delete a record of the database. Uses the primary key for identifying | |
the record to be removed. Returns a table object. | |
Charlie decides to not travel in time. He exits our little group. | |
``` python | |
users.delete('Charlie') | |
``` | |
<Table user (name, email, year_started, pwd)> | |
If the primary key value can’t be found, raises a `NotFoundError`. | |
``` python | |
try: users.delete('Charlies') | |
except NotFoundError: print('User not found') | |
``` | |
User not found | |
In John’s case, he isn’t time travelling with us yet so can’t be | |
removed. | |
``` python | |
try: users.delete('John') | |
except NotFoundError: print('User not found') | |
``` | |
User not found | |
Deleting records with compound primary keys requires providing the | |
entire key. | |
``` python | |
publications.delete(['Alma' , 2035]) | |
``` | |
<Table publication (authors, year, title)> | |
### `in` keyword | |
Are `Alma` and `John` contained `in` the Users table? Or, to be | |
technically precise, is the item with the specified primary key value | |
`in` this table? | |
``` python | |
'Alma' in users, 'John' in users | |
``` | |
(True, False) | |
Also works with compound primary keys, as shown below. You’ll note that | |
the operation can be done with either a `list` or `tuple`. | |
``` python | |
['Alma', 2019] in publications | |
``` | |
True | |
And now for a `False` result, where John has no publications. | |
``` python | |
('John', 1967) in publications | |
``` | |
False | |
### .xtra() | |
If we set fields within the `.xtra` function to a particular value, then | |
indexing is also filtered by those. This applies to every database | |
method except for record creation. This makes it easier to limit users | |
(or other objects) access to only things for which they have permission. | |
This is a one-way operation, once set it can’t be undone for a | |
particular table object. | |
For example, if we query all our records below without setting values | |
via the `.xtra` function, we can see todos for everyone. Pay special | |
attention to the `id` values of all three records, as we are about to | |
filter most of them away. | |
``` python | |
todos() | |
``` | |
[Todo(id=1, title='Write MiniDataAPI spec', detail=None, status='open', name='Braden'), | |
Todo(id=2, title='Implement SSE in FastHTML', detail=None, status='open', name='Alma'), | |
Todo(id=3, title='Finish development of FastHTML', detail=None, status='closed', name='Charlie')] | |
Let’s use `.xtra` to constrain results just to Charlie. We set the | |
`name` field in Todos, but it could be any field defined for this table. | |
``` python | |
todos.xtra(name='Charlie') | |
``` | |
We’ve now set a field to a value with `.xtra`, if we loop over all the | |
records again, only those assigned to records with a `name` of `Charlie` | |
will be displayed. | |
``` python | |
todos() | |
``` | |
[Todo(id=3, title='Finish development of FastHTML', detail=None, status='closed', name='Charlie')] | |
The `in` keyword is also affected. Only records with a `name` of Charlie | |
will evaluate to be `True`. Let’s demonstrate by testing it with a | |
Charlie record: | |
``` python | |
ct = todos[3] | |
ct | |
``` | |
Todo(id=3, title='Finish development of FastHTML', detail=None, status='closed', name='Charlie') | |
Charlie’s record has an ID of 3. Here we demonstrate that Charlie’s TODO | |
can be found in the list of todos: | |
``` python | |
ct.id in todos | |
``` | |
True | |
If we try `in` with the other IDs the query fails because the filtering | |
is now set to just records with a name of Charlie. | |
``` python | |
1 in todos, 2 in todos | |
``` | |
(False, False) | |
``` python | |
try: todos[2] | |
except NotFoundError: print('Record not found') | |
``` | |
Record not found | |
We are also constrained by what records we can update. In the following | |
example we try to update a TODO not named ‘Charlie’. Because the name is | |
wrong, the `.update` function will raise a `NotFoundError`. | |
``` python | |
try: todos.update(Todo(id=1, title='Finish MiniDataAPI Spec', status='closed', name='Braden')) | |
except NotFoundError as e: print('Record not updated') | |
``` | |
Record not updated | |
Unlike poor Braden, Charlie isn’t filtered out. Let’s update his TODO. | |
``` python | |
todos.update(Todo(id=3, title='Finish development of FastHTML', detail=None, status='closed', name='Charlie')) | |
``` | |
Todo(id=3, title='Finish development of FastHTML', detail=None, status='closed', name='Charlie') | |
Finally, once constrained by `.xtra`, only records with Charlie as the | |
name can be deleted. | |
``` python | |
try: todos.delete(1) | |
except NotFoundError as e: print('Record not updated') | |
``` | |
Record not updated | |
Charlie’s TODO was to finish development of FastHTML. While the | |
framework will stabilize, like any good project it will see new features | |
added and the odd bug corrected for many years to come. Therefore, | |
Charlie’s TODO is nonsensical. Let’s delete it. | |
``` python | |
todos.delete(ct.id) | |
``` | |
<Table todo (id, title, detail, status, name)> | |
When a TODO is inserted, the `xtra` fields are automatically set. This | |
ensures that we don’t accidentally, for instance, insert items for | |
others users. Note that here we don’t set the `name` field, but it’s | |
still included in the resultant row: | |
``` python | |
ct = todos.insert(Todo(title='Rewrite personal site in FastHTML', status='open')) | |
ct | |
``` | |
Todo(id=3, title='Rewrite personal site in FastHTML', detail=None, status='open', name='Charlie') | |
If we try to change the username to someone else, the change is ignored, | |
due to `xtra`: | |
``` python | |
ct.name = 'Braden' | |
todos.update(ct) | |
``` | |
Todo(id=3, title='Rewrite personal site in FastHTML', detail=None, status='open', name='Charlie') | |
## SQL-first design | |
``` python | |
users = None | |
User = None | |
``` | |
``` python | |
users = db.t.user | |
users | |
``` | |
<Table user (name, email, year_started, pwd)> | |
(This section needs to be documented properly.) | |
From the table objects we can extract a Dataclass version of our tables. | |
Usually this is given an singular uppercase version of our table name, | |
which in this case is `User`. | |
``` python | |
User = users.dataclass() | |
``` | |
``` python | |
User(name='Braden', email='[email protected]', year_started=2018) | |
``` | |
User(name='Braden', email='[email protected]', year_started=2018, pwd=UNSET) | |
## Implementations | |
### Implementing MiniDataAPI for a new datastore | |
For creating new implementations, the code examples in this | |
specification are the test case for the API. New implementations should | |
pass the tests in order to be compliant with the specification. | |
### Implementations | |
- [fastlite](https://github.com/AnswerDotAI/fastlite) - The original | |
implementation, only for Sqlite | |
- [fastsql](https://github.com/AnswerDotAI/fastsql) - An SQL database | |
agnostic implementation based on the excellent SQLAlchemy library. | |