Spaces:

AItool
/

home-fasthtml

Sleeping

App Files Files Community

home-fasthtml / docs /explains /minidataapi.html.md

AItool

Upload 210 files

036b3a6 verified 10 days ago

preview code

raw

history blame contribute delete

18.3 kB

	# MiniDataAPI Spec


	<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

	The `MiniDataAPI` is a persistence API specification that designed to be
	small and relatively easy to implement across a wide range of
	datastores. While early implementations have been SQL-based, the
	specification can be quickly implemented in key/value stores, document
	databases, and more.

	<div>

	> Work in Progress
	>
	> The MiniData API spec is a work in progress, subject to change. While
	> the majority of design is complete, expect there could be breaking
	> changes.

	</div>

	## Why?

	The MiniDataAPI specification allows us to use the same API for many
	different database engines. Any application using the MiniDataAPI spec
	for interacting with its database requires no modification beyond import
	and configuration changes to switch database engines. For example, to
	convert an application from Fastlite running SQLite to FastSQL running
	PostgreSQL, should require only changing these two lines:

	<div class="columns">

	<div class="column" width="19%"
	style="border-right: 1px solid #ccc; padding-right: 10px;">

	FastLite version

	``` python
	from fastlite import *
	db = database('test.db')
	```

	</div>

	<div class="column" width="79%" style="padding-left: 10px;">

	FastSQL version

	``` python
	from fastsql import *
	db = Database('postgres:...')
	```

	</div>

	</div>

	As both libraries adhere to the MiniDataAPI specification, the rest of
	the code in the application should remain the same. The advantage of the
	MiniDataAPI spec is that it allows people to use whatever datastores
	they have access to or prefer.

	<div>

	> Note
	>
	> Switching databases won’t migrate any existing data between databases.

	</div>

	### Easy to learn, quick to implement

	The MiniDataAPI specification is designed to be easy-to-learn and quick
	to implement. It focuses on straightforward Create, Read, Update, and
	Delete (CRUD) operations.

	MiniDataAPI databases aren’t limited to just row-based systems. In fact,
	the specification is closer in design to a key/value store than a set of
	records. What’s exciting about this is we can write implementations for
	tools like Python dict stored as JSON, Redis, and even the venerable
	ZODB.

	### Limitations of the MiniDataAPI Specification

	> “Mini refers to the lightweightness of specification, not the data.”
	>
	> – Jeremy Howard

	The advantages of the MiniDataAPI come at a cost. The MiniDataAPI
	specification focuses a very small set of features compared to what can
	be found in full-fledged ORMs and query languages. It intentionally
	avoids nuances or sophisticated features.

	This means the specification does not include joins or formal foreign
	keys. Complex data stored over multiple tables that require joins isn’t
	handled well. For this kind of scenario it’s probably for the best to
	use more sophisticated ORMs or even direct database queries.

	### Summary of the MiniDataAPI Design

	- Easy-to-learn
	- Relative quick to implement for new database engines
	- An API for CRUD operations
	- For many different types of databases including row- and
	key/value-based designs
	- Intentionally small in terms of features: no joins, no foreign keys,
	no database specific features
	- Best for simpler designs, complex architectures will need more
	sophisticated tools.

	## Connect/construct the database

	We connect or construct the database by passing in a string connecting
	to the database endpoint or a filepath representing the database’s
	location. While this example is for SQLite running in memory, other
	databases such as PostgreSQL, Redis, MongoDB, might instead use a URI
	pointing at the database’s filepath or endpoint. The method of
	connecting to a DB is not part of this API, but part of the underlying
	library. For instance, for fastlite:

	``` python
	db = database(':memory:')
	```

	Here’s a complete list of the available methods in the API, all
	documented below (assuming `db` is a database and `t` is a table):

	- `db.create`
	- `t.insert`
	- `t.delete`
	- `t.update`
	- `t[key]`
	- `t(...)`
	- `t.xtra`

	## Tables

	For the sake of expediency, this document uses a SQL example. However,
	tables can represent anything, not just the fundamental construct of a
	SQL databases. They might represent keys within a key/value structure or
	files on a hard-drive.

	### Creating tables

	We use a `create()` method attached to `Database` object (`db` in our
	example) to create the tables.

	``` python
	class User: name:str; email: str; year_started:int
	users = db.create(User, pk='name')
	users
	```

	<Table user (name, email, year_started)>

	``` python
	class User: name:str; email: str; year_started:int
	users = db.create(User, pk='name')
	users
	```

	<Table user (name, email, year_started)>

	If no `pk` is provided, `id` is assumed to be the primary key.
	Regardless of whether you mark a class as a dataclass or not, it will be
	turned into one – specifically into a
	[`flexiclass`](https://fastcore.fast.ai/xtras.html#flexiclass).

	``` python
	@dataclass
	class Todo: id: int; title: str; detail: str; status: str; name: str
	todos = db.create(Todo)
	todos
	```

	<Table todo (id, title, detail, status, name)>

	### Compound primary keys

	The MiniData API spec supports compound primary keys, where more than
	one column is used to identify records. We’ll also use this example to
	demonstrate creating a table using a dict of keyword arguments.

	``` python
	class Publication: authors: str; year: int; title: str
	publications = db.create(Publication, pk=('authors', 'year'))
	```

	### Transforming tables

	Depending on the database type, this method can include transforms - the
	ability to modify the tables. Let’s go ahead and add a password field
	for our table called `pwd`.

	``` python
	class User: name:str; email: str; year_started:int; pwd:str
	users = db.create(User, pk='name', transform=True)
	users
	```

	<Table user (name, email, year_started, pwd)>

	## Manipulating data

	The specification is designed to provide as straightforward CRUD API
	(Create, Read, Update, and Delete) as possible. Additional features like
	joins are out of scope.

	### .insert()

	Add a new record to the database. We want to support as many types as
	possible, for now we have tests for Python classes, dataclasses, and
	dicts. Returns an instance of the new record.

	Here’s how to add a record using a Python class:

	``` python
	users.insert(User(name='Braden', email='[email protected]', year_started=2018))
	```

	User(name='Braden', email='[email protected]', year_started=2018, pwd=None)

	We can also use keyword arguments directly:

	``` python
	users.insert(name='Alma', email='[email protected]', year_started=2019)
	```

	User(name='Alma', email='[email protected]', year_started=2019, pwd=None)

	And now Charlie gets added via a Python dict.

	``` python
	users.insert({'name': 'Charlie', 'email': '[email protected]', 'year_started': 2018})
	```

	User(name='Charlie', email='[email protected]', year_started=2018, pwd=None)

	And now TODOs. Note that the inserted row is returned:

	``` python
	todos.insert(Todo(title='Write MiniDataAPI spec', status='open', name='Braden'))
	todos.insert(title='Implement SSE in FastHTML', status='open', name='Alma')
	todo = todos.insert(dict(title='Finish development of FastHTML', status='closed', name='Charlie'))
	todo
	```

	Todo(id=3, title='Finish development of FastHTML', detail=None, status='closed', name='Charlie')

	Let’s do the same with the `Publications` table.

	``` python
	publications.insert(Publication(authors='Alma', year=2019, title='FastHTML'))
	publications.insert(authors='Alma', year=2030, title='FastHTML and beyond')
	publication= publications.insert((dict(authors='Alma', year=2035, title='FastHTML, the early years')))
	publication
	```

	Publication(authors='Alma', year=2035, title='FastHTML, the early years')

	### Square bracket search \[\]

	Get a single record by entering a primary key into a table object within
	square brackets. Let’s see if we can find Alma.

	``` python
	user = users['Alma']
	user
	```

	User(name='Alma', email='[email protected]', year_started=2019, pwd=None)

	If no record is found, a `NotFoundError` error is raised. Here we look
	for David, who hasn’t yet been added to our users table.

	``` python
	try: users['David']
	except NotFoundError: print(f'User not found')
	```

	User not found

	Here’s a demonstration of a ticket search, demonstrating how this works
	with non-string primary keys.

	``` python
	todos[1]
	```

	Todo(id=1, title='Write MiniDataAPI spec', detail=None, status='open', name='Braden')

	Compound primary keys can be supplied in lists or tuples, in the order
	they were defined. In this case it is the `authors` and `year` columns.

	Here’s a query by compound primary key done with a `list`:

	``` python
	publications[['Alma', 2019]]
	```

	Publication(authors='Alma', year=2019, title='FastHTML')

	Here’s the same query done directly with index args.

	``` python
	publications['Alma', 2030]
	```

	Publication(authors='Alma', year=2030, title='FastHTML and beyond')

	### Parentheses search ()

	Get zero to many records by entering values with parentheses searches.
	If nothing is in the parentheses, then everything is returned.

	``` python
	users()
	```

	[User(name='Braden', email='[email protected]', year_started=2018, pwd=None),
	User(name='Alma', email='[email protected]', year_started=2019, pwd=None),
	User(name='Charlie', email='[email protected]', year_started=2018, pwd=None)]

	We can order the results.

	``` python
	users(order_by='name')
	```

	[User(name='Alma', email='[email protected]', year_started=2019, pwd=None),
	User(name='Braden', email='[email protected]', year_started=2018, pwd=None),
	User(name='Charlie', email='[email protected]', year_started=2018, pwd=None)]

	We can filter on the results:

	``` python
	users(where="name='Alma'")
	```

	[User(name='Alma', email='[email protected]', year_started=2019, pwd=None)]

	Generally you probably want to use placeholders, to avoid SQL injection
	attacks:

	``` python
	users("name=?", ('Alma',))
	```

	[User(name='Alma', email='[email protected]', year_started=2019, pwd=None)]

	We can limit results with the `limit` keyword:

	``` python
	users(limit=1)
	```

	[User(name='Braden', email='[email protected]', year_started=2018, pwd=None)]

	If we’re using the `limit` keyword, we can also use the `offset` keyword
	to start the query later.

	``` python
	users(limit=5, offset=1)
	```

	[User(name='Alma', email='[email protected]', year_started=2019, pwd=None),
	User(name='Charlie', email='[email protected]', year_started=2018, pwd=None)]

	### .update()

	Update an existing record of the database. Must accept Python dict,
	dataclasses, and standard classes. Uses the primary key for identifying
	the record to be changed. Returns an instance of the updated record.

	Here’s with a normal Python class:

	``` python
	user
	```

	User(name='Alma', email='[email protected]', year_started=2019, pwd=None)

	``` python
	user.year_started = 2099
	users.update(user)
	```

	User(name='Alma', email='[email protected]', year_started=2099, pwd=None)

	Or use a dict:

	``` python
	users.update(dict(name='Alma', year_started=2199, email='[email protected]'))
	```

	User(name='Alma', email='[email protected]', year_started=2199, pwd=None)

	Or use kwargs:

	``` python
	users.update(name='Alma', year_started=2149)
	```

	User(name='Alma', email='[email protected]', year_started=2149, pwd=None)

	If the primary key doesn’t match a record, raise a `NotFoundError`.

	John hasn’t started with us yet so doesn’t get the chance yet to travel
	in time.

	``` python
	try: users.update(User(name='John', year_started=2024, email='[email protected]'))
	except NotFoundError: print('User not found')
	```

	User not found

	### .delete()

	Delete a record of the database. Uses the primary key for identifying
	the record to be removed. Returns a table object.

	Charlie decides to not travel in time. He exits our little group.

	``` python
	users.delete('Charlie')
	```

	<Table user (name, email, year_started, pwd)>

	If the primary key value can’t be found, raises a `NotFoundError`.

	``` python
	try: users.delete('Charlies')
	except NotFoundError: print('User not found')
	```

	User not found

	In John’s case, he isn’t time travelling with us yet so can’t be
	removed.

	``` python
	try: users.delete('John')
	except NotFoundError: print('User not found')
	```

	User not found

	Deleting records with compound primary keys requires providing the
	entire key.

	``` python
	publications.delete(['Alma' , 2035])
	```

	<Table publication (authors, year, title)>

	### `in` keyword

	Are `Alma` and `John` contained `in` the Users table? Or, to be
	technically precise, is the item with the specified primary key value
	`in` this table?

	``` python
	'Alma' in users, 'John' in users
	```

	(True, False)

	Also works with compound primary keys, as shown below. You’ll note that
	the operation can be done with either a `list` or `tuple`.

	``` python
	['Alma', 2019] in publications
	```

	True

	And now for a `False` result, where John has no publications.

	``` python
	('John', 1967) in publications
	```

	False

	### .xtra()

	If we set fields within the `.xtra` function to a particular value, then
	indexing is also filtered by those. This applies to every database
	method except for record creation. This makes it easier to limit users
	(or other objects) access to only things for which they have permission.
	This is a one-way operation, once set it can’t be undone for a
	particular table object.

	For example, if we query all our records below without setting values
	via the `.xtra` function, we can see todos for everyone. Pay special
	attention to the `id` values of all three records, as we are about to
	filter most of them away.

	``` python
	todos()
	```

	[Todo(id=1, title='Write MiniDataAPI spec', detail=None, status='open', name='Braden'),
	Todo(id=2, title='Implement SSE in FastHTML', detail=None, status='open', name='Alma'),
	Todo(id=3, title='Finish development of FastHTML', detail=None, status='closed', name='Charlie')]

	Let’s use `.xtra` to constrain results just to Charlie. We set the
	`name` field in Todos, but it could be any field defined for this table.

	``` python
	todos.xtra(name='Charlie')
	```

	We’ve now set a field to a value with `.xtra`, if we loop over all the
	records again, only those assigned to records with a `name` of `Charlie`
	will be displayed.

	``` python
	todos()
	```

	[Todo(id=3, title='Finish development of FastHTML', detail=None, status='closed', name='Charlie')]

	The `in` keyword is also affected. Only records with a `name` of Charlie
	will evaluate to be `True`. Let’s demonstrate by testing it with a
	Charlie record:

	``` python
	ct = todos[3]
	ct
	```

	Todo(id=3, title='Finish development of FastHTML', detail=None, status='closed', name='Charlie')

	Charlie’s record has an ID of 3. Here we demonstrate that Charlie’s TODO
	can be found in the list of todos:

	``` python
	ct.id in todos
	```

	True

	If we try `in` with the other IDs the query fails because the filtering
	is now set to just records with a name of Charlie.

	``` python
	1 in todos, 2 in todos
	```

	(False, False)

	``` python
	try: todos[2]
	except NotFoundError: print('Record not found')
	```

	Record not found

	We are also constrained by what records we can update. In the following
	example we try to update a TODO not named ‘Charlie’. Because the name is
	wrong, the `.update` function will raise a `NotFoundError`.

	``` python
	try: todos.update(Todo(id=1, title='Finish MiniDataAPI Spec', status='closed', name='Braden'))
	except NotFoundError as e: print('Record not updated')
	```

	Record not updated

	Unlike poor Braden, Charlie isn’t filtered out. Let’s update his TODO.

	``` python
	todos.update(Todo(id=3, title='Finish development of FastHTML', detail=None, status='closed', name='Charlie'))
	```

	Todo(id=3, title='Finish development of FastHTML', detail=None, status='closed', name='Charlie')

	Finally, once constrained by `.xtra`, only records with Charlie as the
	name can be deleted.

	``` python
	try: todos.delete(1)
	except NotFoundError as e: print('Record not updated')
	```

	Record not updated

	Charlie’s TODO was to finish development of FastHTML. While the
	framework will stabilize, like any good project it will see new features
	added and the odd bug corrected for many years to come. Therefore,
	Charlie’s TODO is nonsensical. Let’s delete it.

	``` python
	todos.delete(ct.id)
	```

	<Table todo (id, title, detail, status, name)>

	When a TODO is inserted, the `xtra` fields are automatically set. This
	ensures that we don’t accidentally, for instance, insert items for
	others users. Note that here we don’t set the `name` field, but it’s
	still included in the resultant row:

	``` python
	ct = todos.insert(Todo(title='Rewrite personal site in FastHTML', status='open'))
	ct
	```

	Todo(id=3, title='Rewrite personal site in FastHTML', detail=None, status='open', name='Charlie')

	If we try to change the username to someone else, the change is ignored,
	due to `xtra`:

	``` python
	ct.name = 'Braden'
	todos.update(ct)
	```

	Todo(id=3, title='Rewrite personal site in FastHTML', detail=None, status='open', name='Charlie')

	## SQL-first design

	``` python
	users = None
	User = None
	```

	``` python
	users = db.t.user
	users
	```

	<Table user (name, email, year_started, pwd)>

	(This section needs to be documented properly.)

	From the table objects we can extract a Dataclass version of our tables.
	Usually this is given an singular uppercase version of our table name,
	which in this case is `User`.

	``` python
	User = users.dataclass()
	```

	``` python
	User(name='Braden', email='[email protected]', year_started=2018)
	```

	User(name='Braden', email='[email protected]', year_started=2018, pwd=UNSET)

	## Implementations

	### Implementing MiniDataAPI for a new datastore

	For creating new implementations, the code examples in this
	specification are the test case for the API. New implementations should
	pass the tests in order to be compliant with the specification.

	### Implementations

	- [fastlite](https://github.com/AnswerDotAI/fastlite) - The original
	implementation, only for Sqlite
	- [fastsql](https://github.com/AnswerDotAI/fastsql) - An SQL database
	agnostic implementation based on the excellent SQLAlchemy library.