Spaces:

reztilop
/

colibri.qdrant

Build error

App Files Files Community

colibri.qdrant / docs /DEVELOPMENT.md

Gouzi Mohaled

Ajout du dossier docs

2fdbd5c 8 months ago

preview code

raw

history blame contribute delete

9.82 kB


	# Developer's guide to Qdrant


	## Build Qdrant

	### Docker 🐳

	Build your own from source

	```bash
	docker build . --tag=qdrant/qdrant
	```

	Or use latest pre-built image from [DockerHub](https://hub.docker.com/r/qdrant/qdrant)

	```bash
	docker pull qdrant/qdrant
	```

	To run the container, use the command:

	```bash
	docker run -p 6333:6333 qdrant/qdrant
	```

	And once you need a fine-grained setup, you can also define a storage path and custom configuration:

	```bash
	docker run -p 6333:6333 \
	-v $(pwd)/path/to/data:/qdrant/storage \
	-v $(pwd)/path/to/snapshots:/qdrant/snapshots \
	-v $(pwd)/path/to/custom_config.yaml:/qdrant/config/production.yaml \
	qdrant/qdrant
	```

	* `/qdrant/storage` - is the place where Qdrant persists all your data.
	Make sure to mount it as a volume, otherwise docker will drop it with the container.
	- `/qdrant/snapshots` - is the place where Qdrant stores [snapshots](https://qdrant.tech/documentation/concepts/snapshots/)
	* `/qdrant/config/production.yaml` - is the file with engine configuration. You can override any value from the [reference config](https://github.com/qdrant/qdrant/blob/master/config/config.yaml)

	Now Qdrant should be accessible at [localhost:6333](http://localhost:6333/).


	### Local development
	#### Linux/Debian/MacOS
	To run Qdrant on local development environment you need to install below:
	- Install Rust, follow: [install rust](https://www.rust-lang.org/tools/install)
	- Install `rustfmt` toolchain for Rust
	```shell
	rustup component add rustfmt
	```
	- Install dependencies:
	```shell
	sudo apt-get update -y
	sudo apt-get upgrade -y
	sudo apt-get install -y curl unzip gcc-multilib \
	clang cmake jq \
	g++-9-aarch64-linux-gnu \
	gcc-9-aarch64-linux-gnu
	```
	- Install `protoc` from source
	```shell
	PROTOC_VERSION=22.2
	PKG_NAME=$(uname -s \| awk '{print ($1 == "Darwin") ? "osx-universal_binary" : (($1 == "Linux") ? "linux-x86_64" : "")}')

	# curl `proto` source file
	curl -LO https://github.com/protocolbuffers/protobuf/releases//download/v$PROTOC_VERSION/protoc-$PROTOC_VERSION-$PKG_NAME.zip

	unzip protoc-$PROTOC_VERSION-$PKG_NAME.zip -d $HOME/.local

	export PATH="$PATH:$HOME/.local/bin"

	# remove source file if not needed
	rm protoc-$PROTOC_VERSION-$PKG_NAME.zip

	# check installed `protoc` version
	protoc --version
	```
	- Build and run the app
	```shell
	cargo build --release --bin qdrant

	./target/release/qdrant
	```
	- Install Python dependencies for testing
	```shell
	poetry -C tests install --sync
	```
	Then you could use `poetry -C run pytest tests/openapi` and `poetry -C run pytest tests/consensus_tests` to run the tests.
	- Use the web UI

	Web UI repo is [in a separate repo](https://github.com/qdrant/qdrant-web-ui), but there's a utility script to sync it to the `static` folder:
	```shell
	./tools/sync-web-ui.sh
	```

	### Nix/NixOS
	If you are using [Nix package manager](https://nixos.org/) (available for Linux and MacOS), you can run `nix-shell` in the project root to get a shell with all dependencies installed.
	It includes dependencies to build Rust code as well as to run Python tests and various tools in the `./tools` directory.

	## Profiling

	There are several benchmarks implemented in Qdrant. Benchmarks are not included in CI/CD and might take some time to execute.
	So the expected approach to benchmarking is to run only ones which might be affected by your changes.

	To run benchmark, use the following command inside a related sub-crate:

	```bash
	cargo bench --bench name_of_benchmark
	```

	In this case you will see the execution timings and, if you launched this bench earlier, the difference in execution time.

	Example output:

	```
	scoring-vector/basic-score-point
	time: [111.81 us 112.07 us 112.31 us]
	change: [+19.567% +20.454% +21.404%] (p = 0.00 < 0.05)
	Performance has regressed.
	Found 9 outliers among 100 measurements (9.00%)
	3 (3.00%) low severe
	3 (3.00%) low mild
	2 (2.00%) high mild
	1 (1.00%) high severe
	scoring-vector/basic-score-point-10x
	time: [111.86 us 112.44 us 113.04 us]
	change: [-1.6120% -0.5554% +0.5103%] (p = 0.32 > 0.05)
	No change in performance detected.
	Found 1 outliers among 100 measurements (1.00%)
	1 (1.00%) high mild
	```


	### FlameGraph and call-graph visualisation
	To run benchmarks with profiler to generate FlameGraph - use the following command:

	```bash
	cargo bench --bench name_of_benchmark -- --profile-time=60
	```

	This command will run each benchmark iterator for `60` seconds and generate FlameGraph svg along with profiling records files.
	These records could later be used to generate visualisation of the call-graph.

	![FlameGraph example](./imgs/flamegraph-profile.png)

	Use [pprof](https://github.com/google/pprof) and the following command to generate `svg` with a call graph:

	```bash
	~/go/bin/pprof -output=profile.svg -svg ${qdrant_root}/target/criterion/${benchmark_name}/${function_name}/profile/profile.pb
	```

	![call-graph example](./imgs/call-graph-profile.png)

	### Real-time profiling

	Qdrant have basic [`tracing`] support with [`Tracy`] profiler and [`tokio-console`] integrations
	that can be enabled with optional features.

	- [`tracing`] is an _optional_ dependency that can be enabled with `tracing` feature
	- `tracy` feature enables [`Tracy`] profiler integration
	- `console` feature enables [`tokio-console`] integration
	- note, that you'll also have to [pass `--cfg tokio_unstable` arguments to `rustc`][tokio-tracing] to enable this feature
	- by default [`tokio-console`] binds to `127.0.0.1:6669`
	- if you want to connect [`tokio-console`] to Qdrant instance running inside a Docker container
	or on remote server, you can define `TOKIO_CONSOLE_BIND` when running Qdrant to override it
	(e.g., `TOKIO_CONSOLE_BIND=0.0.0.0:6669` to listen on all interfaces)
	- `tokio-tracing` feature explicitly enables [`Tokio` crate tracing][tokio-tracing]
	- note, that you'll also have to [pass `--cfg tokio_unstable` arguments to `rustc`][tokio-tracing] to enable this feature
	- this is required (and enabled automatically) by the `console` feature
	- but you can enable it explicitly with the `tracy` feature, to see Tokio traces in [`Tracy`] profiler

	Qdrant code is not instrumented by default, so you'll have to manually add `#[tracing::instrument]` attributes
	on functions and methods that you want to profile.

	Qdrant uses [`tracing-log`] as the [`log`] backend, so `log` and `log-always` features of the [`tracing`] crate
	[should _not_ be enabled][tracing-log-warning]!

	```rust
	// `tracing` crate is an optional dependency in `lib/*` crates, so if you want the code to compile
	// when `tracing` feature is disabled, you have to use `#[cfg_attr(...)]`...
	//
	// See https://doc.rust-lang.org/reference/conditional-compilation.html#the-cfg_attr-attribute
	#[cfg_attr(feature = "tracing", tracing::instrument)]
	fn my_function(some_parameter: String) {
	// ...
	}

	// ...or if you just want to do some quick-and-dirty profiling, you can use `#[tracing::instrument]`
	// directly, just don't forget to add `--features tracing` when running `cargo` (or add `tracing`
	// to default features in `Cargo.toml`)
	#[tracing::instrument]
	fn some_other_function() {
	// ...
	}
	```

	[`tracing`]: https://docs.rs/tracing/latest/tracing/
	[`Tracy`]: https://github.com/wolfpld/tracy
	[`tokio-console`]: https://docs.rs/tokio-console/latest/tokio_console/
	[tokio-tracing]: https://docs.rs/tokio/latest/tokio/#unstable-features
	[`tracing-log`]: https://docs.rs/tracing-log/latest/tracing_log/
	[`log`]: https://docs.rs/log/latest/log/
	[tracing-log-warning]: https://docs.rs/tracing-log/latest/tracing_log/#caution-mixing-both-conversions

	## API changes

	### REST

	Qdrant uses the [openapi](https://spec.openapis.org/oas/latest.html) specification to document its API.

	This means changes to the API must be followed by changes to the specification.
	This is enforced by CI.

	Here is a quick step-by-step guide:

	1. code endpoints and model in Rust
	2. change specs in `/openapi/*ytt.yaml`
	3. add new schema definitions to `src/schema_generator.rs`
	4. run `./tools/generate_openapi_models.sh` to generate specs
	5. update integration tests `tests/openapi` and run them with `pytest tests/openapi` (use poetry or nix to get `pytest`)
	6. expose file by starting an HTTP server, for instance `python -m http.server`, in `/docs/redoc`
	7. validate specs by browsing redoc on `http://localhost:8000/?v=master`
	8. validate `openapi-merged.yaml` using [swagger editor](https://editor.swagger.io/)

	### gRPC

	Qdrant uses [tonic](https://github.com/hyperium/tonic) to serve gRPC traffic.

	Our protocol buffers are defined in `lib/api/src/grpc/proto/*.proto`

	1. define request and response types using protocol buffers (use [oneOf](https://developers.google.com/protocol-buffers/docs/proto3#oneof) for enums payloads)
	2. specify RPC methods inside the service definition using protocol buffers
	3. `cargo build` will generate the struct definitions and a service trait
	4. implement the service trait in Rust
	5. start server `cargo run --bin qdrant`
	6. run integration test `./tests/basic_grpc_test.sh`
	7. generate docs `./tools/generate_grpc_docs.sh`

	Here is a good [tonic tutorial](https://github.com/hyperium/tonic/blob/master/examples/routeguide-tutorial.md#defining-the-service) for reference.

	### System integration

	On top of the API definitions, Qdrant has a few system integrations that need to be considered when making changes:
	1. add new endpoints to the metrics allow lists in `src/common/metrics.rs`
	2. test the JWT integration in `tests/auth_tests`