Spaces:

lmdeploy
/

README

Running

README / README.md

Update README.md

d760e92 almost 2 years ago

1.36 kB

	---
	title: README
	emoji: 🚀
	colorFrom: indigo
	colorTo: pink
	sdk: static
	pinned: false
	license: apache-2.0
	---
	<div align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/64ccdc322e592905f922a06e/VhwQtaklohkUXFWkjA-3M.png" width="450"/>

	[Github](https://github.com/InternLM/lmdeploy)

	English \| [简体中文](https://github.com/InternLM/lmdeploy/blob/main/README_zh-CN.md)

	</div>

	<p align="center">
	👋 join us on <a href="https://twitter.com/intern_lm" target="_blank">Twitter</a>, <a href="https://discord.gg/xa29JuW87d" target="_blank">Discord</a> and <a href="https://r.vansin.top/?r=internwx" target="_blank">WeChat</a>
	</p>

	______________________________________________________________________

	## News 🎉

	- \[2023/08\] TurboMind supports 4-bit inference, 2.4x faster than FP16, the fastest open-source implementation🚀.
	- \[2023/08\] LMDeploy has launched on the [HuggingFace Hub](https://huggingface.co/lmdeploy), providing ready-to-use 4-bit models.
	- \[2023/08\] LMDeploy supports 4-bit quantization using the [AWQ](https://arxiv.org/abs/2306.00978) algorithm.
	- \[2023/07\] TurboMind supports Llama-2 70B with GQA.
	- \[2023/07\] TurboMind supports Llama-2 7B/13B.
	- \[2023/07\] TurboMind supports tensor-parallel inference of InternLM.

	______________________________________________________________________