--- license: mit --- # BLaIR-roberta-base BLaIR, which is short for "**B**ridging **La**nguage and **I**tems for **R**etrieval and **R**ecommendation", is a series of language models pre-trained on Amazon Reviews 2023 dataset. BLaIR is grounded on pairs of *(item metadata, language context)*, enabling the models to: * derive strong item text representations, for both recommendation and retrieval; * predict the most relevant item given simple / complex language context. [[๐Ÿ“‘ Paper](https://arxiv.org/abs/2403.03952)] ยท [[๐Ÿ’ป Code](https://github.com/hyp1231/AmazonReviews2023)] ยท [[๐ŸŒ Amazon Reviews 2023 Dataset](https://amazon-reviews-2023.github.io/)] ยท [[๐Ÿค— Huggingface Datasets](https://huggingface.co/datasets/McAuley-Lab/Amazon-Reviews-2023)] ยท [[๐Ÿ”ฌ McAuley Lab](https://cseweb.ucsd.edu/~jmcauley/)] ## Model Details - **Language(s) (NLP):** English - **License:** MIT - **Finetuned from model:** [roberta-base](https://huggingface.co/FacebookAI/roberta-base) - **Repository:** [https://github.com/hyp1231/AmazonReviews2023](https://github.com/hyp1231/AmazonReviews2023) - **Paper:** [https://arxiv.org/abs/2403.03952](https://arxiv.org/abs/2403.03952) ## Citation If you find Amazon Reviews 2023 dataset, BLaIR checkpoints, Amazon-C4 dataset, or our scripts/code helpful, please cite the following paper. ```bibtex @article{hou2024bridging, title={Bridging Language and Items for Retrieval and Recommendation}, author={Hou, Yupeng and Li, Jiacheng and He, Zhankui and Yan, An and Chen, Xiusi and McAuley, Julian}, journal={arXiv preprint arXiv:2403.03952}, year={2024} } ``` ## Contact Please let us know if you encounter a bug or have any suggestions/questions by [filling an issue](https://github.com/hyp1231/AmazonReview2023/issues/new) or emailing Yupeng Hou ([@hyp1231](https://github.com/hyp1231)) at [yphou@ucsd.edu](mailto:yphou@ucsd.edu).