File size: 7,581 Bytes
63b9057
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
01a871f
 
 
 
 
 
 
 
 
 
 
 
61f9d2f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
2025-01-09 15:54:08,215 - hf_validation - WARNING - No .env file found. Fine if you're on Huggingface, but you need one to run locally on your PC.
2025-01-09 15:54:08,215 - hf_validation - ERROR - No HF_TOKEN found in environment variables
2025-01-09 15:54:08,215 - main - INFO - Starting LLM API server
2025-01-09 15:54:08,216 - llm_api - INFO - Initializing LLM API
2025-01-09 15:54:08,216 - llm_api - INFO - LLM API initialized successfully
2025-01-09 15:54:08,216 - api_routes - INFO - Router initialized with LLM API instance
2025-01-09 15:54:08,218 - main - INFO - FastAPI application created successfully
2025-01-09 16:46:10,118 - api_routes - INFO - Received request to download model: microsoft/phi-4
2025-01-09 16:46:10,118 - llm_api - INFO - Starting download of model: microsoft/phi-4
2025-01-09 16:46:10,118 - llm_api - INFO - Enabling stdout logging for download
2025-01-09 17:00:32,400 - llm_api - INFO - Disabling stdout logging
2025-01-09 17:00:32,400 - llm_api - INFO - Saving model to main/models/phi-4
2025-01-09 17:02:39,928 - llm_api - INFO - Successfully downloaded model: microsoft/phi-4
2025-01-09 17:02:41,075 - api_routes - INFO - Successfully downloaded model: microsoft/phi-4
2025-01-09 17:02:41,080 - api_routes - INFO - Received request to initialize model: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
2025-01-09 17:02:41,080 - llm_api - INFO - Initializing generation model: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
2025-01-09 17:02:41,081 - llm_api - INFO - Loading model from source: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
2025-01-09 17:02:41,377 - llm_api - ERROR - Failed to initialize generation model huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated: Using `low_cpu_mem_usage=True` or a `device_map` requires Accelerate: `pip install 'accelerate>=0.26.0'`
2025-01-09 17:02:41,377 - api_routes - ERROR - Error initializing model: Using `low_cpu_mem_usage=True` or a `device_map` requires Accelerate: `pip install 'accelerate>=0.26.0'`
2025-01-09 17:11:25,843 - hf_validation - WARNING - No .env file found. Fine if you're on Huggingface, but you need one to run locally on your PC.
2025-01-09 17:11:25,843 - hf_validation - ERROR - No HF_TOKEN found in environment variables
2025-01-09 17:11:25,843 - main - INFO - Starting LLM API server
2025-01-09 17:11:25,843 - llm_api - INFO - Initializing LLM API
2025-01-09 17:11:25,844 - llm_api - INFO - LLM API initialized successfully
2025-01-09 17:11:25,844 - api_routes - INFO - Router initialized with LLM API instance
2025-01-09 17:11:25,846 - main - INFO - FastAPI application created successfully
2025-01-09 17:11:38,299 - api_routes - INFO - Received request to initialize model: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
2025-01-09 17:11:38,299 - llm_api - INFO - Initializing generation model: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
2025-01-09 17:11:38,299 - llm_api - INFO - Loading model from source: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
2025-01-09 17:11:38,487 - llm_api - ERROR - Failed to initialize generation model huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated: Using `bitsandbytes` 8-bit quantization requires the latest version of bitsandbytes: `pip install -U bitsandbytes`
2025-01-09 17:11:38,487 - api_routes - ERROR - Error initializing model: Using `bitsandbytes` 8-bit quantization requires the latest version of bitsandbytes: `pip install -U bitsandbytes`
2025-01-09 17:12:48,606 - hf_validation - WARNING - No .env file found. Fine if you're on Huggingface, but you need one to run locally on your PC.
2025-01-09 17:12:48,606 - hf_validation - ERROR - No HF_TOKEN found in environment variables
2025-01-09 17:12:48,606 - main - INFO - Starting LLM API server
2025-01-09 17:12:48,606 - llm_api - INFO - Initializing LLM API
2025-01-09 17:12:48,606 - llm_api - INFO - LLM API initialized successfully
2025-01-09 17:12:48,606 - api_routes - INFO - Router initialized with LLM API instance
2025-01-09 17:12:48,608 - main - INFO - FastAPI application created successfully
2025-01-09 17:12:59,453 - api_routes - INFO - Received request to initialize model: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
2025-01-09 17:12:59,453 - llm_api - INFO - Initializing generation model: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
2025-01-09 17:12:59,453 - llm_api - INFO - Loading model from source: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
2025-01-09 17:12:59,628 - llm_api - ERROR - Failed to initialize generation model huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated: CUDA is required but not available for bitsandbytes. Please consider installing the multi-platform enabled version of bitsandbytes, which is currently a work in progress. Please check currently supported platforms and installation instructions at https://huggingface.co/docs/bitsandbytes/main/en/installation#multi-backend
2025-01-09 17:12:59,628 - api_routes - ERROR - Error initializing model: CUDA is required but not available for bitsandbytes. Please consider installing the multi-platform enabled version of bitsandbytes, which is currently a work in progress. Please check currently supported platforms and installation instructions at https://huggingface.co/docs/bitsandbytes/main/en/installation#multi-backend
2025-01-09 17:14:44,390 - api_routes - INFO - Received request to initialize model: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
2025-01-09 17:14:44,390 - llm_api - INFO - Initializing generation model: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
2025-01-09 17:14:44,390 - llm_api - INFO - Loading model from source: huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
2025-01-09 17:14:53,032 - llm_api - ERROR - Failed to initialize generation model huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated: CUDA is required but not available for bitsandbytes. Please consider installing the multi-platform enabled version of bitsandbytes, which is currently a work in progress. Please check currently supported platforms and installation instructions at https://huggingface.co/docs/bitsandbytes/main/en/installation#multi-backend
2025-01-09 17:14:53,032 - api_routes - ERROR - Error initializing model: CUDA is required but not available for bitsandbytes. Please consider installing the multi-platform enabled version of bitsandbytes, which is currently a work in progress. Please check currently supported platforms and installation instructions at https://huggingface.co/docs/bitsandbytes/main/en/installation#multi-backend
2025-01-09 17:15:14,956 - api_routes - INFO - Received request to initialize model: microsoft/phi-4
2025-01-09 17:15:14,956 - llm_api - INFO - Initializing generation model: microsoft/phi-4
2025-01-09 17:15:14,956 - llm_api - INFO - Loading model from local path: main/models/phi-4
2025-01-09 17:15:14,965 - llm_api - ERROR - Failed to initialize generation model microsoft/phi-4: CUDA is required but not available for bitsandbytes. Please consider installing the multi-platform enabled version of bitsandbytes, which is currently a work in progress. Please check currently supported platforms and installation instructions at https://huggingface.co/docs/bitsandbytes/main/en/installation#multi-backend
2025-01-09 17:15:14,965 - api_routes - ERROR - Error initializing model: CUDA is required but not available for bitsandbytes. Please consider installing the multi-platform enabled version of bitsandbytes, which is currently a work in progress. Please check currently supported platforms and installation instructions at https://huggingface.co/docs/bitsandbytes/main/en/installation#multi-backend