|
Since Huggingface has omitted to publish a standalone pytorch SmolLM2_360M_model.py to load and finetune and run inference of the released model weights and config at https://huggingface.co/HuggingFaceTB/SmolLM2-360M/ |
|
I have attempted to construct a pytorch model.py that can load and at least do inference mode using the published weights and config. One a functioning pytorch model.py is built, it may be possible to export a torchscript version of the SmolLM2 model that can be implemented on non-python hardware such as MPUs or Risc machines or Smartphones, in edge devices. The SmolLM2_360M_model.py runs but is unable to load the safetensors data. Here is the encountered error: |
|
|
|
C:\Users\User\OneDrive\Desktop\SmolLM2>python SmolLM2_360M_model_debugging.py |
|
Warning: SentencePiece not found, using rudimentary BPE tokenizer. Install SentencePiece for better performance. |
|
|
|
A module that was compiled using NumPy 1.x cannot be run in |
|
NumPy 2.1.3 as it may crash. To support both 1.x and 2.x |
|
versions of NumPy, modules must be compiled with NumPy 2.0. |
|
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'. |
|
|
|
If you are a user of the module, the easiest solution will be to |
|
downgrade to 'numpy<2' or try to upgrade the affected module. |
|
We expect that some modules will need time to support NumPy 2. |
|
|
|
Traceback (most recent call last): File "C:\Users\User\OneDrive\Desktop\SmolLM2\SmolLM2_360M_model_debugging.py", line 470, in <module> |
|
model = SmolLM2_360M(config_path) |
|
File "C:\Users\User\OneDrive\Desktop\SmolLM2\SmolLM2_360M_model_debugging.py", line 243, in __init__ |
|
self.embed_tokens = nn.Embedding(self.vocab_size, self.hidden_size) |
|
File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\sparse.py", line 142, in __init__ |
|
self.weight = Parameter(torch.empty((num_embeddings, embedding_dim), **factory_kwargs), |
|
C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\sparse.py:142: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at ..\torch\csrc\utils\tensor_numpy.cpp:84.) |
|
self.weight = Parameter(torch.empty((num_embeddings, embedding_dim), **factory_kwargs), |
|
An error occurred while loading weights: File does not contain tensor lm_head.weight |
|
|
|
C:\Users\User\OneDrive\Desktop\SmolLM2> |
|
|
|
So, what is the story with safetensors "File does not contain tensor lm_head.weight" |
|
|
|
Is there a python script for inspecting the safetensors file? |
|
|
|
Why does model.safetensors file "not contain tensor lm_head.weight"? |
|
|
|
|
|
# Help Needed: Building a Standalone PyTorch SmolLM2-360M Model |
|
|
|
The Hugging Face Hub hosts the SmolLM2-360M model ([HuggingFaceTB/SmolLM2-360M](https://huggingface.co/HuggingFaceTB/SmolLM2-360M/)), but currently lacks a standalone PyTorch `model.py` file for loading, fine-tuning, and inference. This limits the model's usability outside the Hugging Face ecosystem. |
|
|
|
I've started creating a `SmolLM2_360M_model.py` file to address this gap, aiming for compatibility with all SmolLM2 models. The initial goal is to enable inference using the published weights and config. A successful PyTorch implementation would pave the way for exporting a TorchScript version, broadening accessibility to non-Python environments like microcontrollers, RISC-V machines, smartphones, and other edge devices. |
|
|
|
**The Challenge:** |
|
|
|
While my `SmolLM2_360M_model.py` runs, it encounters problems loading the `safetensors` data. I'm receiving the following error: |
|
|
|
``` |
|
# Insert the full error message here, including traceback. This will help others diagnose the problem quickly. |
|
# For example: |
|
Traceback (most recent call last): |
|
File "SmolLM2_360M_model.py", line 32, in <module> |
|
model.load_state_dict(torch.load("pytorch_model.bin")) |
|
File ".../python3.8/site-packages/torch/serialization.py", line 781, in load |
|
with _open_file_like(f, 'rb') as opened_file: |
|
FileNotFoundError: [Errno 2] No such file or directory: 'pytorch_model.bin' |
|
|
|
``` |
|
|
|
**Call to Action:** |
|
|
|
I'm seeking assistance from experienced PyTorch developers to debug the loading issue and complete the `SmolLM2_360M_model.py` implementation. Your contributions will significantly expand the potential applications of SmolLM2. |
|
|
|
**Specific Areas Where Help is Needed:** |
|
|
|
* **Safetensors Loading:** Resolving the error encountered when loading the model weights from the safetensors file. |
|
* **Model Architecture Verification:** Confirming the correctness of the PyTorch model architecture based on the config file. |
|
* **Inference Implementation:** Ensuring the model can perform inference correctly. |
|
* **Fine-tuning Support (Optional):** Adding functionality for fine-tuning the model on downstream tasks. |
|
* **TorchScript Export (Optional):** Enabling export to TorchScript for deployment on resource-constrained devices. |
|
|
|
**How to Contribute:** |
|
|
|
1. Fork the repository containing the `SmolLM2_360M_model.py` file. |
|
2. Debug the code and implement the missing functionality. |
|
3. Submit a pull request with your changes. |
|
|
|
By working together, we can make SmolLM2 more accessible and empower a wider range of users to leverage its capabilities. Thank you for your time and expertise! |
|
|
|
|
|
P.S. Here's a technical breakdown of the process for creating a TorchScript version of the model and deploying it to various platforms: |
|
|
|
**1. TorchScript Creation:** |
|
|
|
* **Trace or Script:** TorchScript offers two ways to convert your PyTorch model: tracing and scripting. Tracing records the operations performed on example inputs, creating a static graph. Scripting directly parses the model code, supporting control flow. Scripting is preferred if your model uses dynamic control flow. |
|
```python |
|
# Tracing Example |
|
example_input = torch.randn(1, 3, 224, 224) # Example input |
|
traced_model = torch.jit.trace(model, example_input) |
|
|
|
# Scripting Example |
|
scripted_model = torch.jit.script(model) |
|
``` |
|
|
|
* **Optimization (Optional):** TorchScript provides optimization passes to improve the performance of the exported model. |
|
```python |
|
optimized_model = torch.jit.optimize_for_inference(scripted_model) |
|
``` |
|
|
|
* **Saving:** Save the TorchScript model to a file. |
|
```python |
|
torch.jit.save(optimized_model, "smolLM2_360m.pt") |
|
``` |
|
|
|
**2. Deployment to Target Environments:** |
|
|
|
* **C++:** LibTorch, the C++ API for PyTorch, can load and execute TorchScript models. Integrate `libTorch` into your C++ application for microcontroller, RISC-V, or other edge device deployments. This typically involves compiling your C++ code and linking against `libTorch`. |
|
|
|
* **Android/iOS:** Use the respective PyTorch Mobile libraries for these platforms. These libraries offer optimized runtime environments for executing TorchScript models within mobile applications. |
|
|
|
* **Other Edge Devices:** Depending on the device and its capabilities, explore options like using a custom runtime, or if available, a cross-compilation toolchain to target the device from your development environment. |
|
|
|
**Example C++ Deployment (Simplified):** |
|
|
|
```c++ |
|
#include <torch/script.h> |
|
|
|
int main() { |
|
// Load the TorchScript model |
|
torch::jit::script::Module module = torch::jit::load("smolLM2_360m.pt"); |
|
|
|
// Prepare input tensor |
|
// ... (Device-specific input tensor preparation) ... |
|
|
|
// Run inference |
|
std::vector<torch::jit::IValue> inputs; |
|
inputs.push_back(input_tensor); // Add input tensor(s) |
|
auto output = module.forward(inputs); |
|
|
|
// Process output |
|
// ... (Handle output tensor on the device) ... |
|
|
|
return 0; |
|
} |
|
``` |
|
|
|
**Key Considerations:** |
|
|
|
* **Hardware Limitations:** Microcontrollers and other edge devices have limited resources. Model size and complexity may need adjustments (quantization, pruning) for optimal performance. |
|
|
|
* **Platform-Specific Tooling:** Each target platform has its own build system and toolchain. Familiarize yourself with these tools for successful deployment. |
|
|
|
* **Cross-Compilation:** If building directly on the target device isn't feasible, cross-compilation is necessary. This typically involves setting up a cross-compilation toolchain for the target architecture. |
|
|
|
* **Debugging:** Debugging on edge devices can be challenging. Thoroughly testing the TorchScript model within a more accessible environment (e.g., your development machine) before deploying is essential. |
|
|
|
|
|
This expanded explanation provides a more complete roadmap for creating and deploying TorchScript versions of the SmolLM2 model. Remember to consult the official PyTorch and LibTorch documentation for platform-specific instructions and best practices. |
|
|
|
|
|
|
|
|
|
|
|
--- |
|
license: apache-2.0 |
|
--- |
|
|