Model Card for richterdc/deepseek-coder-finetuned-tdd
This model is fine-tuned to help developers generate test cases from code or plain language descriptions. It is designed to support Test-Driven Development (TDD) by suggesting tests that can improve code quality.
Model Details
Model Description
This model has been fine-tuned to generate test cases for software code. It takes in code snippets or descriptions of functionality and suggests relevant tests. The model uses the Hugging Face Transformers library and is deployed as a Flask API. It is built for fast inference with GPU support and is intended to help developers by automating part of the TDD process.
- Developed by: Richter Dela Cruz
- Funded by [optional]: Angelo Richter L. Dela Cruz, Alyza Reynado, Gabriel Luis Bacosa, and Joseph Bryan Eusebio
- Shared by [optional]: Angelo Richter L. Dela Cruz
- Model type: Causal language model for code generation and understanding
- Language(s) (NLP): English (for code and comments)
- License: Not specified
- Finetuned from model [optional]: Fine-tuned from a base code generation model (deepseek-ai/deepseek-coder-1.3b-instruct)
Model Sources [optional]
- Repository: https://github.com/RichterDelaCruz/tdd-deployment
- Paper [optional]: On going
- Demo [optional]: See the API demo instructions in the repository
Uses
Direct Use
The model can be used to generate test cases directly from code snippets or textual descriptions. This is useful for developers who want to quickly get ideas for tests to cover their code.
Downstream Use [optional]
The model can also be integrated into larger development pipelines or fine-tuned further for specific applications. For example, it can be used within continuous integration systems to suggest tests for new code changes.
Out-of-Scope Use
This model is not designed for generating security-critical tests or for replacing thorough human testing. It may not capture all edge cases and should not be solely relied upon for complete test coverage.
Bias, Risks, and Limitations
- The model may generate test cases that are too generic or miss specific edge cases.
- It might produce plausible-looking tests that require manual review.
- Its performance may vary depending on the complexity of the input code or description.
Recommendations
Users should always review the generated test cases before using them in production. Fine-tuning on domain-specific data is recommended to improve relevance and accuracy.
How to Get Started with the Model
Clone the Repository:
git clone https://github.com/RichterDelaCruz/tdd-deployment.git cd tdd-deployment
Install Dependencies:
pip install -r requirements.txt
Run the Flask API:
python generate-test.py
Test the API:
Use
curl
or any API testing tool:curl -X POST "http://localhost:8000/generate" \ -H "Content-Type: application/json" \ -d '{"input_text": "Write a Python function to add two numbers"}'
Training Details
Training Data
The exact details of the training data are not provided. It likely consists of publicly available code repositories and associated test cases.
Training Procedure
The model was fine-tuned using standard practices for causal language models on a dataset of code and test cases.
Preprocessing [optional]
Preprocessing steps were applied to prepare the code and test case data, though specific details are not provided.
Training Hyperparameters
- Training regime: Standard fine-tuning for causal language models (e.g., using PyTorch with mixed precision)
Speeds, Sizes, Times [optional]
The model is optimized for GPU inference and has been tested on hardware such as the RTX 3090 for scalability.
Evaluation
Testing Data, Factors & Metrics
Testing Data
Details about the testing dataset are not provided. Evaluation likely used code examples and corresponding expected test cases.
Factors
Evaluations may consider code complexity, coverage, and the correctness of the generated tests.
Metrics
Metrics might include improvements in test coverage or the accuracy of the suggested test cases, though specific metrics are not documented.
Results
Evaluation results are not comprehensively documented. Users are encouraged to evaluate the model based on their own codebases.
Summary
The model is effective at generating plausible test cases for a variety of code snippets, though manual review is recommended to ensure correctness and completeness.
Model Examination [optional]
No detailed interpretability or analysis work has been provided for this model.
Environmental Impact
Carbon emissions for model training and inference can be estimated using the Machine Learning Impact calculator.
- Hardware Type: RTX 3090 or similar GPU
- Hours used: Varies by deployment
- Cloud Provider: Vast.ai (or any provider with CUDA support)
- Compute Region: Not specified
- Carbon Emitted: Not specified (estimate using the ML Impact calculator)
Technical Specifications [optional]
Model Architecture and Objective
This model is based on a causal language model architecture, fine-tuned specifically for code generation and test case creation. Its objective is to assist developers in following Test-Driven Development practices.
Compute Infrastructure
The model is deployed as a Flask API using gunicorn for scalability, with PyTorch handling model inference.
Hardware
The model runs on both CPU and GPU, with best performance observed on GPUs.
Software
Built using Python, Flask, PyTorch, and Hugging Face Transformers.
Citation [optional]
BibTeX:
@misc{richterdc2025tdd,
author = {Richter Dela Cruz},
title = {richterdc/deepseek-coder-finetuned-tdd},
year = {2025},
publisher = {GitHub},
howpublished = {\url{https://github.com/RichterDelaCruz/tdd-deployment}}
}
APA:
Richter Dela Cruz. (2025). richterdc/deepseek-coder-finetuned-tdd. GitHub. Retrieved from https://github.com/RichterDelaCruz/tdd-deployment
Glossary [optional]
- Test-Driven Development (TDD): A development approach where tests are written before the code to ensure functionality.
- Flask: A Python web framework used for building APIs.
- Transformers: A library by Hugging Face for working with state-of-the-art language models.
More Information [optional]
For further details, visit the repository.
Model Card Authors [optional]
Angelo Richter L. Dela Cruz, Alyza Reynado, Gabriel Luis Bacosa, and Joseph Bryan Eusebio
Model Card Contact
For inquiries or contributions, please reach out via GitHub.
- Downloads last month
- 78
Model tree for richterdc/deepseek-coder-finetuned-tdd
Base model
deepseek-ai/deepseek-coder-1.3b-instruct