File size: 10,138 Bytes
76763e7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 |
---
base_model:
- microsoft/Phi-3.5-mini-instruct
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- mircosoft
- phi-3.5-mini-instruct
- law
- contracts
- mergers&acquisitions
---
# Phi-3.5-Law
## Model Summary
**Model Name:** Phi-3.5-Law
**Model Type:** Text Generation
**Hugging Face Model ID:** DanielShaw98/phi-3.5-law
**Phi-3.5-Law** is a specialised model for processing merger and acquisition (M&A) contracts. It is fine-tuned from the `microsoft/Phi-3.5-mini-instruct` model and designed to assist lawyers in identifying specific clauses within contracts. The model is built on the Phi-3 architecture, which focuses on high-quality, reasoning-dense data.
**Developed by:** DanielShaw98
**License:** Apache-2.0
**Finetuned from model:** `microsoft/Phi-3.5-mini-instruct`
## Known Issues
### Model Behavior in Different Environments
**Google Colab:** The model has been tested successfully on Google Colab, where it performs as expected and provides accurate results.
**Downloaded Environment:** When downloaded and tested locally, the model may produce gibberish or unexpected outputs. This issue may be due to differences in the runtime environment or configurations between Google Colab and local setups.
If you encounter issues with the model's performance in a local environment, please ensure that all dependencies are correctly installed and that the runtime environment is properly configured. If the problem persists, consider reaching out for support or checking for any updates or patches.
## Intended Uses
### Primary Use Cases
The Phi-3.5-Law model is intended for use in legal and contract analysis applications, where it helps in:
- Identifying specific clauses in M&A contracts
- Providing explanations and details about these clauses
The model is useful for legal professionals who need to quickly locate and understand relevant contract clauses.
### Use Case Considerations
While the model aims to provide accurate results, it may occasionally produce inaccuracies. The AI tool should not be considered legal advice or a substitute for professional judgement. Users should carefully review and verify the results before making any legal decisions or actions.
## Setup and Usage
### Setting Up the Server
To run the model, you need a FastAPI server. Here’s an example script for setting up the server:
from fastapi import FastAPI, HTTPException, Depends
from pydantic import BaseModel
from transformers import pipeline
import torch
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
import os
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
app = FastAPI()
token_auth_scheme = HTTPBearer()
# Retrieve the Hugging Face token from environment variable
HUGGINGFACE_TOKEN = os.getenv("HUGGINGFACE_TOKEN")
# Initialize the pipeline
model_id = "DanielShaw98/phi-3.5-law"
model_pipeline = pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto" # Requires accelerate to work properly
)
class Message(BaseModel):
role: str
content: str
class RequestBody(BaseModel):
messages: list[Message]
max_new_tokens: int
# Function to verify token
async def verify_token(credentials: HTTPAuthorizationCredentials = Depends(token_auth_scheme)):
token = credentials.credentials
if token != HUGGINGFACE_TOKEN:
raise HTTPException(status_code=401, detail="Invalid or missing token")
return token
@app.post("/generate")
async def generate_text(request_body: RequestBody, token: str = Depends(verify_token)):
try:
outputs = model_pipeline(
[{"role": msg.role, "content": msg.content} for msg in request_body.messages],
max_new_tokens=request_body.max_new_tokens,
)
return {"generated_text": outputs[0]["generated_text"]}
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
### Making a Call to the Server
To make a call to the server and test the model, you can use the following JavaScript code (using example chunk from contract):
const axios = require('axios');
require('dotenv').config(); // Make sure to install dotenv
// Retrieve the token from environment variables
const HUGGINGFACE_TOKEN = process.env.HUGGINGFACE_TOKEN;
const requestBody = {
messages: [
{ role: "system",
content: `You are Legal AI. Your job is to help lawyers by identifying specific clauses in merger and acquisition contracts.
Please identify the desired clauses and also provide an explanation for this choice based on the prompt.
If the requested clause cannot be found, please respond with 'nothing found.'
Otherwise please provide a response in the following JSON format:
{
"entries": [
{
"page": <page_number>,
"line_start": <clause_start_line_within_chunk>,
"line_end": <clause_end_line_within_chunk>,
"clause": <clause_text>,
"explanation": <explanation_text>
}
]
}`
},
{ role: "user",
content: `Prompt: Review the provided text and identify all clauses related to termination rights and conditions.
Return the exact start and end line numbers of the relevant clause within the chunk. If you cannot find anything relevant, please respond with 'nothing found.'
Please provide details in the following JSON format:
{ "relevant_chunks_found": <number>, "entries": [ { "page": <page_number>,
"line_start": <clause_start_line_within_chunk>, "line_end": <clause_end_line_within_chunk>,
"clause": <clause_text>, "explanation": <explanation_text> } ]}\n\n
Chunk: Transaction but all the conditions therein have been satisfied or complied with, \nor confirmed no such clearance is required in accordance with the applicable \ncompetition legislation, or has not objected to the Transaction within the time \nperiod prescribed by law.
\n227876-4-1460-v9.0 \n- 30 - \n70-40688062 \n \nFor the purposes of clauses 4.1.10 to 4.1.12 (inclusive) only, "Transaction" shall \nbe limited to the part or parts of the Transaction required to be notified to the \nCommission, COFECE or the competent competition authority of Vietnam (as \nappropriate).
\nNo material breach \n4.1.13 no Purchaser Covenant Breach and no Purchaser Material Breach having \noccurred; and \n4.1.14 no Chrysaor Covenant Breach and no Chrysaor Material Breach having \noccurred. \n4.2 \nAny Regulatory Condition or Antitrust Condition may be waived at any time on or \nbefore 17.00 on the Longstop Date by written agreement of the Company and the \nPurchaser.
Any Chrysaor Material Breach may be waived at any time on or before \n17.00 on the Longstop Date by the Purchaser by notice in writing to the Company. Any \nPurchaser Material Breach may be waived at any time on or before 17.00 on the \nLongstop Date by the Company by notice in writing to the Purchaser.
\n4.3 \nIf, at any time, any party becomes aware of a fact, matter or circumstance that could \nreasonably be expected to prevent or delay the satisfaction of a Condition, it shall \ninform the others of the fact, matter or circumstance as soon as reasonably practicable.
\n4.4 \nIf a Condition has not been satisfied or (if capable of waiver) waived by 17.00 on the \nLongstop Date or becomes impossible to satisfy before that time, either the \nHarbour/Chrysaor Parties or the Purchaser may terminate this Agreement by notice in \nwriting to that effect to the other, save that the Harbour/Chrysaor Parties may only \nterminate this Agreement: (i) on the basis of the Whitewash Condition not having been \nsatisfied by 17.00 on the Longstop Date or having become impossible to satisfy before
\nthat time; and (ii) on the basis of the Circular Condition and/or the FCA Admission \nCondition not having been satisfied by 17.00 on the Longstop Date or having become \nimpossible to satisfy before that time, in each case, only if the Harbour/Chrysaor Parties \nhave complied with the relevant provisions of clause 5 and/or the Purchaser has not \ncomplied with the relevant provisions of clause 5. \n4.5\n\n Chunk Meta-Data:\nPage Start: 30\n Page End: 31\nLine Start: 1405\n Line End: 1445`
}
],
max_new_tokens: 256
};
axios.post('http://localhost:8000/generate', requestBody, {
headers: {
'Authorization': `Bearer ${HUGGINGFACE_TOKEN}`,
'Content-Type': 'application/json'
}
})
.then(response => {
console.log('Response:', response.data);
})
.catch(error => {
console.error('Error:', error.response ? error.response.data : error.message);
});
## Disclaimer
This AI tool is designed to assist lawyers in identifying specific clauses within contracts. While the AI strives to provide accurate and relevant results, it may occasionally make mistakes or produce inaccuracies. The information provided by this tool should not be considered legal advice or a substitute for professional judgment. Users are strongly encouraged to carefully review and verify the results before relying on them for any legal decisions or actions. The responsibility for the interpretation and application of the contract clauses remains solely with the user.
## Repository
For creating datasets and further details, please visit the [repository for creating datasets](https://github.com/DanielShaw98/data-prep). |