Sengil (Mert)

updated a model 20 days ago

Sengil/ABSA-Turkish-bert-based-small

Text Classification • Updated 20 days ago • 86 • 2

updated a dataset 20 days ago

Sengil/Turkish-ABSA-Wsynthetic

Viewer • Updated 20 days ago • 22.1k • 2

liked a model 23 days ago

turkish-nlp-suite/tr_core_news_trf

Token Classification • Updated Aug 12 • 312 • 14

reacted to gabrielchua's post with 👀 29 days ago

Post

1207

Sharing my first paper!

==
Large Language Models (LLMs) are powerful, but they're prone to off-topic misuse, where users push them beyond their intended scope. Think harmful prompts, jailbreaks, and misuse. So how do we build better guardrails?

Traditional guardrails rely on curated examples or classifiers. The problem?
⚠️ High false-positive rates
⚠️ Poor adaptability to new misuse types
⚠️ Require real-world data, which is often unavailable during pre-production

Our method skips the need for real-world misuse examples. Instead, we:
1️⃣ Define the problem space qualitatively
2️⃣ Use an LLM to generate synthetic misuse prompts
3️⃣ Train and test guardrails on this dataset

We apply this to the off-topic prompt detection problem, and fine-tune simple bi- and cross-encoder classifiers that outperform heuristics based on cosine similarity or prompt engineering.

Additionally, framing the problem as prompt relevance allows these fine-tuned classifiers to generalise to other risk categories (e.g., jailbreak, toxic prompts).

Through this work, we also open-source our dataset (2M examples, ~50M+ tokens) and models.

paper: A Flexible Large Language Models Guardrail Development Methodology Applied to Off-Topic Prompt Detection (2411.12946)

artifacts: govtech/off-topic-guardrail-673838a62e4c661f248e81a4

reacted to maxiw's post with 👍 30 days ago

Post

2011

You can now try out computer use models from the hub to automate your local machine with https://github.com/askui/vision-agent. 💻

import time
from askui import VisionAgent

with VisionAgent() as agent:
    agent.tools.webbrowser.open_new("http://www.google.com")
    time.sleep(0.5)
    agent.click("search field in the center of the screen", model_name="Qwen/Qwen2-VL-7B-Instruct")
    agent.type("cats")
    agent.keyboard("enter")
    time.sleep(0.5)
    agent.click("text 'Images'", model_name="AskUI/PTA-1")
    time.sleep(0.5)
    agent.click("second cat image", model_name="OS-Copilot/OS-Atlas-Base-7B")

Currently these models are integrated with Gradio Spaces API. Also planning to add local inference soon!

Currently supported:
- Qwen/Qwen2-VL-7B-Instruct
- Qwen/Qwen2-VL-2B-Instruct
- AskUI/PTA-1
- OS-Copilot/OS-Atlas-Base-7B

3 replies

·

liked a Space about 1 month ago

Running on CPU Upgrade

125

🔥

HuggingFace Trending Board

reacted to csabakecskemeti's post with 👍 about 1 month ago

Post

1230

Some time ago, I built a predictive LLM router that routes chat requests between small and large LLM models based on prompt classification. It dynamically selects the most suitable model depending on the complexity of the user input, ensuring optimal performance while maintaining conversation context. I also fine-tuned a RoBERTa model to use with the package, but you can plug and play any classifier of your choice.

Project's homepage:
https://devquasar.com/llm-predictive-router/
Pypi:
https://pypi.org/project/llm-predictive-router/
Model:
DevQuasar/roberta-prompt_classifier-v0.1
Training data:
DevQuasar/llm_router_dataset-synth
Git:
https://github.com/csabakecskemeti/llm_predictive_router_package

Feel free to check it out, and/or contribute.

liked a model about 1 month ago

yeniguno/absa-turkish-bert-dbmdz

Text Classification • Updated Sep 22 • 63 • 4

reacted to ImranzamanML's post with 🔥 2 months ago

Post

1378

LoRA with code 🚀 using PEFT (parameter efficient fine-tuning)

LoRA (Low-Rank Adaptation)
LoRA adds low-rank matrices to specific layers and reduce the number of trainable parameters for efficient fine-tuning.

Code:
Please install these libraries first:
pip install peft
pip install datasets
pip install transformers

from transformers import AutoModelForSequenceClassification, Trainer, TrainingArguments
from peft import LoraConfig, get_peft_model
from datasets import load_dataset

# Loading the pre-trained BERT model
model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)

# Configuring the LoRA parameters
lora_config = LoraConfig(
    r=8,
    lora_alpha=16, 
    lora_dropout=0.1, 
    bias="none" 
)

# Applying LoRA to the model
model = get_peft_model(model, lora_config)

# Loading dataset for classification
dataset = load_dataset("glue", "sst2")
train_dataset = dataset["train"]

# Setting the training arguments
training_args = TrainingArguments(
    output_dir="./results",
    per_device_train_batch_size=16,
    num_train_epochs=3,
    logging_dir="./logs",
)

# Creating a Trainer instance for fine-tuning
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
)

# Finally we can fine-tune the model
trainer.train()

LoRA adds low-rank matrices to fine-tune only a small portion of the model and reduces training overhead by training fewer parameters.
We can perform efficient fine-tuning with minimal impact on accuracy and its suitable for large models where full-precision training is still feasible.