TURKCELL
/

gibberish-sentence-detection-model-tr

Text Classification

Inference Endpoints

Model card Files Files and versions Community

gibberish-sentence-detection-model-tr / README.md

zeynepgulhan's picture

Update README.md

a174d1c verified 11 months ago

|

history blame contribute delete

1.35 kB

	---
	license: mit
	language:
	- tr
	pipeline_tag: text-classification
	tags:
	- text-classification
	---

	## Model Description
	This model has been fine-tuned using [dbmdz/bert-base-turkish-128k-uncased](https://huggingface.co/dbmdz/bert-base-turkish-128k-uncased) model.

	This model created for detecting gibberish sentences like "adssnfjnfjn" .
	It is a simple binary classification project that gives sentence is gibberish or real.

	## Usage

	```python
	from transformers import AutoModelForSequenceClassification, AutoTokenizer
	device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
	model = AutoModelForSequenceClassification.from_pretrained("TURKCELL/gibberish-detection-model-tr")
	tokenizer = AutoTokenizer.from_pretrained("TURKCELL/gibberish-detection-model-tr", do_lower_case=True, use_fast=True)

	model.to(device)

	def get_result_for_one_sample(model, tokenizer, device, sample):
	d = {
	1: 'gibberish',
	0: 'real'
	}
	test_sample = tokenizer([sample], padding=True, truncation=True, max_length=256, return_tensors='pt').to(device)
	# test_sample
	output = model(**test_sample)
	y_pred = np.argmax(output.logits.detach().to('cpu').numpy(), axis=1)
	return d[y_pred[0]]

	sentence = "nabeer rdahdaajdajdnjnjf"
	result = get_result_for_one_sample(model, tokenizer, device, sentence)
	print(result)

	```