ihk
/

ojobert

Inference Endpoints

Model card Files Files and versions Community

ojobert / README.md

ihk's picture

ihk

Update README.md

7add91c over 1 year ago

|

history blame contribute delete

1.77 kB

	---
	base_model: distilbert-base-uncased
	model-index:
	- name: ojobert
	results: []
	license: mit
	language:
	- en
	widget:
	- text: Would you like to join a major [MASK] company?
	tags:
	- jobs
	---

	_Nesta, the UK's innovation agency, has been scraping online job adverts since 2021 and building algorithms to extract and structure information as part of the [Open Jobs Observatory](https://www.nesta.org.uk/project/open-jobs-observatory/) project._

	_Although we are unable to share the raw data openly, we aim to open source our models, algorithms and tools so that anyone can use them for their own research and analysis._

	## 📟 About

	This model is pre-trained from a `distilbert-base-uncased` checkpoint on 100k sentences from scraped online job postings as part of the Open Jobs Observatory.

	## 🖨️ Use

	To use the model:

	```
	from transformers import pipeline

	model = pipeline('fill-mask', model='ihk/ojobert', tokenizer='ihk/ojobert')
	```

	An example use is as follows:

	```

	text = "Would you like to join a major [MASK] company?"
	results = model(text, top_k=3)

	results

	>> [{'score': 0.1886572688817978,
	'token': 13859,
	'token_str': 'pharmaceutical',
	'sequence': 'would you like to join a major pharmaceutical company?'},
	{'score': 0.07436735928058624,
	'token': 5427,
	'token_str': 'insurance',
	'sequence': 'would you like to join a major insurance company?'},
	{'score': 0.06400047987699509,
	'token': 2810,
	'token_str': 'construction',
	'sequence': 'would you like to join a major construction company?'}]
	```

	## ⚖️ Training results

	The fine-tuning metrics are as follows:

	- eval_loss: 2.5871026515960693
	- eval_runtime: 134.4452
	- eval_samples_per_second: 14.281
	- eval_steps_per_second: 0.223
	- epoch: 3.0
	- perplexity: 13.29