---
language: []
library_name: sentence-transformers
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:900
- loss:GISTEmbedLoss
base_model: BAAI/bge-small-en-v1.5
datasets: []
metrics:
- cosine_accuracy@1
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@5
- cosine_ndcg@10
- cosine_ndcg@100
- cosine_mrr@5
- cosine_mrr@10
- cosine_mrr@100
- cosine_map@100
- dot_accuracy@1
- dot_accuracy@5
- dot_accuracy@10
- dot_precision@1
- dot_precision@5
- dot_precision@10
- dot_recall@1
- dot_recall@5
- dot_recall@10
- dot_ndcg@5
- dot_ndcg@10
- dot_ndcg@100
- dot_mrr@5
- dot_mrr@10
- dot_mrr@100
- dot_map@100
widget:
- source_sentence: What will be used for the identification of beneficiaries?
sentences:
- '''for loanee and within 30 days for non loanee i.e. 15th Aug for Kharif and
15th Jan for Rabi for loanee and 31st Aug for Kharif and 31st Jan for Rabi
for Non Loanee 13 Within 7 days from the date of intimation by ICs CSCs/Banks/
Intermediary Cut-off date for CSCs/Banks/Intermediary to correct/update the paid
application intimated by ICs on Crop Insurance Portal 14 Cut-off date for Insurer
to accept the corrected/updated applications Within 7 days from the date of
submission of correction/updation by the Bank/CSC Insurance Companies 15 Within
7 days from acceptance of proposal by concerned Insurance Company on Portal Cut-off
date for Banks/ICs to hand over insurance acknowledgement receipt along with folio
to the insured farmer Banks/ICs for enrolment through their intermediaries 16 Cut
off date for processing of applications by ICs and auto approval of application
of insured farmers on crop insurance Portal 60 days from the cut off date for
enrolment/debit of premium from farmers i.e. 15th September for Kharif and 15th
February for Rabi seasons 17 Before cut off date of enrolment of farmers Insurance
Companies/GOI /State Cut off date for raising bills/requisitions with supporting
documents for releasing of advance premium subsidy based on 50% of 80% of respective
share of Centre/State in corresponding previous season 18 Release of advance
upfront premium subsidy (First Instalment)i.e. 50% of 80% of respective share
of Centre/State in corresponding previous season Within 15days of cut off date
of enrolment of farmers i.e. 31st July for Kharif Upto 15th August* 19 *state
may fix earlier dates for early Kharif crops Training and registration of field
level workers assigned for conduct of CCEs and reporting of the same on crop
insurance Portal through smart phones/CCE Agri App Upto31st August* *state 20 Registration
of mobile number of representative of ICs for co-witnessing of CCEs may fix earlier
dates for early Kharif crops At least 7 days before tentative date for conducting
CCEs 21 a) Uploading of tentative schedule/date for conducting CCEs (crop-wise/IU
wise) followed by SMS on one day notice through CCEs app.'''
- '''8 Banks/PACS/CSC/ insurance agent/online enrolment by farmers etc. Upto
last date of enrolment of farmers as notified by States for notified crop(s)
or up to 15th July* for Kharif season Upto last date of enrolment of farmers as
notified by States for notified crop(s) or up to 15th December* for Rabi season Cut-off
date for receipt of Applications of farmers/debit of premium from farmers account
(loanee and non- loanee) by all stakeholders including banks/PACS/CSC/ insurance
agent/online enrolment by farmers etc. Note: *This is indicative only and district
wise crop calendar will be the final basis to arrive at cut off date 9 Declaration
of Prevented sowing State Govt./ IC Strictly within 15 days from cut off date
for enrolment of farmers i.e 31st July for Kharif and 31st Dec for Rabi 10 Banks/Portal Within
15 days of cut-off date for enrolment of farmers/debit of premium for both loanee and non-
loanee farmers i.e. 31st July for Kharif and 31st Dec for Rabi Cut-off date
for electronic remittance of premium along with consolidated Declarations to
respective Insurance Company and uploading of details of individual covered farmers
on crop insurance Portal by Bank branches (CBs/ RRBs/DCCBs/PACs), followed by SMS
to all insured farmers from Portal 11 Within 48 Hours of receipt of application
& premium. Insurance companies and their agents Cut-off date for electronic
remittance of farmer premium to Insurance Companies for farmers covered on Voluntary
basis by designated insurance Agent(s) and uploading of details of individual
covered farmers on crop insurance Portal. Insurance Companies 12 Cut-off date
for Insurer to accept or reject the farmer''s data on Portal Within 15 days from
the cut-off date for uploading of data/information by Banks/PACS/CSC/Agent respectively.'''
- ''' This consent of the beneficiary should be \''to agree that the department
responsible for implementation of the Pradhan Mantri Kisan Samman Nidhi Yojana
in Union Government or the State Government / Union Territory Administration can
use the beneficiary Aadhaar number and other information provided in the declaration
to verify the eligibility of the beneficiary for scheme as per extant scheme guidelines
with the concerned agencies\''. 6.3 The existing land-ownership system in the
concerned State / UT will be used for identification of beneficiaries. Accordingly,
it is of utmost importance that the land records are clear and updated. Further,
State / UT Governments would also expedite the progress of digitization of
the land records and linking the same with Aadhaar as well as bank details of
the beneficiaries. 6.4 The lists of eligible beneficiaries would be published
at the village level. Farmers'' families who are eligible but have been excluded
should be provided an opportunity to represent their case.'''
- source_sentence: What are weather parameters related to crop condition?
sentences:
- '''i. \''Credit Facility\'' means any fund based credit facility extended by
an Eligible Lending Institution (ELI) to an Eligible Borrower without any Collateral
Security or Third Party Guarantee ; ii. \''Credit Guarantee Fund\'' means the
Credit Guarantee Fund for FPOs created with NABARD and NCDC respectively under
the Scheme with matching grant from DAC&FW for the purpose of extending guarantee
to the eligible lending institution(s) against their collateral free lending to eligible
FPOs; iii. \''Eligible Lending Institution (ELI)\'' means a Scheduled Commercial
Bank for the time being included in the second Schedule to the Reserve Bank of
India Act, 1934, Regional Rural Banks, Co-operative Banks, Cooperative Credit Society,
NEDFI, or any other institution (s) as may be decided by the NABARD and/or NCDC,
as the case may be, in consultation with Government of India from time to time.
NABARD and NCDC can also finance, if they so desire with the approval of DAC&FW/N-PMFSC.
NBFCs and such other financing institutions with required net worth and track
record may also serve as Eligible Lending Institutions (ELIs), for lending to
FPOs with a moderate spread between their cost of capital and lending rate. However,
Standard Financial Sector Rating Agency should have rated NBFC **to be AAA**
to be considered as ELI; iv. \''Guarantee Cover\'' means maximum cover available
per eligible FPO borrower; v. \''Guarantee Fee\'' means the onetime fee at
a specified rate of the eligible credit facility sanctioned by the ELI, payable
by the ELI to NABARD or NCDC, as the case may be; and vi.'''
- '''| Table No. |
Topic |\n|--------------------------------------------------------------------------------------|-----------------------------------------------------------------------------|\n|
Para | |\n|
Number | |\n|
Table 1 |
Premium rate payable by the farmer |\n|
Table 2 |
Seasonality Discipline |\n|
Table 3 |
Calculation of crop-wise Sum Insured |\n|
Table 4 |
Number of CCEs to be conducted at IU level |\n|
Table 5 |
Yield Calculation for multi-picking crop |\n|
Table 6 | |\n|
Types of resolution to be used for vegetation index derivation for different | |\n|
levels of analysis | |\n|
19.7 | |\n|
Table 7 |
Calculation of Threshold Yield |\n|
Table 8 | |\n|
Detailed Procedure for On Account Payment of Claims due to Mid-Season | |\n|
Adversity | |\n|
21.2.7.2 | |\n|
Table 9 | |\n|
Detailed Procedure and Timelines for payment of Prevented / Failed Sowing | |\n|
and Prevented Planting / Germination Claims | |\n|
21.3.6.1 | |\n|
Table 10 | |\n|
Detailed Procedure and Timelines for payment of | |\n|
Post Harvest Loss | |\n| | |\n|
21.4.8.1 | |\n|
Table 11 | |\n|
Detailed Procedure and Timelines for payment of | |\n|
localized claims | |\n| | |\n|
21.5.9.1 | |\n|
Table 12 |
Criteria for Performance Evaluation and Ranking of Loss Assessment Agencies |\n|
Table 13 | |\n|
Illustration for classification of risks, clustering/clubbing of risks and districts
| |\n|
and determination of L1 bidder | |\n|
Annexure 1 | |\n|
Table 14 | |\n|
Penalty and performance Monitoring/Evaluation of Empanelled Insurance | |\n|
Companies | |\n|
Annexure 2 | |\n|
Table 15 |
Illustrative Evaluation of Efficiency of Nodal Department of State |'''
- '''gridded or satellite based data, Dry-spell Occurrence, Temperature Anomaly,
Soil Moisture Analysis (either from satellite or model data) and any other available
weather parameters (related to crop condition), either from ground or satellite
data. Another rainfall based index, which can also be studied, is Standardized
Precipitation Index (SPI). SPI can be developed using IMD gridded rainfall data
or NOAA CPC rainfall data. However, while using gridded data (either from satellite
or ground stations), appropriate resolution should be used at appropriate level.
For example, 0.25 degree data should be used only at district level and not at
block/village level. High resolution weather data can also be sourced from weather
companies.'''
- source_sentence: How is the amount realized by the ELI from the FPC shared between
SFAC and ELI?
sentences:
- '''To substantiate the fact, the most successful example is of dairy co-operative
in India where professional managers have contributed immensely to make it a success.
There are other so many examples which prove the absolute requirement of professional
managers. The number of professional staff could depend on geographical spread
of business operation, diversity of activities and volume of business. However,
an FPO should have minimum a CEO/Manager and an Accountant. Accountant is required
in FPO to look after its day to day accounting work. Based on requirement, FPO
can engage other staff also. 10.3 The CEO/Manager is to be appointed by the
executive body of the FPO who should be either graduate in agriculture / agriculture
marketing / agri-business management or BBA or equivalent. Locally available professionals
with 10+2 and preferably diploma in agriculture / agriculture marketing / agri-business
management or in such other related areas may be preferable. The accountant should
have educational qualification of 10+2 with Mathematics as a compulsory subject
or alternatively with Commerce or Accountancy background. If any members of the
FPO meet the above criteria, they may be considered preferably in the selection
process. 10.4 Under the scheme, financial support towards salary of CEO/Manager
up to @ Rs. 25,000/- per month and of Accountant up to @ Rs.10,000/- per
month with annual increment up to 5% is to be provided from the earmarked financial
support for first 3 years only. Thereafter, FPOs will manage from their own resources
to pay the salary of CEO/Manager and Accountant. In order to create interest
of good professional activities of CEO/Accountant, the FPO may also offer higher
payment with their own sources of funds on above of Govt. support. One CEO will
provide full time services to one FPO at a time only.'''
- '''7.3.1 Three levels of Indemnity, viz., 70%, 80% and 90%. SLCCCI in consultation
with Insurance Companies shall approve indemnity levels for notified crops at
district level. Threshold Yield (TY) shall be notified in the Tender for the
current season and the same will be used for claim calculation for that season.
The Average Yield of a notified crop in Insurance Unit (IU) will be average yield
of best five years out of last seven years. The Threshold yield of the notified
crop is equal to Average Yield multiplied by Indemnity level. The Threshold Yield
for any crop and IU shall compulsorily be part of the notification for the season
and shall not change at any point during that season. 7.3.2 Calculation and
Notification of Threshold Yield: For calculation of Threshold Yield, historical
average yield of best five out of last seven years shall be considered. Further
Threshold Yield should be defined only at notified area level and once notified
in the Notification issued by the State should not be changed at later stage
under any circumstances. In case of multi-year contract, the Threshold Yield for the
subsequent years shall be revised by adding/considering the yields of immediate
previous corresponding season. The revised TY and Sum Insured (if revised) should
be notified accordingly at the beginning of each crop season in case of multi-year
tender.'''
- '''a. Only such Claim as is submitted by the ELI within a maximum period of one
year from date of NPA or as specified by SFAC from time to time, shall be considered
by SFAC. b. On receiving a claim, the I&CSC shall review in detail the reasons
for the Default. The Committee reserves the right to reject any proposal where
the Guidelines have not been strictly followed or if any misrepresentation or
concealment of facts is found leading to undue favour to the concerned FPC. c.
SFAC shall honour 75 per cent of the Guaranteed Amount in Default subject to a maximum
of 75 per cent of the guaranteed cap amount, on submission of claim by the ELI
where appropriate action for recovery has been initiated. The balance 25 per cent
of the default or guaranteed cap amount, as the case may be, shall be paid on
conclusion of recovery proceedings by ELI. d. SFAC shall pay claims found in
order and complete in all respects, within 90 days. e. The outstanding dues of
the FPC to ELI shall be reduced to the extent of the claim amount settled by
SFAC. f. The ELI shall continue to make efforts to realise the balance amount
due from the g. defaulting FPC even after settlement of the Guarantee. h. SFAC
has the right to claim from the ELI any amount that is realised by the ELI from the
defaulting FPC even after settlement of the guarantee amount. i. Any amount realized
by the ELI from the FPC shall be shared in the ratio of 85%:15% between SFAC
& ELI.'''
- source_sentence: What directions is the lending institution bound to comply with?
sentences:
- ''' The I&CSC shall regularly keep the NABARD and NCDC, as the case may be, informed
about all major decisions and actions taken by it in this regard, and shall work
under its overall control and guidance with regard to the Fund and the Scheme.
Simultaneously, the NABARD and NCDC will keep the N- PMAFSC apprised about such
decisions and actions taken in this regard. v. Decisions regarding Guarantee
Pay-Outs shall be the primary responsibility of I&CSC which shall meet at least
once every quarter or as often as necessary. vi. The ELI may invoke the Guarantee
in respect of Credit Facility within a maximum period of one year from the date
of NPA, if the conditions set out by NABARD or NCDC, as the case may be , are
met out .'''
- ''' The lending institution shall be bound to comply with such directions as NABARD
or NCDC, as the case may be, may deem fit to issue from time to time, for facilitating
recoveries of the guaranteed account, or safeguarding its interest as a guarantor. ix. The
lending institution shall, in particular, refrain from any act either before or
subsequent to invocation of guarantee, which may adversely affect the interest
of NABARD or NCDC, as the case may be, as the guarantor. x. The lending institution
shall be bound under the Scheme to intimate in advance to NABARD or NCDC, as the
case may be, its intention to enter into any compromise or arrangement, which
may have effect of discharge or waiver of primary security. xi. Further, the
lending institution shall secure for NABARD or NCDC, as the case may be, or
its appointed agency, through a stipulation in an Agreement with the Borrower
or otherwise, the right to list the defaulted Borrowers'' names and particulars
on the Website of NABARD or NCDC, as the case may be or Integrated Portal'''
- '''| AIC | Agricultural Insurance Company of India Ltd |\n|--------|----------------------------------------------------------------------------------------------|\n|
ACF | Area Correction Factor |\n|
APR | Actuarial Premium Rate |\n|
ARG | Automatic Rain Gauge |\n|
AWS | Automatic Weather Stations |\n|
AY | Actual Yield |\n|
CB | Commercial Banks |\n|
CBS | Core Banking Solution |\n|
CCAFS | Research program on Climate Change, Agriculture and Food Security |\n|
CCEs | Crop Cutting Experiments |\n|
CPMU | Central Program Management Unit |\n|
CSC | Common Service Center |\n|
CSO | Central Statistical Office |\n|
CV | Co-efficient of Variance |\n|
DAC&FW | Department of Agriculture, Cooperation and Farmers Welfare |\n|
DBT | Direct Benefit Transfer |\n|
DCCBs | District Central Cooperative Banks |\n|
DLMC | District Level Monitoring Committee |\n|
DLTC | District Level Technical Committee |\n|
ESI | Expected Sum Insured |\n|
FASAL | Forecasting Agricultural output using Space, Agro meteorological and
Land based observations |\n| FIs | Financial Institutions |\n|
GIC Re | General Insurance Corporation of India |\n|
GFR | General Financial Rule |\n|
GIS | Geographic Information System |\n|
GPS | Global Positioning System |\n|
IA | Implementing Agency |\n|
IC | Insurance Company |\n|
IASRI | Indian Agricultural Statistical Research Institute |\n|
IFPRI | International Food Policy Research Institute |\n|
IMD | Indian Meteorological Department |\n|
IRRI | International Rice Research Institute |\n|
IRDAI | Insurance Regulatory and Development Authority of India |\n|
ISRO | Indian Space Research Organisation |\n|
ISS | Interest Subvention Scheme |\n|
IT | Information Technology |\n|
IU | Insurance Unit |\n|
KCC | Kisan Credit Cards |\n|
LC | Loss Cost |\n|
LPA | Long period Average |\n|
LPC | Land Possession Certificate |\n|
MIS | Management Information System |\n|
MNCFC | Mahalanobis National Crop Forecast Centre |\n|
MOA&FW | Ministry of Agriculture and Farmers Welfare |'''
- source_sentence: How can we identify outliers in crop yield?
sentences:
- '''(i) It will closely monitor and review the progress of FPO development and functioning
by holding its regular meetings. (ii) It will suggest the potential produce
clusters in the district (where FPOs can be formed & promoted) to N-PMAFSC and
will also assist Implementing Agencies, CBBOs and other stakeholders in identification
of cluster(s) and activity (ies) and also in mobilization of farmers. (iii) It
will resolve the financial constraints of FPOs through District Level Bankers'' Committee
and provide feedback to N-PMAFSC. (iv) It will identify the constraints in implementation
of scheme at the ground level and communicate the same to State Level Consultative
Committee for further taking up the matter with DAC&FW and N-PMAFSC for appropriate
policy decision. (v) Any other matter so decided by the committee in the interest
of the scheme and farmers. 14.4 Close and effective monitoring has been considered
a major trigger for success of this scheme. Therefore, in addition to three tiered
afore-stated structured mechanism for monitoring of the scheme, there shall be
continuous in-house monitoring by DAC&FW and by the Implementing Agencies also.
The DAC&FW may utilize the services of Directorate of Marketing & Inspection (DMI),
which has existence through its Regional & Sub-Offices across the country. For
effective monitoring, DAC&FW may engage consultants also and cost for same will
be borne from the budget of the scheme itself.'''
- '''Under CGF, NABARD and NCDC, as the case may be, shall cover: i. Fund based
Credit facilities already sanctioned / extended within six months from the date
of the application for the Guarantee Cover or intended to be extended singly or
jointly by one or more than one Eligible Lending Institution (ELI) to a single
eligible FPO borrower by way of term loan and/or working capital/composite credit
facilities without any collateral security and/or third party guarantees. ii.
The ELI can extend credit without any limit; however, the Guarantee Cover shall be
limited to the maximum guarantee cover specified under the Scheme. iii. Non-Banking
Financial Companies (NBFCs) and such other Financial Institutions (FIs) with
required net worth, track record and rating of AAA may also be accommodated
as Eligible Lending Institutions (ELIs), such NBFC should on-ward lend to FPOs
with a moderate spread between their cost of capital and lending rate.'''
- '''Identification of Outliers: All these above analyses can be used to check whether
there was any reason for yield deviation as presented in the CCE data. Then a
yield proxy map may be prepared. The Yield proxy map can be derived from remote
sensing vegetation indices (single or combination of indices), crop simulation
model output, or an integration of various parameters, which are related to crop
yield, such as soil, weather (gridded), satellite based products, etc. Whatever,
yield proxies to be used, it is the responsibility of the organization to record documentary
evidence (from their or other''s published work) that the yield proxy is related
to the particular crop''s yield. Then the IU level yields need to be overlaid
on the yield proxy map. Both yield proxy and CCE yield can be divided into 4-5
categories (e.g. Very good, Good, Medium, Poor, Very poor). Wherever there is
large mismatch between yield proxy and the CCE yield (more than 2 levels), the
CCE yield for that IU can be considered, as outliers.'''
pipeline_tag: sentence-similarity
model-index:
- name: SentenceTransformer based on BAAI/bge-small-en-v1.5
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: val evaluator
type: val_evaluator
metrics:
- type: cosine_accuracy@1
value: 0.48
name: Cosine Accuracy@1
- type: cosine_accuracy@5
value: 0.88
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.95
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.48
name: Cosine Precision@1
- type: cosine_precision@5
value: 0.17599999999999993
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09499999999999999
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.48
name: Cosine Recall@1
- type: cosine_recall@5
value: 0.88
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.95
name: Cosine Recall@10
- type: cosine_ndcg@5
value: 0.7065704999222873
name: Cosine Ndcg@5
- type: cosine_ndcg@10
value: 0.7288298734374183
name: Cosine Ndcg@10
- type: cosine_ndcg@100
value: 0.7407135931762043
name: Cosine Ndcg@100
- type: cosine_mrr@5
value: 0.6475
name: Cosine Mrr@5
- type: cosine_mrr@10
value: 0.6564841269841272
name: Cosine Mrr@10
- type: cosine_mrr@100
value: 0.6595216619129662
name: Cosine Mrr@100
- type: cosine_map@100
value: 0.6595216619129661
name: Cosine Map@100
- type: dot_accuracy@1
value: 0.48
name: Dot Accuracy@1
- type: dot_accuracy@5
value: 0.88
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.95
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.48
name: Dot Precision@1
- type: dot_precision@5
value: 0.17599999999999993
name: Dot Precision@5
- type: dot_precision@10
value: 0.09499999999999999
name: Dot Precision@10
- type: dot_recall@1
value: 0.48
name: Dot Recall@1
- type: dot_recall@5
value: 0.88
name: Dot Recall@5
- type: dot_recall@10
value: 0.95
name: Dot Recall@10
- type: dot_ndcg@5
value: 0.7065704999222873
name: Dot Ndcg@5
- type: dot_ndcg@10
value: 0.7288298734374183
name: Dot Ndcg@10
- type: dot_ndcg@100
value: 0.7407135931762043
name: Dot Ndcg@100
- type: dot_mrr@5
value: 0.6475
name: Dot Mrr@5
- type: dot_mrr@10
value: 0.6564841269841272
name: Dot Mrr@10
- type: dot_mrr@100
value: 0.6595216619129662
name: Dot Mrr@100
- type: dot_map@100
value: 0.6595216619129661
name: Dot Map@100
---
# SentenceTransformer based on BAAI/bge-small-en-v1.5
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5)
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 384 tokens
- **Similarity Function:** Cosine Similarity
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("smokxy/embedding_finetuned")
# Run inference
sentences = [
'How can we identify outliers in crop yield?',
"'Identification of Outliers: All these above analyses can be used to check whether there was any reason for yield deviation as presented in the CCE data. Then a yield proxy map may be prepared. The Yield proxy map can be derived from remote sensing vegetation indices (single or combination of indices), crop simulation model output, or an integration of various parameters, which are related to crop yield, such as soil, weather (gridded), satellite based products, etc. Whatever, yield proxies to be used, it is the responsibility of the organization to record documentary evidence (from their or other's published work) that the yield proxy is related to the particular crop's yield. Then the IU level yields need to be overlaid on the yield proxy map. Both yield proxy and CCE yield can be divided into 4-5 categories (e.g. Very good, Good, Medium, Poor, Very poor). Wherever there is large mismatch between yield proxy and the CCE yield (more than 2 levels), the CCE yield for that IU can be considered, as outliers.'",
"'Under CGF, NABARD and NCDC, as the case may be, shall cover: i. Fund based Credit facilities already sanctioned / extended within six months from the date of the application for the Guarantee Cover or intended to be extended singly or jointly by one or more than one Eligible Lending Institution (ELI) to a single eligible FPO borrower by way of term loan and/or working capital/composite credit facilities without any collateral security and/or third party guarantees. ii. The ELI can extend credit without any limit; however, the Guarantee Cover shall be limited to the maximum guarantee cover specified under the Scheme. iii. Non-Banking Financial Companies (NBFCs) and such other Financial Institutions (FIs) with required net worth, track record and rating of AAA may also be accommodated as Eligible Lending Institutions (ELIs), such NBFC should on-ward lend to FPOs with a moderate spread between their cost of capital and lending rate.'",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
## Evaluation
### Metrics
#### Information Retrieval
* Dataset: `val_evaluator`
* Evaluated with [InformationRetrievalEvaluator
](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.48 |
| cosine_accuracy@5 | 0.88 |
| cosine_accuracy@10 | 0.95 |
| cosine_precision@1 | 0.48 |
| cosine_precision@5 | 0.176 |
| cosine_precision@10 | 0.095 |
| cosine_recall@1 | 0.48 |
| cosine_recall@5 | 0.88 |
| cosine_recall@10 | 0.95 |
| cosine_ndcg@5 | 0.7066 |
| cosine_ndcg@10 | 0.7288 |
| cosine_ndcg@100 | 0.7407 |
| cosine_mrr@5 | 0.6475 |
| cosine_mrr@10 | 0.6565 |
| cosine_mrr@100 | 0.6595 |
| **cosine_map@100** | **0.6595** |
| dot_accuracy@1 | 0.48 |
| dot_accuracy@5 | 0.88 |
| dot_accuracy@10 | 0.95 |
| dot_precision@1 | 0.48 |
| dot_precision@5 | 0.176 |
| dot_precision@10 | 0.095 |
| dot_recall@1 | 0.48 |
| dot_recall@5 | 0.88 |
| dot_recall@10 | 0.95 |
| dot_ndcg@5 | 0.7066 |
| dot_ndcg@10 | 0.7288 |
| dot_ndcg@100 | 0.7407 |
| dot_mrr@5 | 0.6475 |
| dot_mrr@10 | 0.6565 |
| dot_mrr@100 | 0.6595 |
| dot_map@100 | 0.6595 |
## Training Details
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `gradient_accumulation_steps`: 4
- `learning_rate`: 1e-05
- `weight_decay`: 0.01
- `num_train_epochs`: 1.0
- `warmup_ratio`: 0.1
- `load_best_model_at_end`: True
#### All Hyperparameters
Click to expand
- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 8
- `per_device_eval_batch_size`: 8
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 4
- `eval_accumulation_steps`: None
- `learning_rate`: 1e-05
- `weight_decay`: 0.01
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 1.0
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.1
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: True
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: False
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`:
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `dispatch_batches`: None
- `split_batches`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `batch_sampler`: batch_sampler
- `multi_dataset_batch_sampler`: proportional
### Training Logs
| Epoch | Step | Training Loss | loss | val_evaluator_cosine_map@100 |
|:---------:|:------:|:-------------:|:----------:|:----------------------------:|
| **0.531** | **15** | **0.4478** | **0.0912** | **0.6595** |
| 0.9912 | 28 | - | 0.0912 | 0.6595 |
* The bold row denotes the saved checkpoint.
### Framework Versions
- Python: 3.10.14
- Sentence Transformers: 3.0.1
- Transformers: 4.41.1
- PyTorch: 2.3.0+cu121
- Accelerate: 0.27.2
- Datasets: 2.19.1
- Tokenizers: 0.19.1
## Citation
### BibTeX
#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
```
#### GISTEmbedLoss
```bibtex
@misc{solatorio2024gistembed,
title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning},
author={Aivin V. Solatorio},
year={2024},
eprint={2402.16829},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```