metadata
language: []
library_name: sentence-transformers
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:900
- loss:GISTEmbedLoss
base_model: BAAI/bge-small-en-v1.5
datasets: []
metrics:
- cosine_accuracy@1
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@5
- cosine_ndcg@10
- cosine_ndcg@100
- cosine_mrr@5
- cosine_mrr@10
- cosine_mrr@100
- cosine_map@100
- dot_accuracy@1
- dot_accuracy@5
- dot_accuracy@10
- dot_precision@1
- dot_precision@5
- dot_precision@10
- dot_recall@1
- dot_recall@5
- dot_recall@10
- dot_ndcg@5
- dot_ndcg@10
- dot_ndcg@100
- dot_mrr@5
- dot_mrr@10
- dot_mrr@100
- dot_map@100
widget:
- source_sentence: What will be used for the identification of beneficiaries?
sentences:
- >-
'for loanee and within 30 days for non loanee i.e. 15th Aug for Kharif
and 15th Jan for Rabi for loanee and 31st Aug for Kharif and 31st Jan
for Rabi for Non Loanee 13 Within 7 days from the date of
intimation by ICs CSCs/Banks/ Intermediary Cut-off date for
CSCs/Banks/Intermediary to correct/update the paid application
intimated by ICs on Crop Insurance Portal 14 Cut-off date for Insurer
to accept the corrected/updated applications Within 7 days from the
date of submission of correction/updation by the Bank/CSC Insurance
Companies 15 Within 7 days from acceptance of proposal by concerned
Insurance Company on Portal Cut-off date for Banks/ICs to hand over
insurance acknowledgement receipt along with folio to the insured
farmer Banks/ICs for enrolment through their intermediaries 16 Cut
off date for processing of applications by ICs and auto approval of
application of insured farmers on crop insurance Portal 60 days from
the cut off date for enrolment/debit of premium from farmers i.e. 15th
September for Kharif and 15th February for Rabi seasons 17 Before
cut off date of enrolment of farmers Insurance Companies/GOI /State
Cut off date for raising bills/requisitions with supporting documents
for releasing of advance premium subsidy based on 50% of 80% of
respective share of Centre/State in corresponding previous season 18
Release of advance upfront premium subsidy (First Instalment)i.e. 50%
of 80% of respective share of Centre/State in corresponding previous
season Within 15days of cut off date of enrolment of farmers i.e.
31st July for Kharif Upto 15th August* 19 *state may fix earlier
dates for early Kharif crops Training and registration of field level
workers assigned for conduct of CCEs and reporting of the same on crop
insurance Portal through smart phones/CCE Agri App Upto31st August*
*state 20 Registration of mobile number of representative of ICs for
co-witnessing of CCEs may fix earlier dates for early Kharif crops At
least 7 days before tentative date for conducting CCEs 21 a)
Uploading of tentative schedule/date for conducting CCEs (crop-wise/IU
wise) followed by SMS on one day notice through CCEs app.'
- >-
'8 Banks/PACS/CSC/ insurance agent/online enrolment by farmers etc.
Upto last date of enrolment of farmers as notified by States for
notified crop(s) or up to 15th July* for Kharif season Upto last
date of enrolment of farmers as notified by States for notified
crop(s) or up to 15th December* for Rabi season Cut-off date for
receipt of Applications of farmers/debit of premium from farmers account
(loanee and non- loanee) by all stakeholders including banks/PACS/CSC/
insurance agent/online enrolment by farmers etc. Note: *This is
indicative only and district wise crop calendar will be the final basis
to arrive at cut off date 9 Declaration of Prevented sowing State
Govt./ IC Strictly within 15 days from cut off date for enrolment of
farmers i.e 31st July for Kharif and 31st Dec for Rabi 10
Banks/Portal Within 15 days of cut-off date for enrolment of
farmers/debit of premium for both loanee and non- loanee farmers i.e.
31st July for Kharif and 31st Dec for Rabi Cut-off date for
electronic remittance of premium along with consolidated Declarations
to respective Insurance Company and uploading of details of individual
covered farmers on crop insurance Portal by Bank branches (CBs/
RRBs/DCCBs/PACs), followed by SMS to all insured farmers from Portal
11 Within 48 Hours of receipt of application & premium. Insurance
companies and their agents Cut-off date for electronic remittance of
farmer premium to Insurance Companies for farmers covered on Voluntary
basis by designated insurance Agent(s) and uploading of details of
individual covered farmers on crop insurance Portal. Insurance
Companies 12 Cut-off date for Insurer to accept or reject the farmer's
data on Portal Within 15 days from the cut-off date for uploading of
data/information by Banks/PACS/CSC/Agent respectively.'
- >-
' This consent of the beneficiary should be \'to agree that the
department responsible for implementation of the Pradhan Mantri Kisan
Samman Nidhi Yojana in Union Government or the State Government / Union
Territory Administration can use the beneficiary Aadhaar number and
other information provided in the declaration to verify the eligibility
of the beneficiary for scheme as per extant scheme guidelines with the
concerned agencies\'. 6.3 The existing land-ownership system in the
concerned State / UT will be used for identification of beneficiaries.
Accordingly, it is of utmost importance that the land records are clear
and updated. Further, State / UT Governments would also expedite the
progress of digitization of the land records and linking the same with
Aadhaar as well as bank details of the beneficiaries. 6.4 The lists of
eligible beneficiaries would be published at the village level.
Farmers' families who are eligible but have been excluded should be
provided an opportunity to represent their case.'
- source_sentence: What are weather parameters related to crop condition?
sentences:
- >-
'i. \'Credit Facility\' means any fund based credit facility extended
by an Eligible Lending Institution (ELI) to an Eligible Borrower
without any Collateral Security or Third Party Guarantee ; ii.
\'Credit Guarantee Fund\' means the Credit Guarantee Fund for FPOs
created with NABARD and NCDC respectively under the Scheme with
matching grant from DAC&FW for the purpose of extending guarantee to the
eligible lending institution(s) against their collateral free lending
to eligible FPOs; iii. \'Eligible Lending Institution (ELI)\' means a
Scheduled Commercial Bank for the time being included in the second
Schedule to the Reserve Bank of India Act, 1934, Regional Rural Banks,
Co-operative Banks, Cooperative Credit Society, NEDFI, or any other
institution (s) as may be decided by the NABARD and/or NCDC, as the case
may be, in consultation with Government of India from time to time.
NABARD and NCDC can also finance, if they so desire with the approval of
DAC&FW/N-PMFSC. NBFCs and such other financing institutions with
required net worth and track record may also serve as Eligible Lending
Institutions (ELIs), for lending to FPOs with a moderate spread between
their cost of capital and lending rate. However, Standard Financial
Sector Rating Agency should have rated NBFC **to be AAA** to be
considered as ELI; iv. \'Guarantee Cover\' means maximum cover
available per eligible FPO borrower; v. \'Guarantee Fee\' means the
onetime fee at a specified rate of the eligible credit facility
sanctioned by the ELI, payable by the ELI to NABARD or NCDC, as the case
may be; and vi.'
- >-
'| Table
No.
|
Topic
|\n|--------------------------------------------------------------------------------------|-----------------------------------------------------------------------------|\n|
Para
|
|\n|
Number
|
|\n| Table
1
| Premium rate payable by the
farmer |\n| Table
2
| Seasonality
Discipline |\n|
Table
3
| Calculation of crop-wise Sum
Insured |\n| Table
4
| Number of CCEs to be conducted at IU
level |\n| Table
5
| Yield Calculation for multi-picking
crop |\n| Table
6
|
|\n| Types of resolution to be used for vegetation index derivation for
different
|
|\n| levels of
analysis
|
|\n|
19.7
|
|\n| Table
7
| Calculation of Threshold
Yield |\n| Table
8
|
|\n| Detailed Procedure for On Account Payment of Claims due to
Mid-Season
|
|\n|
Adversity
|
|\n|
21.2.7.2
|
|\n| Table
9
|
|\n| Detailed Procedure and Timelines for payment of Prevented / Failed
Sowing
|
|\n| and Prevented Planting / Germination
Claims
|
|\n|
21.3.6.1
|
|\n| Table
10
|
|\n| Detailed Procedure and Timelines for payment
of
|
|\n| Post Harvest
Loss
|
|\n|
|
|\n|
21.4.8.1
|
|\n| Table
11
|
|\n| Detailed Procedure and Timelines for payment
of
|
|\n| localized
claims
|
|\n|
|
|\n|
21.5.9.1
|
|\n| Table
12
| Criteria for Performance Evaluation and Ranking of Loss Assessment
Agencies |\n| Table
13
|
|\n| Illustration for classification of risks, clustering/clubbing of
risks and districts
|
|\n| and determination of L1
bidder
|
|\n| Annexure
1
|
|\n| Table
14
|
|\n| Penalty and performance Monitoring/Evaluation of Empanelled
Insurance
|
|\n|
Companies
|
|\n| Annexure
2
|
|\n| Table
15
| Illustrative Evaluation of Efficiency of Nodal Department of
State |'
- >-
'gridded or satellite based data, Dry-spell Occurrence, Temperature
Anomaly, Soil Moisture Analysis (either from satellite or model data)
and any other available weather parameters (related to crop condition),
either from ground or satellite data. Another rainfall based index,
which can also be studied, is Standardized Precipitation Index (SPI).
SPI can be developed using IMD gridded rainfall data or NOAA CPC
rainfall data. However, while using gridded data (either from satellite
or ground stations), appropriate resolution should be used at
appropriate level. For example, 0.25 degree data should be used only at
district level and not at block/village level. High resolution weather
data can also be sourced from weather companies.'
- source_sentence: >-
How is the amount realized by the ELI from the FPC shared between SFAC and
ELI?
sentences:
- >-
'To substantiate the fact, the most successful example is of dairy
co-operative in India where professional managers have contributed
immensely to make it a success. There are other so many examples which
prove the absolute requirement of professional managers. The number of
professional staff could depend on geographical spread of business
operation, diversity of activities and volume of business. However, an
FPO should have minimum a CEO/Manager and an Accountant. Accountant is
required in FPO to look after its day to day accounting work. Based on
requirement, FPO can engage other staff also. 10.3 The CEO/Manager is
to be appointed by the executive body of the FPO who should be either
graduate in agriculture / agriculture marketing / agri-business
management or BBA or equivalent. Locally available professionals with
10+2 and preferably diploma in agriculture / agriculture marketing /
agri-business management or in such other related areas may be
preferable. The accountant should have educational qualification of 10+2
with Mathematics as a compulsory subject or alternatively with Commerce
or Accountancy background. If any members of the FPO meet the above
criteria, they may be considered preferably in the selection process.
10.4 Under the scheme, financial support towards salary of CEO/Manager
up to @ Rs. 25,000/- per month and of Accountant up to @
Rs.10,000/- per month with annual increment up to 5% is to be provided
from the earmarked financial support for first 3 years only. Thereafter,
FPOs will manage from their own resources to pay the salary of
CEO/Manager and Accountant. In order to create interest of good
professional activities of CEO/Accountant, the FPO may also offer higher
payment with their own sources of funds on above of Govt. support. One
CEO will provide full time services to one FPO at a time only.'
- >-
'7.3.1 Three levels of Indemnity, viz., 70%, 80% and 90%. SLCCCI in
consultation with Insurance Companies shall approve indemnity levels
for notified crops at district level. Threshold Yield (TY) shall be
notified in the Tender for the current season and the same will be used
for claim calculation for that season. The Average Yield of a notified
crop in Insurance Unit (IU) will be average yield of best five years out
of last seven years. The Threshold yield of the notified crop is equal
to Average Yield multiplied by Indemnity level. The Threshold Yield for
any crop and IU shall compulsorily be part of the notification for the
season and shall not change at any point during that season. 7.3.2
Calculation and Notification of Threshold Yield: For calculation of
Threshold Yield, historical average yield of best five out of last
seven years shall be considered. Further Threshold Yield should be
defined only at notified area level and once notified in the
Notification issued by the State should not be changed at later stage
under any circumstances. In case of multi-year contract, the Threshold
Yield for the subsequent years shall be revised by adding/considering
the yields of immediate previous corresponding season. The revised TY
and Sum Insured (if revised) should be notified accordingly at the
beginning of each crop season in case of multi-year tender.'
- >-
'a. Only such Claim as is submitted by the ELI within a maximum period
of one year from date of NPA or as specified by SFAC from time to time,
shall be considered by SFAC. b. On receiving a claim, the I&CSC shall
review in detail the reasons for the Default. The Committee reserves
the right to reject any proposal where the Guidelines have not been
strictly followed or if any misrepresentation or concealment of facts is
found leading to undue favour to the concerned FPC. c. SFAC shall
honour 75 per cent of the Guaranteed Amount in Default subject to a
maximum of 75 per cent of the guaranteed cap amount, on submission of
claim by the ELI where appropriate action for recovery has been
initiated. The balance 25 per cent of the default or guaranteed cap
amount, as the case may be, shall be paid on conclusion of recovery
proceedings by ELI. d. SFAC shall pay claims found in order and
complete in all respects, within 90 days. e. The outstanding dues of
the FPC to ELI shall be reduced to the extent of the claim amount
settled by SFAC. f. The ELI shall continue to make efforts to realise
the balance amount due from the g. defaulting FPC even after settlement
of the Guarantee. h. SFAC has the right to claim from the ELI any amount
that is realised by the ELI from the defaulting FPC even after
settlement of the guarantee amount. i. Any amount realized by the ELI
from the FPC shall be shared in the ratio of 85%:15% between SFAC &
ELI.'
- source_sentence: What directions is the lending institution bound to comply with?
sentences:
- >-
' The I&CSC shall regularly keep the NABARD and NCDC, as the case may
be, informed about all major decisions and actions taken by it in this
regard, and shall work under its overall control and guidance with
regard to the Fund and the Scheme. Simultaneously, the NABARD and NCDC
will keep the N- PMAFSC apprised about such decisions and actions taken
in this regard. v. Decisions regarding Guarantee Pay-Outs shall be the
primary responsibility of I&CSC which shall meet at least once every
quarter or as often as necessary. vi. The ELI may invoke the
Guarantee in respect of Credit Facility within a maximum period of one
year from the date of NPA, if the conditions set out by NABARD or NCDC,
as the case may be , are met out .'
- >-
' The lending institution shall be bound to comply with such directions
as NABARD or NCDC, as the case may be, may deem fit to issue from time
to time, for facilitating recoveries of the guaranteed account, or
safeguarding its interest as a guarantor. ix. The lending institution
shall, in particular, refrain from any act either before or subsequent
to invocation of guarantee, which may adversely affect the interest of
NABARD or NCDC, as the case may be, as the guarantor. x. The lending
institution shall be bound under the Scheme to intimate in advance to
NABARD or NCDC, as the case may be, its intention to enter into any
compromise or arrangement, which may have effect of discharge or waiver
of primary security. xi. Further, the lending institution shall
secure for NABARD or NCDC, as the case may be, or its appointed
agency, through a stipulation in an Agreement with the Borrower or
otherwise, the right to list the defaulted Borrowers' names and
particulars on the Website of NABARD or NCDC, as the case may be or
Integrated Portal'
- >-
'| AIC | Agricultural Insurance Company of India
Ltd
|\n|--------|----------------------------------------------------------------------------------------------|\n|
ACF | Area Correction
Factor
|\n| APR | Actuarial Premium
Rate
|\n| ARG | Automatic Rain
Gauge
|\n| AWS | Automatic Weather
Stations
|\n| AY | Actual
Yield
|\n| CB | Commercial
Banks
|\n| CBS | Core Banking
Solution
|\n| CCAFS | Research program on Climate Change, Agriculture and Food
Security |\n| CCEs | Crop Cutting
Experiments
|\n| CPMU | Central Program Management
Unit |\n|
CSC | Common Service
Center
|\n| CSO | Central Statistical
Office
|\n| CV | Co-efficient of
Variance
|\n| DAC&FW | Department of Agriculture, Cooperation and Farmers
Welfare |\n| DBT | Direct Benefit
Transfer
|\n| DCCBs | District Central Cooperative
Banks |\n|
DLMC | District Level Monitoring
Committee |\n|
DLTC | District Level Technical
Committee |\n|
ESI | Expected Sum
Insured
|\n| FASAL | Forecasting Agricultural output using Space, Agro
meteorological and Land based observations |\n| FIs | Financial
Institutions
|\n| GIC Re | General Insurance Corporation of
India |\n| GFR
| General Financial
Rule
|\n| GIS | Geographic Information
System
|\n| GPS | Global Positioning
System
|\n| IA | Implementing
Agency
|\n| IC | Insurance
Company
|\n| IASRI | Indian Agricultural Statistical Research
Institute |\n| IFPRI |
International Food Policy Research
Institute |\n| IMD |
Indian Meteorological
Department
|\n| IRRI | International Rice Research
Institute |\n|
IRDAI | Insurance Regulatory and Development Authority of
India |\n| ISRO | Indian Space
Research
Organisation
|\n| ISS | Interest Subvention
Scheme
|\n| IT | Information
Technology
|\n| IU | Insurance
Unit
|\n| KCC | Kisan Credit
Cards
|\n| LC | Loss
Cost
|\n| LPA | Long period
Average
|\n| LPC | Land Possession
Certificate
|\n| MIS | Management Information
System
|\n| MNCFC | Mahalanobis National Crop Forecast
Centre |\n| MOA&FW |
Ministry of Agriculture and Farmers
Welfare |'
- source_sentence: How can we identify outliers in crop yield?
sentences:
- >-
'(i) It will closely monitor and review the progress of FPO development
and functioning by holding its regular meetings. (ii) It will suggest
the potential produce clusters in the district (where FPOs can be
formed & promoted) to N-PMAFSC and will also assist Implementing
Agencies, CBBOs and other stakeholders in identification of cluster(s)
and activity (ies) and also in mobilization of farmers. (iii) It
will resolve the financial constraints of FPOs through District Level
Bankers' Committee and provide feedback to N-PMAFSC. (iv) It will
identify the constraints in implementation of scheme at the ground level
and communicate the same to State Level Consultative Committee for
further taking up the matter with DAC&FW and N-PMAFSC for appropriate
policy decision. (v) Any other matter so decided by the committee in
the interest of the scheme and farmers. 14.4 Close and effective
monitoring has been considered a major trigger for success of this
scheme. Therefore, in addition to three tiered afore-stated structured
mechanism for monitoring of the scheme, there shall be continuous
in-house monitoring by DAC&FW and by the Implementing Agencies also. The
DAC&FW may utilize the services of Directorate of Marketing & Inspection
(DMI), which has existence through its Regional & Sub-Offices across the
country. For effective monitoring, DAC&FW may engage consultants also
and cost for same will be borne from the budget of the scheme itself.'
- >-
'Under CGF, NABARD and NCDC, as the case may be, shall cover: i.
Fund based Credit facilities already sanctioned / extended within six
months from the date of the application for the Guarantee Cover or
intended to be extended singly or jointly by one or more than one
Eligible Lending Institution (ELI) to a single eligible FPO borrower by
way of term loan and/or working capital/composite credit facilities
without any collateral security and/or third party guarantees. ii. The
ELI can extend credit without any limit; however, the Guarantee Cover
shall be limited to the maximum guarantee cover specified under the
Scheme. iii. Non-Banking Financial Companies (NBFCs) and such other
Financial Institutions (FIs) with required net worth, track record
and rating of AAA may also be accommodated as Eligible Lending
Institutions (ELIs), such NBFC should on-ward lend to FPOs with a
moderate spread between their cost of capital and lending rate.'
- >-
'Identification of Outliers: All these above analyses can be used to
check whether there was any reason for yield deviation as presented in
the CCE data. Then a yield proxy map may be prepared. The Yield proxy
map can be derived from remote sensing vegetation indices (single or
combination of indices), crop simulation model output, or an integration
of various parameters, which are related to crop yield, such as soil,
weather (gridded), satellite based products, etc. Whatever, yield
proxies to be used, it is the responsibility of the organization to
record documentary evidence (from their or other's published work) that
the yield proxy is related to the particular crop's yield. Then the IU
level yields need to be overlaid on the yield proxy map. Both yield
proxy and CCE yield can be divided into 4-5 categories (e.g. Very good,
Good, Medium, Poor, Very poor). Wherever there is large mismatch between
yield proxy and the CCE yield (more than 2 levels), the CCE yield for
that IU can be considered, as outliers.'
pipeline_tag: sentence-similarity
model-index:
- name: SentenceTransformer based on BAAI/bge-small-en-v1.5
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: val evaluator
type: val_evaluator
metrics:
- type: cosine_accuracy@1
value: 0.48
name: Cosine Accuracy@1
- type: cosine_accuracy@5
value: 0.88
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.95
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.48
name: Cosine Precision@1
- type: cosine_precision@5
value: 0.17599999999999993
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09499999999999999
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.48
name: Cosine Recall@1
- type: cosine_recall@5
value: 0.88
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.95
name: Cosine Recall@10
- type: cosine_ndcg@5
value: 0.7065704999222873
name: Cosine Ndcg@5
- type: cosine_ndcg@10
value: 0.7288298734374183
name: Cosine Ndcg@10
- type: cosine_ndcg@100
value: 0.7407135931762043
name: Cosine Ndcg@100
- type: cosine_mrr@5
value: 0.6475
name: Cosine Mrr@5
- type: cosine_mrr@10
value: 0.6564841269841272
name: Cosine Mrr@10
- type: cosine_mrr@100
value: 0.6595216619129662
name: Cosine Mrr@100
- type: cosine_map@100
value: 0.6595216619129661
name: Cosine Map@100
- type: dot_accuracy@1
value: 0.48
name: Dot Accuracy@1
- type: dot_accuracy@5
value: 0.88
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.95
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.48
name: Dot Precision@1
- type: dot_precision@5
value: 0.17599999999999993
name: Dot Precision@5
- type: dot_precision@10
value: 0.09499999999999999
name: Dot Precision@10
- type: dot_recall@1
value: 0.48
name: Dot Recall@1
- type: dot_recall@5
value: 0.88
name: Dot Recall@5
- type: dot_recall@10
value: 0.95
name: Dot Recall@10
- type: dot_ndcg@5
value: 0.7065704999222873
name: Dot Ndcg@5
- type: dot_ndcg@10
value: 0.7288298734374183
name: Dot Ndcg@10
- type: dot_ndcg@100
value: 0.7407135931762043
name: Dot Ndcg@100
- type: dot_mrr@5
value: 0.6475
name: Dot Mrr@5
- type: dot_mrr@10
value: 0.6564841269841272
name: Dot Mrr@10
- type: dot_mrr@100
value: 0.6595216619129662
name: Dot Mrr@100
- type: dot_map@100
value: 0.6595216619129661
name: Dot Map@100
SentenceTransformer based on BAAI/bge-small-en-v1.5
This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BAAI/bge-small-en-v1.5
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 384 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("smokxy/embedding_finetuned")
# Run inference
sentences = [
'How can we identify outliers in crop yield?',
"'Identification of Outliers: All these above analyses can be used to check whether there was any reason for yield deviation as presented in the CCE data. Then a yield proxy map may be prepared. The Yield proxy map can be derived from remote sensing vegetation indices (single or combination of indices), crop simulation model output, or an integration of various parameters, which are related to crop yield, such as soil, weather (gridded), satellite based products, etc. Whatever, yield proxies to be used, it is the responsibility of the organization to record documentary evidence (from their or other's published work) that the yield proxy is related to the particular crop's yield. Then the IU level yields need to be overlaid on the yield proxy map. Both yield proxy and CCE yield can be divided into 4-5 categories (e.g. Very good, Good, Medium, Poor, Very poor). Wherever there is large mismatch between yield proxy and the CCE yield (more than 2 levels), the CCE yield for that IU can be considered, as outliers.'",
"'Under CGF, NABARD and NCDC, as the case may be, shall cover: i. Fund based Credit facilities already sanctioned / extended within six months from the date of the application for the Guarantee Cover or intended to be extended singly or jointly by one or more than one Eligible Lending Institution (ELI) to a single eligible FPO borrower by way of term loan and/or working capital/composite credit facilities without any collateral security and/or third party guarantees. ii. The ELI can extend credit without any limit; however, the Guarantee Cover shall be limited to the maximum guarantee cover specified under the Scheme. iii. Non-Banking Financial Companies (NBFCs) and such other Financial Institutions (FIs) with required net worth, track record and rating of AAA may also be accommodated as Eligible Lending Institutions (ELIs), such NBFC should on-ward lend to FPOs with a moderate spread between their cost of capital and lending rate.'",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Dataset:
val_evaluator
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.48 |
cosine_accuracy@5 | 0.88 |
cosine_accuracy@10 | 0.95 |
cosine_precision@1 | 0.48 |
cosine_precision@5 | 0.176 |
cosine_precision@10 | 0.095 |
cosine_recall@1 | 0.48 |
cosine_recall@5 | 0.88 |
cosine_recall@10 | 0.95 |
cosine_ndcg@5 | 0.7066 |
cosine_ndcg@10 | 0.7288 |
cosine_ndcg@100 | 0.7407 |
cosine_mrr@5 | 0.6475 |
cosine_mrr@10 | 0.6565 |
cosine_mrr@100 | 0.6595 |
cosine_map@100 | 0.6595 |
dot_accuracy@1 | 0.48 |
dot_accuracy@5 | 0.88 |
dot_accuracy@10 | 0.95 |
dot_precision@1 | 0.48 |
dot_precision@5 | 0.176 |
dot_precision@10 | 0.095 |
dot_recall@1 | 0.48 |
dot_recall@5 | 0.88 |
dot_recall@10 | 0.95 |
dot_ndcg@5 | 0.7066 |
dot_ndcg@10 | 0.7288 |
dot_ndcg@100 | 0.7407 |
dot_mrr@5 | 0.6475 |
dot_mrr@10 | 0.6565 |
dot_mrr@100 | 0.6595 |
dot_map@100 | 0.6595 |
Training Details
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsgradient_accumulation_steps
: 4learning_rate
: 1e-05weight_decay
: 0.01num_train_epochs
: 1.0warmup_ratio
: 0.1load_best_model_at_end
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 8per_device_eval_batch_size
: 8per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 4eval_accumulation_steps
: Nonelearning_rate
: 1e-05weight_decay
: 0.01adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1.0max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | loss | val_evaluator_cosine_map@100 |
---|---|---|---|---|
0.531 | 15 | 0.4478 | 0.0912 | 0.6595 |
0.9912 | 28 | - | 0.0912 | 0.6595 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.10.14
- Sentence Transformers: 3.0.1
- Transformers: 4.41.1
- PyTorch: 2.3.0+cu121
- Accelerate: 0.27.2
- Datasets: 2.19.1
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
GISTEmbedLoss
@misc{solatorio2024gistembed,
title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning},
author={Aivin V. Solatorio},
year={2024},
eprint={2402.16829},
archivePrefix={arXiv},
primaryClass={cs.LG}
}