embedding_finetuned / README.md
smokxy's picture
pytorch_model.bin upload/update
345eae1 verified
|
raw
history blame
46.7 kB
metadata
language: []
library_name: sentence-transformers
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:900
  - loss:GISTEmbedLoss
base_model: BAAI/bge-small-en-v1.5
datasets: []
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@5
  - cosine_ndcg@10
  - cosine_ndcg@100
  - cosine_mrr@5
  - cosine_mrr@10
  - cosine_mrr@100
  - cosine_map@100
  - dot_accuracy@1
  - dot_accuracy@5
  - dot_accuracy@10
  - dot_precision@1
  - dot_precision@5
  - dot_precision@10
  - dot_recall@1
  - dot_recall@5
  - dot_recall@10
  - dot_ndcg@5
  - dot_ndcg@10
  - dot_ndcg@100
  - dot_mrr@5
  - dot_mrr@10
  - dot_mrr@100
  - dot_map@100
widget:
  - source_sentence: What will be used for the identification of beneficiaries?
    sentences:
      - >-
        'for loanee  and within 30 days for non loanee i.e. 15th Aug  for Kharif
        and 15th  Jan for Rabi for loanee and  31st Aug for Kharif and 31st  Jan
        for Rabi for  Non Loanee  13  Within  7 days  from the date of
        intimation by ICs  CSCs/Banks/ Intermediary  Cut-off date for
        CSCs/Banks/Intermediary to correct/update the  paid application
        intimated by ICs on Crop Insurance Portal  14  Cut-off date for Insurer
        to accept the corrected/updated applications  Within  7 days  from the
        date of submission of correction/updation by the Bank/CSC   Insurance
        Companies  15  Within 7 days from acceptance of proposal by concerned
        Insurance Company on Portal  Cut-off date for Banks/ICs to hand over
        insurance acknowledgement receipt along with folio to the insured
        farmer  Banks/ICs for enrolment through their intermediaries  16  Cut
        off date for processing of applications by ICs and auto approval of
        application of insured farmers on crop insurance Portal  60 days from
        the cut off date for enrolment/debit of premium from farmers i.e.  15th
        September  for Kharif and 15th February for  Rabi seasons  17  Before
        cut off date of enrolment of farmers  Insurance Companies/GOI /State  
        Cut off date for raising bills/requisitions with supporting documents
        for releasing of advance premium subsidy based on 50% of  80% of
        respective share of Centre/State in corresponding previous season  18 
        Release of advance upfront premium subsidy  (First Instalment)i.e. 50%
        of 80% of respective share of Centre/State in corresponding  previous
        season  Within 15days of cut off date of enrolment of farmers i.e.
        31st   July for Kharif   Upto  15th August*  19  *state may fix earlier
        dates  for early Kharif crops  Training and registration of field level
        workers assigned for conduct of CCEs and reporting of  the same on crop
        insurance Portal through smart phones/CCE Agri App  Upto31st August*
        *state  20  Registration of mobile number of representative of ICs for
        co-witnessing of CCEs  may fix earlier dates for early  Kharif crops  At
        least 7 days before tentative date  for conducting CCEs  21  a)
        Uploading of tentative schedule/date for conducting CCEs (crop-wise/IU
        wise) followed  by SMS on one day notice through CCEs app.'
      - >-
        '8  Banks/PACS/CSC/ insurance  agent/online enrolment by farmers etc. 
        Upto last date of enrolment of farmers   as notified by States for
        notified crop(s) or   up to  15th July* for  Kharif season  Upto last
        date of enrolment of farmers   as notified by States for notified
        crop(s) or   up to  15th December*  for Rabi season  Cut-off date for
        receipt of Applications of farmers/debit of premium from farmers account
        (loanee and non- loanee) by all  stakeholders including banks/PACS/CSC/ 
        insurance agent/online enrolment by farmers etc. Note: *This is
        indicative only and district wise crop calendar will be the final basis
        to arrive at cut off date  9  Declaration of Prevented sowing  State
        Govt./ IC  Strictly within 15 days from cut off date for  enrolment of
        farmers i.e 31st  July for Kharif and  31st Dec for Rabi  10 
        Banks/Portal  Within 15 days  of cut-off date for enrolment of
        farmers/debit of premium for  both loanee  and  non- loanee farmers i.e.
        31st  July for Kharif and  31st Dec for Rabi  Cut-off date for
        electronic remittance of  premium  along with  consolidated Declarations
        to respective Insurance Company and uploading of details of individual
        covered farmers on crop insurance Portal by Bank branches (CBs/
        RRBs/DCCBs/PACs), followed by  SMS to all insured farmers from Portal 
        11  Within 48 Hours of receipt of application & premium.  Insurance
        companies and their agents  Cut-off date for electronic remittance of
        farmer premium to Insurance Companies for farmers covered on Voluntary
        basis by designated insurance Agent(s) and uploading of details of
        individual covered farmers on crop insurance Portal.  Insurance
        Companies  12  Cut-off date for Insurer to accept or reject the farmer's
        data on Portal  Within 15 days from the cut-off date for uploading of
        data/information by Banks/PACS/CSC/Agent respectively.'
      - >-
        ' This consent of the beneficiary should be \'to agree that the
        department responsible for implementation of the Pradhan Mantri Kisan
        Samman Nidhi Yojana in Union Government or the State Government / Union
        Territory Administration can use the beneficiary Aadhaar number and
        other information provided in the declaration to verify the eligibility
        of the beneficiary for scheme as per extant scheme guidelines with the
        concerned agencies\'. 6.3  The existing land-ownership system in the
        concerned State / UT will be used for identification of beneficiaries.
        Accordingly, it is of utmost importance that the land records are clear
        and updated. Further, State / UT Governments would also expedite    the
        progress of digitization of the land records and linking the same with
        Aadhaar as well as bank details of the beneficiaries. 6.4  The lists of
        eligible beneficiaries would be published at the village level.
        Farmers'  families who are eligible but have been excluded should be
        provided an opportunity to represent their case.'
  - source_sentence: What are weather parameters related to crop condition?
    sentences:
      - >-
        'i.  \'Credit Facility\' means any  fund based credit facility  extended
        by an Eligible  Lending Institution (ELI) to an Eligible Borrower 
        without any Collateral Security or Third Party Guarantee ;  ii. 
        \'Credit Guarantee Fund\' means the Credit Guarantee Fund for FPOs
        created  with NABARD and NCDC  respectively under the Scheme with
        matching grant from DAC&FW for the purpose of extending guarantee to the
        eligible lending institution(s) against their collateral free lending
        to  eligible FPOs;  iii.  \'Eligible Lending Institution (ELI)\' means a
        Scheduled Commercial Bank for  the time being included in the second
        Schedule to the Reserve Bank of India Act, 1934, Regional Rural Banks,
        Co-operative Banks, Cooperative Credit  Society, NEDFI, or any other
        institution (s) as may be decided by the NABARD and/or NCDC, as the case
        may be, in consultation with Government of India from time to time.
        NABARD and NCDC can also finance, if they so desire with the approval of
        DAC&FW/N-PMFSC. NBFCs and such other financing institutions with
        required net worth and track record may also serve as Eligible  Lending
        Institutions (ELIs), for lending to FPOs with a moderate spread between
        their cost of capital and lending rate. However, Standard Financial 
        Sector Rating Agency should have rated NBFC **to be AAA** to be
        considered as  ELI;  iv.  \'Guarantee Cover\' means maximum cover
        available per eligible FPO  borrower;  v.  \'Guarantee Fee\' means the
        onetime fee at a specified rate of the eligible  credit facility
        sanctioned by the ELI, payable by the ELI to NABARD or NCDC, as the case
        may be; and  vi.'
      - >-
        '| Table
        No.                                                                           
        |
        Topic                                                                      
        |\n|--------------------------------------------------------------------------------------|-----------------------------------------------------------------------------|\n|
        Para                                                                                
        |                                                                            
        |\n|
        Number                                                                              
        |                                                                            
        |\n| Table
        1                                                                             
        | Premium rate payable by the
        farmer                                          |\n| Table
        2                                                                             
        | Seasonality
        Discipline                                                      |\n|
        Table
        3                                                                             
        | Calculation of crop-wise Sum
        Insured                                        |\n| Table
        4                                                                             
        | Number of CCEs to be conducted at IU
        level                                  |\n| Table
        5                                                                             
        | Yield Calculation for multi-picking
        crop                                    |\n| Table
        6                                                                             
        |                                                                            
        |\n| Types of resolution to be used for vegetation index derivation for
        different        
        |                                                                            
        |\n| levels of
        analysis                                                                  
        |                                                                            
        |\n|
        19.7                                                                                
        |                                                                            
        |\n| Table
        7                                                                             
        | Calculation of Threshold
        Yield                                              |\n| Table
        8                                                                             
        |                                                                            
        |\n| Detailed Procedure for On Account Payment of Claims due to
        Mid-Season               
        |                                                                            
        |\n|
        Adversity                                                                           
        |                                                                            
        |\n|
        21.2.7.2                                                                            
        |                                                                            
        |\n| Table
        9                                                                             
        |                                                                            
        |\n| Detailed Procedure and Timelines for payment of Prevented / Failed
        Sowing           
        |                                                                            
        |\n| and Prevented Planting / Germination
        Claims                                         
        |                                                                            
        |\n|
        21.3.6.1                                                                            
        |                                                                            
        |\n| Table
        10                                                                            
        |                                                                            
        |\n| Detailed Procedure and Timelines for payment
        of                                     
        |                                                                            
        |\n| Post Harvest
        Loss                                                                   
        |                                                                            
        |\n|                                                                                     
        |                                                                            
        |\n|
        21.4.8.1                                                                            
        |                                                                            
        |\n| Table
        11                                                                            
        |                                                                            
        |\n| Detailed Procedure and Timelines for payment
        of                                     
        |                                                                            
        |\n| localized
        claims                                                                    
        |                                                                            
        |\n|                                                                                     
        |                                                                            
        |\n|
        21.5.9.1                                                                            
        |                                                                            
        |\n| Table
        12                                                                            
        | Criteria for Performance Evaluation and Ranking of Loss Assessment
        Agencies |\n| Table
        13                                                                            
        |                                                                            
        |\n| Illustration for classification of risks, clustering/clubbing of
        risks and districts
        |                                                                            
        |\n| and determination of L1
        bidder                                                      
        |                                                                            
        |\n| Annexure
        1                                                                          
        |                                                                            
        |\n| Table
        14                                                                            
        |                                                                            
        |\n| Penalty and performance Monitoring/Evaluation of Empanelled
        Insurance               
        |                                                                            
        |\n|
        Companies                                                                           
        |                                                                            
        |\n| Annexure
        2                                                                          
        |                                                                            
        |\n| Table
        15                                                                            
        | Illustrative Evaluation of Efficiency of Nodal Department of
        State          |'
      - >-
        'gridded or satellite based data, Dry-spell Occurrence, Temperature
        Anomaly, Soil Moisture Analysis  (either from satellite or model data)
        and any other available weather parameters (related to crop condition),
        either from ground or satellite data. Another rainfall based index,
        which can also be studied, is Standardized Precipitation Index (SPI).
        SPI can be developed using IMD gridded rainfall data or NOAA CPC
        rainfall data. However, while using gridded data (either from satellite
        or ground stations), appropriate resolution should be used at
        appropriate level. For example, 0.25 degree data should be used only at
        district level and not at block/village level. High resolution weather
        data can also be sourced from weather companies.'
  - source_sentence: >-
      How is the amount realized by the ELI from the FPC shared between SFAC and
      ELI?
    sentences:
      - >-
        'To substantiate the fact, the most successful example is of dairy
        co-operative in India where professional managers have contributed
        immensely to make it a success. There are other so many examples which
        prove the absolute requirement of professional managers. The number of
        professional staff could depend on geographical spread of business
        operation, diversity of activities and volume of business. However, an
        FPO should have minimum a CEO/Manager and an Accountant. Accountant is
        required in FPO to look after its day to day accounting work. Based on
        requirement, FPO can engage other staff also.   10.3 The CEO/Manager is
        to be appointed by the executive body of the FPO who  should be either
        graduate in agriculture / agriculture marketing / agri-business
        management or BBA or equivalent. Locally available professionals with
        10+2 and  preferably diploma in agriculture / agriculture marketing /
        agri-business management or in such other related areas may be
        preferable. The accountant should have educational qualification of 10+2
        with Mathematics as a compulsory subject or alternatively with  Commerce
        or Accountancy background. If any members of the FPO meet the above
        criteria, they may be considered preferably in the selection process. 
        10.4 Under the scheme, financial support towards salary of CEO/Manager
        up to         @ Rs. 25,000/- per month and of Accountant up to @
        Rs.10,000/- per month with annual increment up to 5% is to be provided
        from the earmarked financial support for first 3 years only. Thereafter,
        FPOs will manage from their own resources to pay the salary of
        CEO/Manager and Accountant. In order to create  interest of good
        professional activities of CEO/Accountant, the FPO may also offer higher
        payment with their own sources of funds on above of Govt. support. One
        CEO will provide full time services to one FPO at a time only.'
      - >-
        '7.3.1  Three levels of Indemnity, viz., 70%, 80% and 90%. SLCCCI in
        consultation with Insurance Companies  shall approve indemnity levels
        for notified crops at district level. Threshold Yield (TY) shall be
        notified  in the Tender for the current season and the same will be used
        for claim calculation for that season. The Average Yield of a notified
        crop in Insurance Unit (IU) will be average yield of best five years out
        of  last seven years. The Threshold yield of the notified crop is equal
        to Average Yield multiplied by  Indemnity level. The Threshold Yield for
        any crop and IU shall compulsorily be part of the notification for the
        season and shall not change at any point during that season.   7.3.2 
        Calculation and Notification of Threshold Yield: For calculation of
        Threshold Yield, historical average  yield of best five out of last
        seven years shall be considered. Further Threshold Yield should be
        defined  only at notified area level and once notified in the
        Notification issued by the State should not be  changed at later stage
        under any circumstances. In case of multi-year contract, the Threshold
        Yield for  the subsequent years shall be revised by adding/considering
        the yields of immediate previous corresponding season. The revised TY
        and Sum Insured (if revised) should be notified accordingly at the
        beginning of each crop season in case of multi-year tender.'
      - >-
        'a. Only such Claim as is submitted by the ELI within a maximum period
        of one year  from date of NPA or as specified by SFAC from time to time,
        shall be considered by SFAC.  b. On receiving a claim, the I&CSC shall
        review in detail the reasons for the Default. The  Committee reserves
        the right to reject any proposal where the Guidelines have not been
        strictly followed or if any misrepresentation or concealment of facts is
        found leading to undue favour to the concerned FPC.  c. SFAC shall
        honour 75 per cent of the Guaranteed Amount in Default subject to a 
        maximum of 75 per cent of the guaranteed cap amount, on submission of
        claim by the ELI where appropriate action for recovery has been
        initiated. The balance 25 per cent of the default or guaranteed cap
        amount, as the case may be, shall be paid on conclusion of recovery
        proceedings by ELI.  d. SFAC shall pay claims found in order and
        complete in all respects, within 90 days.  e. The outstanding dues of
        the FPC to ELI shall be reduced to the extent of the claim  amount
        settled by SFAC.  f.  The ELI shall continue to make efforts to realise
        the balance amount due from the g. defaulting FPC even after settlement
        of the Guarantee. h. SFAC has the right to claim from the ELI any amount
        that is realised by the ELI from  the defaulting FPC even after
        settlement of the guarantee amount.  i. Any amount realized by the ELI
        from the FPC shall be shared in the ratio of 85%:15%  between SFAC &
        ELI.'
  - source_sentence: What directions is the lending institution bound to comply with?
    sentences:
      - >-
        ' The I&CSC shall regularly keep the NABARD and NCDC, as the case may
        be,   informed about all major decisions and actions taken by it in this
        regard, and shall work under its overall control and guidance with
        regard to the Fund and the Scheme. Simultaneously, the NABARD and NCDC
        will keep the N- PMAFSC apprised about such decisions and actions taken
        in this regard.  v.  Decisions regarding Guarantee Pay-Outs shall be the
        primary responsibility of I&CSC which shall meet at least once every
        quarter or as often as necessary.  vi.   The ELI may invoke the
        Guarantee in respect of Credit Facility within a maximum   period of one
        year from the date of NPA, if the conditions set out by NABARD or NCDC,
        as the case may be , are met out .'
      - >-
        ' The lending institution shall be bound to comply with such directions
        as NABARD or NCDC, as the case may be, may deem fit to issue from time
        to time, for facilitating recoveries of the guaranteed account, or
        safeguarding its interest as a guarantor.   ix.  The lending institution
        shall, in particular, refrain from any act either before or subsequent
        to invocation of guarantee, which may adversely affect the interest of
        NABARD or NCDC, as the case may be, as the guarantor.   x.  The lending
        institution shall be bound under the Scheme to intimate in advance to
        NABARD or NCDC, as the case may be, its intention to enter into any
        compromise or arrangement, which may have effect of discharge or waiver
        of primary security.   xi.  Further, the lending institution shall
        secure for NABARD or NCDC, as the case  may be,  or its appointed
        agency, through a stipulation in an Agreement with the Borrower or
        otherwise, the right to list the defaulted Borrowers' names and
        particulars on the Website of NABARD or NCDC, as the case may be or
        Integrated Portal'
      - >-
        '| AIC    |  Agricultural Insurance Company of India
        Ltd                                                
        |\n|--------|----------------------------------------------------------------------------------------------|\n|
        ACF    | Area Correction
        Factor                                                                      
        |\n| APR    | Actuarial Premium
        Rate                                                                      
        |\n| ARG    | Automatic Rain
        Gauge                                                                        
        |\n| AWS    | Automatic Weather
        Stations                                                                  
        |\n| AY     | Actual
        Yield                                                                                
        |\n| CB     | Commercial
        Banks                                                                            
        |\n| CBS    | Core Banking
        Solution                                                                       
        |\n| CCAFS  | Research program on Climate Change, Agriculture and Food
        Security                            |\n| CCEs   | Crop Cutting
        Experiments                                                                    
        |\n| CPMU   | Central Program Management
        Unit                                                              |\n|
        CSC    | Common Service
        Center                                                                       
        |\n| CSO    | Central Statistical
        Office                                                                  
        |\n| CV     | Co-efficient of
        Variance                                                                    
        |\n| DAC&FW | Department of Agriculture, Cooperation and Farmers
        Welfare                                   |\n| DBT    | Direct Benefit
        Transfer                                                                     
        |\n| DCCBs  | District Central Cooperative
        Banks                                                           |\n|
        DLMC   | District Level Monitoring
        Committee                                                          |\n|
        DLTC   | District Level Technical
        Committee                                                           |\n|
        ESI    | Expected Sum
        Insured                                                                        
        |\n| FASAL  | Forecasting Agricultural output using Space, Agro
        meteorological and Land based observations |\n| FIs    | Financial
        Institutions                                                                      
        |\n| GIC Re | General Insurance Corporation of
        India                                                       |\n| GFR   
        | General Financial
        Rule                                                                      
        |\n| GIS    | Geographic Information
        System                                                               
        |\n| GPS    | Global Positioning
        System                                                                   
        |\n| IA     | Implementing
        Agency                                                                         
        |\n| IC     | Insurance
        Company                                                                           
        |\n| IASRI  | Indian Agricultural Statistical Research
        Institute                                           |\n| IFPRI  |
        International Food Policy Research
        Institute                                                 |\n| IMD    |
        Indian Meteorological
        Department                                                            
        |\n| IRRI   | International Rice Research
        Institute                                                        |\n|
        IRDAI  | Insurance Regulatory and Development Authority of
        India                                      |\n| ISRO   | Indian Space
        Research
        Organisation                                                          
        |\n| ISS    | Interest Subvention
        Scheme                                                                  
        |\n| IT     | Information
        Technology                                                                      
        |\n| IU     | Insurance
        Unit                                                                              
        |\n| KCC    | Kisan Credit
        Cards                                                                          
        |\n| LC     | Loss
        Cost                                                                                   
        |\n| LPA    | Long period
        Average                                                                         
        |\n| LPC    | Land Possession
        Certificate                                                                 
        |\n| MIS    | Management Information
        System                                                               
        |\n| MNCFC  | Mahalanobis National Crop Forecast
        Centre                                                    |\n| MOA&FW |
        Ministry of Agriculture and Farmers
        Welfare                                                  |'
  - source_sentence: How can we identify outliers in crop yield?
    sentences:
      - >-
        '(i) It will closely monitor and review the progress of FPO development
        and  functioning by holding its regular meetings.  (ii)  It will suggest
        the potential produce clusters in the district (where FPOs can be 
        formed & promoted) to N-PMAFSC and will also assist Implementing
        Agencies, CBBOs and other stakeholders in identification of cluster(s) 
        and  activity (ies)   and also in mobilization of farmers.  (iii)  It
        will resolve the financial constraints of FPOs through District Level
        Bankers'  Committee and provide feedback to N-PMAFSC.  (iv) It will
        identify the constraints in implementation of scheme at the ground level
        and  communicate the same to State Level Consultative Committee for
        further taking  up the matter with DAC&FW and N-PMAFSC for appropriate
        policy decision.   (v) Any other matter so decided by the committee in
        the interest of the scheme and  farmers.  14.4 Close and effective
        monitoring has been considered a major trigger for success  of this
        scheme. Therefore, in addition to three tiered afore-stated structured
        mechanism for monitoring of the scheme, there shall be continuous
        in-house monitoring by DAC&FW and by the Implementing Agencies also. The
        DAC&FW may utilize the services of Directorate of Marketing & Inspection
        (DMI), which has existence through its Regional & Sub-Offices across the
        country. For effective monitoring, DAC&FW may engage consultants also
        and cost for same will be borne from the budget of the scheme itself.'
      - >-
        'Under CGF, NABARD and NCDC, as the case may be, shall cover:   i.  
        Fund based Credit facilities already sanctioned / extended within six
        months  from the date of the application for the Guarantee Cover or
        intended to be extended singly or jointly by one or more than one
        Eligible Lending Institution  (ELI) to a single eligible FPO borrower by
        way of term loan and/or working capital/composite credit facilities
        without any collateral security and/or third party guarantees.  ii. The
        ELI can extend credit without any limit; however, the Guarantee Cover
        shall  be limited to the maximum guarantee cover specified under the
        Scheme.    iii. Non-Banking Financial Companies (NBFCs) and such other
        Financial   Institutions (FIs) with required net worth, track record 
        and rating  of AAA  may also be accommodated as Eligible Lending
        Institutions (ELIs), such NBFC should on-ward lend to FPOs with a
        moderate spread between their cost of capital and lending rate.'
      - >-
        'Identification of Outliers: All these above analyses can be used to
        check whether there was any reason for yield deviation as presented in
        the CCE data. Then a yield proxy map may be prepared.  The Yield proxy
        map can be derived from remote sensing vegetation indices (single or 
        combination of indices), crop simulation model output, or an integration
        of various parameters, which are related to crop yield, such as soil,
        weather (gridded), satellite based products, etc. Whatever, yield
        proxies to be used, it is the responsibility of the organization to
        record  documentary evidence (from their or other's published work) that
        the yield proxy is related to the particular crop's yield.  Then the IU
        level yields need to be overlaid on the yield proxy map. Both  yield
        proxy and CCE yield can be divided into 4-5 categories (e.g. Very good,
        Good, Medium, Poor, Very poor). Wherever there is large mismatch between
        yield proxy and the CCE yield (more than 2 levels), the CCE yield for
        that IU can be considered, as outliers.'
pipeline_tag: sentence-similarity
model-index:
  - name: SentenceTransformer based on BAAI/bge-small-en-v1.5
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: val evaluator
          type: val_evaluator
        metrics:
          - type: cosine_accuracy@1
            value: 0.48
            name: Cosine Accuracy@1
          - type: cosine_accuracy@5
            value: 0.88
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.95
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.48
            name: Cosine Precision@1
          - type: cosine_precision@5
            value: 0.17599999999999993
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09499999999999999
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.48
            name: Cosine Recall@1
          - type: cosine_recall@5
            value: 0.88
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.95
            name: Cosine Recall@10
          - type: cosine_ndcg@5
            value: 0.7065704999222873
            name: Cosine Ndcg@5
          - type: cosine_ndcg@10
            value: 0.7288298734374183
            name: Cosine Ndcg@10
          - type: cosine_ndcg@100
            value: 0.7407135931762043
            name: Cosine Ndcg@100
          - type: cosine_mrr@5
            value: 0.6475
            name: Cosine Mrr@5
          - type: cosine_mrr@10
            value: 0.6564841269841272
            name: Cosine Mrr@10
          - type: cosine_mrr@100
            value: 0.6595216619129662
            name: Cosine Mrr@100
          - type: cosine_map@100
            value: 0.6595216619129661
            name: Cosine Map@100
          - type: dot_accuracy@1
            value: 0.48
            name: Dot Accuracy@1
          - type: dot_accuracy@5
            value: 0.88
            name: Dot Accuracy@5
          - type: dot_accuracy@10
            value: 0.95
            name: Dot Accuracy@10
          - type: dot_precision@1
            value: 0.48
            name: Dot Precision@1
          - type: dot_precision@5
            value: 0.17599999999999993
            name: Dot Precision@5
          - type: dot_precision@10
            value: 0.09499999999999999
            name: Dot Precision@10
          - type: dot_recall@1
            value: 0.48
            name: Dot Recall@1
          - type: dot_recall@5
            value: 0.88
            name: Dot Recall@5
          - type: dot_recall@10
            value: 0.95
            name: Dot Recall@10
          - type: dot_ndcg@5
            value: 0.7065704999222873
            name: Dot Ndcg@5
          - type: dot_ndcg@10
            value: 0.7288298734374183
            name: Dot Ndcg@10
          - type: dot_ndcg@100
            value: 0.7407135931762043
            name: Dot Ndcg@100
          - type: dot_mrr@5
            value: 0.6475
            name: Dot Mrr@5
          - type: dot_mrr@10
            value: 0.6564841269841272
            name: Dot Mrr@10
          - type: dot_mrr@100
            value: 0.6595216619129662
            name: Dot Mrr@100
          - type: dot_map@100
            value: 0.6595216619129661
            name: Dot Map@100

SentenceTransformer based on BAAI/bge-small-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-small-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("smokxy/embedding_finetuned")
# Run inference
sentences = [
    'How can we identify outliers in crop yield?',
    "'Identification of Outliers: All these above analyses can be used to check whether there was any reason for yield deviation as presented in the CCE data. Then a yield proxy map may be prepared.  The Yield proxy map can be derived from remote sensing vegetation indices (single or  combination of indices), crop simulation model output, or an integration of various parameters, which are related to crop yield, such as soil, weather (gridded), satellite based products, etc. Whatever, yield proxies to be used, it is the responsibility of the organization to record  documentary evidence (from their or other's published work) that the yield proxy is related to the particular crop's yield.  Then the IU level yields need to be overlaid on the yield proxy map. Both  yield proxy and CCE yield can be divided into 4-5 categories (e.g. Very good, Good, Medium, Poor, Very poor). Wherever there is large mismatch between yield proxy and the CCE yield (more than 2 levels), the CCE yield for that IU can be considered, as outliers.'",
    "'Under CGF, NABARD and NCDC, as the case may be, shall cover:   i.   Fund based Credit facilities already sanctioned / extended within six months  from the date of the application for the Guarantee Cover or intended to be extended singly or jointly by one or more than one Eligible Lending Institution  (ELI) to a single eligible FPO borrower by way of term loan and/or working capital/composite credit facilities without any collateral security and/or third party guarantees.  ii. The ELI can extend credit without any limit; however, the Guarantee Cover shall  be limited to the maximum guarantee cover specified under the Scheme.    iii. Non-Banking Financial Companies (NBFCs) and such other Financial   Institutions (FIs) with required net worth, track record  and rating  of AAA  may also be accommodated as Eligible Lending Institutions (ELIs), such NBFC should on-ward lend to FPOs with a moderate spread between their cost of capital and lending rate.'",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.48
cosine_accuracy@5 0.88
cosine_accuracy@10 0.95
cosine_precision@1 0.48
cosine_precision@5 0.176
cosine_precision@10 0.095
cosine_recall@1 0.48
cosine_recall@5 0.88
cosine_recall@10 0.95
cosine_ndcg@5 0.7066
cosine_ndcg@10 0.7288
cosine_ndcg@100 0.7407
cosine_mrr@5 0.6475
cosine_mrr@10 0.6565
cosine_mrr@100 0.6595
cosine_map@100 0.6595
dot_accuracy@1 0.48
dot_accuracy@5 0.88
dot_accuracy@10 0.95
dot_precision@1 0.48
dot_precision@5 0.176
dot_precision@10 0.095
dot_recall@1 0.48
dot_recall@5 0.88
dot_recall@10 0.95
dot_ndcg@5 0.7066
dot_ndcg@10 0.7288
dot_ndcg@100 0.7407
dot_mrr@5 0.6475
dot_mrr@10 0.6565
dot_mrr@100 0.6595
dot_map@100 0.6595

Training Details

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • gradient_accumulation_steps: 4
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • num_train_epochs: 1.0
  • warmup_ratio: 0.1
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 4
  • eval_accumulation_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1.0
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss val_evaluator_cosine_map@100
0.531 15 0.4478 0.0912 0.6595
0.9912 28 - 0.0912 0.6595
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.14
  • Sentence Transformers: 3.0.1
  • Transformers: 4.41.1
  • PyTorch: 2.3.0+cu121
  • Accelerate: 0.27.2
  • Datasets: 2.19.1
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

GISTEmbedLoss

@misc{solatorio2024gistembed,
    title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning}, 
    author={Aivin V. Solatorio},
    year={2024},
    eprint={2402.16829},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}