ros-classifiers / README.md
stan-northell's picture
Upload folder using huggingface_hub
6fe6d58 verified
metadata
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
widget:
  - text: "hasColourDetails: To be printed on external, adhesive-backed vinyl. No installation required. It will be installed from outside the window, so the glue needs to be on the reverse., hasCreatedDate: 2024-06-11, hasCustomerHomeCountry: United Kingdom, hasCustomerID: 12356, hasCustomerName: Post Office(Post Office), hasCutting: Trim to size, hasElementID: 3338891, hasElementTitle: Self-adhesive Window Vinyl (no installation), hasFinishedSizeHeight: 150, hasFinishedSizeWidth: 1080, hasInternalID: a909113d-a92c-48f2-9817-248a984dc5d8, hasMaterialCategory: Plastic, hasMaterialDescription: Suitable self-adhesive Window Vinyl - adhesive on back, hasMaterialType: PVC, hasMaterialUnitOfMeasure: GSM, hasNumberOfVersions: 1, hasPackingRequirements: Installation is NOT required.   Delivery address- FAO Jyoti Rathod, Ayston Road Post Office,10 Ayston Road, Leicester LE3 2GA, hasPrice: 30.0, hasPrintedSides: Single sided, hasProofType: PDF digital proof, hasQuantity: 1, hasQuantityPerVersion: 1, hasSendToDetails: [email protected]., hasSupplierName: Design X-Press Limited - CCS Lot 1 Only\t(Design X-Press Limited - CCS Lot 1 Only\t), hasSustainableOptionBeenOffered: N/A, hasTotalColours: 4, hasUnitOfMeasure: Millimetres (mm), "
  - text: >-
      hasAdditionalInformation: 2,600 cards (85 x 55mm), printed one side (full
      colour), delivered to Granby Marketing Services.  26 x packs of 100.,
      hasArtworkDoubleSidedStatus: Double Sided Different, hasCreatedDate:
      2024-08-22, hasCustomerHomeCountry: United Kingdom, hasCustomerID: 29427,
      hasCustomerName: NHS Blood and Transplant(NHS Blood and Transplant),
      hasCutting: Trim to size, hasElementID: 3466275, hasElementTitle: OLC317B
      Organ Donation Week Cards, hasFinishedSizeHeight: 55,
      hasFinishedSizeWidth: 85, hasFscPaperBeenSpecified: No, hasHandFinishing:
      Yes, hasHandFinishingDetails: Shrinkwrap in 100’s, hasInternalID:
      c171e44e-1c7a-4dba-a6d7-a8c11f317622, hasMaterialCategory: Paper,
      hasMaterialDescription: White Silk Coated Board,
      hasMaterialRecycledPercentage: 0%, hasMaterialThicknessOrWeight: 300,
      hasMaterialType: Paper and board, hasMaterialUnitOfMeasure: GSM,
      hasNumberOfVersions: 1, hasPackingRequirements: DELIVERY, hasPrice: 380.0,
      hasPrintedSides: Double sided, hasProductCategory: Loose Print,
      hasProofType: PDF digital proof, hasQuantity: 2600, hasQuantityPerVersion:
      1, hasRecycledContentBeenOffered: No, hasSendToDetails: Email to,
      hasSupplierName: Dataforms Chartered Press Ltd- CCS Lot 1 only(Dataforms
      Chartered Press Ltd- CCS Lot 1 only), hasTotalColours: 4,
      hasTotalColoursFace: 4, hasUnitOfMeasure: Millimetres (mm), 
  - text: >-
      hasCreatedDate: 2024-01-26, hasCustomerHomeCountry: United States,
      hasCustomerID: 25570, hasCustomerName: Finish Line Group(Finish Line,
      Inc), hasCutting: Trim to size, hasElementID: 3086661, hasElementTitle:
      F019753 20 x 28 Fabric Graphic, hasFinishedSizeHeight: 28,
      hasFinishedSizeWidth: 20, hasFscPaperBeenSpecified: No, hasInternalID:
      ac97e6d8-0358-4a1b-9da3-5871bb2b04d1, hasMachineFinishing: Yes,
      hasMachineFinishingDetails: Trim to size. Silicone bed top, bottom left
      and right edges, standard bed, hasMaterialCategory: Textiles,
      hasMaterialDescription: Moonlight Itex - 260GSM,
      hasMaterialRecycledPercentage: 0%, hasMaterialType: Polyester,
      hasNumberOfVersions: 1, hasPackingRequirements: Trim to size. Silicone bed
      top, bottom left and right edges, standard bed, hasPrice: 1395.56,
      hasPrintedSides: Single sided, hasProductCategory: Displays - Backlit,
      hasProofType: PDF digital proof, hasQuantity: 139,
      hasRecycledContentBeenOffered: N/A, hasSupplierName: GSP Custom Color(GSP
      Custom Color - HHGSP), hasTotalColours: 4, hasUnitOfMeasure: Inches (in), 
  - text: >-
      hasAdditionalInformation: AO, hasCreatedDate: 2024-10-08,
      hasCustomerHomeCountry: United States, hasCustomerID: 30642,
      hasCustomerName: Station Casinos LLC(Station Casinos), hasCutting: Trim to
      size, hasElementID: 3555960, hasElementTitle: 211696 - 381" X 363" SS
      FRONTLIT EXTERIOR VINYL SIGN , hasFinishedSizeHeight: 363,
      hasFinishedSizeWidth: 381, hasFscPaperBeenSpecified: No, hasInternalID:
      04b8890a-dc33-4778-ad73-c1f68f68231c, hasMaterialCategory: Plastic,
      hasMaterialDescription: 13OZ VINYL, hasMaterialRecycledPercentage: 0%,
      hasMaterialThicknessOrWeight: 13, hasMaterialType: PVC,
      hasMaterialUnitOfMeasure: Ounces (oz), hasNumberOfVersions: 1, hasPrice:
      676.0, hasPrintedSides: Not printed, hasProductCategory: Banners
      (synthetic), hasProofType: PDF digital proof, hasQuantity: 1,
      hasRecycledContentBeenOffered: N/A, hasSupplierName: WestRock
      Company(Westrock - 14360 - HHGSP), hasTotalColours: 4, hasUnitOfMeasure:
      Inches (in), 
  - text: >-
      hasAdditionalInformation: 2 - Pages, hasArtworkDoubleSidedStatus: Double
      Sided Different, hasColourDetails: 1/1 (Black Two-sided), hasCreatedDate:
      2024-12-10, hasCustomerHomeCountry: United States, hasCustomerID: 26760,
      hasCustomerName: Elanco Animal Health(Elanco Animal Health), hasCutting:
      Trim to size, hasElementID: 3687940, hasElementTitle: PA103754X -
      Galliprant PI, hasFinishedSizeHeight: 11, hasFinishedSizeWidth: 8.5,
      hasFlatSizeHeight: 11, hasFlatSizeWidth: 8.5, hasFscPaperBeenSpecified:
      No, hasInternalID: c0585f9a-6716-4373-a218-041b92baf4a2,
      hasMachineFinishing: Yes, hasMachineFinishingDetails: bleed,
      hasMaterialCategory: Paper, hasMaterialDescription: 60# White Offset,
      hasMaterialThicknessOrWeight: 60, hasMaterialType: Paper and board,
      hasMaterialUnitOfMeasure: Pounds (lbs), hasMinimumRecycledContent: 0%,
      hasNumberOfVersions: 1, hasPackingRequirements: Hold for Kit Packing in
      Element 5, hasPaperType: Offset, hasPrice: 350.0, hasPrintedSides: Double
      sided, hasProductCategory: Booklets & Brochures, hasProofType: PDF digital
      proof, hasQuantity: 3100, hasRecycledContentBeenRequested: No,
      hasSupplierName: Modern Litho – Kansas City(Modernlitho  -James Printing.
      Inc - HHGSP), hasTotalColoursFace: 1, hasTotalColoursReverse: 1,
      hasUnitOfMeasure: Inches (in), 
metrics:
  - f1_micro
  - f1_macro
  - f1_weighted
  - precision
  - accuracy
  - recall
pipeline_tag: text-classification
library_name: setfit
inference: false
model-index:
  - name: SetFit
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: Unknown
          type: unknown
          split: test
        metrics:
          - type: f1_micro
            value: 0.9322709163346613
            name: F1_Micro
          - type: f1_macro
            value: 0.37012987012987014
            name: F1_Macro
          - type: f1_weighted
            value: 0.8821255080797066
            name: F1_Weighted
          - type: precision
            value: 0.9750000238418579
            name: Precision
          - type: accuracy
            value: 0.9468749761581421
            name: Accuracy
          - type: recall
            value: 0.8931297659873962
            name: Recall

SetFit

This is a SetFit model that can be used for Text Classification. A OneVsRestClassifier instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

  • Model Type: SetFit
  • Classification head: a OneVsRestClassifier instance
  • Maximum Sequence Length: 512 tokens
  • Number of Classes: 8 classes

Model Sources

Evaluation

Metrics

Label F1_Micro F1_Macro F1_Weighted Precision Accuracy Recall
all 0.9323 0.3701 0.8821 0.9750 0.9469 0.8931

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("hasAdditionalInformation: AO, hasCreatedDate: 2024-10-08, hasCustomerHomeCountry: United States, hasCustomerID: 30642, hasCustomerName: Station Casinos LLC(Station Casinos), hasCutting: Trim to size, hasElementID: 3555960, hasElementTitle: 211696 - 381\" X 363\" SS FRONTLIT EXTERIOR VINYL SIGN , hasFinishedSizeHeight: 363, hasFinishedSizeWidth: 381, hasFscPaperBeenSpecified: No, hasInternalID: 04b8890a-dc33-4778-ad73-c1f68f68231c, hasMaterialCategory: Plastic, hasMaterialDescription: 13OZ VINYL, hasMaterialRecycledPercentage: 0%, hasMaterialThicknessOrWeight: 13, hasMaterialType: PVC, hasMaterialUnitOfMeasure: Ounces (oz), hasNumberOfVersions: 1, hasPrice: 676.0, hasPrintedSides: Not printed, hasProductCategory: Banners (synthetic), hasProofType: PDF digital proof, hasQuantity: 1, hasRecycledContentBeenOffered: N/A, hasSupplierName: WestRock Company(Westrock - 14360 - HHGSP), hasTotalColours: 4, hasUnitOfMeasure: Inches (in), ")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 67 110.9875 238

Framework Versions

  • Python: 3.10.16
  • SetFit: 1.1.2
  • Sentence Transformers: 3.4.1
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu124
  • Datasets: 3.2.0
  • Tokenizers: 0.21.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}