spacy classifier trained for Phoebus-127k to eliminate:

  1. product spambots, the website was being spammed a few times (usually with html tags)
  2. warning only pages ("Warning" and nothing else)
  3. edits (authors adding editorial history footnotes)
  4. patreon and alike callouts (author asking for donations)
  5. author notes, summaries, tagging

usage example:

import spacy
spacy.prefer_gpu()
nlp = spacy.load('./Phoebus-Spam-Classifier-v2')
nlp.max_length = 10000000000
doc = nlp("Help me out by subscribing to my Patreon!")
print(doc.cats)

output: {'SPAM': 0.9999606609344482}


  1. The Classifier is provided ""AS IS"" and ""AS AVAILABLE"" without warranty of any kind, express or implied, including but not limited to warranties of merchantability, fitness for a particular purpose, title, or non-infringement.
  2. The Provider disclaims all liability for any damages or losses resulting from the use or misuse of the Classifier, including but not limited to any damages or losses arising from the use of the Classifier for purposes other than those intended by the Provider.
  3. The Provider does not endorse or condone the use of the Classifier for any purpose that violates applicable laws, regulations, or ethical standards.
  4. The Provider does not warrant that the Classifier will meet your specific requirements or that it will be error-free or that it will function without interruption.
  5. You assume all risks associated with the use of the Classifier, including but not limited to any loss of data, loss of business, or damage to your reputation.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Collection including Edgerunners/Phoebus-Spam-Classifier-v2