Forecast-ing commited on
Commit
cc5dade
·
verified ·
1 Parent(s): 5546d42

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -14
README.md CHANGED
@@ -13,15 +13,15 @@ model-index:
13
  # ModernBERT Engagement Content Regression
14
  ### What is this?
15
 
16
- This is an exploration of using modernBERT for the text regression task of predicting engagement metrics for text content. In this case, we are predicting the clickthrough rate (CTR) of email text content.
17
 
18
- We will be exploring hyperparameter tuning of modernBert; and how to use it for regression, as well as comparing the results to a benchmark model.
19
 
20
- This type of task if difficult, we can remember the quote
21
  > Half my advertising is wasted; the trouble is, I don't know which half
22
  > -John Wanamaker
23
 
24
- We are also excluding other relevant factors such as the time of day the email is sent, the day of the week, the recipient, etc in this experiment.
25
 
26
  Links for project:
27
  - Model - [ModernBERT-Engagement-Content-Regression](https://huggingface.co/Forecast-ing/modernBERT-content-regression)
@@ -36,13 +36,11 @@ This work is indebted to the work of many community members and blog posts.
36
 
37
 
38
  ### Our dataset
39
- We will be using a dataset of 548 emails where we have the text of the email `text` and the CTR we are trying to predict `labels`.
40
-
41
- We look forward in the improvements of ModernBERT to fine-tune models specifically for each potential users email dataset. The variability of email data, as well as the small size of the dataset pose an interesting regression challenge.
42
 
 
43
  ### Benchmarking
44
- We will start by using the Catboost library as a simple benchmark for text regression. For both the benchmark and the ModernBert run, we are using 'rmse' as the metric.
45
- We recieve the following results:
46
  | Metric | Value |
47
  |--------|------------------|
48
  | MSE | 2.552100633998035 |
@@ -79,11 +77,8 @@ After running hyperparameter tuning for ModernBERT, we get the following results
79
  | SMAPE | 56.61447048187256 |
80
 
81
  We see improvements in all metrics except for SMAPE. We believe that ModernBERT would scale even better with a larger dataset; as 500 example is very low for fine-tuning and are thus happy with the performance of this evaluation.
82
-
83
  ### Who are we?
84
- At [Forecast.ing](https://forecast.ing) we are building a platform to help users create more enriching content by automatically researching trends and generating campaign ideas with AgenticAI.
85
- We generate the content, and then create fine-tuned scores of how likely we think that content will succeed.
86
 
87
  ## Conclusion
88
- We see that ModernBERT is a powerful model for text regression. We believe that with a larger dataset, we would see even better results. We are excited to see the future of ModernBERT and how it will be used for text regression.
89
- If interested, I can be contacted at [email protected]
 
13
  # ModernBERT Engagement Content Regression
14
  ### What is this?
15
 
16
+ This explores using modernBERT for the text regression task of predicting engagement metrics for text content. In this case, we predict the clickthrough rate (CTR) of email text content.
17
 
18
+ We will explore modernBert's hyperparameter tuning and how to use it for regression. We will also compare the results to a benchmark model.
19
 
20
+ This type of task is complex; we can remember the quote.
21
  > Half my advertising is wasted; the trouble is, I don't know which half
22
  > -John Wanamaker
23
 
24
+ In this experiment, we exclude other relevant factors, such as the time the email is sent, the day of the week, the recipient, etc.
25
 
26
  Links for project:
27
  - Model - [ModernBERT-Engagement-Content-Regression](https://huggingface.co/Forecast-ing/modernBERT-content-regression)
 
36
 
37
 
38
  ### Our dataset
39
+ We will be using a dataset of 548 emails where we have the text of the email text and the CTR we are trying to predict labels.
 
 
40
 
41
+ We look forward to ModernBERT's improvements, allowing us to fine-tune models for each potential user’s email dataset. The variability of email data and its small size pose interesting regression challenges.
42
  ### Benchmarking
43
+ We will start by using the Catboost library as a simple benchmark for text regression. For both the benchmark and the ModernBert run, we are using 'rmse' as the metric. We receive the following results:
 
44
  | Metric | Value |
45
  |--------|------------------|
46
  | MSE | 2.552100633998035 |
 
77
  | SMAPE | 56.61447048187256 |
78
 
79
  We see improvements in all metrics except for SMAPE. We believe that ModernBERT would scale even better with a larger dataset; as 500 example is very low for fine-tuning and are thus happy with the performance of this evaluation.
 
80
  ### Who are we?
81
+ At [Forecast.ing](https://forecast.ing) we are building a platform to help users create more enriching content by automatically researching trends and generating campaign ideas with AgenticAI. We generate the content, and then create fine-tuned scores of how likely we think that content will succeed.
 
82
 
83
  ## Conclusion
84
+ We see that ModernBERT is a powerful model for text regression. We believe that with a larger dataset, we would see even better results. We are excited to see the future of ModernBERT and how it will be used for text regression. If interested, I can be contacted [email protected]