Update README.md
Browse files
README.md
CHANGED
@@ -13,15 +13,15 @@ model-index:
|
|
13 |
# ModernBERT Engagement Content Regression
|
14 |
### What is this?
|
15 |
|
16 |
-
This
|
17 |
|
18 |
-
We will
|
19 |
|
20 |
-
This type of task
|
21 |
> Half my advertising is wasted; the trouble is, I don't know which half
|
22 |
> -John Wanamaker
|
23 |
|
24 |
-
|
25 |
|
26 |
Links for project:
|
27 |
- Model - [ModernBERT-Engagement-Content-Regression](https://huggingface.co/Forecast-ing/modernBERT-content-regression)
|
@@ -36,13 +36,11 @@ This work is indebted to the work of many community members and blog posts.
|
|
36 |
|
37 |
|
38 |
### Our dataset
|
39 |
-
We will be using a dataset of 548 emails where we have the text of the email
|
40 |
-
|
41 |
-
We look forward in the improvements of ModernBERT to fine-tune models specifically for each potential users email dataset. The variability of email data, as well as the small size of the dataset pose an interesting regression challenge.
|
42 |
|
|
|
43 |
### Benchmarking
|
44 |
-
We will start by using the Catboost library as a simple benchmark for text regression. For both the benchmark and the ModernBert run, we are using 'rmse' as the metric.
|
45 |
-
We recieve the following results:
|
46 |
| Metric | Value |
|
47 |
|--------|------------------|
|
48 |
| MSE | 2.552100633998035 |
|
@@ -79,11 +77,8 @@ After running hyperparameter tuning for ModernBERT, we get the following results
|
|
79 |
| SMAPE | 56.61447048187256 |
|
80 |
|
81 |
We see improvements in all metrics except for SMAPE. We believe that ModernBERT would scale even better with a larger dataset; as 500 example is very low for fine-tuning and are thus happy with the performance of this evaluation.
|
82 |
-
|
83 |
### Who are we?
|
84 |
-
At [Forecast.ing](https://forecast.ing) we are building a platform to help users create more enriching content by automatically researching trends and generating campaign ideas with AgenticAI.
|
85 |
-
We generate the content, and then create fine-tuned scores of how likely we think that content will succeed.
|
86 |
|
87 |
## Conclusion
|
88 |
-
We see that ModernBERT is a powerful model for text regression. We believe that with a larger dataset, we would see even better results. We are excited to see the future of ModernBERT and how it will be used for text regression.
|
89 |
-
If interested, I can be contacted at [email protected]
|
|
|
13 |
# ModernBERT Engagement Content Regression
|
14 |
### What is this?
|
15 |
|
16 |
+
This explores using modernBERT for the text regression task of predicting engagement metrics for text content. In this case, we predict the clickthrough rate (CTR) of email text content.
|
17 |
|
18 |
+
We will explore modernBert's hyperparameter tuning and how to use it for regression. We will also compare the results to a benchmark model.
|
19 |
|
20 |
+
This type of task is complex; we can remember the quote.
|
21 |
> Half my advertising is wasted; the trouble is, I don't know which half
|
22 |
> -John Wanamaker
|
23 |
|
24 |
+
In this experiment, we exclude other relevant factors, such as the time the email is sent, the day of the week, the recipient, etc.
|
25 |
|
26 |
Links for project:
|
27 |
- Model - [ModernBERT-Engagement-Content-Regression](https://huggingface.co/Forecast-ing/modernBERT-content-regression)
|
|
|
36 |
|
37 |
|
38 |
### Our dataset
|
39 |
+
We will be using a dataset of 548 emails where we have the text of the email text and the CTR we are trying to predict labels.
|
|
|
|
|
40 |
|
41 |
+
We look forward to ModernBERT's improvements, allowing us to fine-tune models for each potential user’s email dataset. The variability of email data and its small size pose interesting regression challenges.
|
42 |
### Benchmarking
|
43 |
+
We will start by using the Catboost library as a simple benchmark for text regression. For both the benchmark and the ModernBert run, we are using 'rmse' as the metric. We receive the following results:
|
|
|
44 |
| Metric | Value |
|
45 |
|--------|------------------|
|
46 |
| MSE | 2.552100633998035 |
|
|
|
77 |
| SMAPE | 56.61447048187256 |
|
78 |
|
79 |
We see improvements in all metrics except for SMAPE. We believe that ModernBERT would scale even better with a larger dataset; as 500 example is very low for fine-tuning and are thus happy with the performance of this evaluation.
|
|
|
80 |
### Who are we?
|
81 |
+
At [Forecast.ing](https://forecast.ing) we are building a platform to help users create more enriching content by automatically researching trends and generating campaign ideas with AgenticAI. We generate the content, and then create fine-tuned scores of how likely we think that content will succeed.
|
|
|
82 |
|
83 |
## Conclusion
|
84 |
+
We see that ModernBERT is a powerful model for text regression. We believe that with a larger dataset, we would see even better results. We are excited to see the future of ModernBERT and how it will be used for text regression. If interested, I can be contacted [email protected]
|
|