Spaces:
Runtime error
Runtime error
update explanations
Browse files
app.py
CHANGED
@@ -66,12 +66,14 @@ def read_climate_change_results():
|
|
66 |
sentiment_results, zero_shot_results = read_climate_change_results()
|
67 |
|
68 |
|
|
|
69 |
# intro to app
|
70 |
st.title('Survey Analytic Techniques')
|
71 |
st.write('''
|
72 |
-
Organisations collect lots of data
|
73 |
It can be resource intensive to craft a good survey and getting responders to fill in their answers, and we should make full use of the data obtained.
|
74 |
-
|
|
|
75 |
We can employ the help of machines to comb through the data and provide actionable insights.
|
76 |
''')
|
77 |
|
@@ -85,7 +87,7 @@ st.write('''
|
|
85 |
- Factor Analysis - Clustering responders based on their answers
|
86 |
- Topic Modelling - Uncovering topics from text responses
|
87 |
- Zero-shot Classification - Classifying text responses into user-defined labels
|
88 |
-
- Sentiment Analysis - Quantifying sentiment of responders text responses
|
89 |
''')
|
90 |
st.write('\n')
|
91 |
st.write('\n')
|
@@ -94,20 +96,22 @@ st.markdown('''---''')
|
|
94 |
st.header('Clustering Survey Responders')
|
95 |
st.write('''
|
96 |
Having knowledge and understanding about different groups of responders can help us to customise our interactions with them.
|
97 |
-
E.g. Within
|
98 |
-
We want to be able to cluster survey
|
99 |
-
This can be achieved
|
100 |
''')
|
101 |
st.write('\n')
|
102 |
st.write('\n')
|
103 |
|
|
|
|
|
104 |
# copy data
|
105 |
df_factor_analysis = data_survey.copy()
|
106 |
|
107 |
st.subheader('Sample Survey Data')
|
108 |
st.write('''
|
109 |
Here we have a sample survey dataset where responders answer questions about their personality traits on a scale from 1 (Very Inaccurate) to 6 (Very Accurate).
|
110 |
-
Factor Analysis gives us \'factors\' or clusters of responders which provide us insights
|
111 |
''')
|
112 |
|
113 |
# split page into two columns
|
@@ -125,7 +129,7 @@ st.write('\n')
|
|
125 |
st.subheader('Factor Analysis Suitability')
|
126 |
st.write('''
|
127 |
Before performing Factor Analysis on the data, we need to evaluate if it is suitable to do so.
|
128 |
-
We apply two statistical tests (Bartlett's and KMO test) the data.
|
129 |
These two tests check if the variables in the data are correlated with each other.
|
130 |
If there isn't any correlation between the variables, then the data is unsuitable for factor analysis as there are no natural clusters.
|
131 |
''')
|
@@ -246,7 +250,7 @@ fa_z_scores = fa_z_scores.groupby('cluster').mean().reset_index()
|
|
246 |
fa_z_scores = fa_z_scores.apply(lambda x: round(x, 2))
|
247 |
|
248 |
st.write('''
|
249 |
-
Aggregating the scores of the clusters gives us
|
250 |
The scores here have been normalised to Z-scores, which is a measure of how many standard deviations (SD) is the score away from the mean.
|
251 |
E.g. A Z-score of 0 indicates the score is identical to the mean, while a Z-score of 1 indicates the score is 1 SD away from the mean.
|
252 |
''')
|
@@ -285,7 +289,7 @@ st.write('\n')
|
|
285 |
st.header('Uncovering Topics from Text Responses')
|
286 |
st.write('''
|
287 |
With feedback forms or open-ended survey questions, we want to know what are the responders generally talking about.
|
288 |
-
One way would be to manually read all the collected
|
289 |
Using **Topic Modelling**, we can programmatically extract common topics with the help of machine learning.
|
290 |
''')
|
291 |
st.write('\n')
|
@@ -303,7 +307,7 @@ st.write('\n')
|
|
303 |
|
304 |
st.subheader('Visualising Topics')
|
305 |
st.write('''
|
306 |
-
|
307 |
''')
|
308 |
|
309 |
# load and plot topics using unclean data
|
@@ -314,7 +318,7 @@ st.plotly_chart(fig, use_container_width=True)
|
|
314 |
st.write('''
|
315 |
From the chart above, we can see that 'Topic 0' and 'Topic 5' have some words that are not as meaningful.
|
316 |
For 'Topic 0', we already know that the tweets are about the Tokyo 2020 Olympics, having a topic for that isn't helpful.
|
317 |
-
'Tokyo', '2020', 'Olympics', etc., we refer to these as *stopwords*, and
|
318 |
''')
|
319 |
st.write('\n')
|
320 |
|
@@ -337,7 +341,7 @@ st.plotly_chart(fig, use_container_width=True)
|
|
337 |
|
338 |
st.write('''
|
339 |
Now we can see that the topics have improved.
|
340 |
-
We can
|
341 |
''')
|
342 |
st.write('\n')
|
343 |
st.write('\n')
|
@@ -361,7 +365,7 @@ st.write('''
|
|
361 |
The model has an understanding of the relationship between words, e.g. 'Andy Murray' is related to 'tennis'.
|
362 |
For example:
|
363 |
*'Cilic vs Menezes, after more than 3 hours and millions of unconverted match points, is one of the worst quality ten…'*
|
364 |
-
This tweet is in
|
365 |
|
366 |
Here we can inspect the individual tweets within each topic.
|
367 |
''')
|
@@ -385,7 +389,6 @@ st.write(f'''
|
|
385 |
st.dataframe(topic_results.loc[(topic_results['Topic'] == inspect_topic)])
|
386 |
st.markdown('''---''')
|
387 |
st.write('\n')
|
388 |
-
st.write('\n')
|
389 |
|
390 |
|
391 |
|
@@ -393,8 +396,8 @@ st.write('\n')
|
|
393 |
|
394 |
st.header('Classifiying Text Responses and Sentiment Analysis')
|
395 |
st.write(f'''
|
396 |
-
With survey responses, sometimes as a business user, we already have
|
397 |
-
|
398 |
Using **Zero-shot Classification**, we can classify responses into one of these four categories.
|
399 |
As an added bonus, we can also find out how responders feel about the categories using **Sentiment Analysis**.
|
400 |
''')
|
@@ -495,7 +498,7 @@ st.write(f'''
|
|
495 |
Main category score ranges from 0 to 1, with 1 being very likely.
|
496 |
|
497 |
The full set of scores are: {dict(zip(zero_shot_sample['labels'], [round(score, 2) for score in zero_shot_sample['scores']]))}
|
498 |
-
Full set of scores
|
499 |
|
500 |
The sentiment is: {emoji[sentiment_label]} **{sentiment_label}** with a score of {round(sentiment_sample, 2)}
|
501 |
Sentiment score ranges from 0 to 1, with 1 being very positive.
|
@@ -509,14 +512,14 @@ zero_shot_results = zero_shot_results.rename(columns={'sequence':'tweet', 'label
|
|
509 |
|
510 |
st.subheader('Zero-Shot Classification and Sentiment Analysis Results')
|
511 |
st.write(f'''
|
512 |
-
|
513 |
''')
|
514 |
|
515 |
st.dataframe(zero_shot_results.style.format(precision=2))
|
516 |
|
517 |
st.write(f'''
|
518 |
We can observe that the model does not have strong confidence in predicting the categories for some of the tweets.
|
519 |
-
It is likely that the tweet does not
|
520 |
Before performing further analysis on our results, we can set a score threshold to only keep predictions that we're confident in.
|
521 |
''')
|
522 |
st.write('\n')
|
@@ -535,7 +538,7 @@ zero_shot_results_clean = zero_shot_results.loc[(zero_shot_results['score'] >= u
|
|
535 |
sentiment_results.columns = ['tweet', 'sentiment']
|
536 |
|
537 |
st.write(f'''
|
538 |
-
The predictions get better with a higher threshold
|
539 |
Out of the {len(sentiment_results):,} tweets, we are now left with {len(zero_shot_results_clean)}.
|
540 |
We also add on the sentiment score for the tweets, the score here ranges from 0 (most negative) to 1 (most positive).
|
541 |
''')
|
@@ -548,11 +551,17 @@ classification_sentiment_df = classification_sentiment_df[['tweet', 'category',
|
|
548 |
st.dataframe(classification_sentiment_df.style.format(precision=2))
|
549 |
|
550 |
st.write(f'''
|
551 |
-
The difficult part
|
552 |
-
Some trial and error
|
|
|
553 |
''')
|
554 |
-
st.write('\n')
|
555 |
|
|
|
|
|
|
|
|
|
|
|
|
|
556 |
# group by category, count tweets and get mean of sentiment
|
557 |
classification_sentiment_agg = classification_sentiment_df.groupby(['category']).agg({'tweet':'count', 'sentiment':'mean'}).reset_index()
|
558 |
classification_sentiment_agg = classification_sentiment_agg.rename(columns={'tweet':'count'})
|
@@ -587,13 +596,28 @@ fig.update_yaxes(range=[0, 1])
|
|
587 |
fig.add_hline(y=0.5, line_width=3, line_color='darkgreen')
|
588 |
st.plotly_chart(fig, use_container_width=True)
|
589 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
590 |
st.markdown('''---''')
|
591 |
st.write('\n')
|
592 |
st.write('\n')
|
593 |
|
594 |
st.write('''
|
595 |
-
That's the end of
|
596 |
''')
|
597 |
st.write('\n')
|
598 |
-
st.image('https://images.unsplash.com/photo-1620712943543-bcc4688e7485
|
599 |
-
st.caption('Photo by [Andrea De Santis](https://unsplash.com/@santesson89) on [Unsplash](https://unsplash.com).')
|
|
|
|
|
|
66 |
sentiment_results, zero_shot_results = read_climate_change_results()
|
67 |
|
68 |
|
69 |
+
|
70 |
# intro to app
|
71 |
st.title('Survey Analytic Techniques')
|
72 |
st.write('''
|
73 |
+
Organisations collect lots of data every day through surveys, to get feedback, understand user behaviour, track trends across time etc.
|
74 |
It can be resource intensive to craft a good survey and getting responders to fill in their answers, and we should make full use of the data obtained.
|
75 |
+
|
76 |
+
Processing and analysing the data is tedious and time-consuming but it doesn't have to be!
|
77 |
We can employ the help of machines to comb through the data and provide actionable insights.
|
78 |
''')
|
79 |
|
|
|
87 |
- Factor Analysis - Clustering responders based on their answers
|
88 |
- Topic Modelling - Uncovering topics from text responses
|
89 |
- Zero-shot Classification - Classifying text responses into user-defined labels
|
90 |
+
- Sentiment Analysis - Quantifying sentiment of responders' text responses
|
91 |
''')
|
92 |
st.write('\n')
|
93 |
st.write('\n')
|
|
|
96 |
st.header('Clustering Survey Responders')
|
97 |
st.write('''
|
98 |
Having knowledge and understanding about different groups of responders can help us to customise our interactions with them.
|
99 |
+
E.g. Within Financial Institutions, we have banks, insurers, and payment services, and they have different structures and behaviours from one another.
|
100 |
+
We want to be able to cluster survey responders into various groups based on how their answers.
|
101 |
+
This can be achieved through **Factor Analysis**.
|
102 |
''')
|
103 |
st.write('\n')
|
104 |
st.write('\n')
|
105 |
|
106 |
+
|
107 |
+
|
108 |
# copy data
|
109 |
df_factor_analysis = data_survey.copy()
|
110 |
|
111 |
st.subheader('Sample Survey Data')
|
112 |
st.write('''
|
113 |
Here we have a sample survey dataset where responders answer questions about their personality traits on a scale from 1 (Very Inaccurate) to 6 (Very Accurate).
|
114 |
+
Factor Analysis gives us \'factors\' or clusters of responders which provide us insights into the different personalities of the responders.
|
115 |
''')
|
116 |
|
117 |
# split page into two columns
|
|
|
129 |
st.subheader('Factor Analysis Suitability')
|
130 |
st.write('''
|
131 |
Before performing Factor Analysis on the data, we need to evaluate if it is suitable to do so.
|
132 |
+
We apply two statistical tests (Bartlett's and KMO test) to the data.
|
133 |
These two tests check if the variables in the data are correlated with each other.
|
134 |
If there isn't any correlation between the variables, then the data is unsuitable for factor analysis as there are no natural clusters.
|
135 |
''')
|
|
|
250 |
fa_z_scores = fa_z_scores.apply(lambda x: round(x, 2))
|
251 |
|
252 |
st.write('''
|
253 |
+
Aggregating the scores of the clusters gives us detailed insights into the personality traits of the responders.
|
254 |
The scores here have been normalised to Z-scores, which is a measure of how many standard deviations (SD) is the score away from the mean.
|
255 |
E.g. A Z-score of 0 indicates the score is identical to the mean, while a Z-score of 1 indicates the score is 1 SD away from the mean.
|
256 |
''')
|
|
|
289 |
st.header('Uncovering Topics from Text Responses')
|
290 |
st.write('''
|
291 |
With feedback forms or open-ended survey questions, we want to know what are the responders generally talking about.
|
292 |
+
One way would be to manually read all the collected responses to get a sense of the topics within, however, this is very manual and subjective.
|
293 |
Using **Topic Modelling**, we can programmatically extract common topics with the help of machine learning.
|
294 |
''')
|
295 |
st.write('\n')
|
|
|
307 |
|
308 |
st.subheader('Visualising Topics')
|
309 |
st.write('''
|
310 |
+
Let's generate some topics without performing any cleaning to the data.
|
311 |
''')
|
312 |
|
313 |
# load and plot topics using unclean data
|
|
|
318 |
st.write('''
|
319 |
From the chart above, we can see that 'Topic 0' and 'Topic 5' have some words that are not as meaningful.
|
320 |
For 'Topic 0', we already know that the tweets are about the Tokyo 2020 Olympics, having a topic for that isn't helpful.
|
321 |
+
'Tokyo', '2020', 'Olympics', etc., we refer to these as *stopwords*, and let's remove them and regenerate the topics.
|
322 |
''')
|
323 |
st.write('\n')
|
324 |
|
|
|
341 |
|
342 |
st.write('''
|
343 |
Now we can see that the topics have improved.
|
344 |
+
We can use the top words in each topic to come up with a meaningful name, this has to be done manually and is subjective.
|
345 |
''')
|
346 |
st.write('\n')
|
347 |
st.write('\n')
|
|
|
365 |
The model has an understanding of the relationship between words, e.g. 'Andy Murray' is related to 'tennis'.
|
366 |
For example:
|
367 |
*'Cilic vs Menezes, after more than 3 hours and millions of unconverted match points, is one of the worst quality ten…'*
|
368 |
+
This tweet is in Topic 9 - Tennis without the word 'tennis' in it.
|
369 |
|
370 |
Here we can inspect the individual tweets within each topic.
|
371 |
''')
|
|
|
389 |
st.dataframe(topic_results.loc[(topic_results['Topic'] == inspect_topic)])
|
390 |
st.markdown('''---''')
|
391 |
st.write('\n')
|
|
|
392 |
|
393 |
|
394 |
|
|
|
396 |
|
397 |
st.header('Classifiying Text Responses and Sentiment Analysis')
|
398 |
st.write(f'''
|
399 |
+
With survey responses, sometimes as a business user, we already have a general idea of what responders are talking about and we want to categorise or classify the responses accordingly.
|
400 |
+
As an example, within the topic of 'Climate Change', we are interested in finance, politics, technology, and wildlife.
|
401 |
Using **Zero-shot Classification**, we can classify responses into one of these four categories.
|
402 |
As an added bonus, we can also find out how responders feel about the categories using **Sentiment Analysis**.
|
403 |
''')
|
|
|
498 |
Main category score ranges from 0 to 1, with 1 being very likely.
|
499 |
|
500 |
The full set of scores are: {dict(zip(zero_shot_sample['labels'], [round(score, 2) for score in zero_shot_sample['scores']]))}
|
501 |
+
Full set of scores adds up to 1.
|
502 |
|
503 |
The sentiment is: {emoji[sentiment_label]} **{sentiment_label}** with a score of {round(sentiment_sample, 2)}
|
504 |
Sentiment score ranges from 0 to 1, with 1 being very positive.
|
|
|
512 |
|
513 |
st.subheader('Zero-Shot Classification and Sentiment Analysis Results')
|
514 |
st.write(f'''
|
515 |
+
Let's review all the tweets and how they fall into the categories of finance, politics, technology, and wildlife.
|
516 |
''')
|
517 |
|
518 |
st.dataframe(zero_shot_results.style.format(precision=2))
|
519 |
|
520 |
st.write(f'''
|
521 |
We can observe that the model does not have strong confidence in predicting the categories for some of the tweets.
|
522 |
+
It is likely that the tweet does not naturally fall into one of the defined categories.
|
523 |
Before performing further analysis on our results, we can set a score threshold to only keep predictions that we're confident in.
|
524 |
''')
|
525 |
st.write('\n')
|
|
|
538 |
sentiment_results.columns = ['tweet', 'sentiment']
|
539 |
|
540 |
st.write(f'''
|
541 |
+
The predictions get better with a higher threshold but reduces the final number of tweets available for further analysis.
|
542 |
Out of the {len(sentiment_results):,} tweets, we are now left with {len(zero_shot_results_clean)}.
|
543 |
We also add on the sentiment score for the tweets, the score here ranges from 0 (most negative) to 1 (most positive).
|
544 |
''')
|
|
|
551 |
st.dataframe(classification_sentiment_df.style.format(precision=2))
|
552 |
|
553 |
st.write(f'''
|
554 |
+
The difficult part of zero-shot classification is defining the right set of categories for each business case.
|
555 |
+
Some trial and error are required to find the appropriate words that can return the optimal results.
|
556 |
+
E.g. Do we want to differentiate between 'plants' and 'animals', or is 'wildlife' better as an overall category?
|
557 |
''')
|
|
|
558 |
|
559 |
+
st.write(f'''
|
560 |
+
With sentiment analysis, the model typically has pitfalls such as not being able to detect sarcasm well.
|
561 |
+
However, sarcastic responses are typically outliers in survey data and the data points would be smoothed out when we look at average the sentiment scores.
|
562 |
+
''')
|
563 |
+
|
564 |
+
st.write('\n')
|
565 |
# group by category, count tweets and get mean of sentiment
|
566 |
classification_sentiment_agg = classification_sentiment_df.groupby(['category']).agg({'tweet':'count', 'sentiment':'mean'}).reset_index()
|
567 |
classification_sentiment_agg = classification_sentiment_agg.rename(columns={'tweet':'count'})
|
|
|
596 |
fig.add_hline(y=0.5, line_width=3, line_color='darkgreen')
|
597 |
st.plotly_chart(fig, use_container_width=True)
|
598 |
|
599 |
+
st.write('''
|
600 |
+
To improve the performance of the models, further fine tuning can be done.
|
601 |
+
We would also need labelled data to test against which is usually not readily available and can be difficult and expensive to obtain.
|
602 |
+
|
603 |
+
If you're just thinking of exploring the feasibility of applying text analysis on your dataset, the pre-trained models used in this app will be perfect!
|
604 |
+
We've leveraged state-of-the-art deep learning models to jumpstart our analytics capabilities.
|
605 |
+
The base models used for sentiment analysis and zero-shot classification and are called BERT (developed by Google) and BART (developed by Facebook) respectively.
|
606 |
+
|
607 |
+
These language models require large amounts of data and resources to be trained.
|
608 |
+
BERT by Google was trained on the whole Wikipedia (about 2.5 billion words) and 11 thousand books, while BART was trained the same plus 63 million news articles and other text scraped from the internet.
|
609 |
+
An example of a fine-tuned model is FinBERT, which builds on top of BERT and is further trained on financial news to analyse the sentiment of finance-related text.
|
610 |
+
''')
|
611 |
+
|
612 |
st.markdown('''---''')
|
613 |
st.write('\n')
|
614 |
st.write('\n')
|
615 |
|
616 |
st.write('''
|
617 |
+
That's the end of this demo 😎, the source code can be found on [Github](https://github.com/Greco1899/survey_analytics).
|
618 |
''')
|
619 |
st.write('\n')
|
620 |
+
st.image('https://images.unsplash.com/photo-1620712943543-bcc4688e7485')
|
621 |
+
st.caption('Photo by [Andrea De Santis](https://unsplash.com/@santesson89) on [Unsplash](https://unsplash.com).')
|
622 |
+
|
623 |
+
|