Spaces:

corvo7
/

Data_Analysis

Sleeping

App Files Files Community

Data_Analysis / pages /statistics.py

Bhanuprakashpujala

Update pages/statistics.py

b9390c2 verified 12 months ago

raw

history blame

11 kB

	import streamlit as st

	# Function to display the main content
	def show_content(topic):
	if topic == "Introduction":
	st.title("STATISTICS")
	st.subheader("Data")
	st.write("""
	"Data" refers to raw facts, information, or observations that are collected, recorded, or measured. Data can take various forms, including numbers, text, images, audio, and more.

	It is a collection of real facts in numerical figures known as data.

	Data is a piece of information or facts that can be collected. The data can be any form such as numbers, text, images, sounds, and more.
	""")
	st.write("Add some examples here.")
	if st.button("Next"):
	st.session_state.page = "Types of Data"

	elif topic == "Types of Data":
	st.subheader("Types of Data")
	st.image("images/page_1_img_1.png")
	st.write("""
	Quantitative data: The data which is measured in numerical figures.
	Ex: Ages, weights, heights
	This is divided into 2 types.
	- Individual Data or ungrouped data: which can take some individual values.
	Ex: 1,2,3,4,5 [n=5] If n<30 we take individual data If n>= 30 we go to grouped data.
	- Grouped data or frequency data: When the observations are the number of occurred (appeared) times is known as the frequency of observations.
	This is again divided into:
	1. Discrete frequency data: Which can take some distinct values and maintain some gap between the values is called discrete frequency data.
	""")
	st.image("images/page_2_img_1.png")
	st.write("""
	2. Continuous frequency data: Which can take any functional values within the specified range is called continuous frequency data.
	This again divided into:
	- Exclusive continuous frequency data: In this format, the upper bound of any class is the same as the lower bound of the next class is known as exclusive continuous frequency data. Here the lower bounds are included, the upper bounds are excluded.
	""")
	st.image("images/page_2_img_2.png")
	st.write("""
	- Inclusive continuous frequency data: In this format, the upper bound of any class is not the same as the lower bound of the next class is known as inclusive continuous. Here lower bound and upper bound are included.
	So we convert into exclusive continuous because of partial continuation. Inclusive to Exclusive:
	Difference between upper bound of any class and lower bound of next class d =1.
	Correlation factor c.f = d/2 = 1/2 = 0.5
	Then lower bound - c.f and upper bound +c.f
	""")
	st.image("images/page_3_img_1.png")
	st.write("""
	Qualitative data: Also known as categorical data, describes the data that fits into the categories. Qualitative data are not numerical. The categorical information involves categorical variables that describe the features such as a person’s gender, home town, etc. Categorical measures are defined in terms of natural language specifications, but not in terms of numbers. Sometimes categorical data can hold numerical values (quantitative value), but those values do not have a mathematical sense. Examples of the categorical data are birthdate, favorite sport, school postcode. Here, the birthdate and school postcode hold the quantitative value, but it does not give numerical meaning.
	""")
	st.write("""
	Nominal: Is a type of qualitative data that puts things into categories or groups, but these categories don’t have any natural order or ranking. Ex: colors, names, city names.
	Nominal data is one of the types of qualitative information which helps to label the variables without providing the numerical value. It’s the simplest form of data and is commonly used for the purpose of classification or categorization. Nominal data is also called the nominal scale. It cannot be ordered and measured. But sometimes, the data can be qualitative and quantitative. Examples of nominal data are letters, symbols, words, gender, etc.
	""")
	st.write("""
	Ordinal: Is a type of qualitative data that has a natural order or ranking between the categories. In other words, ordinal data allows us to rank or order the different categories in terms of their relative position or importance.
	Ordinal data/variable is a type of data that follows a natural order. The significant feature of the nominal data is that the difference between the data values is not determined. This variable is mostly found in surveys, finance, economics, questionnaires, and so on. For example, consider a survey that asks respondents to rate their satisfaction with a restaurant on a scale of 1 to 5, where 1 means “very dissatisfied” and 5 means “very satisfied”. The responses here represent ordinal data because they categorise the respondents by satisfaction level and the categories have a clear order (1 < 2 < 3 < 4 < 5), but the difference between each category is not quantitatively defined (the difference in satisfaction between 1 and 2 might not be the same as between 4 and 5). The ordinal data is commonly represented using a bar chart. These data are investigated and interpreted through many visualisation tools. The information may be expressed using tables in which each row in the table shows the distinct category.
	""")
	if st.button("Next"):
	st.session_state.page = "Data Collection"

	elif topic == "Data Collection":
	st.subheader("Data Collection in Statistics")
	st.write("""
	In statistics, data is usually collected in two ways: as a population or as a sample:
	1. Population Data: This refers to data that relates to all members of a particular group or set. The population is the entire group that you want to draw conclusions about. For example, if you wanted to know the average height of all adult men in India, the population would be all adult men in India.
	2. Sample Data: This refers to data collected from a subset of the population. A sample is a group of subjects selected from the population. For example, if it’s not feasible to measure the height of every adult man in India, you might measure the heights of a sample of 1000 men selected randomly from across the country. The idea is that the sample represents the population and can give you a good estimate of the population parameter.
	In both cases, the goal is usually to learn something about the population. When it’s not practical or possible to study the entire population, then a sample is used, and statistical inference is used to draw conclusions about the population based on the sample.
	""")
	st.image("images\page_7_img_1.png")
	st.write("Examples:")
	st.write("1. Educational levels (e.g., elementary school, high school, bachelor’s degree, master’s degree).")
	st.write("2. Satisfaction levels (e.g., very satisfied, somewhat satisfied, neither satisfied nor dissatisfied, somewhat dissatisfied, very dissatisfied).")
	if st.button("Next"):
	st.session_state.page = "Introduction to Statistics"

	elif topic == "Introduction to Statistics":
	st.subheader("Introduction to Statistics")
	st.write("""
	Statistics is a branch of mathematics that involves collecting, analyzing, interpreting, presenting, and organizing data. It helps us make sense of the information we gather and make informed decisions. In simpler terms, statistics is like a toolbox of methods and techniques we use to understand and describe the world around us by studying numerical information.
	""")
	if st.button("Next"):
	st.session_state.page = "Types of Statistics"

	elif topic == "Types of Statistics":
	st.subheader("Types of Statistics")
	st.write("""
	The study of statistics can be organized in a variety of ways. One of the main ways is to subdivide statistics into two branches: descriptive statistics and inferential statistics. To understand the difference between descriptive and inferential statistics, definitions of population and sample are helpful.
	""")
	st.write("""
	Descriptive statistics summarize and organize characteristics of a data set. A data set is a collection of responses or observations from a sample or entire population.
	In quantitative research, after collecting data, the first step of statistical analysis is to describe characteristics of the responses, such as the average of one variable (e.g., age), or the relation between two variables (e.g., age and creativity).
	The next step is inferential statistics, which help you decide whether your data confirms or refutes your hypothesis and whether it is generalizable to a larger population.
	""")
	st.write("""
	This is again divided into 4 parts:
	1. 1st order measurements or measures of central tendency
	2. 2nd order measurements or measures of dispersion
	3. 3rd order measurements or measures of skewness
	4. 4th order measurements or measures of kurtosis
	""")
	st.write("""
	Key points:
	- Descriptive statistics summarize or describe the characteristics of a data set.
	- Descriptive statistics consist of three basic categories of measures: measures of central tendency, measures of variability (or spread), and frequency distribution.
	- Measures of central tendency describe the center of the data set (mean, median, mode).
	- Measures of variability describe the dispersion of the data set (variance, standard deviation).
	- Measures of frequency distribution describe the occurrence of data within the data set (count).
	""")
	st.write("""
	1st order measurements or measures of central tendency:
	To concentrate on the central part of the data is known as central tendency
	And these measures are called measures of central tendency.
	The 3 most common measures of central tendency are the mode, median, and mean.
	Mean is again divided into:
	1. Arithmetic mean
	2. Geometric mean
	3. Harmonic mean
	Mode: the most frequent value.
	Median: the middle number in an ordered dataset.
	Mean: the sum of all values divided by the total number of values.
	""")

	# Set up sidebar navigation
	topics = [
	"Introduction",
	"Types of Data",
	"Data Collection",
	"Introduction to Statistics",
	"Types of Statistics"
	]

	st.sidebar.title("Statistics Topics")
	selection = st.sidebar.radio("Go to", topics)

	# Initialize session state if not already done
	if 'page' not in st.session_state:
	st.session_state.page = selection

	# Update session state if sidebar selection is changed
	if st.sidebar.button("Navigate"):
	st.session_state.page = selection

	# Display the selected content
	show_content(st.session_state.page)