File size: 2,880 Bytes
6ec2ef9
de0e3ad
6ec2ef9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
import pandas as pd
from scikit import sklearn
from sklearn.compose import ColumnTransformer
from sklearn.ensemble import RandomForestClassifier
from sklearn.impute import SimpleImputer
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import LabelEncoder, StandardScaler
from streamlit import *
import joblib


# Load the CSV data
data = pd.read_csv('dataset.csv')

# Split the data into features and labels
X = data.drop('PlacedOrNot', axis=1)
y = data['PlacedOrNot']

# Encode categorical features
categorical_features = ['HistoryOfBacklogs']
for feature in categorical_features:
    encoder = LabelEncoder()
    X[feature] = encoder.fit_transform(X[feature])

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create the pipeline
numerical_features = ['Internships', 'CGPA']
numerical_transformer = StandardScaler()
categorical_features = [ 'HistoryOfBacklogs']
categorical_transformer = SimpleImputer(strategy='most_frequent')
preprocessor = ColumnTransformer(
    transformers=[
        ('num', numerical_transformer, numerical_features),
        ('cat', categorical_transformer, categorical_features)
    ])

pipeline = Pipeline([
    ('preprocessor', preprocessor),
    ('classifier', RandomForestClassifier(random_state=42))
])

# Train the model
pipeline.fit(X_train, y_train)

# Evaluate the model
accuracy = pipeline.score(X_test, y_test)
print('Accuracy:', accuracy)
joblib.dump(pipeline, 'student_placement_model.joblib')

# Define Streamlit API
def predict_placement(internships, cgpa, history_of_backlogs, stream):
    # Load the trained pipeline
    pipeline = joblib.load('student_placement_model.joblib')
    
    # Prepare input data
    input_data = pd.DataFrame({'internships': [internships],
                                'cgpa': [cgpa],
                                'history_of_backlogs': [history_of_backlogs],
                                'stream': [stream]})
    
    # Make prediction
    prediction = pipeline.predict(input_data)
    
    return prediction[0]

# Define Streamlit web app
def streamlit_app():
    title('Student Placement Prediction')
    internships = number_input('Number of internships:', min_value=0, max_value=10, step=1)
    cgpa = number_input('CGPA:', min_value=0.0, max_value=10.0, step=0.1)
    history_of_backlogs = number_input('Number of history of backlogs:', min_value=0, max_value=10, step=1)
    stream = selectbox('Stream:', options=['Science', 'Commerce', 'Arts'])
    prediction = predict_placement(internships, cgpa, history_of_backlogs, stream)
    if prediction == 1:
        result = 'Placed'
    else:
        result = 'Not Placed'
    button('Predict Placement')
    write(f'Result: {result}')

if __name__ == '__main__':
    streamlit_app()