yitingliii commited on
Commit
7044552
·
verified ·
1 Parent(s): b98218f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -10
README.md CHANGED
@@ -20,14 +20,7 @@ import pandas as pd
20
  from sklearn.svm import SVC
21
  ```
22
 
23
- 2. File Description
24
- - config.json: Configuration file for model and dataset parameters.
25
- - ml.py: Python script containing the machine learning pipeline.
26
- - model.pkl: Trained SVM model saved as a pickle file.
27
- - tfidf.pkl: TF-IDF vectorizer saved as a pickle file.
28
- - README.md: Documentation for the repository.
29
-
30
- 3. Data Cleaning
31
  <br> The clean() function performs data preprocessing to prepare the input data for training. This includes:
32
  - Removing HTML tags using BeautifulSoup.
33
  - Removing non-alphanumeric characters and extra spaces.
@@ -37,10 +30,10 @@ from sklearn.svm import SVC
37
 
38
 
39
  ```python
40
- from clean_data import clean
41
 
42
  # Load your data
43
- df = pd.read_csv('your_dataset.csv')
44
 
45
  # Clean the data
46
  cleaned_df = clean(df)
 
20
  from sklearn.svm import SVC
21
  ```
22
 
23
+ 2. Data Cleaning
 
 
 
 
 
 
 
24
  <br> The clean() function performs data preprocessing to prepare the input data for training. This includes:
25
  - Removing HTML tags using BeautifulSoup.
26
  - Removing non-alphanumeric characters and extra spaces.
 
30
 
31
 
32
  ```python
33
+ from data_cleaning import clean
34
 
35
  # Load your data
36
+ df = pd.read_csv('test_data_random_subset.csv')
37
 
38
  # Clean the data
39
  cleaned_df = clean(df)