File size: 2,794 Bytes
5657307
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
# Scikit-Learn

To run the scikit-learn examples make sure you have installed the following library:

```bash
pip install -U scikit-learn
```

The metrics in `evaluate` can be easily integrated with an Scikit-Learn estimator or [pipeline](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html#sklearn.pipeline.Pipeline).  

However, these metrics require that we generate the predictions from the model. The predictions and labels from the estimators can be passed to `evaluate` mertics to compute the required values.

```python
import numpy as np
np.random.seed(0)
import evaluate
from sklearn.compose import ColumnTransformer
from sklearn.datasets import fetch_openml
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
```

Load data from https://www.openml.org/d/40945:

```python
X, y = fetch_openml("titanic", version=1, as_frame=True, return_X_y=True)
```

Alternatively X and y can be obtained directly from the frame attribute:

```python
X = titanic.frame.drop('survived', axis=1)
y = titanic.frame['survived']
```

We create the preprocessing pipelines for both numeric and categorical data. Note that pclass could either be treated as a categorical or numeric feature.

```python
numeric_features = ["age", "fare"]
numeric_transformer = Pipeline(
    steps=[("imputer", SimpleImputer(strategy="median")), ("scaler", StandardScaler())]
)

categorical_features = ["embarked", "sex", "pclass"]
categorical_transformer = OneHotEncoder(handle_unknown="ignore")

preprocessor = ColumnTransformer(
    transformers=[
        ("num", numeric_transformer, numeric_features),
        ("cat", categorical_transformer, categorical_features),
    ]
)
```

Append classifier to preprocessing pipeline. Now we have a full prediction pipeline.

```python
clf = Pipeline(
    steps=[("preprocessor", preprocessor), ("classifier", LogisticRegression())]
)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
```

As `Evaluate` metrics use lists as inputs for references and predictions, we need to convert them to Python lists.


```python
# Evaluate metrics accept lists as inputs for values of references and predictions

y_test = y_test.tolist()
y_pred = y_pred.tolist()

# Accuracy

accuracy_metric = evaluate.load("accuracy")
accuracy = accuracy_metric.compute(references=y_test, predictions=y_pred)
print("Accuracy:", accuracy)
# Accuracy: 0.79
```

You can use any suitable `evaluate` metric with the estimators as long as they are compatible with the task and predictions.