# Arthur Quickstart From a Python environment with the `arthurai` package installed, this quickstart code will: 1. Make binary classification predictions on a small dataset 2. Onboard the model with reference data to Arthur 3. Log batches of model inference data with Arthur 4. Get performance results for our model ## Imports The `arthurai` package can be `pip`-installed from the terminal, along with `numpy` and `pandas`: `pip install arthurai numpy pandas`. Then you can import from the `arthurai` package like this: ```python # Arthur imports from arthurai import ArthurAI from arthurai.common.constants import InputType, OutputType, Stage # other libraries used in this example import numpy as np import pandas as pd ``` ## Model Predictions We write out samples from a Titanic survival prediction dataset explicitly in python, giving the age of each passenger, the cost of their ticket, the passenger class of their ticket, and the ground-truth label of whether they survived. Our model's outputs are given by a predict function using only the `age` variable. We split the data into * `reference_data` for onboarding the model * `inference_data` for in-production inferences the model processes ```{note} We include model outputs, ground-truth values, and non-input data in reference_data, which are optional but recommended. ``` ```python # Define Titanic sample data titanic_data = pd.DataFrame({ 'age': [16.0, 24.0, 19.0, 58.0, 30.0, 22.0, 40.0, 37.0, 65.0, 32.0], 'fare': [86.5, 49.5042, 8.05, 153.4625, 7.8958, 7.75, 7.8958, 29.7, 7.75, 7.8958], 'passenger_class': [1, 1, 3, 1, 3, 3, 3, 1, 3, 3], 'survived': [1, 1, 1, 1, 0, 1, 0, 0, 0, 0]}) # Split into reference and inference data reference_data, inference_data = titanic_data[:6], titanic_data[6:] # Predict the probability of Titanic survival as inverse percentile of age def predict(age): nearest_age_index = np.argmin(np.abs(np.sort(reference_data['age']) - age)) return 1 - (nearest_age_index / (len(reference_data) - 1)) # reference_data and inference_data contain the model's inputs and outputs reference_data['pred_survived'] = reference_data['age'].apply(predict) inference_data['pred_survived'] = inference_data['age'].apply(predict) ``` ## Onboarding This code will only run once you enter a valid username and password. We register our `arthur_model` with Arthur as a tabular classifier with the name "TitanicQuickstart". Then we build its model schema from `reference_data`, specifying which attributes are in which {ref}`stage `. Additionally, we configure extra settings for the `passenger_class` attribute. Then we save the model to the platform. ```python # Connect to Arthur arthur = ArthurAI(url="https://app.arthur.ai", login="") # Register the model type with Arthur arthur_model = arthur.model(partner_model_id="TitanicQuickstart", input_type=InputType.Tabular, output_type=OutputType.Multiclass) # Map PredictedValue attribute to its corresponding GroundTruth attribute value. # This tells Arthur that the `pred_survived` column represents # the probability that the ground truth column has the value 1 pred_to_ground_truth_map = {'pred_survived' : 1} # Build arthur_model schema on the reference dataset, # specifying which attribute represents ground truth # and which attributes are NonInputData. # Arthur will monitor NonInputData attributes even though they are not model inputs. arthur_model.build(reference_data, ground_truth_column='survived', pred_to_ground_truth_map=pred_to_ground_truth_map, non_input_columns=['fare', 'passenger_class']) # Configure the `passenger_class` attribute # 1. Turn on bias monitoring for the attribute. # 2. Specify that the passenger_class attribute has possible values [1, 2, 3], # since that information was not present in reference_data (only values 1 and 3 are present). arthur_model.get_attribute(name='passenger_class').set(monitor_for_bias=True, categories=[1,2,3]) # onboard the model to Arthur arthur_model.save() ``` ## Sending Inferences Here we send batches of inferences from `inference_data` to Arthur. ```python # send four batches of inferences for batch in range(4): # Sample the inference dataset with predictions inferences = inference_data.sample(np.random.randint(2, 5)) # Send the inferences to Arthur arthur_model.send_inferences(inferences, batch_id=f"batch_{batch}") ``` ## Performance Results With our model onboarded and inferences sent, we can get performance results from Arthur. View your model in your Arthur dashboard, or use the code below to fetch the overall accuracy rate: ```python # query model accuracy across the batches query = { "select": [ { "function": "accuracyRate" } ] } query_result = arthur_model.query(query) ``` If you print `query_result`, you should see `[{'accuracyRate': 1}]`. ## Next Steps ### {doc}`Basic Concepts ` The {doc}`basic_concepts` page contains a quick introduction to important terms and ideas to get familiar with model monitoring using the Arthur platform. ### {doc}`Onboard Your Model ` The {doc}`Model Onboarding walkthrough ` page covers the steps of onboarding a model, formatting attribute data, and sending inferences to Arthur.