metadata

license: cc-by-sa-4.0
language:
  - en

phytoClassUCSC

Sections and prompts from the model cards paper, v2.

Jump to section:

Model details
Intended use
Factors
Metrics
Evaluation data
Training data
Quantitative analyses
Ethical considerations
Caveats and recommendations

Model details

Review section 4.1 of the model cards paper.

Developed by the Kudela Lab from the Ocean Sciences Department at University of California, Santa Cruz.
Current version trained in February, 2023.
phytoClassUCSC-SoftNone02162023
phytoClassUCSC is a depthwise- CNN based on the Xception architecture Chollet, F., 2017 with 134 layers using weights pretrained on ImageNet.
An average pooling layer is used.
Paper or other resource for more information
Citation details
License
Email Patrick Daniel ([email protected]) for questions

Intended use

This model was designed and trained to work with IFCB data generated in Monterey Bay. While that does not mean it may not perform well in other locations, the distribution of training images reflects common phytoplankton observed at the Santa Cruz Wharf and Power Buoy locations.

Independent model validation should be used when applying the model to other sites.

Review section 4.2 of the model cards paper.

Primary intended uses

Generalized phytoplankton classifier for common taxa found in the Monterey Bay.

Primary intended users

IFCB users or researchers interested in phytoplankton ecology.

Out-of-scope use cases

Observing and identifying rare or non-endemic taxa.

Factors

Model classes were chosen based on common and resolvable phytoplankton taxa. Taxonomic groupings were chosen based on what researchers in the lab felt groups that could be confidently identified, given the expertise and research intersts of the lab.

Review section 4.3 of the model cards paper.

Relevant factors

Evaluation factors

Metrics

The appropriate metrics to feature in a model card depend on the type of model that is being tested. For example, classification systems in which the primary output is a class label differ significantly from systems whose primary output is a score. In all cases, the reported metrics should be determined based on the model’s structure and intended use.

Review section 4.4 of the model cards paper.

Model performance measures

Decision thresholds

Approaches to uncertainty and variability

Evaluation data

All referenced datasets would ideally point to any set of documents that provide visibility into the source and composition of the dataset. Evaluation datasets should include datasets that are publicly available for third-party use. These could be existing datasets or new ones provided alongside the model card analyses to enable further benchmarking.

Review section 4.5 of the model cards paper.

Datasets

Motivation

Preprocessing

Training data

Review section 4.6 of the model cards paper.

Quantitative analyses

Quantitative analyses should be disaggregated, that is, broken down by the chosen factors. Quantitative analyses should provide the results of evaluating the model according to the chosen metrics, providing confidence interval values when possible.

Review section 4.7 of the model cards paper.

Unitary results

Intersectional result

Ethical considerations

This section is intended to demonstrate the ethical considerations that went into model development, surfacing ethical challenges and solutions to stakeholders. Ethical analysis does not always lead to precise solutions, but the process of ethical contemplation is worthwhile to inform on responsible practices and next steps in future work.

Review section 4.8 of the model cards paper.

Data

Human life

Mitigations

Risks and harms

Use cases

Caveats and recommendations

This section should list additional concerns that were not covered in the previous sections.

Review section 4.9 of the model cards paper.