# How to Replace Datasets in Model Evaluation ## Introduction This notebook provides a beginner friendly introduction to using different datasets in the context of JATIC and MAITE. So far, we have relied on ART to load the CIFAR10 dataset. In this notebook, we will show how to load two different datasets (MNIST and hugginface CIFAR) and how to use them within JATIC. :::: {grid} 5 :::{grid-item} **Intended Audience:** All T&E Users ::: :::{grid-item} **Requirements:** Basic Python and Torchvision / ML Skills ::: :::{grid-item} **Notebook Runtime:** Full run of the notebook: <1 minute ::: :::{grid-item} **Reading time:** ~10 Minutes ::: :::{grid-item} **Order of Completion:** 1., then any. ::: :::: ::::{grid} 2 :::{grid-item} :columns: 8 Before you begin, you will want to make sure that you download the how-to guide's companion Jupyter notebook. This notebook allows you to follow along in your own environment and interact with the code as you learn. The code snippets are also included in the documentation, but the notebook is provided for ease of use and to enable you to try things on your own. ::: :::{grid-item} :child-align: center :columns: 4 ```{note} The [How to Replace Datasets in Model Evaluation Companion Notebook](https://github.com/IBM/heart-library/blob/main/notebooks/how_tos/image_classification/5_How_to_Replace_Datasets_in_Model_Evaluation.ipynb) can be downloaded via the HEART public GitHub. ``` ::: :::: ### Contents 1. Imports 1. Load Satellite classification data 1. Load CIFAR10 model and data from ART 1. Load CIFAR10 from Huggingface 1. Load CIFAR10 from Pytorch 1. Load Single channel dataset from Huggingface (MNIST) 1. Conclusion 1. Next Steps ### Learning Objectives - How to load the standard dataset used in the other notebooks (2.) - Datasets can be imported from many different libraries (3.,4.,5.) - How to load single channel (black-and-white) image data (6.) ## 1. Imports We import all necessary libraries for this tutorial. In this order, we first import general libraries such as numpy, then load relevant methods from ART. We then load the corresponding HEART functionality and specific torch functions to support the model. Lastly, we use a command to plot within the notebook. ```python import numpy as np import os import requests import matplotlib.pyplot as plt from art.utils import load_dataset from heart_library.estimators.classification.pytorch import JaticPyTorchClassifier from heart_library.metrics import AccuracyPerturbationMetric from datasets import load_dataset as load_dataset_hf import torch import torchvision from torchvision import transforms from torchvision.models import resnet18, ResNet18_Weights %matplotlib inline ``` ## 2. Load Satellite Classification Data This way of loading a saetilte dataset is used in all other how-to's. ```python classes = { 0:'Building', 1:'Construction Site', 2:'Engineering Vehicle', 3:'Fishing Vessel', 4:'Oil Tanker', 5:'Vehicle Lot' } data = load_dataset_hf("CDAO/xview-subset-classification", split="test[0:12]") idx = 3 plt.title(f"Prediction: {classes[data[idx]['label']]}") plt.imshow(data[idx]['image']) model = torchvision.models.resnet18(False) num_ftrs = model.fc.in_features model.fc = torch.nn.Linear(num_ftrs, len(classes.keys())) model.load_state_dict(torch.load('../../../utils/resources/models/xview_model.pt')) _ = model.eval() ``` ```text Resolving data files: 0%| | 0/31 [00:00