Mini Section:
Dataset Cards
Key Idea: What are Dataset Cards?
Hugging Face Dataset Cards are documentation cards that accompany Hugging Face NLP datasets and are used to alert users to potential biases within a given dataset to promote responsible dataset usage for ML purposes. Similar to Datasheets for Datasets, Dataset Cards also document the provenance, creation, and use of ML datasets; however, Dataset Cards are displayed through the Hugging Face interface and are embedded into the process of uploading a dataset to the Hub.
Fun Fact! The conceptualization of Dataset Cards was inspired by Model Cards proposed by Mitchell and colleagues (which we will cover in the next module!)
Explore: Dataset Card Creator
ML practitioners and dataset creators/curators can create their own dataset card through React, a JavaScript library for building user interfaces.
Explore the application and read more about dataset cards here.
Explore: SNLI Dataset Card
The Stanford Natural Language Inference (SNLI) corpus (version 1.0) is a collection of 570,000 manually labeled, person-written English sentence pairs.
Explore its dataset card on Hugging Face!