Skip to content

📁 A small MNIST-like The Simpsons character database to at least have some fun while training neural networks.

License

Notifications You must be signed in to change notification settings

alvarobartt/simpsons-mnist

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simpsons-MNIST

Simpsons-MNIST is a small dataset of The Simpsons characters consisting of a training set of 8,000 examples and a test set of 2,000 examples. Each example is a 28x28 RGB/grayscale image, associated with a label from 10 classes. This is a small MNIST as it does not contain 60,000 training and 10,000 testing examples, since there's not that much data in the original dataset. This dataset is available in both formats, RGB and grayscale, in RGB as some of The Simpsons features are based on color, and the grayscale one, so as to be as similar as possible to the original MNIST. Anyway, this MNIST is intended to be used with educational purposes as it is small and dirty which means that you can train a neural network using this dataset from almost any computer, and also this can serve as a replacement of all the other well known MNIST datasets as it is not as boring as the rest of MNIST datasets, both the Google's original MNIST, and the other derivate MNIST-like dataset as: Zalando's Fashion-MNIST, ROIS-DS Center for Open Data in the Humanities Kuzushiji-MNIST, and some more created by the open source community.

Here's an example on how the data looks like:

RGB Grayscaled

About the data

The RGB dataset contains the following information:

Name Set Size # Images Download URL
rgb-train.zip Train 12 MBytes 8,000 RGB Train
rgb-test.zip Test 3 MBytes 2,000 RGB Test

And, the grayscaled dataset contains the following information:

Name Set Size # Images Download URL
grayscale-train.zip Train 8 MBytes 8,000 Grayscale Train
grayscale-test.zip Test 2 MBytes 2,000 Grayscale Test

And the labels for both datasets are the following ones:

{
  0: "bart_simpson",
  1: "charles_montgomery_burns",
  2: "homer_simpson",
  3: "krusty_the_clown",
  4: "lisa_simpson",
  5: "marge_simpson",
  6: "milhouse_van_houten",
  7: "moe_szyslak",
  8: "ned_flanders",
  9: "principal_skinner"
}

As an alternative solution you can just clone the repository as both dataset are included under the dataset/ directory. To do so you just need to: git clone git@github.com:alvarobartt/simpsons-mnist.git


Load the data

Here you have some examples on how to load the data (both RGB and grayscale):


License

The License of this dataset is Attribution-NonCommercial-ShareAlike 4.0 International, which is the same one as the one in the original dataset, so this is an inherited license.


Credits

The original dataset of The Simpsons Characters was created by Alexandre Attia in the following Kaggle dataset's page.

About

📁 A small MNIST-like The Simpsons character database to at least have some fun while training neural networks.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published