JSONforMNIST/README.md
2025-07-20 14:33:26 -06:00

2.5 KiB

JSON for MNIST

This repo contains two JSON files, train.json and test.json, which are a human-readable reproduction of the training and testing data from the MNIST database. There are 10,000 entries in train.json and another 60,000 in test.json.

Formatting

A typical line in either JSON file looks like

{"84": [[7, 6]], "185": [[7, 7]], "159": [[7, 8]], "151": [[7, 9]], "60": [[7, 10]], "36": [[7, 11]], "222": [[8, 6]], "254": [[8, 7], [8, 8], [8, 9], [8, 10], [9, 12], [9, 14], [9, 15], [9, 16], [9, 19], [9, 20], [10, 20], [13, 18], [14, 18], [15, 17], [17, 16], [19, 15], [20, 14], [21, 13], [21, 14], [22, 13], [23, 12], [23, 13], [24, 12], [24, 13], [25, 11], [25, 12], [26, 11]], "241": [[8, 11]], "198": [[8, 12], [8, 13], [8, 14], [8, 15], [8, 16], [8, 17], [8, 18], [8, 19]], "170": [[8, 20]], "52": [[8, 21], [23, 14], [24, 14]], "67": [[9, 6], [10, 14], [10, 15], [10, 16]], "114": [[9, 7], [9, 9]], "72": [[9, 8]], "163": [[9, 10]], "227": [[9, 11]], "225": [[9, 13]], "250": [[9, 17]], "229": [[9, 18]], "140": [[9, 21]], "17": [[10, 11]], "66": [[10, 12]], "14": [[10, 13]], "59": [[10, 17], [14, 16]], "21": [[10, 18]], "236": [[10, 19]], "106": [[10, 21]], "83": [[11, 18], [12, 20]], "253": [[11, 19]], "209": [[11, 20]], "18": [[11, 21], [26, 13]], "22": [[12, 17]], "233": [[12, 18]], "255": [[12, 19]], "129": [[13, 17]], "238": [[13, 19]], "44": [[13, 20]], "249": [[14, 17]], "62": [[14, 19]], "133": [[15, 16], [23, 11]], "187": [[15, 18]], "5": [[15, 19]], "9": [[16, 15]], "205": [[16, 16]], "248": [[16, 17]], "58": [[16, 18]], "126": [[17, 15]], "182": [[17, 17]], "75": [[18, 14]], "251": [[18, 15]], "240": [[18, 16]], "57": [[18, 17]], "19": [[19, 13]], "221": [[19, 14]], "166": [[19, 16]], "3": [[20, 12]], "203": [[20, 13]], "219": [[20, 15], [25, 13]], "35": [[20, 16]], "38": [[21, 12]], "77": [[21, 15]], "31": [[22, 11]], "224": [[22, 12]], "115": [[22, 14]], "1": [[22, 15]], "61": [[24, 10]], "242": [[24, 11]], "121": [[25, 10], [26, 10]], "40": [[25, 14]], "207": [[26, 12]], "label": 7}

This string is a valid dictionary in Python. Its keys are all integers which lie between 0 and 255, except for the last key listed, which is label. Each of the integer keys represents a greyscale value between 0 and 255. The values for these integer keys are lists of pairs, indicating which pixels in the image have the corresponding greyscale value. The value associated to the label key is the number that the image is supposed to represent.