{"id":3923,"date":"2024-05-07T09:35:59","date_gmt":"2024-05-07T07:35:59","guid":{"rendered":"https:\/\/www.capri-soft.de\/blog\/?p=3923"},"modified":"2024-05-19T16:23:06","modified_gmt":"2024-05-19T14:23:06","slug":"ki-training-das-format-vom-mnist-datensatz-t10k-images-idx3-ubyte-t10k-labels-idx1-ubyte-train-images-idx3-ubyte-train-labels-idx1-ubyte","status":"publish","type":"post","link":"https:\/\/www.capri-soft.de\/blog\/?p=3923","title":{"rendered":"Neuronale Netze (KNN) \/ KI-Training: Das Format \/ der Aufbau vom MNIST-Datensatz (MNIST Datenbank) der Dateien t10k-images-idx3-ubyte, t10k-labels-idx1-ubyte, train-images-idx3-ubyte, train-labels-idx1-ubyte"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">Intention<\/h2>\n\n\n\n<p>Zum Auffrischen des eigenen Wissens \u00fcber k\u00fcnstliche neuronale Netze (KNN) m\u00f6chte man sich mit Frameworks wie PyTorch oder TensorFlow auseinandersetzen. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Problem<\/h2>\n\n\n\n<p>In den ersten Tutorials ist meistens die Rede vom &#8222;<a href=\"https:\/\/en.wikipedia.org\/wiki\/MNIST_database\">MNIST-Datensatz<\/a>&#8220; oder der &#8222;<a href=\"https:\/\/en.wikipedia.org\/wiki\/MNIST_database\">MNIST Datenbank<\/a>&#8220; mit 70.000 handgeschriebenen Ziffern im Format 28&#215;28 mit 256 Grauwerten je Pixel (also je Byte). 60.000 Bilder davon sind zum Trainieren, 10.000 Bilder zum Testen eines neuronalen Netzes. Die Dateiendung der entpackten Dateien l\u00e4sst sich nicht einfach in *.bmp umbenennen und zum Beispiel mit Paint \u00f6ffnen. Man wei\u00df erstmal nicht in welchem Format die Dateien sind um sich einzelne Zahlen anzusehen.<\/p>\n\n\n\n<p>Laut &#8222;<a href=\"https:\/\/yann.lecun.com\/exdb\/mnist\">https:\/\/yann.lecun.com\/exdb\/mnist<\/a>&#8220; (manchmal nur \u00fcber einen archive.org-Snapshot erreichbar) handelt es sich bei diesem Format nicht um ein Standard-Bildformat. Man muss ein eigenes Programm schreiben um diese Bilder zu interpretieren. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Analyse<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">train-images-idx3-ubyte, t10k-images-idx3-ubyte<\/h3>\n\n\n\n<p>Diese Dateien sind mit GZip (Endung *.gz) gepackt und lassen sich in Windows direkt mit einem Doppelklick \u00f6ffnen oder mit einem Rechtsklick extrahieren:<br><\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><a href=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-1.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"185\" height=\"177\" data-attachment-id=\"3926\" data-permalink=\"https:\/\/www.capri-soft.de\/blog\/?attachment_id=3926\" data-orig-file=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-1.png?fit=185%2C177&amp;ssl=1\" data-orig-size=\"185,177\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"image-1\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-1.png?fit=185%2C177&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-1.png?resize=185%2C177&#038;ssl=1\" alt=\"\" class=\"wp-image-3926\" style=\"width:180px;height:auto\"\/><\/a><\/figure>\n\n\n\n<p>Die *-images*-Dateien enthalten Bilder von handgeschriebenen Ziffern zwischen 0 und 9, die von Studenten und Mitarbeitern der <a href=\"https:\/\/en.wikipedia.org\/wiki\/University_of_South_Carolina_Beaufort\">Universit\u00e4t von South Carolina Beaufort<\/a> im Jahre 1994 gesammelt wurden.<\/p>\n\n\n\n<p>\u00d6ffnet man die extrahierten Dateien in einem Hexadezimaleditor wie zum Beispiel dem kostenlosen HxD-Editor und stellt die Spaltenanzahl auf 28 um, ist bereits ein Muster der enthaltenen Zahlen erkennbar:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-3.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"474\" height=\"232\" data-attachment-id=\"3931\" data-permalink=\"https:\/\/www.capri-soft.de\/blog\/?attachment_id=3931\" data-orig-file=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-3.png?fit=1291%2C632&amp;ssl=1\" data-orig-size=\"1291,632\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"image-3\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-3.png?fit=474%2C232&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-3.png?resize=474%2C232&#038;ssl=1\" alt=\"\" class=\"wp-image-3931\" srcset=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-3.png?resize=1024%2C501&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-3.png?resize=300%2C147&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-3.png?resize=768%2C376&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-3.png?w=1291&amp;ssl=1 1291w, https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-3.png?w=948&amp;ssl=1 948w\" sizes=\"auto, (max-width: 474px) 100vw, 474px\" \/><\/a><\/figure>\n\n\n\n<p>Die ersten 16 Byte haben den folgenden Aufbau:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;offset] &#x5B;type]          &#x5B;value]          &#x5B;description]\n0000     32 bit integer  0x00000803(2051) magic number\n0004     32 bit integer  60000            number of images\n0008     32 bit integer  28               number of rows\n0012     32 bit integer  28               number of columns\n0016     unsigned byte   ??               pixel\n0017     unsigned byte   ??               pixel\n........\nxxxx     unsigned byte   ??               pixel\n<\/pre><\/div>\n\n\n<p>Die 0x08 des dritten Bytes in der Magic Number sagt aus, dass es sich hierbei um UByte-Werte anhandelt. Das dritte Byte kann dabei die folgenden Werte annehmen:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nThe third byte codes the type of the data:\n0x08: unsigned byte\n0x09: signed byte\n0x0B: short (2 bytes)\n0x0C: int (4 bytes)\n0x0D: float (4 bytes)\n0x0E: double (8 bytes)\n<\/pre><\/div>\n\n\n<p>Das vierte Byte in der Magic Number hat hier den Wert 0x03, was bedeutet das unsere Daten 3 Dimensionen f\u00fcr den Pixel haben (x-Pos, y-Pos, Pixelwert\/Grauwert[0-255]).<\/p>\n\n\n\n<p>Entfernt man den markierten Header mit den ersten 16 Bytes (siehe obiges Bild) z.B. im HxD, indem man einfach die Entfernen-Taste dr\u00fcckt, ist das Schriftmuster bereits im HEX-Editor erkennbar:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><a href=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-4.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"474\" height=\"376\" data-attachment-id=\"3933\" data-permalink=\"https:\/\/www.capri-soft.de\/blog\/?attachment_id=3933\" data-orig-file=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-4.png?fit=1311%2C1040&amp;ssl=1\" data-orig-size=\"1311,1040\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"image-4\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-4.png?fit=474%2C376&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-4.png?resize=474%2C376&#038;ssl=1\" alt=\"\" class=\"wp-image-3933\" style=\"width:800px;height:auto\" srcset=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-4.png?resize=1024%2C812&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-4.png?resize=300%2C238&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-4.png?resize=768%2C609&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-4.png?w=1311&amp;ssl=1 1311w, https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-4.png?w=948&amp;ssl=1 948w\" sizes=\"auto, (max-width: 474px) 100vw, 474px\" \/><\/a><\/figure>\n\n\n\n<p>Wie bereits erw\u00e4hnt, hat jeder Pixel einen Wert zwischen 0 (wei\u00df) und 255 (schwarz) [Magic Number: 3. Byte], wobei die Zwischenwerte lineare Abstufungen f\u00fcr Grauwerte sind.<\/p>\n\n\n\n<p>Hier noch ein Beispiel der Fashion-MNIST-Datenbank mit Kleidungsst\u00fccken (von Zalando):<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-6.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"474\" height=\"419\" data-attachment-id=\"3962\" data-permalink=\"https:\/\/www.capri-soft.de\/blog\/?attachment_id=3962\" data-orig-file=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-6.png?fit=1098%2C972&amp;ssl=1\" data-orig-size=\"1098,972\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"image-6\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-6.png?fit=474%2C419&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-6.png?resize=474%2C419&#038;ssl=1\" alt=\"\" class=\"wp-image-3962\" srcset=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-6.png?resize=1024%2C906&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-6.png?resize=300%2C266&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-6.png?resize=768%2C680&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-6.png?w=1098&amp;ssl=1 1098w, https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-6.png?w=948&amp;ssl=1 948w\" sizes=\"auto, (max-width: 474px) 100vw, 474px\" \/><\/a><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">train-labels-idx1-ubyte, t10k-labels-idx1-ubyte<\/h3>\n\n\n\n<p>Der Aufbau der *-labels*-Dateien ist \u00e4hnlich. Als Label werden hier die Zahlen mit den Werten zwischen 0 bis 9 in der selben Reihenfolge wie in den *-images*-Dateien aufgef\u00fchrt. Diese beginnen nach dem Header an Position 8 (hier 5 und 0 \/ unten wie oben im Screenshot):<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-5.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"474\" height=\"206\" data-attachment-id=\"3947\" data-permalink=\"https:\/\/www.capri-soft.de\/blog\/?attachment_id=3947\" data-orig-file=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-5.png?fit=655%2C284&amp;ssl=1\" data-orig-size=\"655,284\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"image-5\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-5.png?fit=474%2C206&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-5.png?resize=474%2C206&#038;ssl=1\" alt=\"\" class=\"wp-image-3947\" srcset=\"https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-5.png?w=655&amp;ssl=1 655w, https:\/\/i0.wp.com\/www.capri-soft.de\/blog\/wp-content\/uploads\/2024\/05\/image-5.png?resize=300%2C130&amp;ssl=1 300w\" sizes=\"auto, (max-width: 474px) 100vw, 474px\" \/><\/a><\/figure>\n\n\n\n<p>Das Format ist also:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">[offset] [type]         [value]          [description]<br>0000     32 bit integer 0x00000801(2049) magic number<br>0004     32 bit integer 60000            number of images<br>0008     byte           [0-9]            Ziffer zw. 0-9<br>\u2026\u2026..<\/pre>\n<iframe src=\"http:\/\/www.facebook.com\/plugins\/like.php?href=https%3A%2F%2Fwww.capri-soft.de%2Fblog%2F%3Fp%3D3923&amp;layout=standard&amp;show_faces=true&amp;width=450&amp;action=like&amp;colorscheme=light\" scrolling=\"no\" frameborder=\"0\" allowTransparency=\"true\" style=\"border:none; overflow:hidden; width:450px;margin-top:5px;\"><\/iframe>","protected":false},"excerpt":{"rendered":"<p>Intention Zum Auffrischen des eigenen Wissens \u00fcber k\u00fcnstliche neuronale Netze (KNN) m\u00f6chte man sich mit Frameworks wie PyTorch oder TensorFlow auseinandersetzen. Problem In den ersten Tutorials ist meistens die Rede vom &#8222;MNIST-Datensatz&#8220; oder der &#8222;MNIST Datenbank&#8220; mit 70.000 handgeschriebenen Ziffern im Format 28&#215;28 mit 256 Grauwerten je Pixel (also je Byte). 60.000 Bilder davon sind &hellip; <a href=\"https:\/\/www.capri-soft.de\/blog\/?p=3923\" class=\"more-link\"><span class=\"screen-reader-text\">Neuronale Netze (KNN) \/ KI-Training: Das Format \/ der Aufbau vom MNIST-Datensatz (MNIST Datenbank) der Dateien t10k-images-idx3-ubyte, t10k-labels-idx1-ubyte, train-images-idx3-ubyte, train-labels-idx1-ubyte<\/span> weiterlesen <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[33,374,96],"tags":[],"class_list":["post-3923","post","type-post","status-publish","format-standard","hentry","category-c","category-kuenstliche-intelligenz","category-python"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p4yGeN-11h","jetpack_likes_enabled":true,"jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/www.capri-soft.de\/blog\/index.php?rest_route=\/wp\/v2\/posts\/3923","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.capri-soft.de\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.capri-soft.de\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.capri-soft.de\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.capri-soft.de\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3923"}],"version-history":[{"count":30,"href":"https:\/\/www.capri-soft.de\/blog\/index.php?rest_route=\/wp\/v2\/posts\/3923\/revisions"}],"predecessor-version":[{"id":3964,"href":"https:\/\/www.capri-soft.de\/blog\/index.php?rest_route=\/wp\/v2\/posts\/3923\/revisions\/3964"}],"wp:attachment":[{"href":"https:\/\/www.capri-soft.de\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3923"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.capri-soft.de\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3923"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.capri-soft.de\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3923"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}