minesweeper

A minewseeper implementation to play around with Hare and Raylib
git clone https://git.tronto.net/minesweeper
Download | Log | Files | Refs | README | LICENSE

stb_image.h (283010B)


      1 /* stb_image - v2.30 - public domain image loader - http://nothings.org/stb
      2                                   no warranty implied; use at your own risk
      3 
      4    Do this:
      5       #define STB_IMAGE_IMPLEMENTATION
      6    before you include this file in *one* C or C++ file to create the implementation.
      7 
      8    // i.e. it should look like this:
      9    #include ...
     10    #include ...
     11    #include ...
     12    #define STB_IMAGE_IMPLEMENTATION
     13    #include "stb_image.h"
     14 
     15    You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
     16    And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
     17 
     18 
     19    QUICK NOTES:
     20       Primarily of interest to game developers and other people who can
     21           avoid problematic images and only need the trivial interface
     22 
     23       JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
     24       PNG 1/2/4/8/16-bit-per-channel
     25 
     26       TGA (not sure what subset, if a subset)
     27       BMP non-1bpp, non-RLE
     28       PSD (composited view only, no extra channels, 8/16 bit-per-channel)
     29 
     30       GIF (*comp always reports as 4-channel)
     31       HDR (radiance rgbE format)
     32       PIC (Softimage PIC)
     33       PNM (PPM and PGM binary only)
     34 
     35       Animated GIF still needs a proper API, but here's one way to do it:
     36           http://gist.github.com/urraka/685d9a6340b26b830d49
     37 
     38       - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
     39       - decode from arbitrary I/O callbacks
     40       - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
     41 
     42    Full documentation under "DOCUMENTATION" below.
     43 
     44 
     45 LICENSE
     46 
     47   See end of file for license information.
     48 
     49 RECENT REVISION HISTORY:
     50 
     51       2.30  (2024-05-31) avoid erroneous gcc warning
     52       2.29  (2023-05-xx) optimizations
     53       2.28  (2023-01-29) many error fixes, security errors, just tons of stuff
     54       2.27  (2021-07-11) document stbi_info better, 16-bit PNM support, bug fixes
     55       2.26  (2020-07-13) many minor fixes
     56       2.25  (2020-02-02) fix warnings
     57       2.24  (2020-02-02) fix warnings; thread-local failure_reason and flip_vertically
     58       2.23  (2019-08-11) fix clang static analysis warning
     59       2.22  (2019-03-04) gif fixes, fix warnings
     60       2.21  (2019-02-25) fix typo in comment
     61       2.20  (2019-02-07) support utf8 filenames in Windows; fix warnings and platform ifdefs
     62       2.19  (2018-02-11) fix warning
     63       2.18  (2018-01-30) fix warnings
     64       2.17  (2018-01-29) bugfix, 1-bit BMP, 16-bitness query, fix warnings
     65       2.16  (2017-07-23) all functions have 16-bit variants; optimizations; bugfixes
     66       2.15  (2017-03-18) fix png-1,2,4; all Imagenet JPGs; no runtime SSE detection on GCC
     67       2.14  (2017-03-03) remove deprecated STBI_JPEG_OLD; fixes for Imagenet JPGs
     68       2.13  (2016-12-04) experimental 16-bit API, only for PNG so far; fixes
     69       2.12  (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
     70       2.11  (2016-04-02) 16-bit PNGS; enable SSE2 in non-gcc x64
     71                          RGB-format JPEG; remove white matting in PSD;
     72                          allocate large structures on the stack;
     73                          correct channel count for PNG & BMP
     74       2.10  (2016-01-22) avoid warning introduced in 2.09
     75       2.09  (2016-01-16) 16-bit TGA; comments in PNM files; STBI_REALLOC_SIZED
     76 
     77    See end of file for full revision history.
     78 
     79 
     80  ============================    Contributors    =========================
     81 
     82  Image formats                          Extensions, features
     83     Sean Barrett (jpeg, png, bmp)          Jetro Lauha (stbi_info)
     84     Nicolas Schulz (hdr, psd)              Martin "SpartanJ" Golini (stbi_info)
     85     Jonathan Dummer (tga)                  James "moose2000" Brown (iPhone PNG)
     86     Jean-Marc Lienher (gif)                Ben "Disch" Wenger (io callbacks)
     87     Tom Seddon (pic)                       Omar Cornut (1/2/4-bit PNG)
     88     Thatcher Ulrich (psd)                  Nicolas Guillemot (vertical flip)
     89     Ken Miller (pgm, ppm)                  Richard Mitton (16-bit PSD)
     90     github:urraka (animated gif)           Junggon Kim (PNM comments)
     91     Christopher Forseth (animated gif)     Daniel Gibson (16-bit TGA)
     92                                            socks-the-fox (16-bit PNG)
     93                                            Jeremy Sawicki (handle all ImageNet JPGs)
     94  Optimizations & bugfixes                  Mikhail Morozov (1-bit BMP)
     95     Fabian "ryg" Giesen                    Anael Seghezzi (is-16-bit query)
     96     Arseny Kapoulkine                      Simon Breuss (16-bit PNM)
     97     John-Mark Allen
     98     Carmelo J Fdez-Aguera
     99 
    100  Bug & warning fixes
    101     Marc LeBlanc            David Woo          Guillaume George     Martins Mozeiko
    102     Christpher Lloyd        Jerry Jansson      Joseph Thomson       Blazej Dariusz Roszkowski
    103     Phil Jordan                                Dave Moore           Roy Eltham
    104     Hayaki Saito            Nathan Reed        Won Chun
    105     Luke Graham             Johan Duparc       Nick Verigakis       the Horde3D community
    106     Thomas Ruf              Ronny Chevalier                         github:rlyeh
    107     Janez Zemva             John Bartholomew   Michal Cichon        github:romigrou
    108     Jonathan Blow           Ken Hamada         Tero Hanninen        github:svdijk
    109     Eugene Golushkov        Laurent Gomila     Cort Stratton        github:snagar
    110     Aruelien Pocheville     Sergio Gonzalez    Thibault Reuille     github:Zelex
    111     Cass Everitt            Ryamond Barbiero                        github:grim210
    112     Paul Du Bois            Engin Manap        Aldo Culquicondor    github:sammyhw
    113     Philipp Wiesemann       Dale Weiler        Oriol Ferrer Mesia   github:phprus
    114     Josh Tobin              Neil Bickford      Matthew Gregan       github:poppolopoppo
    115     Julian Raschke          Gregory Mullen     Christian Floisand   github:darealshinji
    116     Baldur Karlsson         Kevin Schmidt      JR Smith             github:Michaelangel007
    117                             Brad Weinberger    Matvey Cherevko      github:mosra
    118     Luca Sas                Alexander Veselov  Zack Middleton       [reserved]
    119     Ryan C. Gordon          [reserved]                              [reserved]
    120                      DO NOT ADD YOUR NAME HERE
    121 
    122                      Jacko Dirks
    123 
    124   To add your name to the credits, pick a random blank space in the middle and fill it.
    125   80% of merge conflicts on stb PRs are due to people adding their name at the end
    126   of the credits.
    127 */
    128 
    129 #ifndef STBI_INCLUDE_STB_IMAGE_H
    130 #define STBI_INCLUDE_STB_IMAGE_H
    131 
    132 // DOCUMENTATION
    133 //
    134 // Limitations:
    135 //    - no 12-bit-per-channel JPEG
    136 //    - no JPEGs with arithmetic coding
    137 //    - GIF always returns *comp=4
    138 //
    139 // Basic usage (see HDR discussion below for HDR usage):
    140 //    int x,y,n;
    141 //    unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
    142 //    // ... process data if not NULL ...
    143 //    // ... x = width, y = height, n = # 8-bit components per pixel ...
    144 //    // ... replace '0' with '1'..'4' to force that many components per pixel
    145 //    // ... but 'n' will always be the number that it would have been if you said 0
    146 //    stbi_image_free(data);
    147 //
    148 // Standard parameters:
    149 //    int *x                 -- outputs image width in pixels
    150 //    int *y                 -- outputs image height in pixels
    151 //    int *channels_in_file  -- outputs # of image components in image file
    152 //    int desired_channels   -- if non-zero, # of image components requested in result
    153 //
    154 // The return value from an image loader is an 'unsigned char *' which points
    155 // to the pixel data, or NULL on an allocation failure or if the image is
    156 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
    157 // with each pixel consisting of N interleaved 8-bit components; the first
    158 // pixel pointed to is top-left-most in the image. There is no padding between
    159 // image scanlines or between pixels, regardless of format. The number of
    160 // components N is 'desired_channels' if desired_channels is non-zero, or
    161 // *channels_in_file otherwise. If desired_channels is non-zero,
    162 // *channels_in_file has the number of components that _would_ have been
    163 // output otherwise. E.g. if you set desired_channels to 4, you will always
    164 // get RGBA output, but you can check *channels_in_file to see if it's trivially
    165 // opaque because e.g. there were only 3 channels in the source image.
    166 //
    167 // An output image with N components has the following components interleaved
    168 // in this order in each pixel:
    169 //
    170 //     N=#comp     components
    171 //       1           grey
    172 //       2           grey, alpha
    173 //       3           red, green, blue
    174 //       4           red, green, blue, alpha
    175 //
    176 // If image loading fails for any reason, the return value will be NULL,
    177 // and *x, *y, *channels_in_file will be unchanged. The function
    178 // stbi_failure_reason() can be queried for an extremely brief, end-user
    179 // unfriendly explanation of why the load failed. Define STBI_NO_FAILURE_STRINGS
    180 // to avoid compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
    181 // more user-friendly ones.
    182 //
    183 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
    184 //
    185 // To query the width, height and component count of an image without having to
    186 // decode the full file, you can use the stbi_info family of functions:
    187 //
    188 //   int x,y,n,ok;
    189 //   ok = stbi_info(filename, &x, &y, &n);
    190 //   // returns ok=1 and sets x, y, n if image is a supported format,
    191 //   // 0 otherwise.
    192 //
    193 // Note that stb_image pervasively uses ints in its public API for sizes,
    194 // including sizes of memory buffers. This is now part of the API and thus
    195 // hard to change without causing breakage. As a result, the various image
    196 // loaders all have certain limits on image size; these differ somewhat
    197 // by format but generally boil down to either just under 2GB or just under
    198 // 1GB. When the decoded image would be larger than this, stb_image decoding
    199 // will fail.
    200 //
    201 // Additionally, stb_image will reject image files that have any of their
    202 // dimensions set to a larger value than the configurable STBI_MAX_DIMENSIONS,
    203 // which defaults to 2**24 = 16777216 pixels. Due to the above memory limit,
    204 // the only way to have an image with such dimensions load correctly
    205 // is for it to have a rather extreme aspect ratio. Either way, the
    206 // assumption here is that such larger images are likely to be malformed
    207 // or malicious. If you do need to load an image with individual dimensions
    208 // larger than that, and it still fits in the overall size limit, you can
    209 // #define STBI_MAX_DIMENSIONS on your own to be something larger.
    210 //
    211 // ===========================================================================
    212 //
    213 // UNICODE:
    214 //
    215 //   If compiling for Windows and you wish to use Unicode filenames, compile
    216 //   with
    217 //       #define STBI_WINDOWS_UTF8
    218 //   and pass utf8-encoded filenames. Call stbi_convert_wchar_to_utf8 to convert
    219 //   Windows wchar_t filenames to utf8.
    220 //
    221 // ===========================================================================
    222 //
    223 // Philosophy
    224 //
    225 // stb libraries are designed with the following priorities:
    226 //
    227 //    1. easy to use
    228 //    2. easy to maintain
    229 //    3. good performance
    230 //
    231 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
    232 // and for best performance I may provide less-easy-to-use APIs that give higher
    233 // performance, in addition to the easy-to-use ones. Nevertheless, it's important
    234 // to keep in mind that from the standpoint of you, a client of this library,
    235 // all you care about is #1 and #3, and stb libraries DO NOT emphasize #3 above all.
    236 //
    237 // Some secondary priorities arise directly from the first two, some of which
    238 // provide more explicit reasons why performance can't be emphasized.
    239 //
    240 //    - Portable ("ease of use")
    241 //    - Small source code footprint ("easy to maintain")
    242 //    - No dependencies ("ease of use")
    243 //
    244 // ===========================================================================
    245 //
    246 // I/O callbacks
    247 //
    248 // I/O callbacks allow you to read from arbitrary sources, like packaged
    249 // files or some other source. Data read from callbacks are processed
    250 // through a small internal buffer (currently 128 bytes) to try to reduce
    251 // overhead.
    252 //
    253 // The three functions you must define are "read" (reads some bytes of data),
    254 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
    255 //
    256 // ===========================================================================
    257 //
    258 // SIMD support
    259 //
    260 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
    261 // supported by the compiler. For ARM Neon support, you must explicitly
    262 // request it.
    263 //
    264 // (The old do-it-yourself SIMD API is no longer supported in the current
    265 // code.)
    266 //
    267 // On x86, SSE2 will automatically be used when available based on a run-time
    268 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
    269 // the typical path is to have separate builds for NEON and non-NEON devices
    270 // (at least this is true for iOS and Android). Therefore, the NEON support is
    271 // toggled by a build flag: define STBI_NEON to get NEON loops.
    272 //
    273 // If for some reason you do not want to use any of SIMD code, or if
    274 // you have issues compiling it, you can disable it entirely by
    275 // defining STBI_NO_SIMD.
    276 //
    277 // ===========================================================================
    278 //
    279 // HDR image support   (disable by defining STBI_NO_HDR)
    280 //
    281 // stb_image supports loading HDR images in general, and currently the Radiance
    282 // .HDR file format specifically. You can still load any file through the existing
    283 // interface; if you attempt to load an HDR file, it will be automatically remapped
    284 // to LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
    285 // both of these constants can be reconfigured through this interface:
    286 //
    287 //     stbi_hdr_to_ldr_gamma(2.2f);
    288 //     stbi_hdr_to_ldr_scale(1.0f);
    289 //
    290 // (note, do not use _inverse_ constants; stbi_image will invert them
    291 // appropriately).
    292 //
    293 // Additionally, there is a new, parallel interface for loading files as
    294 // (linear) floats to preserve the full dynamic range:
    295 //
    296 //    float *data = stbi_loadf(filename, &x, &y, &n, 0);
    297 //
    298 // If you load LDR images through this interface, those images will
    299 // be promoted to floating point values, run through the inverse of
    300 // constants corresponding to the above:
    301 //
    302 //     stbi_ldr_to_hdr_scale(1.0f);
    303 //     stbi_ldr_to_hdr_gamma(2.2f);
    304 //
    305 // Finally, given a filename (or an open file or memory block--see header
    306 // file for details) containing image data, you can query for the "most
    307 // appropriate" interface to use (that is, whether the image is HDR or
    308 // not), using:
    309 //
    310 //     stbi_is_hdr(char *filename);
    311 //
    312 // ===========================================================================
    313 //
    314 // iPhone PNG support:
    315 //
    316 // We optionally support converting iPhone-formatted PNGs (which store
    317 // premultiplied BGRA) back to RGB, even though they're internally encoded
    318 // differently. To enable this conversion, call
    319 // stbi_convert_iphone_png_to_rgb(1).
    320 //
    321 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
    322 // pixel to remove any premultiplied alpha *only* if the image file explicitly
    323 // says there's premultiplied data (currently only happens in iPhone images,
    324 // and only if iPhone convert-to-rgb processing is on).
    325 //
    326 // ===========================================================================
    327 //
    328 // ADDITIONAL CONFIGURATION
    329 //
    330 //  - You can suppress implementation of any of the decoders to reduce
    331 //    your code footprint by #defining one or more of the following
    332 //    symbols before creating the implementation.
    333 //
    334 //        STBI_NO_JPEG
    335 //        STBI_NO_PNG
    336 //        STBI_NO_BMP
    337 //        STBI_NO_PSD
    338 //        STBI_NO_TGA
    339 //        STBI_NO_GIF
    340 //        STBI_NO_HDR
    341 //        STBI_NO_PIC
    342 //        STBI_NO_PNM   (.ppm and .pgm)
    343 //
    344 //  - You can request *only* certain decoders and suppress all other ones
    345 //    (this will be more forward-compatible, as addition of new decoders
    346 //    doesn't require you to disable them explicitly):
    347 //
    348 //        STBI_ONLY_JPEG
    349 //        STBI_ONLY_PNG
    350 //        STBI_ONLY_BMP
    351 //        STBI_ONLY_PSD
    352 //        STBI_ONLY_TGA
    353 //        STBI_ONLY_GIF
    354 //        STBI_ONLY_HDR
    355 //        STBI_ONLY_PIC
    356 //        STBI_ONLY_PNM   (.ppm and .pgm)
    357 //
    358 //   - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
    359 //     want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
    360 //
    361 //  - If you define STBI_MAX_DIMENSIONS, stb_image will reject images greater
    362 //    than that size (in either width or height) without further processing.
    363 //    This is to let programs in the wild set an upper bound to prevent
    364 //    denial-of-service attacks on untrusted data, as one could generate a
    365 //    valid image of gigantic dimensions and force stb_image to allocate a
    366 //    huge block of memory and spend disproportionate time decoding it. By
    367 //    default this is set to (1 << 24), which is 16777216, but that's still
    368 //    very big.
    369 
    370 #ifndef STBI_NO_STDIO
    371 #include <stdio.h>
    372 #endif // STBI_NO_STDIO
    373 
    374 #define STBI_VERSION 1
    375 
    376 enum
    377 {
    378    STBI_default = 0, // only used for desired_channels
    379 
    380    STBI_grey       = 1,
    381    STBI_grey_alpha = 2,
    382    STBI_rgb        = 3,
    383    STBI_rgb_alpha  = 4
    384 };
    385 
    386 #include <stdlib.h>
    387 typedef unsigned char stbi_uc;
    388 typedef unsigned short stbi_us;
    389 
    390 #ifdef __cplusplus
    391 extern "C" {
    392 #endif
    393 
    394 #ifndef STBIDEF
    395 #ifdef STB_IMAGE_STATIC
    396 #define STBIDEF static
    397 #else
    398 #define STBIDEF extern
    399 #endif
    400 #endif
    401 
    402 //////////////////////////////////////////////////////////////////////////////
    403 //
    404 // PRIMARY API - works on images of any type
    405 //
    406 
    407 //
    408 // load image by filename, open file, or memory buffer
    409 //
    410 
    411 typedef struct
    412 {
    413    int      (*read)  (void *user,char *data,int size);   // fill 'data' with 'size' bytes.  return number of bytes actually read
    414    void     (*skip)  (void *user,int n);                 // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
    415    int      (*eof)   (void *user);                       // returns nonzero if we are at end of file/data
    416 } stbi_io_callbacks;
    417 
    418 ////////////////////////////////////
    419 //
    420 // 8-bits-per-channel interface
    421 //
    422 
    423 STBIDEF stbi_uc *stbi_load_from_memory   (stbi_uc           const *buffer, int len   , int *x, int *y, int *channels_in_file, int desired_channels);
    424 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk  , void *user, int *x, int *y, int *channels_in_file, int desired_channels);
    425 
    426 #ifndef STBI_NO_STDIO
    427 STBIDEF stbi_uc *stbi_load            (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
    428 STBIDEF stbi_uc *stbi_load_from_file  (FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
    429 // for stbi_load_from_file, file pointer is left pointing immediately after image
    430 #endif
    431 
    432 #ifndef STBI_NO_GIF
    433 STBIDEF stbi_uc *stbi_load_gif_from_memory(stbi_uc const *buffer, int len, int **delays, int *x, int *y, int *z, int *comp, int req_comp);
    434 #endif
    435 
    436 #ifdef STBI_WINDOWS_UTF8
    437 STBIDEF int stbi_convert_wchar_to_utf8(char *buffer, size_t bufferlen, const wchar_t* input);
    438 #endif
    439 
    440 ////////////////////////////////////
    441 //
    442 // 16-bits-per-channel interface
    443 //
    444 
    445 STBIDEF stbi_us *stbi_load_16_from_memory   (stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels);
    446 STBIDEF stbi_us *stbi_load_16_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels);
    447 
    448 #ifndef STBI_NO_STDIO
    449 STBIDEF stbi_us *stbi_load_16          (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
    450 STBIDEF stbi_us *stbi_load_from_file_16(FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
    451 #endif
    452 
    453 ////////////////////////////////////
    454 //
    455 // float-per-channel interface
    456 //
    457 #ifndef STBI_NO_LINEAR
    458    STBIDEF float *stbi_loadf_from_memory     (stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels);
    459    STBIDEF float *stbi_loadf_from_callbacks  (stbi_io_callbacks const *clbk, void *user, int *x, int *y,  int *channels_in_file, int desired_channels);
    460 
    461    #ifndef STBI_NO_STDIO
    462    STBIDEF float *stbi_loadf            (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
    463    STBIDEF float *stbi_loadf_from_file  (FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
    464    #endif
    465 #endif
    466 
    467 #ifndef STBI_NO_HDR
    468    STBIDEF void   stbi_hdr_to_ldr_gamma(float gamma);
    469    STBIDEF void   stbi_hdr_to_ldr_scale(float scale);
    470 #endif // STBI_NO_HDR
    471 
    472 #ifndef STBI_NO_LINEAR
    473    STBIDEF void   stbi_ldr_to_hdr_gamma(float gamma);
    474    STBIDEF void   stbi_ldr_to_hdr_scale(float scale);
    475 #endif // STBI_NO_LINEAR
    476 
    477 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
    478 STBIDEF int    stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
    479 STBIDEF int    stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
    480 #ifndef STBI_NO_STDIO
    481 STBIDEF int      stbi_is_hdr          (char const *filename);
    482 STBIDEF int      stbi_is_hdr_from_file(FILE *f);
    483 #endif // STBI_NO_STDIO
    484 
    485 
    486 // get a VERY brief reason for failure
    487 // on most compilers (and ALL modern mainstream compilers) this is threadsafe
    488 STBIDEF const char *stbi_failure_reason  (void);
    489 
    490 // free the loaded image -- this is just free()
    491 STBIDEF void     stbi_image_free      (void *retval_from_stbi_load);
    492 
    493 // get image dimensions & components without fully decoding
    494 STBIDEF int      stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
    495 STBIDEF int      stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
    496 STBIDEF int      stbi_is_16_bit_from_memory(stbi_uc const *buffer, int len);
    497 STBIDEF int      stbi_is_16_bit_from_callbacks(stbi_io_callbacks const *clbk, void *user);
    498 
    499 #ifndef STBI_NO_STDIO
    500 STBIDEF int      stbi_info               (char const *filename,     int *x, int *y, int *comp);
    501 STBIDEF int      stbi_info_from_file     (FILE *f,                  int *x, int *y, int *comp);
    502 STBIDEF int      stbi_is_16_bit          (char const *filename);
    503 STBIDEF int      stbi_is_16_bit_from_file(FILE *f);
    504 #endif
    505 
    506 
    507 
    508 // for image formats that explicitly notate that they have premultiplied alpha,
    509 // we just return the colors as stored in the file. set this flag to force
    510 // unpremultiplication. results are undefined if the unpremultiply overflow.
    511 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
    512 
    513 // indicate whether we should process iphone images back to canonical format,
    514 // or just pass them through "as-is"
    515 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
    516 
    517 // flip the image vertically, so the first pixel in the output array is the bottom left
    518 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
    519 
    520 // as above, but only applies to images loaded on the thread that calls the function
    521 // this function is only available if your compiler supports thread-local variables;
    522 // calling it will fail to link if your compiler doesn't
    523 STBIDEF void stbi_set_unpremultiply_on_load_thread(int flag_true_if_should_unpremultiply);
    524 STBIDEF void stbi_convert_iphone_png_to_rgb_thread(int flag_true_if_should_convert);
    525 STBIDEF void stbi_set_flip_vertically_on_load_thread(int flag_true_if_should_flip);
    526 
    527 // ZLIB client - used by PNG, available for other purposes
    528 
    529 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
    530 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
    531 STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
    532 STBIDEF int   stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
    533 
    534 STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
    535 STBIDEF int   stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
    536 
    537 
    538 #ifdef __cplusplus
    539 }
    540 #endif
    541 
    542 //
    543 //
    544 ////   end header file   /////////////////////////////////////////////////////
    545 #endif // STBI_INCLUDE_STB_IMAGE_H
    546 
    547 #ifdef STB_IMAGE_IMPLEMENTATION
    548 
    549 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
    550   || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
    551   || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
    552   || defined(STBI_ONLY_ZLIB)
    553    #ifndef STBI_ONLY_JPEG
    554    #define STBI_NO_JPEG
    555    #endif
    556    #ifndef STBI_ONLY_PNG
    557    #define STBI_NO_PNG
    558    #endif
    559    #ifndef STBI_ONLY_BMP
    560    #define STBI_NO_BMP
    561    #endif
    562    #ifndef STBI_ONLY_PSD
    563    #define STBI_NO_PSD
    564    #endif
    565    #ifndef STBI_ONLY_TGA
    566    #define STBI_NO_TGA
    567    #endif
    568    #ifndef STBI_ONLY_GIF
    569    #define STBI_NO_GIF
    570    #endif
    571    #ifndef STBI_ONLY_HDR
    572    #define STBI_NO_HDR
    573    #endif
    574    #ifndef STBI_ONLY_PIC
    575    #define STBI_NO_PIC
    576    #endif
    577    #ifndef STBI_ONLY_PNM
    578    #define STBI_NO_PNM
    579    #endif
    580 #endif
    581 
    582 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
    583 #define STBI_NO_ZLIB
    584 #endif
    585 
    586 
    587 #include <stdarg.h>
    588 #include <stddef.h> // ptrdiff_t on osx
    589 #include <stdlib.h>
    590 #include <string.h>
    591 #include <limits.h>
    592 
    593 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
    594 #include <math.h>  // ldexp, pow
    595 #endif
    596 
    597 #ifndef STBI_NO_STDIO
    598 #include <stdio.h>
    599 #endif
    600 
    601 #ifndef STBI_ASSERT
    602 #include <assert.h>
    603 #define STBI_ASSERT(x) assert(x)
    604 #endif
    605 
    606 #ifdef __cplusplus
    607 #define STBI_EXTERN extern "C"
    608 #else
    609 #define STBI_EXTERN extern
    610 #endif
    611 
    612 
    613 #ifndef _MSC_VER
    614    #ifdef __cplusplus
    615    #define stbi_inline inline
    616    #else
    617    #define stbi_inline
    618    #endif
    619 #else
    620    #define stbi_inline __forceinline
    621 #endif
    622 
    623 #ifndef STBI_NO_THREAD_LOCALS
    624    #if defined(__cplusplus) &&  __cplusplus >= 201103L
    625       #define STBI_THREAD_LOCAL       thread_local
    626    #elif defined(__GNUC__) && __GNUC__ < 5
    627       #define STBI_THREAD_LOCAL       __thread
    628    #elif defined(_MSC_VER)
    629       #define STBI_THREAD_LOCAL       __declspec(thread)
    630    #elif defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L && !defined(__STDC_NO_THREADS__)
    631       #define STBI_THREAD_LOCAL       _Thread_local
    632    #endif
    633 
    634    #ifndef STBI_THREAD_LOCAL
    635       #if defined(__GNUC__)
    636         #define STBI_THREAD_LOCAL       __thread
    637       #endif
    638    #endif
    639 #endif
    640 
    641 #if defined(_MSC_VER) || defined(__SYMBIAN32__)
    642 typedef unsigned short stbi__uint16;
    643 typedef   signed short stbi__int16;
    644 typedef unsigned int   stbi__uint32;
    645 typedef   signed int   stbi__int32;
    646 #else
    647 #include <stdint.h>
    648 typedef uint16_t stbi__uint16;
    649 typedef int16_t  stbi__int16;
    650 typedef uint32_t stbi__uint32;
    651 typedef int32_t  stbi__int32;
    652 #endif
    653 
    654 // should produce compiler error if size is wrong
    655 typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
    656 
    657 #ifdef _MSC_VER
    658 #define STBI_NOTUSED(v)  (void)(v)
    659 #else
    660 #define STBI_NOTUSED(v)  (void)sizeof(v)
    661 #endif
    662 
    663 #ifdef _MSC_VER
    664 #define STBI_HAS_LROTL
    665 #endif
    666 
    667 #ifdef STBI_HAS_LROTL
    668    #define stbi_lrot(x,y)  _lrotl(x,y)
    669 #else
    670    #define stbi_lrot(x,y)  (((x) << (y)) | ((x) >> (-(y) & 31)))
    671 #endif
    672 
    673 #if defined(STBI_MALLOC) && defined(STBI_FREE) && (defined(STBI_REALLOC) || defined(STBI_REALLOC_SIZED))
    674 // ok
    675 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC) && !defined(STBI_REALLOC_SIZED)
    676 // ok
    677 #else
    678 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC (or STBI_REALLOC_SIZED)."
    679 #endif
    680 
    681 #ifndef STBI_MALLOC
    682 #define STBI_MALLOC(sz)           malloc(sz)
    683 #define STBI_REALLOC(p,newsz)     realloc(p,newsz)
    684 #define STBI_FREE(p)              free(p)
    685 #endif
    686 
    687 #ifndef STBI_REALLOC_SIZED
    688 #define STBI_REALLOC_SIZED(p,oldsz,newsz) STBI_REALLOC(p,newsz)
    689 #endif
    690 
    691 // x86/x64 detection
    692 #if defined(__x86_64__) || defined(_M_X64)
    693 #define STBI__X64_TARGET
    694 #elif defined(__i386) || defined(_M_IX86)
    695 #define STBI__X86_TARGET
    696 #endif
    697 
    698 #if defined(__GNUC__) && defined(STBI__X86_TARGET) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
    699 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
    700 // which in turn means it gets to use SSE2 everywhere. This is unfortunate,
    701 // but previous attempts to provide the SSE2 functions with runtime
    702 // detection caused numerous issues. The way architecture extensions are
    703 // exposed in GCC/Clang is, sadly, not really suited for one-file libs.
    704 // New behavior: if compiled with -msse2, we use SSE2 without any
    705 // detection; if not, we don't use it at all.
    706 #define STBI_NO_SIMD
    707 #endif
    708 
    709 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
    710 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
    711 //
    712 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
    713 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
    714 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
    715 // simultaneously enabling "-mstackrealign".
    716 //
    717 // See https://github.com/nothings/stb/issues/81 for more information.
    718 //
    719 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
    720 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
    721 #define STBI_NO_SIMD
    722 #endif
    723 
    724 #if !defined(STBI_NO_SIMD) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET))
    725 #define STBI_SSE2
    726 #include <emmintrin.h>
    727 
    728 #ifdef _MSC_VER
    729 
    730 #if _MSC_VER >= 1400  // not VC6
    731 #include <intrin.h> // __cpuid
    732 static int stbi__cpuid3(void)
    733 {
    734    int info[4];
    735    __cpuid(info,1);
    736    return info[3];
    737 }
    738 #else
    739 static int stbi__cpuid3(void)
    740 {
    741    int res;
    742    __asm {
    743       mov  eax,1
    744       cpuid
    745       mov  res,edx
    746    }
    747    return res;
    748 }
    749 #endif
    750 
    751 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
    752 
    753 #if !defined(STBI_NO_JPEG) && defined(STBI_SSE2)
    754 static int stbi__sse2_available(void)
    755 {
    756    int info3 = stbi__cpuid3();
    757    return ((info3 >> 26) & 1) != 0;
    758 }
    759 #endif
    760 
    761 #else // assume GCC-style if not VC++
    762 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
    763 
    764 #if !defined(STBI_NO_JPEG) && defined(STBI_SSE2)
    765 static int stbi__sse2_available(void)
    766 {
    767    // If we're even attempting to compile this on GCC/Clang, that means
    768    // -msse2 is on, which means the compiler is allowed to use SSE2
    769    // instructions at will, and so are we.
    770    return 1;
    771 }
    772 #endif
    773 
    774 #endif
    775 #endif
    776 
    777 // ARM NEON
    778 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
    779 #undef STBI_NEON
    780 #endif
    781 
    782 #ifdef STBI_NEON
    783 #include <arm_neon.h>
    784 #ifdef _MSC_VER
    785 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
    786 #else
    787 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
    788 #endif
    789 #endif
    790 
    791 #ifndef STBI_SIMD_ALIGN
    792 #define STBI_SIMD_ALIGN(type, name) type name
    793 #endif
    794 
    795 #ifndef STBI_MAX_DIMENSIONS
    796 #define STBI_MAX_DIMENSIONS (1 << 24)
    797 #endif
    798 
    799 ///////////////////////////////////////////////
    800 //
    801 //  stbi__context struct and start_xxx functions
    802 
    803 // stbi__context structure is our basic context used by all images, so it
    804 // contains all the IO context, plus some basic image information
    805 typedef struct
    806 {
    807    stbi__uint32 img_x, img_y;
    808    int img_n, img_out_n;
    809 
    810    stbi_io_callbacks io;
    811    void *io_user_data;
    812 
    813    int read_from_callbacks;
    814    int buflen;
    815    stbi_uc buffer_start[128];
    816    int callback_already_read;
    817 
    818    stbi_uc *img_buffer, *img_buffer_end;
    819    stbi_uc *img_buffer_original, *img_buffer_original_end;
    820 } stbi__context;
    821 
    822 
    823 static void stbi__refill_buffer(stbi__context *s);
    824 
    825 // initialize a memory-decode context
    826 static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
    827 {
    828    s->io.read = NULL;
    829    s->read_from_callbacks = 0;
    830    s->callback_already_read = 0;
    831    s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
    832    s->img_buffer_end = s->img_buffer_original_end = (stbi_uc *) buffer+len;
    833 }
    834 
    835 // initialize a callback-based context
    836 static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
    837 {
    838    s->io = *c;
    839    s->io_user_data = user;
    840    s->buflen = sizeof(s->buffer_start);
    841    s->read_from_callbacks = 1;
    842    s->callback_already_read = 0;
    843    s->img_buffer = s->img_buffer_original = s->buffer_start;
    844    stbi__refill_buffer(s);
    845    s->img_buffer_original_end = s->img_buffer_end;
    846 }
    847 
    848 #ifndef STBI_NO_STDIO
    849 
    850 static int stbi__stdio_read(void *user, char *data, int size)
    851 {
    852    return (int) fread(data,1,size,(FILE*) user);
    853 }
    854 
    855 static void stbi__stdio_skip(void *user, int n)
    856 {
    857    int ch;
    858    fseek((FILE*) user, n, SEEK_CUR);
    859    ch = fgetc((FILE*) user);  /* have to read a byte to reset feof()'s flag */
    860    if (ch != EOF) {
    861       ungetc(ch, (FILE *) user);  /* push byte back onto stream if valid. */
    862    }
    863 }
    864 
    865 static int stbi__stdio_eof(void *user)
    866 {
    867    return feof((FILE*) user) || ferror((FILE *) user);
    868 }
    869 
    870 static stbi_io_callbacks stbi__stdio_callbacks =
    871 {
    872    stbi__stdio_read,
    873    stbi__stdio_skip,
    874    stbi__stdio_eof,
    875 };
    876 
    877 static void stbi__start_file(stbi__context *s, FILE *f)
    878 {
    879    stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
    880 }
    881 
    882 //static void stop_file(stbi__context *s) { }
    883 
    884 #endif // !STBI_NO_STDIO
    885 
    886 static void stbi__rewind(stbi__context *s)
    887 {
    888    // conceptually rewind SHOULD rewind to the beginning of the stream,
    889    // but we just rewind to the beginning of the initial buffer, because
    890    // we only use it after doing 'test', which only ever looks at at most 92 bytes
    891    s->img_buffer = s->img_buffer_original;
    892    s->img_buffer_end = s->img_buffer_original_end;
    893 }
    894 
    895 enum
    896 {
    897    STBI_ORDER_RGB,
    898    STBI_ORDER_BGR
    899 };
    900 
    901 typedef struct
    902 {
    903    int bits_per_channel;
    904    int num_channels;
    905    int channel_order;
    906 } stbi__result_info;
    907 
    908 #ifndef STBI_NO_JPEG
    909 static int      stbi__jpeg_test(stbi__context *s);
    910 static void    *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    911 static int      stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
    912 #endif
    913 
    914 #ifndef STBI_NO_PNG
    915 static int      stbi__png_test(stbi__context *s);
    916 static void    *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    917 static int      stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
    918 static int      stbi__png_is16(stbi__context *s);
    919 #endif
    920 
    921 #ifndef STBI_NO_BMP
    922 static int      stbi__bmp_test(stbi__context *s);
    923 static void    *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    924 static int      stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
    925 #endif
    926 
    927 #ifndef STBI_NO_TGA
    928 static int      stbi__tga_test(stbi__context *s);
    929 static void    *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    930 static int      stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
    931 #endif
    932 
    933 #ifndef STBI_NO_PSD
    934 static int      stbi__psd_test(stbi__context *s);
    935 static void    *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc);
    936 static int      stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
    937 static int      stbi__psd_is16(stbi__context *s);
    938 #endif
    939 
    940 #ifndef STBI_NO_HDR
    941 static int      stbi__hdr_test(stbi__context *s);
    942 static float   *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    943 static int      stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
    944 #endif
    945 
    946 #ifndef STBI_NO_PIC
    947 static int      stbi__pic_test(stbi__context *s);
    948 static void    *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    949 static int      stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
    950 #endif
    951 
    952 #ifndef STBI_NO_GIF
    953 static int      stbi__gif_test(stbi__context *s);
    954 static void    *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    955 static void    *stbi__load_gif_main(stbi__context *s, int **delays, int *x, int *y, int *z, int *comp, int req_comp);
    956 static int      stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
    957 #endif
    958 
    959 #ifndef STBI_NO_PNM
    960 static int      stbi__pnm_test(stbi__context *s);
    961 static void    *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
    962 static int      stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
    963 static int      stbi__pnm_is16(stbi__context *s);
    964 #endif
    965 
    966 static
    967 #ifdef STBI_THREAD_LOCAL
    968 STBI_THREAD_LOCAL
    969 #endif
    970 const char *stbi__g_failure_reason;
    971 
    972 STBIDEF const char *stbi_failure_reason(void)
    973 {
    974    return stbi__g_failure_reason;
    975 }
    976 
    977 #ifndef STBI_NO_FAILURE_STRINGS
    978 static int stbi__err(const char *str)
    979 {
    980    stbi__g_failure_reason = str;
    981    return 0;
    982 }
    983 #endif
    984 
    985 static void *stbi__malloc(size_t size)
    986 {
    987     return STBI_MALLOC(size);
    988 }
    989 
    990 // stb_image uses ints pervasively, including for offset calculations.
    991 // therefore the largest decoded image size we can support with the
    992 // current code, even on 64-bit targets, is INT_MAX. this is not a
    993 // significant limitation for the intended use case.
    994 //
    995 // we do, however, need to make sure our size calculations don't
    996 // overflow. hence a few helper functions for size calculations that
    997 // multiply integers together, making sure that they're non-negative
    998 // and no overflow occurs.
    999 
   1000 // return 1 if the sum is valid, 0 on overflow.
   1001 // negative terms are considered invalid.
   1002 static int stbi__addsizes_valid(int a, int b)
   1003 {
   1004    if (b < 0) return 0;
   1005    // now 0 <= b <= INT_MAX, hence also
   1006    // 0 <= INT_MAX - b <= INTMAX.
   1007    // And "a + b <= INT_MAX" (which might overflow) is the
   1008    // same as a <= INT_MAX - b (no overflow)
   1009    return a <= INT_MAX - b;
   1010 }
   1011 
   1012 // returns 1 if the product is valid, 0 on overflow.
   1013 // negative factors are considered invalid.
   1014 static int stbi__mul2sizes_valid(int a, int b)
   1015 {
   1016    if (a < 0 || b < 0) return 0;
   1017    if (b == 0) return 1; // mul-by-0 is always safe
   1018    // portable way to check for no overflows in a*b
   1019    return a <= INT_MAX/b;
   1020 }
   1021 
   1022 #if !defined(STBI_NO_JPEG) || !defined(STBI_NO_PNG) || !defined(STBI_NO_TGA) || !defined(STBI_NO_HDR)
   1023 // returns 1 if "a*b + add" has no negative terms/factors and doesn't overflow
   1024 static int stbi__mad2sizes_valid(int a, int b, int add)
   1025 {
   1026    return stbi__mul2sizes_valid(a, b) && stbi__addsizes_valid(a*b, add);
   1027 }
   1028 #endif
   1029 
   1030 // returns 1 if "a*b*c + add" has no negative terms/factors and doesn't overflow
   1031 static int stbi__mad3sizes_valid(int a, int b, int c, int add)
   1032 {
   1033    return stbi__mul2sizes_valid(a, b) && stbi__mul2sizes_valid(a*b, c) &&
   1034       stbi__addsizes_valid(a*b*c, add);
   1035 }
   1036 
   1037 // returns 1 if "a*b*c*d + add" has no negative terms/factors and doesn't overflow
   1038 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR) || !defined(STBI_NO_PNM)
   1039 static int stbi__mad4sizes_valid(int a, int b, int c, int d, int add)
   1040 {
   1041    return stbi__mul2sizes_valid(a, b) && stbi__mul2sizes_valid(a*b, c) &&
   1042       stbi__mul2sizes_valid(a*b*c, d) && stbi__addsizes_valid(a*b*c*d, add);
   1043 }
   1044 #endif
   1045 
   1046 #if !defined(STBI_NO_JPEG) || !defined(STBI_NO_PNG) || !defined(STBI_NO_TGA) || !defined(STBI_NO_HDR)
   1047 // mallocs with size overflow checking
   1048 static void *stbi__malloc_mad2(int a, int b, int add)
   1049 {
   1050    if (!stbi__mad2sizes_valid(a, b, add)) return NULL;
   1051    return stbi__malloc(a*b + add);
   1052 }
   1053 #endif
   1054 
   1055 static void *stbi__malloc_mad3(int a, int b, int c, int add)
   1056 {
   1057    if (!stbi__mad3sizes_valid(a, b, c, add)) return NULL;
   1058    return stbi__malloc(a*b*c + add);
   1059 }
   1060 
   1061 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR) || !defined(STBI_NO_PNM)
   1062 static void *stbi__malloc_mad4(int a, int b, int c, int d, int add)
   1063 {
   1064    if (!stbi__mad4sizes_valid(a, b, c, d, add)) return NULL;
   1065    return stbi__malloc(a*b*c*d + add);
   1066 }
   1067 #endif
   1068 
   1069 // returns 1 if the sum of two signed ints is valid (between -2^31 and 2^31-1 inclusive), 0 on overflow.
   1070 static int stbi__addints_valid(int a, int b)
   1071 {
   1072    if ((a >= 0) != (b >= 0)) return 1; // a and b have different signs, so no overflow
   1073    if (a < 0 && b < 0) return a >= INT_MIN - b; // same as a + b >= INT_MIN; INT_MIN - b cannot overflow since b < 0.
   1074    return a <= INT_MAX - b;
   1075 }
   1076 
   1077 // returns 1 if the product of two ints fits in a signed short, 0 on overflow.
   1078 static int stbi__mul2shorts_valid(int a, int b)
   1079 {
   1080    if (b == 0 || b == -1) return 1; // multiplication by 0 is always 0; check for -1 so SHRT_MIN/b doesn't overflow
   1081    if ((a >= 0) == (b >= 0)) return a <= SHRT_MAX/b; // product is positive, so similar to mul2sizes_valid
   1082    if (b < 0) return a <= SHRT_MIN / b; // same as a * b >= SHRT_MIN
   1083    return a >= SHRT_MIN / b;
   1084 }
   1085 
   1086 // stbi__err - error
   1087 // stbi__errpf - error returning pointer to float
   1088 // stbi__errpuc - error returning pointer to unsigned char
   1089 
   1090 #ifdef STBI_NO_FAILURE_STRINGS
   1091    #define stbi__err(x,y)  0
   1092 #elif defined(STBI_FAILURE_USERMSG)
   1093    #define stbi__err(x,y)  stbi__err(y)
   1094 #else
   1095    #define stbi__err(x,y)  stbi__err(x)
   1096 #endif
   1097 
   1098 #define stbi__errpf(x,y)   ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
   1099 #define stbi__errpuc(x,y)  ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
   1100 
   1101 STBIDEF void stbi_image_free(void *retval_from_stbi_load)
   1102 {
   1103    STBI_FREE(retval_from_stbi_load);
   1104 }
   1105 
   1106 #ifndef STBI_NO_LINEAR
   1107 static float   *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
   1108 #endif
   1109 
   1110 #ifndef STBI_NO_HDR
   1111 static stbi_uc *stbi__hdr_to_ldr(float   *data, int x, int y, int comp);
   1112 #endif
   1113 
   1114 static int stbi__vertically_flip_on_load_global = 0;
   1115 
   1116 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
   1117 {
   1118    stbi__vertically_flip_on_load_global = flag_true_if_should_flip;
   1119 }
   1120 
   1121 #ifndef STBI_THREAD_LOCAL
   1122 #define stbi__vertically_flip_on_load  stbi__vertically_flip_on_load_global
   1123 #else
   1124 static STBI_THREAD_LOCAL int stbi__vertically_flip_on_load_local, stbi__vertically_flip_on_load_set;
   1125 
   1126 STBIDEF void stbi_set_flip_vertically_on_load_thread(int flag_true_if_should_flip)
   1127 {
   1128    stbi__vertically_flip_on_load_local = flag_true_if_should_flip;
   1129    stbi__vertically_flip_on_load_set = 1;
   1130 }
   1131 
   1132 #define stbi__vertically_flip_on_load  (stbi__vertically_flip_on_load_set       \
   1133                                          ? stbi__vertically_flip_on_load_local  \
   1134                                          : stbi__vertically_flip_on_load_global)
   1135 #endif // STBI_THREAD_LOCAL
   1136 
   1137 static void *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc)
   1138 {
   1139    memset(ri, 0, sizeof(*ri)); // make sure it's initialized if we add new fields
   1140    ri->bits_per_channel = 8; // default is 8 so most paths don't have to be changed
   1141    ri->channel_order = STBI_ORDER_RGB; // all current input & output are this, but this is here so we can add BGR order
   1142    ri->num_channels = 0;
   1143 
   1144    // test the formats with a very explicit header first (at least a FOURCC
   1145    // or distinctive magic number first)
   1146    #ifndef STBI_NO_PNG
   1147    if (stbi__png_test(s))  return stbi__png_load(s,x,y,comp,req_comp, ri);
   1148    #endif
   1149    #ifndef STBI_NO_BMP
   1150    if (stbi__bmp_test(s))  return stbi__bmp_load(s,x,y,comp,req_comp, ri);
   1151    #endif
   1152    #ifndef STBI_NO_GIF
   1153    if (stbi__gif_test(s))  return stbi__gif_load(s,x,y,comp,req_comp, ri);
   1154    #endif
   1155    #ifndef STBI_NO_PSD
   1156    if (stbi__psd_test(s))  return stbi__psd_load(s,x,y,comp,req_comp, ri, bpc);
   1157    #else
   1158    STBI_NOTUSED(bpc);
   1159    #endif
   1160    #ifndef STBI_NO_PIC
   1161    if (stbi__pic_test(s))  return stbi__pic_load(s,x,y,comp,req_comp, ri);
   1162    #endif
   1163 
   1164    // then the formats that can end up attempting to load with just 1 or 2
   1165    // bytes matching expectations; these are prone to false positives, so
   1166    // try them later
   1167    #ifndef STBI_NO_JPEG
   1168    if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp, ri);
   1169    #endif
   1170    #ifndef STBI_NO_PNM
   1171    if (stbi__pnm_test(s))  return stbi__pnm_load(s,x,y,comp,req_comp, ri);
   1172    #endif
   1173 
   1174    #ifndef STBI_NO_HDR
   1175    if (stbi__hdr_test(s)) {
   1176       float *hdr = stbi__hdr_load(s, x,y,comp,req_comp, ri);
   1177       return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
   1178    }
   1179    #endif
   1180 
   1181    #ifndef STBI_NO_TGA
   1182    // test tga last because it's a crappy test!
   1183    if (stbi__tga_test(s))
   1184       return stbi__tga_load(s,x,y,comp,req_comp, ri);
   1185    #endif
   1186 
   1187    return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
   1188 }
   1189 
   1190 static stbi_uc *stbi__convert_16_to_8(stbi__uint16 *orig, int w, int h, int channels)
   1191 {
   1192    int i;
   1193    int img_len = w * h * channels;
   1194    stbi_uc *reduced;
   1195 
   1196    reduced = (stbi_uc *) stbi__malloc(img_len);
   1197    if (reduced == NULL) return stbi__errpuc("outofmem", "Out of memory");
   1198 
   1199    for (i = 0; i < img_len; ++i)
   1200       reduced[i] = (stbi_uc)((orig[i] >> 8) & 0xFF); // top half of each byte is sufficient approx of 16->8 bit scaling
   1201 
   1202    STBI_FREE(orig);
   1203    return reduced;
   1204 }
   1205 
   1206 static stbi__uint16 *stbi__convert_8_to_16(stbi_uc *orig, int w, int h, int channels)
   1207 {
   1208    int i;
   1209    int img_len = w * h * channels;
   1210    stbi__uint16 *enlarged;
   1211 
   1212    enlarged = (stbi__uint16 *) stbi__malloc(img_len*2);
   1213    if (enlarged == NULL) return (stbi__uint16 *) stbi__errpuc("outofmem", "Out of memory");
   1214 
   1215    for (i = 0; i < img_len; ++i)
   1216       enlarged[i] = (stbi__uint16)((orig[i] << 8) + orig[i]); // replicate to high and low byte, maps 0->0, 255->0xffff
   1217 
   1218    STBI_FREE(orig);
   1219    return enlarged;
   1220 }
   1221 
   1222 static void stbi__vertical_flip(void *image, int w, int h, int bytes_per_pixel)
   1223 {
   1224    int row;
   1225    size_t bytes_per_row = (size_t)w * bytes_per_pixel;
   1226    stbi_uc temp[2048];
   1227    stbi_uc *bytes = (stbi_uc *)image;
   1228 
   1229    for (row = 0; row < (h>>1); row++) {
   1230       stbi_uc *row0 = bytes + row*bytes_per_row;
   1231       stbi_uc *row1 = bytes + (h - row - 1)*bytes_per_row;
   1232       // swap row0 with row1
   1233       size_t bytes_left = bytes_per_row;
   1234       while (bytes_left) {
   1235          size_t bytes_copy = (bytes_left < sizeof(temp)) ? bytes_left : sizeof(temp);
   1236          memcpy(temp, row0, bytes_copy);
   1237          memcpy(row0, row1, bytes_copy);
   1238          memcpy(row1, temp, bytes_copy);
   1239          row0 += bytes_copy;
   1240          row1 += bytes_copy;
   1241          bytes_left -= bytes_copy;
   1242       }
   1243    }
   1244 }
   1245 
   1246 #ifndef STBI_NO_GIF
   1247 static void stbi__vertical_flip_slices(void *image, int w, int h, int z, int bytes_per_pixel)
   1248 {
   1249    int slice;
   1250    int slice_size = w * h * bytes_per_pixel;
   1251 
   1252    stbi_uc *bytes = (stbi_uc *)image;
   1253    for (slice = 0; slice < z; ++slice) {
   1254       stbi__vertical_flip(bytes, w, h, bytes_per_pixel);
   1255       bytes += slice_size;
   1256    }
   1257 }
   1258 #endif
   1259 
   1260 static unsigned char *stbi__load_and_postprocess_8bit(stbi__context *s, int *x, int *y, int *comp, int req_comp)
   1261 {
   1262    stbi__result_info ri;
   1263    void *result = stbi__load_main(s, x, y, comp, req_comp, &ri, 8);
   1264 
   1265    if (result == NULL)
   1266       return NULL;
   1267 
   1268    // it is the responsibility of the loaders to make sure we get either 8 or 16 bit.
   1269    STBI_ASSERT(ri.bits_per_channel == 8 || ri.bits_per_channel == 16);
   1270 
   1271    if (ri.bits_per_channel != 8) {
   1272       result = stbi__convert_16_to_8((stbi__uint16 *) result, *x, *y, req_comp == 0 ? *comp : req_comp);
   1273       ri.bits_per_channel = 8;
   1274    }
   1275 
   1276    // @TODO: move stbi__convert_format to here
   1277 
   1278    if (stbi__vertically_flip_on_load) {
   1279       int channels = req_comp ? req_comp : *comp;
   1280       stbi__vertical_flip(result, *x, *y, channels * sizeof(stbi_uc));
   1281    }
   1282 
   1283    return (unsigned char *) result;
   1284 }
   1285 
   1286 static stbi__uint16 *stbi__load_and_postprocess_16bit(stbi__context *s, int *x, int *y, int *comp, int req_comp)
   1287 {
   1288    stbi__result_info ri;
   1289    void *result = stbi__load_main(s, x, y, comp, req_comp, &ri, 16);
   1290 
   1291    if (result == NULL)
   1292       return NULL;
   1293 
   1294    // it is the responsibility of the loaders to make sure we get either 8 or 16 bit.
   1295    STBI_ASSERT(ri.bits_per_channel == 8 || ri.bits_per_channel == 16);
   1296 
   1297    if (ri.bits_per_channel != 16) {
   1298       result = stbi__convert_8_to_16((stbi_uc *) result, *x, *y, req_comp == 0 ? *comp : req_comp);
   1299       ri.bits_per_channel = 16;
   1300    }
   1301 
   1302    // @TODO: move stbi__convert_format16 to here
   1303    // @TODO: special case RGB-to-Y (and RGBA-to-YA) for 8-bit-to-16-bit case to keep more precision
   1304 
   1305    if (stbi__vertically_flip_on_load) {
   1306       int channels = req_comp ? req_comp : *comp;
   1307       stbi__vertical_flip(result, *x, *y, channels * sizeof(stbi__uint16));
   1308    }
   1309 
   1310    return (stbi__uint16 *) result;
   1311 }
   1312 
   1313 #if !defined(STBI_NO_HDR) && !defined(STBI_NO_LINEAR)
   1314 static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
   1315 {
   1316    if (stbi__vertically_flip_on_load && result != NULL) {
   1317       int channels = req_comp ? req_comp : *comp;
   1318       stbi__vertical_flip(result, *x, *y, channels * sizeof(float));
   1319    }
   1320 }
   1321 #endif
   1322 
   1323 #ifndef STBI_NO_STDIO
   1324 
   1325 #if defined(_WIN32) && defined(STBI_WINDOWS_UTF8)
   1326 STBI_EXTERN __declspec(dllimport) int __stdcall MultiByteToWideChar(unsigned int cp, unsigned long flags, const char *str, int cbmb, wchar_t *widestr, int cchwide);
   1327 STBI_EXTERN __declspec(dllimport) int __stdcall WideCharToMultiByte(unsigned int cp, unsigned long flags, const wchar_t *widestr, int cchwide, char *str, int cbmb, const char *defchar, int *used_default);
   1328 #endif
   1329 
   1330 #if defined(_WIN32) && defined(STBI_WINDOWS_UTF8)
   1331 STBIDEF int stbi_convert_wchar_to_utf8(char *buffer, size_t bufferlen, const wchar_t* input)
   1332 {
   1333 	return WideCharToMultiByte(65001 /* UTF8 */, 0, input, -1, buffer, (int) bufferlen, NULL, NULL);
   1334 }
   1335 #endif
   1336 
   1337 static FILE *stbi__fopen(char const *filename, char const *mode)
   1338 {
   1339    FILE *f;
   1340 #if defined(_WIN32) && defined(STBI_WINDOWS_UTF8)
   1341    wchar_t wMode[64];
   1342    wchar_t wFilename[1024];
   1343 	if (0 == MultiByteToWideChar(65001 /* UTF8 */, 0, filename, -1, wFilename, sizeof(wFilename)/sizeof(*wFilename)))
   1344       return 0;
   1345 
   1346 	if (0 == MultiByteToWideChar(65001 /* UTF8 */, 0, mode, -1, wMode, sizeof(wMode)/sizeof(*wMode)))
   1347       return 0;
   1348 
   1349 #if defined(_MSC_VER) && _MSC_VER >= 1400
   1350 	if (0 != _wfopen_s(&f, wFilename, wMode))
   1351 		f = 0;
   1352 #else
   1353    f = _wfopen(wFilename, wMode);
   1354 #endif
   1355 
   1356 #elif defined(_MSC_VER) && _MSC_VER >= 1400
   1357    if (0 != fopen_s(&f, filename, mode))
   1358       f=0;
   1359 #else
   1360    f = fopen(filename, mode);
   1361 #endif
   1362    return f;
   1363 }
   1364 
   1365 
   1366 STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
   1367 {
   1368    FILE *f = stbi__fopen(filename, "rb");
   1369    unsigned char *result;
   1370    if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
   1371    result = stbi_load_from_file(f,x,y,comp,req_comp);
   1372    fclose(f);
   1373    return result;
   1374 }
   1375 
   1376 STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
   1377 {
   1378    unsigned char *result;
   1379    stbi__context s;
   1380    stbi__start_file(&s,f);
   1381    result = stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
   1382    if (result) {
   1383       // need to 'unget' all the characters in the IO buffer
   1384       fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
   1385    }
   1386    return result;
   1387 }
   1388 
   1389 STBIDEF stbi__uint16 *stbi_load_from_file_16(FILE *f, int *x, int *y, int *comp, int req_comp)
   1390 {
   1391    stbi__uint16 *result;
   1392    stbi__context s;
   1393    stbi__start_file(&s,f);
   1394    result = stbi__load_and_postprocess_16bit(&s,x,y,comp,req_comp);
   1395    if (result) {
   1396       // need to 'unget' all the characters in the IO buffer
   1397       fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
   1398    }
   1399    return result;
   1400 }
   1401 
   1402 STBIDEF stbi_us *stbi_load_16(char const *filename, int *x, int *y, int *comp, int req_comp)
   1403 {
   1404    FILE *f = stbi__fopen(filename, "rb");
   1405    stbi__uint16 *result;
   1406    if (!f) return (stbi_us *) stbi__errpuc("can't fopen", "Unable to open file");
   1407    result = stbi_load_from_file_16(f,x,y,comp,req_comp);
   1408    fclose(f);
   1409    return result;
   1410 }
   1411 
   1412 
   1413 #endif //!STBI_NO_STDIO
   1414 
   1415 STBIDEF stbi_us *stbi_load_16_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels)
   1416 {
   1417    stbi__context s;
   1418    stbi__start_mem(&s,buffer,len);
   1419    return stbi__load_and_postprocess_16bit(&s,x,y,channels_in_file,desired_channels);
   1420 }
   1421 
   1422 STBIDEF stbi_us *stbi_load_16_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels)
   1423 {
   1424    stbi__context s;
   1425    stbi__start_callbacks(&s, (stbi_io_callbacks *)clbk, user);
   1426    return stbi__load_and_postprocess_16bit(&s,x,y,channels_in_file,desired_channels);
   1427 }
   1428 
   1429 STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
   1430 {
   1431    stbi__context s;
   1432    stbi__start_mem(&s,buffer,len);
   1433    return stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
   1434 }
   1435 
   1436 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
   1437 {
   1438    stbi__context s;
   1439    stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
   1440    return stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
   1441 }
   1442 
   1443 #ifndef STBI_NO_GIF
   1444 STBIDEF stbi_uc *stbi_load_gif_from_memory(stbi_uc const *buffer, int len, int **delays, int *x, int *y, int *z, int *comp, int req_comp)
   1445 {
   1446    unsigned char *result;
   1447    stbi__context s;
   1448    stbi__start_mem(&s,buffer,len);
   1449 
   1450    result = (unsigned char*) stbi__load_gif_main(&s, delays, x, y, z, comp, req_comp);
   1451    if (stbi__vertically_flip_on_load) {
   1452       stbi__vertical_flip_slices( result, *x, *y, *z, *comp );
   1453    }
   1454 
   1455    return result;
   1456 }
   1457 #endif
   1458 
   1459 #ifndef STBI_NO_LINEAR
   1460 static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
   1461 {
   1462    unsigned char *data;
   1463    #ifndef STBI_NO_HDR
   1464    if (stbi__hdr_test(s)) {
   1465       stbi__result_info ri;
   1466       float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp, &ri);
   1467       if (hdr_data)
   1468          stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
   1469       return hdr_data;
   1470    }
   1471    #endif
   1472    data = stbi__load_and_postprocess_8bit(s, x, y, comp, req_comp);
   1473    if (data)
   1474       return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
   1475    return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
   1476 }
   1477 
   1478 STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
   1479 {
   1480    stbi__context s;
   1481    stbi__start_mem(&s,buffer,len);
   1482    return stbi__loadf_main(&s,x,y,comp,req_comp);
   1483 }
   1484 
   1485 STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
   1486 {
   1487    stbi__context s;
   1488    stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
   1489    return stbi__loadf_main(&s,x,y,comp,req_comp);
   1490 }
   1491 
   1492 #ifndef STBI_NO_STDIO
   1493 STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
   1494 {
   1495    float *result;
   1496    FILE *f = stbi__fopen(filename, "rb");
   1497    if (!f) return stbi__errpf("can't fopen", "Unable to open file");
   1498    result = stbi_loadf_from_file(f,x,y,comp,req_comp);
   1499    fclose(f);
   1500    return result;
   1501 }
   1502 
   1503 STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
   1504 {
   1505    stbi__context s;
   1506    stbi__start_file(&s,f);
   1507    return stbi__loadf_main(&s,x,y,comp,req_comp);
   1508 }
   1509 #endif // !STBI_NO_STDIO
   1510 
   1511 #endif // !STBI_NO_LINEAR
   1512 
   1513 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
   1514 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
   1515 // reports false!
   1516 
   1517 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
   1518 {
   1519    #ifndef STBI_NO_HDR
   1520    stbi__context s;
   1521    stbi__start_mem(&s,buffer,len);
   1522    return stbi__hdr_test(&s);
   1523    #else
   1524    STBI_NOTUSED(buffer);
   1525    STBI_NOTUSED(len);
   1526    return 0;
   1527    #endif
   1528 }
   1529 
   1530 #ifndef STBI_NO_STDIO
   1531 STBIDEF int      stbi_is_hdr          (char const *filename)
   1532 {
   1533    FILE *f = stbi__fopen(filename, "rb");
   1534    int result=0;
   1535    if (f) {
   1536       result = stbi_is_hdr_from_file(f);
   1537       fclose(f);
   1538    }
   1539    return result;
   1540 }
   1541 
   1542 STBIDEF int stbi_is_hdr_from_file(FILE *f)
   1543 {
   1544    #ifndef STBI_NO_HDR
   1545    long pos = ftell(f);
   1546    int res;
   1547    stbi__context s;
   1548    stbi__start_file(&s,f);
   1549    res = stbi__hdr_test(&s);
   1550    fseek(f, pos, SEEK_SET);
   1551    return res;
   1552    #else
   1553    STBI_NOTUSED(f);
   1554    return 0;
   1555    #endif
   1556 }
   1557 #endif // !STBI_NO_STDIO
   1558 
   1559 STBIDEF int      stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
   1560 {
   1561    #ifndef STBI_NO_HDR
   1562    stbi__context s;
   1563    stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
   1564    return stbi__hdr_test(&s);
   1565    #else
   1566    STBI_NOTUSED(clbk);
   1567    STBI_NOTUSED(user);
   1568    return 0;
   1569    #endif
   1570 }
   1571 
   1572 #ifndef STBI_NO_LINEAR
   1573 static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
   1574 
   1575 STBIDEF void   stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
   1576 STBIDEF void   stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
   1577 #endif
   1578 
   1579 static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
   1580 
   1581 STBIDEF void   stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
   1582 STBIDEF void   stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
   1583 
   1584 
   1585 //////////////////////////////////////////////////////////////////////////////
   1586 //
   1587 // Common code used by all image loaders
   1588 //
   1589 
   1590 enum
   1591 {
   1592    STBI__SCAN_load=0,
   1593    STBI__SCAN_type,
   1594    STBI__SCAN_header
   1595 };
   1596 
   1597 static void stbi__refill_buffer(stbi__context *s)
   1598 {
   1599    int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
   1600    s->callback_already_read += (int) (s->img_buffer - s->img_buffer_original);
   1601    if (n == 0) {
   1602       // at end of file, treat same as if from memory, but need to handle case
   1603       // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
   1604       s->read_from_callbacks = 0;
   1605       s->img_buffer = s->buffer_start;
   1606       s->img_buffer_end = s->buffer_start+1;
   1607       *s->img_buffer = 0;
   1608    } else {
   1609       s->img_buffer = s->buffer_start;
   1610       s->img_buffer_end = s->buffer_start + n;
   1611    }
   1612 }
   1613 
   1614 stbi_inline static stbi_uc stbi__get8(stbi__context *s)
   1615 {
   1616    if (s->img_buffer < s->img_buffer_end)
   1617       return *s->img_buffer++;
   1618    if (s->read_from_callbacks) {
   1619       stbi__refill_buffer(s);
   1620       return *s->img_buffer++;
   1621    }
   1622    return 0;
   1623 }
   1624 
   1625 #if defined(STBI_NO_JPEG) && defined(STBI_NO_HDR) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
   1626 // nothing
   1627 #else
   1628 stbi_inline static int stbi__at_eof(stbi__context *s)
   1629 {
   1630    if (s->io.read) {
   1631       if (!(s->io.eof)(s->io_user_data)) return 0;
   1632       // if feof() is true, check if buffer = end
   1633       // special case: we've only got the special 0 character at the end
   1634       if (s->read_from_callbacks == 0) return 1;
   1635    }
   1636 
   1637    return s->img_buffer >= s->img_buffer_end;
   1638 }
   1639 #endif
   1640 
   1641 #if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC)
   1642 // nothing
   1643 #else
   1644 static void stbi__skip(stbi__context *s, int n)
   1645 {
   1646    if (n == 0) return;  // already there!
   1647    if (n < 0) {
   1648       s->img_buffer = s->img_buffer_end;
   1649       return;
   1650    }
   1651    if (s->io.read) {
   1652       int blen = (int) (s->img_buffer_end - s->img_buffer);
   1653       if (blen < n) {
   1654          s->img_buffer = s->img_buffer_end;
   1655          (s->io.skip)(s->io_user_data, n - blen);
   1656          return;
   1657       }
   1658    }
   1659    s->img_buffer += n;
   1660 }
   1661 #endif
   1662 
   1663 #if defined(STBI_NO_PNG) && defined(STBI_NO_TGA) && defined(STBI_NO_HDR) && defined(STBI_NO_PNM)
   1664 // nothing
   1665 #else
   1666 static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
   1667 {
   1668    if (s->io.read) {
   1669       int blen = (int) (s->img_buffer_end - s->img_buffer);
   1670       if (blen < n) {
   1671          int res, count;
   1672 
   1673          memcpy(buffer, s->img_buffer, blen);
   1674 
   1675          count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
   1676          res = (count == (n-blen));
   1677          s->img_buffer = s->img_buffer_end;
   1678          return res;
   1679       }
   1680    }
   1681 
   1682    if (s->img_buffer+n <= s->img_buffer_end) {
   1683       memcpy(buffer, s->img_buffer, n);
   1684       s->img_buffer += n;
   1685       return 1;
   1686    } else
   1687       return 0;
   1688 }
   1689 #endif
   1690 
   1691 #if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_PSD) && defined(STBI_NO_PIC)
   1692 // nothing
   1693 #else
   1694 static int stbi__get16be(stbi__context *s)
   1695 {
   1696    int z = stbi__get8(s);
   1697    return (z << 8) + stbi__get8(s);
   1698 }
   1699 #endif
   1700 
   1701 #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD) && defined(STBI_NO_PIC)
   1702 // nothing
   1703 #else
   1704 static stbi__uint32 stbi__get32be(stbi__context *s)
   1705 {
   1706    stbi__uint32 z = stbi__get16be(s);
   1707    return (z << 16) + stbi__get16be(s);
   1708 }
   1709 #endif
   1710 
   1711 #if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
   1712 // nothing
   1713 #else
   1714 static int stbi__get16le(stbi__context *s)
   1715 {
   1716    int z = stbi__get8(s);
   1717    return z + (stbi__get8(s) << 8);
   1718 }
   1719 #endif
   1720 
   1721 #ifndef STBI_NO_BMP
   1722 static stbi__uint32 stbi__get32le(stbi__context *s)
   1723 {
   1724    stbi__uint32 z = stbi__get16le(s);
   1725    z += (stbi__uint32)stbi__get16le(s) << 16;
   1726    return z;
   1727 }
   1728 #endif
   1729 
   1730 #define STBI__BYTECAST(x)  ((stbi_uc) ((x) & 255))  // truncate int to byte without warnings
   1731 
   1732 #if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
   1733 // nothing
   1734 #else
   1735 //////////////////////////////////////////////////////////////////////////////
   1736 //
   1737 //  generic converter from built-in img_n to req_comp
   1738 //    individual types do this automatically as much as possible (e.g. jpeg
   1739 //    does all cases internally since it needs to colorspace convert anyway,
   1740 //    and it never has alpha, so very few cases ). png can automatically
   1741 //    interleave an alpha=255 channel, but falls back to this for other cases
   1742 //
   1743 //  assume data buffer is malloced, so malloc a new one and free that one
   1744 //  only failure mode is malloc failing
   1745 
   1746 static stbi_uc stbi__compute_y(int r, int g, int b)
   1747 {
   1748    return (stbi_uc) (((r*77) + (g*150) +  (29*b)) >> 8);
   1749 }
   1750 #endif
   1751 
   1752 #if defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
   1753 // nothing
   1754 #else
   1755 static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
   1756 {
   1757    int i,j;
   1758    unsigned char *good;
   1759 
   1760    if (req_comp == img_n) return data;
   1761    STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
   1762 
   1763    good = (unsigned char *) stbi__malloc_mad3(req_comp, x, y, 0);
   1764    if (good == NULL) {
   1765       STBI_FREE(data);
   1766       return stbi__errpuc("outofmem", "Out of memory");
   1767    }
   1768 
   1769    for (j=0; j < (int) y; ++j) {
   1770       unsigned char *src  = data + j * x * img_n   ;
   1771       unsigned char *dest = good + j * x * req_comp;
   1772 
   1773       #define STBI__COMBO(a,b)  ((a)*8+(b))
   1774       #define STBI__CASE(a,b)   case STBI__COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
   1775       // convert source image with img_n components to one with req_comp components;
   1776       // avoid switch per pixel, so use switch per scanline and massive macros
   1777       switch (STBI__COMBO(img_n, req_comp)) {
   1778          STBI__CASE(1,2) { dest[0]=src[0]; dest[1]=255;                                     } break;
   1779          STBI__CASE(1,3) { dest[0]=dest[1]=dest[2]=src[0];                                  } break;
   1780          STBI__CASE(1,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=255;                     } break;
   1781          STBI__CASE(2,1) { dest[0]=src[0];                                                  } break;
   1782          STBI__CASE(2,3) { dest[0]=dest[1]=dest[2]=src[0];                                  } break;
   1783          STBI__CASE(2,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=src[1];                  } break;
   1784          STBI__CASE(3,4) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];dest[3]=255;        } break;
   1785          STBI__CASE(3,1) { dest[0]=stbi__compute_y(src[0],src[1],src[2]);                   } break;
   1786          STBI__CASE(3,2) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); dest[1] = 255;    } break;
   1787          STBI__CASE(4,1) { dest[0]=stbi__compute_y(src[0],src[1],src[2]);                   } break;
   1788          STBI__CASE(4,2) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); dest[1] = src[3]; } break;
   1789          STBI__CASE(4,3) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];                    } break;
   1790          default: STBI_ASSERT(0); STBI_FREE(data); STBI_FREE(good); return stbi__errpuc("unsupported", "Unsupported format conversion");
   1791       }
   1792       #undef STBI__CASE
   1793    }
   1794 
   1795    STBI_FREE(data);
   1796    return good;
   1797 }
   1798 #endif
   1799 
   1800 #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD)
   1801 // nothing
   1802 #else
   1803 static stbi__uint16 stbi__compute_y_16(int r, int g, int b)
   1804 {
   1805    return (stbi__uint16) (((r*77) + (g*150) +  (29*b)) >> 8);
   1806 }
   1807 #endif
   1808 
   1809 #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD)
   1810 // nothing
   1811 #else
   1812 static stbi__uint16 *stbi__convert_format16(stbi__uint16 *data, int img_n, int req_comp, unsigned int x, unsigned int y)
   1813 {
   1814    int i,j;
   1815    stbi__uint16 *good;
   1816 
   1817    if (req_comp == img_n) return data;
   1818    STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
   1819 
   1820    good = (stbi__uint16 *) stbi__malloc(req_comp * x * y * 2);
   1821    if (good == NULL) {
   1822       STBI_FREE(data);
   1823       return (stbi__uint16 *) stbi__errpuc("outofmem", "Out of memory");
   1824    }
   1825 
   1826    for (j=0; j < (int) y; ++j) {
   1827       stbi__uint16 *src  = data + j * x * img_n   ;
   1828       stbi__uint16 *dest = good + j * x * req_comp;
   1829 
   1830       #define STBI__COMBO(a,b)  ((a)*8+(b))
   1831       #define STBI__CASE(a,b)   case STBI__COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
   1832       // convert source image with img_n components to one with req_comp components;
   1833       // avoid switch per pixel, so use switch per scanline and massive macros
   1834       switch (STBI__COMBO(img_n, req_comp)) {
   1835          STBI__CASE(1,2) { dest[0]=src[0]; dest[1]=0xffff;                                     } break;
   1836          STBI__CASE(1,3) { dest[0]=dest[1]=dest[2]=src[0];                                     } break;
   1837          STBI__CASE(1,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=0xffff;                     } break;
   1838          STBI__CASE(2,1) { dest[0]=src[0];                                                     } break;
   1839          STBI__CASE(2,3) { dest[0]=dest[1]=dest[2]=src[0];                                     } break;
   1840          STBI__CASE(2,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=src[1];                     } break;
   1841          STBI__CASE(3,4) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];dest[3]=0xffff;        } break;
   1842          STBI__CASE(3,1) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]);                   } break;
   1843          STBI__CASE(3,2) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); dest[1] = 0xffff; } break;
   1844          STBI__CASE(4,1) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]);                   } break;
   1845          STBI__CASE(4,2) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); dest[1] = src[3]; } break;
   1846          STBI__CASE(4,3) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];                       } break;
   1847          default: STBI_ASSERT(0); STBI_FREE(data); STBI_FREE(good); return (stbi__uint16*) stbi__errpuc("unsupported", "Unsupported format conversion");
   1848       }
   1849       #undef STBI__CASE
   1850    }
   1851 
   1852    STBI_FREE(data);
   1853    return good;
   1854 }
   1855 #endif
   1856 
   1857 #ifndef STBI_NO_LINEAR
   1858 static float   *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
   1859 {
   1860    int i,k,n;
   1861    float *output;
   1862    if (!data) return NULL;
   1863    output = (float *) stbi__malloc_mad4(x, y, comp, sizeof(float), 0);
   1864    if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
   1865    // compute number of non-alpha components
   1866    if (comp & 1) n = comp; else n = comp-1;
   1867    for (i=0; i < x*y; ++i) {
   1868       for (k=0; k < n; ++k) {
   1869          output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
   1870       }
   1871    }
   1872    if (n < comp) {
   1873       for (i=0; i < x*y; ++i) {
   1874          output[i*comp + n] = data[i*comp + n]/255.0f;
   1875       }
   1876    }
   1877    STBI_FREE(data);
   1878    return output;
   1879 }
   1880 #endif
   1881 
   1882 #ifndef STBI_NO_HDR
   1883 #define stbi__float2int(x)   ((int) (x))
   1884 static stbi_uc *stbi__hdr_to_ldr(float   *data, int x, int y, int comp)
   1885 {
   1886    int i,k,n;
   1887    stbi_uc *output;
   1888    if (!data) return NULL;
   1889    output = (stbi_uc *) stbi__malloc_mad3(x, y, comp, 0);
   1890    if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
   1891    // compute number of non-alpha components
   1892    if (comp & 1) n = comp; else n = comp-1;
   1893    for (i=0; i < x*y; ++i) {
   1894       for (k=0; k < n; ++k) {
   1895          float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
   1896          if (z < 0) z = 0;
   1897          if (z > 255) z = 255;
   1898          output[i*comp + k] = (stbi_uc) stbi__float2int(z);
   1899       }
   1900       if (k < comp) {
   1901          float z = data[i*comp+k] * 255 + 0.5f;
   1902          if (z < 0) z = 0;
   1903          if (z > 255) z = 255;
   1904          output[i*comp + k] = (stbi_uc) stbi__float2int(z);
   1905       }
   1906    }
   1907    STBI_FREE(data);
   1908    return output;
   1909 }
   1910 #endif
   1911 
   1912 //////////////////////////////////////////////////////////////////////////////
   1913 //
   1914 //  "baseline" JPEG/JFIF decoder
   1915 //
   1916 //    simple implementation
   1917 //      - doesn't support delayed output of y-dimension
   1918 //      - simple interface (only one output format: 8-bit interleaved RGB)
   1919 //      - doesn't try to recover corrupt jpegs
   1920 //      - doesn't allow partial loading, loading multiple at once
   1921 //      - still fast on x86 (copying globals into locals doesn't help x86)
   1922 //      - allocates lots of intermediate memory (full size of all components)
   1923 //        - non-interleaved case requires this anyway
   1924 //        - allows good upsampling (see next)
   1925 //    high-quality
   1926 //      - upsampled channels are bilinearly interpolated, even across blocks
   1927 //      - quality integer IDCT derived from IJG's 'slow'
   1928 //    performance
   1929 //      - fast huffman; reasonable integer IDCT
   1930 //      - some SIMD kernels for common paths on targets with SSE2/NEON
   1931 //      - uses a lot of intermediate memory, could cache poorly
   1932 
   1933 #ifndef STBI_NO_JPEG
   1934 
   1935 // huffman decoding acceleration
   1936 #define FAST_BITS   9  // larger handles more cases; smaller stomps less cache
   1937 
   1938 typedef struct
   1939 {
   1940    stbi_uc  fast[1 << FAST_BITS];
   1941    // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
   1942    stbi__uint16 code[256];
   1943    stbi_uc  values[256];
   1944    stbi_uc  size[257];
   1945    unsigned int maxcode[18];
   1946    int    delta[17];   // old 'firstsymbol' - old 'firstcode'
   1947 } stbi__huffman;
   1948 
   1949 typedef struct
   1950 {
   1951    stbi__context *s;
   1952    stbi__huffman huff_dc[4];
   1953    stbi__huffman huff_ac[4];
   1954    stbi__uint16 dequant[4][64];
   1955    stbi__int16 fast_ac[4][1 << FAST_BITS];
   1956 
   1957 // sizes for components, interleaved MCUs
   1958    int img_h_max, img_v_max;
   1959    int img_mcu_x, img_mcu_y;
   1960    int img_mcu_w, img_mcu_h;
   1961 
   1962 // definition of jpeg image component
   1963    struct
   1964    {
   1965       int id;
   1966       int h,v;
   1967       int tq;
   1968       int hd,ha;
   1969       int dc_pred;
   1970 
   1971       int x,y,w2,h2;
   1972       stbi_uc *data;
   1973       void *raw_data, *raw_coeff;
   1974       stbi_uc *linebuf;
   1975       short   *coeff;   // progressive only
   1976       int      coeff_w, coeff_h; // number of 8x8 coefficient blocks
   1977    } img_comp[4];
   1978 
   1979    stbi__uint32   code_buffer; // jpeg entropy-coded buffer
   1980    int            code_bits;   // number of valid bits
   1981    unsigned char  marker;      // marker seen while filling entropy buffer
   1982    int            nomore;      // flag if we saw a marker so must stop
   1983 
   1984    int            progressive;
   1985    int            spec_start;
   1986    int            spec_end;
   1987    int            succ_high;
   1988    int            succ_low;
   1989    int            eob_run;
   1990    int            jfif;
   1991    int            app14_color_transform; // Adobe APP14 tag
   1992    int            rgb;
   1993 
   1994    int scan_n, order[4];
   1995    int restart_interval, todo;
   1996 
   1997 // kernels
   1998    void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
   1999    void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
   2000    stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
   2001 } stbi__jpeg;
   2002 
   2003 static int stbi__build_huffman(stbi__huffman *h, int *count)
   2004 {
   2005    int i,j,k=0;
   2006    unsigned int code;
   2007    // build size list for each symbol (from JPEG spec)
   2008    for (i=0; i < 16; ++i) {
   2009       for (j=0; j < count[i]; ++j) {
   2010          h->size[k++] = (stbi_uc) (i+1);
   2011          if(k >= 257) return stbi__err("bad size list","Corrupt JPEG");
   2012       }
   2013    }
   2014    h->size[k] = 0;
   2015 
   2016    // compute actual symbols (from jpeg spec)
   2017    code = 0;
   2018    k = 0;
   2019    for(j=1; j <= 16; ++j) {
   2020       // compute delta to add to code to compute symbol id
   2021       h->delta[j] = k - code;
   2022       if (h->size[k] == j) {
   2023          while (h->size[k] == j)
   2024             h->code[k++] = (stbi__uint16) (code++);
   2025          if (code-1 >= (1u << j)) return stbi__err("bad code lengths","Corrupt JPEG");
   2026       }
   2027       // compute largest code + 1 for this size, preshifted as needed later
   2028       h->maxcode[j] = code << (16-j);
   2029       code <<= 1;
   2030    }
   2031    h->maxcode[j] = 0xffffffff;
   2032 
   2033    // build non-spec acceleration table; 255 is flag for not-accelerated
   2034    memset(h->fast, 255, 1 << FAST_BITS);
   2035    for (i=0; i < k; ++i) {
   2036       int s = h->size[i];
   2037       if (s <= FAST_BITS) {
   2038          int c = h->code[i] << (FAST_BITS-s);
   2039          int m = 1 << (FAST_BITS-s);
   2040          for (j=0; j < m; ++j) {
   2041             h->fast[c+j] = (stbi_uc) i;
   2042          }
   2043       }
   2044    }
   2045    return 1;
   2046 }
   2047 
   2048 // build a table that decodes both magnitude and value of small ACs in
   2049 // one go.
   2050 static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
   2051 {
   2052    int i;
   2053    for (i=0; i < (1 << FAST_BITS); ++i) {
   2054       stbi_uc fast = h->fast[i];
   2055       fast_ac[i] = 0;
   2056       if (fast < 255) {
   2057          int rs = h->values[fast];
   2058          int run = (rs >> 4) & 15;
   2059          int magbits = rs & 15;
   2060          int len = h->size[fast];
   2061 
   2062          if (magbits && len + magbits <= FAST_BITS) {
   2063             // magnitude code followed by receive_extend code
   2064             int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
   2065             int m = 1 << (magbits - 1);
   2066             if (k < m) k += (~0U << magbits) + 1;
   2067             // if the result is small enough, we can fit it in fast_ac table
   2068             if (k >= -128 && k <= 127)
   2069                fast_ac[i] = (stbi__int16) ((k * 256) + (run * 16) + (len + magbits));
   2070          }
   2071       }
   2072    }
   2073 }
   2074 
   2075 static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
   2076 {
   2077    do {
   2078       unsigned int b = j->nomore ? 0 : stbi__get8(j->s);
   2079       if (b == 0xff) {
   2080          int c = stbi__get8(j->s);
   2081          while (c == 0xff) c = stbi__get8(j->s); // consume fill bytes
   2082          if (c != 0) {
   2083             j->marker = (unsigned char) c;
   2084             j->nomore = 1;
   2085             return;
   2086          }
   2087       }
   2088       j->code_buffer |= b << (24 - j->code_bits);
   2089       j->code_bits += 8;
   2090    } while (j->code_bits <= 24);
   2091 }
   2092 
   2093 // (1 << n) - 1
   2094 static const stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
   2095 
   2096 // decode a jpeg huffman value from the bitstream
   2097 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
   2098 {
   2099    unsigned int temp;
   2100    int c,k;
   2101 
   2102    if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
   2103 
   2104    // look at the top FAST_BITS and determine what symbol ID it is,
   2105    // if the code is <= FAST_BITS
   2106    c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
   2107    k = h->fast[c];
   2108    if (k < 255) {
   2109       int s = h->size[k];
   2110       if (s > j->code_bits)
   2111          return -1;
   2112       j->code_buffer <<= s;
   2113       j->code_bits -= s;
   2114       return h->values[k];
   2115    }
   2116 
   2117    // naive test is to shift the code_buffer down so k bits are
   2118    // valid, then test against maxcode. To speed this up, we've
   2119    // preshifted maxcode left so that it has (16-k) 0s at the
   2120    // end; in other words, regardless of the number of bits, it
   2121    // wants to be compared against something shifted to have 16;
   2122    // that way we don't need to shift inside the loop.
   2123    temp = j->code_buffer >> 16;
   2124    for (k=FAST_BITS+1 ; ; ++k)
   2125       if (temp < h->maxcode[k])
   2126          break;
   2127    if (k == 17) {
   2128       // error! code not found
   2129       j->code_bits -= 16;
   2130       return -1;
   2131    }
   2132 
   2133    if (k > j->code_bits)
   2134       return -1;
   2135 
   2136    // convert the huffman code to the symbol id
   2137    c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
   2138    if(c < 0 || c >= 256) // symbol id out of bounds!
   2139        return -1;
   2140    STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
   2141 
   2142    // convert the id to a symbol
   2143    j->code_bits -= k;
   2144    j->code_buffer <<= k;
   2145    return h->values[c];
   2146 }
   2147 
   2148 // bias[n] = (-1<<n) + 1
   2149 static const int stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
   2150 
   2151 // combined JPEG 'receive' and JPEG 'extend', since baseline
   2152 // always extends everything it receives.
   2153 stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
   2154 {
   2155    unsigned int k;
   2156    int sgn;
   2157    if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
   2158    if (j->code_bits < n) return 0; // ran out of bits from stream, return 0s intead of continuing
   2159 
   2160    sgn = j->code_buffer >> 31; // sign bit always in MSB; 0 if MSB clear (positive), 1 if MSB set (negative)
   2161    k = stbi_lrot(j->code_buffer, n);
   2162    j->code_buffer = k & ~stbi__bmask[n];
   2163    k &= stbi__bmask[n];
   2164    j->code_bits -= n;
   2165    return k + (stbi__jbias[n] & (sgn - 1));
   2166 }
   2167 
   2168 // get some unsigned bits
   2169 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
   2170 {
   2171    unsigned int k;
   2172    if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
   2173    if (j->code_bits < n) return 0; // ran out of bits from stream, return 0s intead of continuing
   2174    k = stbi_lrot(j->code_buffer, n);
   2175    j->code_buffer = k & ~stbi__bmask[n];
   2176    k &= stbi__bmask[n];
   2177    j->code_bits -= n;
   2178    return k;
   2179 }
   2180 
   2181 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
   2182 {
   2183    unsigned int k;
   2184    if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
   2185    if (j->code_bits < 1) return 0; // ran out of bits from stream, return 0s intead of continuing
   2186    k = j->code_buffer;
   2187    j->code_buffer <<= 1;
   2188    --j->code_bits;
   2189    return k & 0x80000000;
   2190 }
   2191 
   2192 // given a value that's at position X in the zigzag stream,
   2193 // where does it appear in the 8x8 matrix coded as row-major?
   2194 static const stbi_uc stbi__jpeg_dezigzag[64+15] =
   2195 {
   2196     0,  1,  8, 16,  9,  2,  3, 10,
   2197    17, 24, 32, 25, 18, 11,  4,  5,
   2198    12, 19, 26, 33, 40, 48, 41, 34,
   2199    27, 20, 13,  6,  7, 14, 21, 28,
   2200    35, 42, 49, 56, 57, 50, 43, 36,
   2201    29, 22, 15, 23, 30, 37, 44, 51,
   2202    58, 59, 52, 45, 38, 31, 39, 46,
   2203    53, 60, 61, 54, 47, 55, 62, 63,
   2204    // let corrupt input sample past end
   2205    63, 63, 63, 63, 63, 63, 63, 63,
   2206    63, 63, 63, 63, 63, 63, 63
   2207 };
   2208 
   2209 // decode one 64-entry block--
   2210 static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi__uint16 *dequant)
   2211 {
   2212    int diff,dc,k;
   2213    int t;
   2214 
   2215    if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
   2216    t = stbi__jpeg_huff_decode(j, hdc);
   2217    if (t < 0 || t > 15) return stbi__err("bad huffman code","Corrupt JPEG");
   2218 
   2219    // 0 all the ac values now so we can do it 32-bits at a time
   2220    memset(data,0,64*sizeof(data[0]));
   2221 
   2222    diff = t ? stbi__extend_receive(j, t) : 0;
   2223    if (!stbi__addints_valid(j->img_comp[b].dc_pred, diff)) return stbi__err("bad delta","Corrupt JPEG");
   2224    dc = j->img_comp[b].dc_pred + diff;
   2225    j->img_comp[b].dc_pred = dc;
   2226    if (!stbi__mul2shorts_valid(dc, dequant[0])) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
   2227    data[0] = (short) (dc * dequant[0]);
   2228 
   2229    // decode AC components, see JPEG spec
   2230    k = 1;
   2231    do {
   2232       unsigned int zig;
   2233       int c,r,s;
   2234       if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
   2235       c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
   2236       r = fac[c];
   2237       if (r) { // fast-AC path
   2238          k += (r >> 4) & 15; // run
   2239          s = r & 15; // combined length
   2240          if (s > j->code_bits) return stbi__err("bad huffman code", "Combined length longer than code bits available");
   2241          j->code_buffer <<= s;
   2242          j->code_bits -= s;
   2243          // decode into unzigzag'd location
   2244          zig = stbi__jpeg_dezigzag[k++];
   2245          data[zig] = (short) ((r >> 8) * dequant[zig]);
   2246       } else {
   2247          int rs = stbi__jpeg_huff_decode(j, hac);
   2248          if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
   2249          s = rs & 15;
   2250          r = rs >> 4;
   2251          if (s == 0) {
   2252             if (rs != 0xf0) break; // end block
   2253             k += 16;
   2254          } else {
   2255             k += r;
   2256             // decode into unzigzag'd location
   2257             zig = stbi__jpeg_dezigzag[k++];
   2258             data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
   2259          }
   2260       }
   2261    } while (k < 64);
   2262    return 1;
   2263 }
   2264 
   2265 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
   2266 {
   2267    int diff,dc;
   2268    int t;
   2269    if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
   2270 
   2271    if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
   2272 
   2273    if (j->succ_high == 0) {
   2274       // first scan for DC coefficient, must be first
   2275       memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
   2276       t = stbi__jpeg_huff_decode(j, hdc);
   2277       if (t < 0 || t > 15) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
   2278       diff = t ? stbi__extend_receive(j, t) : 0;
   2279 
   2280       if (!stbi__addints_valid(j->img_comp[b].dc_pred, diff)) return stbi__err("bad delta", "Corrupt JPEG");
   2281       dc = j->img_comp[b].dc_pred + diff;
   2282       j->img_comp[b].dc_pred = dc;
   2283       if (!stbi__mul2shorts_valid(dc, 1 << j->succ_low)) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
   2284       data[0] = (short) (dc * (1 << j->succ_low));
   2285    } else {
   2286       // refinement scan for DC coefficient
   2287       if (stbi__jpeg_get_bit(j))
   2288          data[0] += (short) (1 << j->succ_low);
   2289    }
   2290    return 1;
   2291 }
   2292 
   2293 // @OPTIMIZE: store non-zigzagged during the decode passes,
   2294 // and only de-zigzag when dequantizing
   2295 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
   2296 {
   2297    int k;
   2298    if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
   2299 
   2300    if (j->succ_high == 0) {
   2301       int shift = j->succ_low;
   2302 
   2303       if (j->eob_run) {
   2304          --j->eob_run;
   2305          return 1;
   2306       }
   2307 
   2308       k = j->spec_start;
   2309       do {
   2310          unsigned int zig;
   2311          int c,r,s;
   2312          if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
   2313          c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
   2314          r = fac[c];
   2315          if (r) { // fast-AC path
   2316             k += (r >> 4) & 15; // run
   2317             s = r & 15; // combined length
   2318             if (s > j->code_bits) return stbi__err("bad huffman code", "Combined length longer than code bits available");
   2319             j->code_buffer <<= s;
   2320             j->code_bits -= s;
   2321             zig = stbi__jpeg_dezigzag[k++];
   2322             data[zig] = (short) ((r >> 8) * (1 << shift));
   2323          } else {
   2324             int rs = stbi__jpeg_huff_decode(j, hac);
   2325             if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
   2326             s = rs & 15;
   2327             r = rs >> 4;
   2328             if (s == 0) {
   2329                if (r < 15) {
   2330                   j->eob_run = (1 << r);
   2331                   if (r)
   2332                      j->eob_run += stbi__jpeg_get_bits(j, r);
   2333                   --j->eob_run;
   2334                   break;
   2335                }
   2336                k += 16;
   2337             } else {
   2338                k += r;
   2339                zig = stbi__jpeg_dezigzag[k++];
   2340                data[zig] = (short) (stbi__extend_receive(j,s) * (1 << shift));
   2341             }
   2342          }
   2343       } while (k <= j->spec_end);
   2344    } else {
   2345       // refinement scan for these AC coefficients
   2346 
   2347       short bit = (short) (1 << j->succ_low);
   2348 
   2349       if (j->eob_run) {
   2350          --j->eob_run;
   2351          for (k = j->spec_start; k <= j->spec_end; ++k) {
   2352             short *p = &data[stbi__jpeg_dezigzag[k]];
   2353             if (*p != 0)
   2354                if (stbi__jpeg_get_bit(j))
   2355                   if ((*p & bit)==0) {
   2356                      if (*p > 0)
   2357                         *p += bit;
   2358                      else
   2359                         *p -= bit;
   2360                   }
   2361          }
   2362       } else {
   2363          k = j->spec_start;
   2364          do {
   2365             int r,s;
   2366             int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
   2367             if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
   2368             s = rs & 15;
   2369             r = rs >> 4;
   2370             if (s == 0) {
   2371                if (r < 15) {
   2372                   j->eob_run = (1 << r) - 1;
   2373                   if (r)
   2374                      j->eob_run += stbi__jpeg_get_bits(j, r);
   2375                   r = 64; // force end of block
   2376                } else {
   2377                   // r=15 s=0 should write 16 0s, so we just do
   2378                   // a run of 15 0s and then write s (which is 0),
   2379                   // so we don't have to do anything special here
   2380                }
   2381             } else {
   2382                if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
   2383                // sign bit
   2384                if (stbi__jpeg_get_bit(j))
   2385                   s = bit;
   2386                else
   2387                   s = -bit;
   2388             }
   2389 
   2390             // advance by r
   2391             while (k <= j->spec_end) {
   2392                short *p = &data[stbi__jpeg_dezigzag[k++]];
   2393                if (*p != 0) {
   2394                   if (stbi__jpeg_get_bit(j))
   2395                      if ((*p & bit)==0) {
   2396                         if (*p > 0)
   2397                            *p += bit;
   2398                         else
   2399                            *p -= bit;
   2400                      }
   2401                } else {
   2402                   if (r == 0) {
   2403                      *p = (short) s;
   2404                      break;
   2405                   }
   2406                   --r;
   2407                }
   2408             }
   2409          } while (k <= j->spec_end);
   2410       }
   2411    }
   2412    return 1;
   2413 }
   2414 
   2415 // take a -128..127 value and stbi__clamp it and convert to 0..255
   2416 stbi_inline static stbi_uc stbi__clamp(int x)
   2417 {
   2418    // trick to use a single test to catch both cases
   2419    if ((unsigned int) x > 255) {
   2420       if (x < 0) return 0;
   2421       if (x > 255) return 255;
   2422    }
   2423    return (stbi_uc) x;
   2424 }
   2425 
   2426 #define stbi__f2f(x)  ((int) (((x) * 4096 + 0.5)))
   2427 #define stbi__fsh(x)  ((x) * 4096)
   2428 
   2429 // derived from jidctint -- DCT_ISLOW
   2430 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
   2431    int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
   2432    p2 = s2;                                    \
   2433    p3 = s6;                                    \
   2434    p1 = (p2+p3) * stbi__f2f(0.5411961f);       \
   2435    t2 = p1 + p3*stbi__f2f(-1.847759065f);      \
   2436    t3 = p1 + p2*stbi__f2f( 0.765366865f);      \
   2437    p2 = s0;                                    \
   2438    p3 = s4;                                    \
   2439    t0 = stbi__fsh(p2+p3);                      \
   2440    t1 = stbi__fsh(p2-p3);                      \
   2441    x0 = t0+t3;                                 \
   2442    x3 = t0-t3;                                 \
   2443    x1 = t1+t2;                                 \
   2444    x2 = t1-t2;                                 \
   2445    t0 = s7;                                    \
   2446    t1 = s5;                                    \
   2447    t2 = s3;                                    \
   2448    t3 = s1;                                    \
   2449    p3 = t0+t2;                                 \
   2450    p4 = t1+t3;                                 \
   2451    p1 = t0+t3;                                 \
   2452    p2 = t1+t2;                                 \
   2453    p5 = (p3+p4)*stbi__f2f( 1.175875602f);      \
   2454    t0 = t0*stbi__f2f( 0.298631336f);           \
   2455    t1 = t1*stbi__f2f( 2.053119869f);           \
   2456    t2 = t2*stbi__f2f( 3.072711026f);           \
   2457    t3 = t3*stbi__f2f( 1.501321110f);           \
   2458    p1 = p5 + p1*stbi__f2f(-0.899976223f);      \
   2459    p2 = p5 + p2*stbi__f2f(-2.562915447f);      \
   2460    p3 = p3*stbi__f2f(-1.961570560f);           \
   2461    p4 = p4*stbi__f2f(-0.390180644f);           \
   2462    t3 += p1+p4;                                \
   2463    t2 += p2+p3;                                \
   2464    t1 += p2+p4;                                \
   2465    t0 += p1+p3;
   2466 
   2467 static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
   2468 {
   2469    int i,val[64],*v=val;
   2470    stbi_uc *o;
   2471    short *d = data;
   2472 
   2473    // columns
   2474    for (i=0; i < 8; ++i,++d, ++v) {
   2475       // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
   2476       if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
   2477            && d[40]==0 && d[48]==0 && d[56]==0) {
   2478          //    no shortcut                 0     seconds
   2479          //    (1|2|3|4|5|6|7)==0          0     seconds
   2480          //    all separate               -0.047 seconds
   2481          //    1 && 2|3 && 4|5 && 6|7:    -0.047 seconds
   2482          int dcterm = d[0]*4;
   2483          v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
   2484       } else {
   2485          STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
   2486          // constants scaled things up by 1<<12; let's bring them back
   2487          // down, but keep 2 extra bits of precision
   2488          x0 += 512; x1 += 512; x2 += 512; x3 += 512;
   2489          v[ 0] = (x0+t3) >> 10;
   2490          v[56] = (x0-t3) >> 10;
   2491          v[ 8] = (x1+t2) >> 10;
   2492          v[48] = (x1-t2) >> 10;
   2493          v[16] = (x2+t1) >> 10;
   2494          v[40] = (x2-t1) >> 10;
   2495          v[24] = (x3+t0) >> 10;
   2496          v[32] = (x3-t0) >> 10;
   2497       }
   2498    }
   2499 
   2500    for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
   2501       // no fast case since the first 1D IDCT spread components out
   2502       STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
   2503       // constants scaled things up by 1<<12, plus we had 1<<2 from first
   2504       // loop, plus horizontal and vertical each scale by sqrt(8) so together
   2505       // we've got an extra 1<<3, so 1<<17 total we need to remove.
   2506       // so we want to round that, which means adding 0.5 * 1<<17,
   2507       // aka 65536. Also, we'll end up with -128 to 127 that we want
   2508       // to encode as 0..255 by adding 128, so we'll add that before the shift
   2509       x0 += 65536 + (128<<17);
   2510       x1 += 65536 + (128<<17);
   2511       x2 += 65536 + (128<<17);
   2512       x3 += 65536 + (128<<17);
   2513       // tried computing the shifts into temps, or'ing the temps to see
   2514       // if any were out of range, but that was slower
   2515       o[0] = stbi__clamp((x0+t3) >> 17);
   2516       o[7] = stbi__clamp((x0-t3) >> 17);
   2517       o[1] = stbi__clamp((x1+t2) >> 17);
   2518       o[6] = stbi__clamp((x1-t2) >> 17);
   2519       o[2] = stbi__clamp((x2+t1) >> 17);
   2520       o[5] = stbi__clamp((x2-t1) >> 17);
   2521       o[3] = stbi__clamp((x3+t0) >> 17);
   2522       o[4] = stbi__clamp((x3-t0) >> 17);
   2523    }
   2524 }
   2525 
   2526 #ifdef STBI_SSE2
   2527 // sse2 integer IDCT. not the fastest possible implementation but it
   2528 // produces bit-identical results to the generic C version so it's
   2529 // fully "transparent".
   2530 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
   2531 {
   2532    // This is constructed to match our regular (generic) integer IDCT exactly.
   2533    __m128i row0, row1, row2, row3, row4, row5, row6, row7;
   2534    __m128i tmp;
   2535 
   2536    // dot product constant: even elems=x, odd elems=y
   2537    #define dct_const(x,y)  _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
   2538 
   2539    // out(0) = c0[even]*x + c0[odd]*y   (c0, x, y 16-bit, out 32-bit)
   2540    // out(1) = c1[even]*x + c1[odd]*y
   2541    #define dct_rot(out0,out1, x,y,c0,c1) \
   2542       __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
   2543       __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
   2544       __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
   2545       __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
   2546       __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
   2547       __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
   2548 
   2549    // out = in << 12  (in 16-bit, out 32-bit)
   2550    #define dct_widen(out, in) \
   2551       __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
   2552       __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
   2553 
   2554    // wide add
   2555    #define dct_wadd(out, a, b) \
   2556       __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
   2557       __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
   2558 
   2559    // wide sub
   2560    #define dct_wsub(out, a, b) \
   2561       __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
   2562       __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
   2563 
   2564    // butterfly a/b, add bias, then shift by "s" and pack
   2565    #define dct_bfly32o(out0, out1, a,b,bias,s) \
   2566       { \
   2567          __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
   2568          __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
   2569          dct_wadd(sum, abiased, b); \
   2570          dct_wsub(dif, abiased, b); \
   2571          out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
   2572          out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
   2573       }
   2574 
   2575    // 8-bit interleave step (for transposes)
   2576    #define dct_interleave8(a, b) \
   2577       tmp = a; \
   2578       a = _mm_unpacklo_epi8(a, b); \
   2579       b = _mm_unpackhi_epi8(tmp, b)
   2580 
   2581    // 16-bit interleave step (for transposes)
   2582    #define dct_interleave16(a, b) \
   2583       tmp = a; \
   2584       a = _mm_unpacklo_epi16(a, b); \
   2585       b = _mm_unpackhi_epi16(tmp, b)
   2586 
   2587    #define dct_pass(bias,shift) \
   2588       { \
   2589          /* even part */ \
   2590          dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
   2591          __m128i sum04 = _mm_add_epi16(row0, row4); \
   2592          __m128i dif04 = _mm_sub_epi16(row0, row4); \
   2593          dct_widen(t0e, sum04); \
   2594          dct_widen(t1e, dif04); \
   2595          dct_wadd(x0, t0e, t3e); \
   2596          dct_wsub(x3, t0e, t3e); \
   2597          dct_wadd(x1, t1e, t2e); \
   2598          dct_wsub(x2, t1e, t2e); \
   2599          /* odd part */ \
   2600          dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
   2601          dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
   2602          __m128i sum17 = _mm_add_epi16(row1, row7); \
   2603          __m128i sum35 = _mm_add_epi16(row3, row5); \
   2604          dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
   2605          dct_wadd(x4, y0o, y4o); \
   2606          dct_wadd(x5, y1o, y5o); \
   2607          dct_wadd(x6, y2o, y5o); \
   2608          dct_wadd(x7, y3o, y4o); \
   2609          dct_bfly32o(row0,row7, x0,x7,bias,shift); \
   2610          dct_bfly32o(row1,row6, x1,x6,bias,shift); \
   2611          dct_bfly32o(row2,row5, x2,x5,bias,shift); \
   2612          dct_bfly32o(row3,row4, x3,x4,bias,shift); \
   2613       }
   2614 
   2615    __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
   2616    __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
   2617    __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
   2618    __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
   2619    __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
   2620    __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
   2621    __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
   2622    __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
   2623 
   2624    // rounding biases in column/row passes, see stbi__idct_block for explanation.
   2625    __m128i bias_0 = _mm_set1_epi32(512);
   2626    __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
   2627 
   2628    // load
   2629    row0 = _mm_load_si128((const __m128i *) (data + 0*8));
   2630    row1 = _mm_load_si128((const __m128i *) (data + 1*8));
   2631    row2 = _mm_load_si128((const __m128i *) (data + 2*8));
   2632    row3 = _mm_load_si128((const __m128i *) (data + 3*8));
   2633    row4 = _mm_load_si128((const __m128i *) (data + 4*8));
   2634    row5 = _mm_load_si128((const __m128i *) (data + 5*8));
   2635    row6 = _mm_load_si128((const __m128i *) (data + 6*8));
   2636    row7 = _mm_load_si128((const __m128i *) (data + 7*8));
   2637 
   2638    // column pass
   2639    dct_pass(bias_0, 10);
   2640 
   2641    {
   2642       // 16bit 8x8 transpose pass 1
   2643       dct_interleave16(row0, row4);
   2644       dct_interleave16(row1, row5);
   2645       dct_interleave16(row2, row6);
   2646       dct_interleave16(row3, row7);
   2647 
   2648       // transpose pass 2
   2649       dct_interleave16(row0, row2);
   2650       dct_interleave16(row1, row3);
   2651       dct_interleave16(row4, row6);
   2652       dct_interleave16(row5, row7);
   2653 
   2654       // transpose pass 3
   2655       dct_interleave16(row0, row1);
   2656       dct_interleave16(row2, row3);
   2657       dct_interleave16(row4, row5);
   2658       dct_interleave16(row6, row7);
   2659    }
   2660 
   2661    // row pass
   2662    dct_pass(bias_1, 17);
   2663 
   2664    {
   2665       // pack
   2666       __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
   2667       __m128i p1 = _mm_packus_epi16(row2, row3);
   2668       __m128i p2 = _mm_packus_epi16(row4, row5);
   2669       __m128i p3 = _mm_packus_epi16(row6, row7);
   2670 
   2671       // 8bit 8x8 transpose pass 1
   2672       dct_interleave8(p0, p2); // a0e0a1e1...
   2673       dct_interleave8(p1, p3); // c0g0c1g1...
   2674 
   2675       // transpose pass 2
   2676       dct_interleave8(p0, p1); // a0c0e0g0...
   2677       dct_interleave8(p2, p3); // b0d0f0h0...
   2678 
   2679       // transpose pass 3
   2680       dct_interleave8(p0, p2); // a0b0c0d0...
   2681       dct_interleave8(p1, p3); // a4b4c4d4...
   2682 
   2683       // store
   2684       _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
   2685       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
   2686       _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
   2687       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
   2688       _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
   2689       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
   2690       _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
   2691       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
   2692    }
   2693 
   2694 #undef dct_const
   2695 #undef dct_rot
   2696 #undef dct_widen
   2697 #undef dct_wadd
   2698 #undef dct_wsub
   2699 #undef dct_bfly32o
   2700 #undef dct_interleave8
   2701 #undef dct_interleave16
   2702 #undef dct_pass
   2703 }
   2704 
   2705 #endif // STBI_SSE2
   2706 
   2707 #ifdef STBI_NEON
   2708 
   2709 // NEON integer IDCT. should produce bit-identical
   2710 // results to the generic C version.
   2711 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
   2712 {
   2713    int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
   2714 
   2715    int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
   2716    int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
   2717    int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
   2718    int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
   2719    int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
   2720    int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
   2721    int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
   2722    int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
   2723    int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
   2724    int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
   2725    int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
   2726    int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
   2727 
   2728 #define dct_long_mul(out, inq, coeff) \
   2729    int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
   2730    int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
   2731 
   2732 #define dct_long_mac(out, acc, inq, coeff) \
   2733    int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
   2734    int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
   2735 
   2736 #define dct_widen(out, inq) \
   2737    int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
   2738    int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
   2739 
   2740 // wide add
   2741 #define dct_wadd(out, a, b) \
   2742    int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
   2743    int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
   2744 
   2745 // wide sub
   2746 #define dct_wsub(out, a, b) \
   2747    int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
   2748    int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
   2749 
   2750 // butterfly a/b, then shift using "shiftop" by "s" and pack
   2751 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
   2752    { \
   2753       dct_wadd(sum, a, b); \
   2754       dct_wsub(dif, a, b); \
   2755       out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
   2756       out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
   2757    }
   2758 
   2759 #define dct_pass(shiftop, shift) \
   2760    { \
   2761       /* even part */ \
   2762       int16x8_t sum26 = vaddq_s16(row2, row6); \
   2763       dct_long_mul(p1e, sum26, rot0_0); \
   2764       dct_long_mac(t2e, p1e, row6, rot0_1); \
   2765       dct_long_mac(t3e, p1e, row2, rot0_2); \
   2766       int16x8_t sum04 = vaddq_s16(row0, row4); \
   2767       int16x8_t dif04 = vsubq_s16(row0, row4); \
   2768       dct_widen(t0e, sum04); \
   2769       dct_widen(t1e, dif04); \
   2770       dct_wadd(x0, t0e, t3e); \
   2771       dct_wsub(x3, t0e, t3e); \
   2772       dct_wadd(x1, t1e, t2e); \
   2773       dct_wsub(x2, t1e, t2e); \
   2774       /* odd part */ \
   2775       int16x8_t sum15 = vaddq_s16(row1, row5); \
   2776       int16x8_t sum17 = vaddq_s16(row1, row7); \
   2777       int16x8_t sum35 = vaddq_s16(row3, row5); \
   2778       int16x8_t sum37 = vaddq_s16(row3, row7); \
   2779       int16x8_t sumodd = vaddq_s16(sum17, sum35); \
   2780       dct_long_mul(p5o, sumodd, rot1_0); \
   2781       dct_long_mac(p1o, p5o, sum17, rot1_1); \
   2782       dct_long_mac(p2o, p5o, sum35, rot1_2); \
   2783       dct_long_mul(p3o, sum37, rot2_0); \
   2784       dct_long_mul(p4o, sum15, rot2_1); \
   2785       dct_wadd(sump13o, p1o, p3o); \
   2786       dct_wadd(sump24o, p2o, p4o); \
   2787       dct_wadd(sump23o, p2o, p3o); \
   2788       dct_wadd(sump14o, p1o, p4o); \
   2789       dct_long_mac(x4, sump13o, row7, rot3_0); \
   2790       dct_long_mac(x5, sump24o, row5, rot3_1); \
   2791       dct_long_mac(x6, sump23o, row3, rot3_2); \
   2792       dct_long_mac(x7, sump14o, row1, rot3_3); \
   2793       dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
   2794       dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
   2795       dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
   2796       dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
   2797    }
   2798 
   2799    // load
   2800    row0 = vld1q_s16(data + 0*8);
   2801    row1 = vld1q_s16(data + 1*8);
   2802    row2 = vld1q_s16(data + 2*8);
   2803    row3 = vld1q_s16(data + 3*8);
   2804    row4 = vld1q_s16(data + 4*8);
   2805    row5 = vld1q_s16(data + 5*8);
   2806    row6 = vld1q_s16(data + 6*8);
   2807    row7 = vld1q_s16(data + 7*8);
   2808 
   2809    // add DC bias
   2810    row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
   2811 
   2812    // column pass
   2813    dct_pass(vrshrn_n_s32, 10);
   2814 
   2815    // 16bit 8x8 transpose
   2816    {
   2817 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
   2818 // whether compilers actually get this is another story, sadly.
   2819 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
   2820 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
   2821 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
   2822 
   2823       // pass 1
   2824       dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
   2825       dct_trn16(row2, row3);
   2826       dct_trn16(row4, row5);
   2827       dct_trn16(row6, row7);
   2828 
   2829       // pass 2
   2830       dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
   2831       dct_trn32(row1, row3);
   2832       dct_trn32(row4, row6);
   2833       dct_trn32(row5, row7);
   2834 
   2835       // pass 3
   2836       dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
   2837       dct_trn64(row1, row5);
   2838       dct_trn64(row2, row6);
   2839       dct_trn64(row3, row7);
   2840 
   2841 #undef dct_trn16
   2842 #undef dct_trn32
   2843 #undef dct_trn64
   2844    }
   2845 
   2846    // row pass
   2847    // vrshrn_n_s32 only supports shifts up to 16, we need
   2848    // 17. so do a non-rounding shift of 16 first then follow
   2849    // up with a rounding shift by 1.
   2850    dct_pass(vshrn_n_s32, 16);
   2851 
   2852    {
   2853       // pack and round
   2854       uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
   2855       uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
   2856       uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
   2857       uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
   2858       uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
   2859       uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
   2860       uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
   2861       uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
   2862 
   2863       // again, these can translate into one instruction, but often don't.
   2864 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
   2865 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
   2866 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
   2867 
   2868       // sadly can't use interleaved stores here since we only write
   2869       // 8 bytes to each scan line!
   2870 
   2871       // 8x8 8-bit transpose pass 1
   2872       dct_trn8_8(p0, p1);
   2873       dct_trn8_8(p2, p3);
   2874       dct_trn8_8(p4, p5);
   2875       dct_trn8_8(p6, p7);
   2876 
   2877       // pass 2
   2878       dct_trn8_16(p0, p2);
   2879       dct_trn8_16(p1, p3);
   2880       dct_trn8_16(p4, p6);
   2881       dct_trn8_16(p5, p7);
   2882 
   2883       // pass 3
   2884       dct_trn8_32(p0, p4);
   2885       dct_trn8_32(p1, p5);
   2886       dct_trn8_32(p2, p6);
   2887       dct_trn8_32(p3, p7);
   2888 
   2889       // store
   2890       vst1_u8(out, p0); out += out_stride;
   2891       vst1_u8(out, p1); out += out_stride;
   2892       vst1_u8(out, p2); out += out_stride;
   2893       vst1_u8(out, p3); out += out_stride;
   2894       vst1_u8(out, p4); out += out_stride;
   2895       vst1_u8(out, p5); out += out_stride;
   2896       vst1_u8(out, p6); out += out_stride;
   2897       vst1_u8(out, p7);
   2898 
   2899 #undef dct_trn8_8
   2900 #undef dct_trn8_16
   2901 #undef dct_trn8_32
   2902    }
   2903 
   2904 #undef dct_long_mul
   2905 #undef dct_long_mac
   2906 #undef dct_widen
   2907 #undef dct_wadd
   2908 #undef dct_wsub
   2909 #undef dct_bfly32o
   2910 #undef dct_pass
   2911 }
   2912 
   2913 #endif // STBI_NEON
   2914 
   2915 #define STBI__MARKER_none  0xff
   2916 // if there's a pending marker from the entropy stream, return that
   2917 // otherwise, fetch from the stream and get a marker. if there's no
   2918 // marker, return 0xff, which is never a valid marker value
   2919 static stbi_uc stbi__get_marker(stbi__jpeg *j)
   2920 {
   2921    stbi_uc x;
   2922    if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
   2923    x = stbi__get8(j->s);
   2924    if (x != 0xff) return STBI__MARKER_none;
   2925    while (x == 0xff)
   2926       x = stbi__get8(j->s); // consume repeated 0xff fill bytes
   2927    return x;
   2928 }
   2929 
   2930 // in each scan, we'll have scan_n components, and the order
   2931 // of the components is specified by order[]
   2932 #define STBI__RESTART(x)     ((x) >= 0xd0 && (x) <= 0xd7)
   2933 
   2934 // after a restart interval, stbi__jpeg_reset the entropy decoder and
   2935 // the dc prediction
   2936 static void stbi__jpeg_reset(stbi__jpeg *j)
   2937 {
   2938    j->code_bits = 0;
   2939    j->code_buffer = 0;
   2940    j->nomore = 0;
   2941    j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = j->img_comp[3].dc_pred = 0;
   2942    j->marker = STBI__MARKER_none;
   2943    j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
   2944    j->eob_run = 0;
   2945    // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
   2946    // since we don't even allow 1<<30 pixels
   2947 }
   2948 
   2949 static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
   2950 {
   2951    stbi__jpeg_reset(z);
   2952    if (!z->progressive) {
   2953       if (z->scan_n == 1) {
   2954          int i,j;
   2955          STBI_SIMD_ALIGN(short, data[64]);
   2956          int n = z->order[0];
   2957          // non-interleaved data, we just need to process one block at a time,
   2958          // in trivial scanline order
   2959          // number of blocks to do just depends on how many actual "pixels" this
   2960          // component has, independent of interleaved MCU blocking and such
   2961          int w = (z->img_comp[n].x+7) >> 3;
   2962          int h = (z->img_comp[n].y+7) >> 3;
   2963          for (j=0; j < h; ++j) {
   2964             for (i=0; i < w; ++i) {
   2965                int ha = z->img_comp[n].ha;
   2966                if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
   2967                z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
   2968                // every data block is an MCU, so countdown the restart interval
   2969                if (--z->todo <= 0) {
   2970                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
   2971                   // if it's NOT a restart, then just bail, so we get corrupt data
   2972                   // rather than no data
   2973                   if (!STBI__RESTART(z->marker)) return 1;
   2974                   stbi__jpeg_reset(z);
   2975                }
   2976             }
   2977          }
   2978          return 1;
   2979       } else { // interleaved
   2980          int i,j,k,x,y;
   2981          STBI_SIMD_ALIGN(short, data[64]);
   2982          for (j=0; j < z->img_mcu_y; ++j) {
   2983             for (i=0; i < z->img_mcu_x; ++i) {
   2984                // scan an interleaved mcu... process scan_n components in order
   2985                for (k=0; k < z->scan_n; ++k) {
   2986                   int n = z->order[k];
   2987                   // scan out an mcu's worth of this component; that's just determined
   2988                   // by the basic H and V specified for the component
   2989                   for (y=0; y < z->img_comp[n].v; ++y) {
   2990                      for (x=0; x < z->img_comp[n].h; ++x) {
   2991                         int x2 = (i*z->img_comp[n].h + x)*8;
   2992                         int y2 = (j*z->img_comp[n].v + y)*8;
   2993                         int ha = z->img_comp[n].ha;
   2994                         if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
   2995                         z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
   2996                      }
   2997                   }
   2998                }
   2999                // after all interleaved components, that's an interleaved MCU,
   3000                // so now count down the restart interval
   3001                if (--z->todo <= 0) {
   3002                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
   3003                   if (!STBI__RESTART(z->marker)) return 1;
   3004                   stbi__jpeg_reset(z);
   3005                }
   3006             }
   3007          }
   3008          return 1;
   3009       }
   3010    } else {
   3011       if (z->scan_n == 1) {
   3012          int i,j;
   3013          int n = z->order[0];
   3014          // non-interleaved data, we just need to process one block at a time,
   3015          // in trivial scanline order
   3016          // number of blocks to do just depends on how many actual "pixels" this
   3017          // component has, independent of interleaved MCU blocking and such
   3018          int w = (z->img_comp[n].x+7) >> 3;
   3019          int h = (z->img_comp[n].y+7) >> 3;
   3020          for (j=0; j < h; ++j) {
   3021             for (i=0; i < w; ++i) {
   3022                short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
   3023                if (z->spec_start == 0) {
   3024                   if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
   3025                      return 0;
   3026                } else {
   3027                   int ha = z->img_comp[n].ha;
   3028                   if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
   3029                      return 0;
   3030                }
   3031                // every data block is an MCU, so countdown the restart interval
   3032                if (--z->todo <= 0) {
   3033                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
   3034                   if (!STBI__RESTART(z->marker)) return 1;
   3035                   stbi__jpeg_reset(z);
   3036                }
   3037             }
   3038          }
   3039          return 1;
   3040       } else { // interleaved
   3041          int i,j,k,x,y;
   3042          for (j=0; j < z->img_mcu_y; ++j) {
   3043             for (i=0; i < z->img_mcu_x; ++i) {
   3044                // scan an interleaved mcu... process scan_n components in order
   3045                for (k=0; k < z->scan_n; ++k) {
   3046                   int n = z->order[k];
   3047                   // scan out an mcu's worth of this component; that's just determined
   3048                   // by the basic H and V specified for the component
   3049                   for (y=0; y < z->img_comp[n].v; ++y) {
   3050                      for (x=0; x < z->img_comp[n].h; ++x) {
   3051                         int x2 = (i*z->img_comp[n].h + x);
   3052                         int y2 = (j*z->img_comp[n].v + y);
   3053                         short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
   3054                         if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
   3055                            return 0;
   3056                      }
   3057                   }
   3058                }
   3059                // after all interleaved components, that's an interleaved MCU,
   3060                // so now count down the restart interval
   3061                if (--z->todo <= 0) {
   3062                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
   3063                   if (!STBI__RESTART(z->marker)) return 1;
   3064                   stbi__jpeg_reset(z);
   3065                }
   3066             }
   3067          }
   3068          return 1;
   3069       }
   3070    }
   3071 }
   3072 
   3073 static void stbi__jpeg_dequantize(short *data, stbi__uint16 *dequant)
   3074 {
   3075    int i;
   3076    for (i=0; i < 64; ++i)
   3077       data[i] *= dequant[i];
   3078 }
   3079 
   3080 static void stbi__jpeg_finish(stbi__jpeg *z)
   3081 {
   3082    if (z->progressive) {
   3083       // dequantize and idct the data
   3084       int i,j,n;
   3085       for (n=0; n < z->s->img_n; ++n) {
   3086          int w = (z->img_comp[n].x+7) >> 3;
   3087          int h = (z->img_comp[n].y+7) >> 3;
   3088          for (j=0; j < h; ++j) {
   3089             for (i=0; i < w; ++i) {
   3090                short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
   3091                stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
   3092                z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
   3093             }
   3094          }
   3095       }
   3096    }
   3097 }
   3098 
   3099 static int stbi__process_marker(stbi__jpeg *z, int m)
   3100 {
   3101    int L;
   3102    switch (m) {
   3103       case STBI__MARKER_none: // no marker found
   3104          return stbi__err("expected marker","Corrupt JPEG");
   3105 
   3106       case 0xDD: // DRI - specify restart interval
   3107          if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
   3108          z->restart_interval = stbi__get16be(z->s);
   3109          return 1;
   3110 
   3111       case 0xDB: // DQT - define quantization table
   3112          L = stbi__get16be(z->s)-2;
   3113          while (L > 0) {
   3114             int q = stbi__get8(z->s);
   3115             int p = q >> 4, sixteen = (p != 0);
   3116             int t = q & 15,i;
   3117             if (p != 0 && p != 1) return stbi__err("bad DQT type","Corrupt JPEG");
   3118             if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
   3119 
   3120             for (i=0; i < 64; ++i)
   3121                z->dequant[t][stbi__jpeg_dezigzag[i]] = (stbi__uint16)(sixteen ? stbi__get16be(z->s) : stbi__get8(z->s));
   3122             L -= (sixteen ? 129 : 65);
   3123          }
   3124          return L==0;
   3125 
   3126       case 0xC4: // DHT - define huffman table
   3127          L = stbi__get16be(z->s)-2;
   3128          while (L > 0) {
   3129             stbi_uc *v;
   3130             int sizes[16],i,n=0;
   3131             int q = stbi__get8(z->s);
   3132             int tc = q >> 4;
   3133             int th = q & 15;
   3134             if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
   3135             for (i=0; i < 16; ++i) {
   3136                sizes[i] = stbi__get8(z->s);
   3137                n += sizes[i];
   3138             }
   3139             if(n > 256) return stbi__err("bad DHT header","Corrupt JPEG"); // Loop over i < n would write past end of values!
   3140             L -= 17;
   3141             if (tc == 0) {
   3142                if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
   3143                v = z->huff_dc[th].values;
   3144             } else {
   3145                if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
   3146                v = z->huff_ac[th].values;
   3147             }
   3148             for (i=0; i < n; ++i)
   3149                v[i] = stbi__get8(z->s);
   3150             if (tc != 0)
   3151                stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
   3152             L -= n;
   3153          }
   3154          return L==0;
   3155    }
   3156 
   3157    // check for comment block or APP blocks
   3158    if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
   3159       L = stbi__get16be(z->s);
   3160       if (L < 2) {
   3161          if (m == 0xFE)
   3162             return stbi__err("bad COM len","Corrupt JPEG");
   3163          else
   3164             return stbi__err("bad APP len","Corrupt JPEG");
   3165       }
   3166       L -= 2;
   3167 
   3168       if (m == 0xE0 && L >= 5) { // JFIF APP0 segment
   3169          static const unsigned char tag[5] = {'J','F','I','F','\0'};
   3170          int ok = 1;
   3171          int i;
   3172          for (i=0; i < 5; ++i)
   3173             if (stbi__get8(z->s) != tag[i])
   3174                ok = 0;
   3175          L -= 5;
   3176          if (ok)
   3177             z->jfif = 1;
   3178       } else if (m == 0xEE && L >= 12) { // Adobe APP14 segment
   3179          static const unsigned char tag[6] = {'A','d','o','b','e','\0'};
   3180          int ok = 1;
   3181          int i;
   3182          for (i=0; i < 6; ++i)
   3183             if (stbi__get8(z->s) != tag[i])
   3184                ok = 0;
   3185          L -= 6;
   3186          if (ok) {
   3187             stbi__get8(z->s); // version
   3188             stbi__get16be(z->s); // flags0
   3189             stbi__get16be(z->s); // flags1
   3190             z->app14_color_transform = stbi__get8(z->s); // color transform
   3191             L -= 6;
   3192          }
   3193       }
   3194 
   3195       stbi__skip(z->s, L);
   3196       return 1;
   3197    }
   3198 
   3199    return stbi__err("unknown marker","Corrupt JPEG");
   3200 }
   3201 
   3202 // after we see SOS
   3203 static int stbi__process_scan_header(stbi__jpeg *z)
   3204 {
   3205    int i;
   3206    int Ls = stbi__get16be(z->s);
   3207    z->scan_n = stbi__get8(z->s);
   3208    if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
   3209    if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
   3210    for (i=0; i < z->scan_n; ++i) {
   3211       int id = stbi__get8(z->s), which;
   3212       int q = stbi__get8(z->s);
   3213       for (which = 0; which < z->s->img_n; ++which)
   3214          if (z->img_comp[which].id == id)
   3215             break;
   3216       if (which == z->s->img_n) return 0; // no match
   3217       z->img_comp[which].hd = q >> 4;   if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
   3218       z->img_comp[which].ha = q & 15;   if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
   3219       z->order[i] = which;
   3220    }
   3221 
   3222    {
   3223       int aa;
   3224       z->spec_start = stbi__get8(z->s);
   3225       z->spec_end   = stbi__get8(z->s); // should be 63, but might be 0
   3226       aa = stbi__get8(z->s);
   3227       z->succ_high = (aa >> 4);
   3228       z->succ_low  = (aa & 15);
   3229       if (z->progressive) {
   3230          if (z->spec_start > 63 || z->spec_end > 63  || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
   3231             return stbi__err("bad SOS", "Corrupt JPEG");
   3232       } else {
   3233          if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
   3234          if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
   3235          z->spec_end = 63;
   3236       }
   3237    }
   3238 
   3239    return 1;
   3240 }
   3241 
   3242 static int stbi__free_jpeg_components(stbi__jpeg *z, int ncomp, int why)
   3243 {
   3244    int i;
   3245    for (i=0; i < ncomp; ++i) {
   3246       if (z->img_comp[i].raw_data) {
   3247          STBI_FREE(z->img_comp[i].raw_data);
   3248          z->img_comp[i].raw_data = NULL;
   3249          z->img_comp[i].data = NULL;
   3250       }
   3251       if (z->img_comp[i].raw_coeff) {
   3252          STBI_FREE(z->img_comp[i].raw_coeff);
   3253          z->img_comp[i].raw_coeff = 0;
   3254          z->img_comp[i].coeff = 0;
   3255       }
   3256       if (z->img_comp[i].linebuf) {
   3257          STBI_FREE(z->img_comp[i].linebuf);
   3258          z->img_comp[i].linebuf = NULL;
   3259       }
   3260    }
   3261    return why;
   3262 }
   3263 
   3264 static int stbi__process_frame_header(stbi__jpeg *z, int scan)
   3265 {
   3266    stbi__context *s = z->s;
   3267    int Lf,p,i,q, h_max=1,v_max=1,c;
   3268    Lf = stbi__get16be(s);         if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
   3269    p  = stbi__get8(s);            if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
   3270    s->img_y = stbi__get16be(s);   if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
   3271    s->img_x = stbi__get16be(s);   if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
   3272    if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
   3273    if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
   3274    c = stbi__get8(s);
   3275    if (c != 3 && c != 1 && c != 4) return stbi__err("bad component count","Corrupt JPEG");
   3276    s->img_n = c;
   3277    for (i=0; i < c; ++i) {
   3278       z->img_comp[i].data = NULL;
   3279       z->img_comp[i].linebuf = NULL;
   3280    }
   3281 
   3282    if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
   3283 
   3284    z->rgb = 0;
   3285    for (i=0; i < s->img_n; ++i) {
   3286       static const unsigned char rgb[3] = { 'R', 'G', 'B' };
   3287       z->img_comp[i].id = stbi__get8(s);
   3288       if (s->img_n == 3 && z->img_comp[i].id == rgb[i])
   3289          ++z->rgb;
   3290       q = stbi__get8(s);
   3291       z->img_comp[i].h = (q >> 4);  if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
   3292       z->img_comp[i].v = q & 15;    if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
   3293       z->img_comp[i].tq = stbi__get8(s);  if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
   3294    }
   3295 
   3296    if (scan != STBI__SCAN_load) return 1;
   3297 
   3298    if (!stbi__mad3sizes_valid(s->img_x, s->img_y, s->img_n, 0)) return stbi__err("too large", "Image too large to decode");
   3299 
   3300    for (i=0; i < s->img_n; ++i) {
   3301       if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
   3302       if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
   3303    }
   3304 
   3305    // check that plane subsampling factors are integer ratios; our resamplers can't deal with fractional ratios
   3306    // and I've never seen a non-corrupted JPEG file actually use them
   3307    for (i=0; i < s->img_n; ++i) {
   3308       if (h_max % z->img_comp[i].h != 0) return stbi__err("bad H","Corrupt JPEG");
   3309       if (v_max % z->img_comp[i].v != 0) return stbi__err("bad V","Corrupt JPEG");
   3310    }
   3311 
   3312    // compute interleaved mcu info
   3313    z->img_h_max = h_max;
   3314    z->img_v_max = v_max;
   3315    z->img_mcu_w = h_max * 8;
   3316    z->img_mcu_h = v_max * 8;
   3317    // these sizes can't be more than 17 bits
   3318    z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
   3319    z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
   3320 
   3321    for (i=0; i < s->img_n; ++i) {
   3322       // number of effective pixels (e.g. for non-interleaved MCU)
   3323       z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
   3324       z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
   3325       // to simplify generation, we'll allocate enough memory to decode
   3326       // the bogus oversized data from using interleaved MCUs and their
   3327       // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
   3328       // discard the extra data until colorspace conversion
   3329       //
   3330       // img_mcu_x, img_mcu_y: <=17 bits; comp[i].h and .v are <=4 (checked earlier)
   3331       // so these muls can't overflow with 32-bit ints (which we require)
   3332       z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
   3333       z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
   3334       z->img_comp[i].coeff = 0;
   3335       z->img_comp[i].raw_coeff = 0;
   3336       z->img_comp[i].linebuf = NULL;
   3337       z->img_comp[i].raw_data = stbi__malloc_mad2(z->img_comp[i].w2, z->img_comp[i].h2, 15);
   3338       if (z->img_comp[i].raw_data == NULL)
   3339          return stbi__free_jpeg_components(z, i+1, stbi__err("outofmem", "Out of memory"));
   3340       // align blocks for idct using mmx/sse
   3341       z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
   3342       if (z->progressive) {
   3343          // w2, h2 are multiples of 8 (see above)
   3344          z->img_comp[i].coeff_w = z->img_comp[i].w2 / 8;
   3345          z->img_comp[i].coeff_h = z->img_comp[i].h2 / 8;
   3346          z->img_comp[i].raw_coeff = stbi__malloc_mad3(z->img_comp[i].w2, z->img_comp[i].h2, sizeof(short), 15);
   3347          if (z->img_comp[i].raw_coeff == NULL)
   3348             return stbi__free_jpeg_components(z, i+1, stbi__err("outofmem", "Out of memory"));
   3349          z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
   3350       }
   3351    }
   3352 
   3353    return 1;
   3354 }
   3355 
   3356 // use comparisons since in some cases we handle more than one case (e.g. SOF)
   3357 #define stbi__DNL(x)         ((x) == 0xdc)
   3358 #define stbi__SOI(x)         ((x) == 0xd8)
   3359 #define stbi__EOI(x)         ((x) == 0xd9)
   3360 #define stbi__SOF(x)         ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
   3361 #define stbi__SOS(x)         ((x) == 0xda)
   3362 
   3363 #define stbi__SOF_progressive(x)   ((x) == 0xc2)
   3364 
   3365 static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
   3366 {
   3367    int m;
   3368    z->jfif = 0;
   3369    z->app14_color_transform = -1; // valid values are 0,1,2
   3370    z->marker = STBI__MARKER_none; // initialize cached marker to empty
   3371    m = stbi__get_marker(z);
   3372    if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
   3373    if (scan == STBI__SCAN_type) return 1;
   3374    m = stbi__get_marker(z);
   3375    while (!stbi__SOF(m)) {
   3376       if (!stbi__process_marker(z,m)) return 0;
   3377       m = stbi__get_marker(z);
   3378       while (m == STBI__MARKER_none) {
   3379          // some files have extra padding after their blocks, so ok, we'll scan
   3380          if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
   3381          m = stbi__get_marker(z);
   3382       }
   3383    }
   3384    z->progressive = stbi__SOF_progressive(m);
   3385    if (!stbi__process_frame_header(z, scan)) return 0;
   3386    return 1;
   3387 }
   3388 
   3389 static stbi_uc stbi__skip_jpeg_junk_at_end(stbi__jpeg *j)
   3390 {
   3391    // some JPEGs have junk at end, skip over it but if we find what looks
   3392    // like a valid marker, resume there
   3393    while (!stbi__at_eof(j->s)) {
   3394       stbi_uc x = stbi__get8(j->s);
   3395       while (x == 0xff) { // might be a marker
   3396          if (stbi__at_eof(j->s)) return STBI__MARKER_none;
   3397          x = stbi__get8(j->s);
   3398          if (x != 0x00 && x != 0xff) {
   3399             // not a stuffed zero or lead-in to another marker, looks
   3400             // like an actual marker, return it
   3401             return x;
   3402          }
   3403          // stuffed zero has x=0 now which ends the loop, meaning we go
   3404          // back to regular scan loop.
   3405          // repeated 0xff keeps trying to read the next byte of the marker.
   3406       }
   3407    }
   3408    return STBI__MARKER_none;
   3409 }
   3410 
   3411 // decode image to YCbCr format
   3412 static int stbi__decode_jpeg_image(stbi__jpeg *j)
   3413 {
   3414    int m;
   3415    for (m = 0; m < 4; m++) {
   3416       j->img_comp[m].raw_data = NULL;
   3417       j->img_comp[m].raw_coeff = NULL;
   3418    }
   3419    j->restart_interval = 0;
   3420    if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
   3421    m = stbi__get_marker(j);
   3422    while (!stbi__EOI(m)) {
   3423       if (stbi__SOS(m)) {
   3424          if (!stbi__process_scan_header(j)) return 0;
   3425          if (!stbi__parse_entropy_coded_data(j)) return 0;
   3426          if (j->marker == STBI__MARKER_none ) {
   3427          j->marker = stbi__skip_jpeg_junk_at_end(j);
   3428             // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
   3429          }
   3430          m = stbi__get_marker(j);
   3431          if (STBI__RESTART(m))
   3432             m = stbi__get_marker(j);
   3433       } else if (stbi__DNL(m)) {
   3434          int Ld = stbi__get16be(j->s);
   3435          stbi__uint32 NL = stbi__get16be(j->s);
   3436          if (Ld != 4) return stbi__err("bad DNL len", "Corrupt JPEG");
   3437          if (NL != j->s->img_y) return stbi__err("bad DNL height", "Corrupt JPEG");
   3438          m = stbi__get_marker(j);
   3439       } else {
   3440          if (!stbi__process_marker(j, m)) return 1;
   3441          m = stbi__get_marker(j);
   3442       }
   3443    }
   3444    if (j->progressive)
   3445       stbi__jpeg_finish(j);
   3446    return 1;
   3447 }
   3448 
   3449 // static jfif-centered resampling (across block boundaries)
   3450 
   3451 typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
   3452                                     int w, int hs);
   3453 
   3454 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
   3455 
   3456 static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3457 {
   3458    STBI_NOTUSED(out);
   3459    STBI_NOTUSED(in_far);
   3460    STBI_NOTUSED(w);
   3461    STBI_NOTUSED(hs);
   3462    return in_near;
   3463 }
   3464 
   3465 static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3466 {
   3467    // need to generate two samples vertically for every one in input
   3468    int i;
   3469    STBI_NOTUSED(hs);
   3470    for (i=0; i < w; ++i)
   3471       out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
   3472    return out;
   3473 }
   3474 
   3475 static stbi_uc*  stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3476 {
   3477    // need to generate two samples horizontally for every one in input
   3478    int i;
   3479    stbi_uc *input = in_near;
   3480 
   3481    if (w == 1) {
   3482       // if only one sample, can't do any interpolation
   3483       out[0] = out[1] = input[0];
   3484       return out;
   3485    }
   3486 
   3487    out[0] = input[0];
   3488    out[1] = stbi__div4(input[0]*3 + input[1] + 2);
   3489    for (i=1; i < w-1; ++i) {
   3490       int n = 3*input[i]+2;
   3491       out[i*2+0] = stbi__div4(n+input[i-1]);
   3492       out[i*2+1] = stbi__div4(n+input[i+1]);
   3493    }
   3494    out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
   3495    out[i*2+1] = input[w-1];
   3496 
   3497    STBI_NOTUSED(in_far);
   3498    STBI_NOTUSED(hs);
   3499 
   3500    return out;
   3501 }
   3502 
   3503 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
   3504 
   3505 static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3506 {
   3507    // need to generate 2x2 samples for every one in input
   3508    int i,t0,t1;
   3509    if (w == 1) {
   3510       out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
   3511       return out;
   3512    }
   3513 
   3514    t1 = 3*in_near[0] + in_far[0];
   3515    out[0] = stbi__div4(t1+2);
   3516    for (i=1; i < w; ++i) {
   3517       t0 = t1;
   3518       t1 = 3*in_near[i]+in_far[i];
   3519       out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
   3520       out[i*2  ] = stbi__div16(3*t1 + t0 + 8);
   3521    }
   3522    out[w*2-1] = stbi__div4(t1+2);
   3523 
   3524    STBI_NOTUSED(hs);
   3525 
   3526    return out;
   3527 }
   3528 
   3529 #if defined(STBI_SSE2) || defined(STBI_NEON)
   3530 static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3531 {
   3532    // need to generate 2x2 samples for every one in input
   3533    int i=0,t0,t1;
   3534 
   3535    if (w == 1) {
   3536       out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
   3537       return out;
   3538    }
   3539 
   3540    t1 = 3*in_near[0] + in_far[0];
   3541    // process groups of 8 pixels for as long as we can.
   3542    // note we can't handle the last pixel in a row in this loop
   3543    // because we need to handle the filter boundary conditions.
   3544    for (; i < ((w-1) & ~7); i += 8) {
   3545 #if defined(STBI_SSE2)
   3546       // load and perform the vertical filtering pass
   3547       // this uses 3*x + y = 4*x + (y - x)
   3548       __m128i zero  = _mm_setzero_si128();
   3549       __m128i farb  = _mm_loadl_epi64((__m128i *) (in_far + i));
   3550       __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
   3551       __m128i farw  = _mm_unpacklo_epi8(farb, zero);
   3552       __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
   3553       __m128i diff  = _mm_sub_epi16(farw, nearw);
   3554       __m128i nears = _mm_slli_epi16(nearw, 2);
   3555       __m128i curr  = _mm_add_epi16(nears, diff); // current row
   3556 
   3557       // horizontal filter works the same based on shifted vers of current
   3558       // row. "prev" is current row shifted right by 1 pixel; we need to
   3559       // insert the previous pixel value (from t1).
   3560       // "next" is current row shifted left by 1 pixel, with first pixel
   3561       // of next block of 8 pixels added in.
   3562       __m128i prv0 = _mm_slli_si128(curr, 2);
   3563       __m128i nxt0 = _mm_srli_si128(curr, 2);
   3564       __m128i prev = _mm_insert_epi16(prv0, t1, 0);
   3565       __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
   3566 
   3567       // horizontal filter, polyphase implementation since it's convenient:
   3568       // even pixels = 3*cur + prev = cur*4 + (prev - cur)
   3569       // odd  pixels = 3*cur + next = cur*4 + (next - cur)
   3570       // note the shared term.
   3571       __m128i bias  = _mm_set1_epi16(8);
   3572       __m128i curs = _mm_slli_epi16(curr, 2);
   3573       __m128i prvd = _mm_sub_epi16(prev, curr);
   3574       __m128i nxtd = _mm_sub_epi16(next, curr);
   3575       __m128i curb = _mm_add_epi16(curs, bias);
   3576       __m128i even = _mm_add_epi16(prvd, curb);
   3577       __m128i odd  = _mm_add_epi16(nxtd, curb);
   3578 
   3579       // interleave even and odd pixels, then undo scaling.
   3580       __m128i int0 = _mm_unpacklo_epi16(even, odd);
   3581       __m128i int1 = _mm_unpackhi_epi16(even, odd);
   3582       __m128i de0  = _mm_srli_epi16(int0, 4);
   3583       __m128i de1  = _mm_srli_epi16(int1, 4);
   3584 
   3585       // pack and write output
   3586       __m128i outv = _mm_packus_epi16(de0, de1);
   3587       _mm_storeu_si128((__m128i *) (out + i*2), outv);
   3588 #elif defined(STBI_NEON)
   3589       // load and perform the vertical filtering pass
   3590       // this uses 3*x + y = 4*x + (y - x)
   3591       uint8x8_t farb  = vld1_u8(in_far + i);
   3592       uint8x8_t nearb = vld1_u8(in_near + i);
   3593       int16x8_t diff  = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
   3594       int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
   3595       int16x8_t curr  = vaddq_s16(nears, diff); // current row
   3596 
   3597       // horizontal filter works the same based on shifted vers of current
   3598       // row. "prev" is current row shifted right by 1 pixel; we need to
   3599       // insert the previous pixel value (from t1).
   3600       // "next" is current row shifted left by 1 pixel, with first pixel
   3601       // of next block of 8 pixels added in.
   3602       int16x8_t prv0 = vextq_s16(curr, curr, 7);
   3603       int16x8_t nxt0 = vextq_s16(curr, curr, 1);
   3604       int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
   3605       int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
   3606 
   3607       // horizontal filter, polyphase implementation since it's convenient:
   3608       // even pixels = 3*cur + prev = cur*4 + (prev - cur)
   3609       // odd  pixels = 3*cur + next = cur*4 + (next - cur)
   3610       // note the shared term.
   3611       int16x8_t curs = vshlq_n_s16(curr, 2);
   3612       int16x8_t prvd = vsubq_s16(prev, curr);
   3613       int16x8_t nxtd = vsubq_s16(next, curr);
   3614       int16x8_t even = vaddq_s16(curs, prvd);
   3615       int16x8_t odd  = vaddq_s16(curs, nxtd);
   3616 
   3617       // undo scaling and round, then store with even/odd phases interleaved
   3618       uint8x8x2_t o;
   3619       o.val[0] = vqrshrun_n_s16(even, 4);
   3620       o.val[1] = vqrshrun_n_s16(odd,  4);
   3621       vst2_u8(out + i*2, o);
   3622 #endif
   3623 
   3624       // "previous" value for next iter
   3625       t1 = 3*in_near[i+7] + in_far[i+7];
   3626    }
   3627 
   3628    t0 = t1;
   3629    t1 = 3*in_near[i] + in_far[i];
   3630    out[i*2] = stbi__div16(3*t1 + t0 + 8);
   3631 
   3632    for (++i; i < w; ++i) {
   3633       t0 = t1;
   3634       t1 = 3*in_near[i]+in_far[i];
   3635       out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
   3636       out[i*2  ] = stbi__div16(3*t1 + t0 + 8);
   3637    }
   3638    out[w*2-1] = stbi__div4(t1+2);
   3639 
   3640    STBI_NOTUSED(hs);
   3641 
   3642    return out;
   3643 }
   3644 #endif
   3645 
   3646 static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
   3647 {
   3648    // resample with nearest-neighbor
   3649    int i,j;
   3650    STBI_NOTUSED(in_far);
   3651    for (i=0; i < w; ++i)
   3652       for (j=0; j < hs; ++j)
   3653          out[i*hs+j] = in_near[i];
   3654    return out;
   3655 }
   3656 
   3657 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
   3658 // to make sure the code produces the same results in both SIMD and scalar
   3659 #define stbi__float2fixed(x)  (((int) ((x) * 4096.0f + 0.5f)) << 8)
   3660 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
   3661 {
   3662    int i;
   3663    for (i=0; i < count; ++i) {
   3664       int y_fixed = (y[i] << 20) + (1<<19); // rounding
   3665       int r,g,b;
   3666       int cr = pcr[i] - 128;
   3667       int cb = pcb[i] - 128;
   3668       r = y_fixed +  cr* stbi__float2fixed(1.40200f);
   3669       g = y_fixed + (cr*-stbi__float2fixed(0.71414f)) + ((cb*-stbi__float2fixed(0.34414f)) & 0xffff0000);
   3670       b = y_fixed                                     +   cb* stbi__float2fixed(1.77200f);
   3671       r >>= 20;
   3672       g >>= 20;
   3673       b >>= 20;
   3674       if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
   3675       if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
   3676       if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
   3677       out[0] = (stbi_uc)r;
   3678       out[1] = (stbi_uc)g;
   3679       out[2] = (stbi_uc)b;
   3680       out[3] = 255;
   3681       out += step;
   3682    }
   3683 }
   3684 
   3685 #if defined(STBI_SSE2) || defined(STBI_NEON)
   3686 static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
   3687 {
   3688    int i = 0;
   3689 
   3690 #ifdef STBI_SSE2
   3691    // step == 3 is pretty ugly on the final interleave, and i'm not convinced
   3692    // it's useful in practice (you wouldn't use it for textures, for example).
   3693    // so just accelerate step == 4 case.
   3694    if (step == 4) {
   3695       // this is a fairly straightforward implementation and not super-optimized.
   3696       __m128i signflip  = _mm_set1_epi8(-0x80);
   3697       __m128i cr_const0 = _mm_set1_epi16(   (short) ( 1.40200f*4096.0f+0.5f));
   3698       __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
   3699       __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
   3700       __m128i cb_const1 = _mm_set1_epi16(   (short) ( 1.77200f*4096.0f+0.5f));
   3701       __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
   3702       __m128i xw = _mm_set1_epi16(255); // alpha channel
   3703 
   3704       for (; i+7 < count; i += 8) {
   3705          // load
   3706          __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
   3707          __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
   3708          __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
   3709          __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
   3710          __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
   3711 
   3712          // unpack to short (and left-shift cr, cb by 8)
   3713          __m128i yw  = _mm_unpacklo_epi8(y_bias, y_bytes);
   3714          __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
   3715          __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
   3716 
   3717          // color transform
   3718          __m128i yws = _mm_srli_epi16(yw, 4);
   3719          __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
   3720          __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
   3721          __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
   3722          __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
   3723          __m128i rws = _mm_add_epi16(cr0, yws);
   3724          __m128i gwt = _mm_add_epi16(cb0, yws);
   3725          __m128i bws = _mm_add_epi16(yws, cb1);
   3726          __m128i gws = _mm_add_epi16(gwt, cr1);
   3727 
   3728          // descale
   3729          __m128i rw = _mm_srai_epi16(rws, 4);
   3730          __m128i bw = _mm_srai_epi16(bws, 4);
   3731          __m128i gw = _mm_srai_epi16(gws, 4);
   3732 
   3733          // back to byte, set up for transpose
   3734          __m128i brb = _mm_packus_epi16(rw, bw);
   3735          __m128i gxb = _mm_packus_epi16(gw, xw);
   3736 
   3737          // transpose to interleave channels
   3738          __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
   3739          __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
   3740          __m128i o0 = _mm_unpacklo_epi16(t0, t1);
   3741          __m128i o1 = _mm_unpackhi_epi16(t0, t1);
   3742 
   3743          // store
   3744          _mm_storeu_si128((__m128i *) (out + 0), o0);
   3745          _mm_storeu_si128((__m128i *) (out + 16), o1);
   3746          out += 32;
   3747       }
   3748    }
   3749 #endif
   3750 
   3751 #ifdef STBI_NEON
   3752    // in this version, step=3 support would be easy to add. but is there demand?
   3753    if (step == 4) {
   3754       // this is a fairly straightforward implementation and not super-optimized.
   3755       uint8x8_t signflip = vdup_n_u8(0x80);
   3756       int16x8_t cr_const0 = vdupq_n_s16(   (short) ( 1.40200f*4096.0f+0.5f));
   3757       int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
   3758       int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
   3759       int16x8_t cb_const1 = vdupq_n_s16(   (short) ( 1.77200f*4096.0f+0.5f));
   3760 
   3761       for (; i+7 < count; i += 8) {
   3762          // load
   3763          uint8x8_t y_bytes  = vld1_u8(y + i);
   3764          uint8x8_t cr_bytes = vld1_u8(pcr + i);
   3765          uint8x8_t cb_bytes = vld1_u8(pcb + i);
   3766          int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
   3767          int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
   3768 
   3769          // expand to s16
   3770          int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
   3771          int16x8_t crw = vshll_n_s8(cr_biased, 7);
   3772          int16x8_t cbw = vshll_n_s8(cb_biased, 7);
   3773 
   3774          // color transform
   3775          int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
   3776          int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
   3777          int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
   3778          int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
   3779          int16x8_t rws = vaddq_s16(yws, cr0);
   3780          int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
   3781          int16x8_t bws = vaddq_s16(yws, cb1);
   3782 
   3783          // undo scaling, round, convert to byte
   3784          uint8x8x4_t o;
   3785          o.val[0] = vqrshrun_n_s16(rws, 4);
   3786          o.val[1] = vqrshrun_n_s16(gws, 4);
   3787          o.val[2] = vqrshrun_n_s16(bws, 4);
   3788          o.val[3] = vdup_n_u8(255);
   3789 
   3790          // store, interleaving r/g/b/a
   3791          vst4_u8(out, o);
   3792          out += 8*4;
   3793       }
   3794    }
   3795 #endif
   3796 
   3797    for (; i < count; ++i) {
   3798       int y_fixed = (y[i] << 20) + (1<<19); // rounding
   3799       int r,g,b;
   3800       int cr = pcr[i] - 128;
   3801       int cb = pcb[i] - 128;
   3802       r = y_fixed + cr* stbi__float2fixed(1.40200f);
   3803       g = y_fixed + cr*-stbi__float2fixed(0.71414f) + ((cb*-stbi__float2fixed(0.34414f)) & 0xffff0000);
   3804       b = y_fixed                                   +   cb* stbi__float2fixed(1.77200f);
   3805       r >>= 20;
   3806       g >>= 20;
   3807       b >>= 20;
   3808       if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
   3809       if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
   3810       if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
   3811       out[0] = (stbi_uc)r;
   3812       out[1] = (stbi_uc)g;
   3813       out[2] = (stbi_uc)b;
   3814       out[3] = 255;
   3815       out += step;
   3816    }
   3817 }
   3818 #endif
   3819 
   3820 // set up the kernels
   3821 static void stbi__setup_jpeg(stbi__jpeg *j)
   3822 {
   3823    j->idct_block_kernel = stbi__idct_block;
   3824    j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
   3825    j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
   3826 
   3827 #ifdef STBI_SSE2
   3828    if (stbi__sse2_available()) {
   3829       j->idct_block_kernel = stbi__idct_simd;
   3830       j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
   3831       j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
   3832    }
   3833 #endif
   3834 
   3835 #ifdef STBI_NEON
   3836    j->idct_block_kernel = stbi__idct_simd;
   3837    j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
   3838    j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
   3839 #endif
   3840 }
   3841 
   3842 // clean up the temporary component buffers
   3843 static void stbi__cleanup_jpeg(stbi__jpeg *j)
   3844 {
   3845    stbi__free_jpeg_components(j, j->s->img_n, 0);
   3846 }
   3847 
   3848 typedef struct
   3849 {
   3850    resample_row_func resample;
   3851    stbi_uc *line0,*line1;
   3852    int hs,vs;   // expansion factor in each axis
   3853    int w_lores; // horizontal pixels pre-expansion
   3854    int ystep;   // how far through vertical expansion we are
   3855    int ypos;    // which pre-expansion row we're on
   3856 } stbi__resample;
   3857 
   3858 // fast 0..255 * 0..255 => 0..255 rounded multiplication
   3859 static stbi_uc stbi__blinn_8x8(stbi_uc x, stbi_uc y)
   3860 {
   3861    unsigned int t = x*y + 128;
   3862    return (stbi_uc) ((t + (t >>8)) >> 8);
   3863 }
   3864 
   3865 static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
   3866 {
   3867    int n, decode_n, is_rgb;
   3868    z->s->img_n = 0; // make stbi__cleanup_jpeg safe
   3869 
   3870    // validate req_comp
   3871    if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
   3872 
   3873    // load a jpeg image from whichever source, but leave in YCbCr format
   3874    if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
   3875 
   3876    // determine actual number of components to generate
   3877    n = req_comp ? req_comp : z->s->img_n >= 3 ? 3 : 1;
   3878 
   3879    is_rgb = z->s->img_n == 3 && (z->rgb == 3 || (z->app14_color_transform == 0 && !z->jfif));
   3880 
   3881    if (z->s->img_n == 3 && n < 3 && !is_rgb)
   3882       decode_n = 1;
   3883    else
   3884       decode_n = z->s->img_n;
   3885 
   3886    // nothing to do if no components requested; check this now to avoid
   3887    // accessing uninitialized coutput[0] later
   3888    if (decode_n <= 0) { stbi__cleanup_jpeg(z); return NULL; }
   3889 
   3890    // resample and color-convert
   3891    {
   3892       int k;
   3893       unsigned int i,j;
   3894       stbi_uc *output;
   3895       stbi_uc *coutput[4] = { NULL, NULL, NULL, NULL };
   3896 
   3897       stbi__resample res_comp[4];
   3898 
   3899       for (k=0; k < decode_n; ++k) {
   3900          stbi__resample *r = &res_comp[k];
   3901 
   3902          // allocate line buffer big enough for upsampling off the edges
   3903          // with upsample factor of 4
   3904          z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
   3905          if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
   3906 
   3907          r->hs      = z->img_h_max / z->img_comp[k].h;
   3908          r->vs      = z->img_v_max / z->img_comp[k].v;
   3909          r->ystep   = r->vs >> 1;
   3910          r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
   3911          r->ypos    = 0;
   3912          r->line0   = r->line1 = z->img_comp[k].data;
   3913 
   3914          if      (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
   3915          else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
   3916          else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
   3917          else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
   3918          else                               r->resample = stbi__resample_row_generic;
   3919       }
   3920 
   3921       // can't error after this so, this is safe
   3922       output = (stbi_uc *) stbi__malloc_mad3(n, z->s->img_x, z->s->img_y, 1);
   3923       if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
   3924 
   3925       // now go ahead and resample
   3926       for (j=0; j < z->s->img_y; ++j) {
   3927          stbi_uc *out = output + n * z->s->img_x * j;
   3928          for (k=0; k < decode_n; ++k) {
   3929             stbi__resample *r = &res_comp[k];
   3930             int y_bot = r->ystep >= (r->vs >> 1);
   3931             coutput[k] = r->resample(z->img_comp[k].linebuf,
   3932                                      y_bot ? r->line1 : r->line0,
   3933                                      y_bot ? r->line0 : r->line1,
   3934                                      r->w_lores, r->hs);
   3935             if (++r->ystep >= r->vs) {
   3936                r->ystep = 0;
   3937                r->line0 = r->line1;
   3938                if (++r->ypos < z->img_comp[k].y)
   3939                   r->line1 += z->img_comp[k].w2;
   3940             }
   3941          }
   3942          if (n >= 3) {
   3943             stbi_uc *y = coutput[0];
   3944             if (z->s->img_n == 3) {
   3945                if (is_rgb) {
   3946                   for (i=0; i < z->s->img_x; ++i) {
   3947                      out[0] = y[i];
   3948                      out[1] = coutput[1][i];
   3949                      out[2] = coutput[2][i];
   3950                      out[3] = 255;
   3951                      out += n;
   3952                   }
   3953                } else {
   3954                   z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
   3955                }
   3956             } else if (z->s->img_n == 4) {
   3957                if (z->app14_color_transform == 0) { // CMYK
   3958                   for (i=0; i < z->s->img_x; ++i) {
   3959                      stbi_uc m = coutput[3][i];
   3960                      out[0] = stbi__blinn_8x8(coutput[0][i], m);
   3961                      out[1] = stbi__blinn_8x8(coutput[1][i], m);
   3962                      out[2] = stbi__blinn_8x8(coutput[2][i], m);
   3963                      out[3] = 255;
   3964                      out += n;
   3965                   }
   3966                } else if (z->app14_color_transform == 2) { // YCCK
   3967                   z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
   3968                   for (i=0; i < z->s->img_x; ++i) {
   3969                      stbi_uc m = coutput[3][i];
   3970                      out[0] = stbi__blinn_8x8(255 - out[0], m);
   3971                      out[1] = stbi__blinn_8x8(255 - out[1], m);
   3972                      out[2] = stbi__blinn_8x8(255 - out[2], m);
   3973                      out += n;
   3974                   }
   3975                } else { // YCbCr + alpha?  Ignore the fourth channel for now
   3976                   z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
   3977                }
   3978             } else
   3979                for (i=0; i < z->s->img_x; ++i) {
   3980                   out[0] = out[1] = out[2] = y[i];
   3981                   out[3] = 255; // not used if n==3
   3982                   out += n;
   3983                }
   3984          } else {
   3985             if (is_rgb) {
   3986                if (n == 1)
   3987                   for (i=0; i < z->s->img_x; ++i)
   3988                      *out++ = stbi__compute_y(coutput[0][i], coutput[1][i], coutput[2][i]);
   3989                else {
   3990                   for (i=0; i < z->s->img_x; ++i, out += 2) {
   3991                      out[0] = stbi__compute_y(coutput[0][i], coutput[1][i], coutput[2][i]);
   3992                      out[1] = 255;
   3993                   }
   3994                }
   3995             } else if (z->s->img_n == 4 && z->app14_color_transform == 0) {
   3996                for (i=0; i < z->s->img_x; ++i) {
   3997                   stbi_uc m = coutput[3][i];
   3998                   stbi_uc r = stbi__blinn_8x8(coutput[0][i], m);
   3999                   stbi_uc g = stbi__blinn_8x8(coutput[1][i], m);
   4000                   stbi_uc b = stbi__blinn_8x8(coutput[2][i], m);
   4001                   out[0] = stbi__compute_y(r, g, b);
   4002                   out[1] = 255;
   4003                   out += n;
   4004                }
   4005             } else if (z->s->img_n == 4 && z->app14_color_transform == 2) {
   4006                for (i=0; i < z->s->img_x; ++i) {
   4007                   out[0] = stbi__blinn_8x8(255 - coutput[0][i], coutput[3][i]);
   4008                   out[1] = 255;
   4009                   out += n;
   4010                }
   4011             } else {
   4012                stbi_uc *y = coutput[0];
   4013                if (n == 1)
   4014                   for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
   4015                else
   4016                   for (i=0; i < z->s->img_x; ++i) { *out++ = y[i]; *out++ = 255; }
   4017             }
   4018          }
   4019       }
   4020       stbi__cleanup_jpeg(z);
   4021       *out_x = z->s->img_x;
   4022       *out_y = z->s->img_y;
   4023       if (comp) *comp = z->s->img_n >= 3 ? 3 : 1; // report original components, not output
   4024       return output;
   4025    }
   4026 }
   4027 
   4028 static void *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   4029 {
   4030    unsigned char* result;
   4031    stbi__jpeg* j = (stbi__jpeg*) stbi__malloc(sizeof(stbi__jpeg));
   4032    if (!j) return stbi__errpuc("outofmem", "Out of memory");
   4033    memset(j, 0, sizeof(stbi__jpeg));
   4034    STBI_NOTUSED(ri);
   4035    j->s = s;
   4036    stbi__setup_jpeg(j);
   4037    result = load_jpeg_image(j, x,y,comp,req_comp);
   4038    STBI_FREE(j);
   4039    return result;
   4040 }
   4041 
   4042 static int stbi__jpeg_test(stbi__context *s)
   4043 {
   4044    int r;
   4045    stbi__jpeg* j = (stbi__jpeg*)stbi__malloc(sizeof(stbi__jpeg));
   4046    if (!j) return stbi__err("outofmem", "Out of memory");
   4047    memset(j, 0, sizeof(stbi__jpeg));
   4048    j->s = s;
   4049    stbi__setup_jpeg(j);
   4050    r = stbi__decode_jpeg_header(j, STBI__SCAN_type);
   4051    stbi__rewind(s);
   4052    STBI_FREE(j);
   4053    return r;
   4054 }
   4055 
   4056 static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
   4057 {
   4058    if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
   4059       stbi__rewind( j->s );
   4060       return 0;
   4061    }
   4062    if (x) *x = j->s->img_x;
   4063    if (y) *y = j->s->img_y;
   4064    if (comp) *comp = j->s->img_n >= 3 ? 3 : 1;
   4065    return 1;
   4066 }
   4067 
   4068 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
   4069 {
   4070    int result;
   4071    stbi__jpeg* j = (stbi__jpeg*) (stbi__malloc(sizeof(stbi__jpeg)));
   4072    if (!j) return stbi__err("outofmem", "Out of memory");
   4073    memset(j, 0, sizeof(stbi__jpeg));
   4074    j->s = s;
   4075    result = stbi__jpeg_info_raw(j, x, y, comp);
   4076    STBI_FREE(j);
   4077    return result;
   4078 }
   4079 #endif
   4080 
   4081 // public domain zlib decode    v0.2  Sean Barrett 2006-11-18
   4082 //    simple implementation
   4083 //      - all input must be provided in an upfront buffer
   4084 //      - all output is written to a single output buffer (can malloc/realloc)
   4085 //    performance
   4086 //      - fast huffman
   4087 
   4088 #ifndef STBI_NO_ZLIB
   4089 
   4090 // fast-way is faster to check than jpeg huffman, but slow way is slower
   4091 #define STBI__ZFAST_BITS  9 // accelerate all cases in default tables
   4092 #define STBI__ZFAST_MASK  ((1 << STBI__ZFAST_BITS) - 1)
   4093 #define STBI__ZNSYMS 288 // number of symbols in literal/length alphabet
   4094 
   4095 // zlib-style huffman encoding
   4096 // (jpegs packs from left, zlib from right, so can't share code)
   4097 typedef struct
   4098 {
   4099    stbi__uint16 fast[1 << STBI__ZFAST_BITS];
   4100    stbi__uint16 firstcode[16];
   4101    int maxcode[17];
   4102    stbi__uint16 firstsymbol[16];
   4103    stbi_uc  size[STBI__ZNSYMS];
   4104    stbi__uint16 value[STBI__ZNSYMS];
   4105 } stbi__zhuffman;
   4106 
   4107 stbi_inline static int stbi__bitreverse16(int n)
   4108 {
   4109   n = ((n & 0xAAAA) >>  1) | ((n & 0x5555) << 1);
   4110   n = ((n & 0xCCCC) >>  2) | ((n & 0x3333) << 2);
   4111   n = ((n & 0xF0F0) >>  4) | ((n & 0x0F0F) << 4);
   4112   n = ((n & 0xFF00) >>  8) | ((n & 0x00FF) << 8);
   4113   return n;
   4114 }
   4115 
   4116 stbi_inline static int stbi__bit_reverse(int v, int bits)
   4117 {
   4118    STBI_ASSERT(bits <= 16);
   4119    // to bit reverse n bits, reverse 16 and shift
   4120    // e.g. 11 bits, bit reverse and shift away 5
   4121    return stbi__bitreverse16(v) >> (16-bits);
   4122 }
   4123 
   4124 static int stbi__zbuild_huffman(stbi__zhuffman *z, const stbi_uc *sizelist, int num)
   4125 {
   4126    int i,k=0;
   4127    int code, next_code[16], sizes[17];
   4128 
   4129    // DEFLATE spec for generating codes
   4130    memset(sizes, 0, sizeof(sizes));
   4131    memset(z->fast, 0, sizeof(z->fast));
   4132    for (i=0; i < num; ++i)
   4133       ++sizes[sizelist[i]];
   4134    sizes[0] = 0;
   4135    for (i=1; i < 16; ++i)
   4136       if (sizes[i] > (1 << i))
   4137          return stbi__err("bad sizes", "Corrupt PNG");
   4138    code = 0;
   4139    for (i=1; i < 16; ++i) {
   4140       next_code[i] = code;
   4141       z->firstcode[i] = (stbi__uint16) code;
   4142       z->firstsymbol[i] = (stbi__uint16) k;
   4143       code = (code + sizes[i]);
   4144       if (sizes[i])
   4145          if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
   4146       z->maxcode[i] = code << (16-i); // preshift for inner loop
   4147       code <<= 1;
   4148       k += sizes[i];
   4149    }
   4150    z->maxcode[16] = 0x10000; // sentinel
   4151    for (i=0; i < num; ++i) {
   4152       int s = sizelist[i];
   4153       if (s) {
   4154          int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
   4155          stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
   4156          z->size [c] = (stbi_uc     ) s;
   4157          z->value[c] = (stbi__uint16) i;
   4158          if (s <= STBI__ZFAST_BITS) {
   4159             int j = stbi__bit_reverse(next_code[s],s);
   4160             while (j < (1 << STBI__ZFAST_BITS)) {
   4161                z->fast[j] = fastv;
   4162                j += (1 << s);
   4163             }
   4164          }
   4165          ++next_code[s];
   4166       }
   4167    }
   4168    return 1;
   4169 }
   4170 
   4171 // zlib-from-memory implementation for PNG reading
   4172 //    because PNG allows splitting the zlib stream arbitrarily,
   4173 //    and it's annoying structurally to have PNG call ZLIB call PNG,
   4174 //    we require PNG read all the IDATs and combine them into a single
   4175 //    memory buffer
   4176 
   4177 typedef struct
   4178 {
   4179    stbi_uc *zbuffer, *zbuffer_end;
   4180    int num_bits;
   4181    int hit_zeof_once;
   4182    stbi__uint32 code_buffer;
   4183 
   4184    char *zout;
   4185    char *zout_start;
   4186    char *zout_end;
   4187    int   z_expandable;
   4188 
   4189    stbi__zhuffman z_length, z_distance;
   4190 } stbi__zbuf;
   4191 
   4192 stbi_inline static int stbi__zeof(stbi__zbuf *z)
   4193 {
   4194    return (z->zbuffer >= z->zbuffer_end);
   4195 }
   4196 
   4197 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
   4198 {
   4199    return stbi__zeof(z) ? 0 : *z->zbuffer++;
   4200 }
   4201 
   4202 static void stbi__fill_bits(stbi__zbuf *z)
   4203 {
   4204    do {
   4205       if (z->code_buffer >= (1U << z->num_bits)) {
   4206         z->zbuffer = z->zbuffer_end;  /* treat this as EOF so we fail. */
   4207         return;
   4208       }
   4209       z->code_buffer |= (unsigned int) stbi__zget8(z) << z->num_bits;
   4210       z->num_bits += 8;
   4211    } while (z->num_bits <= 24);
   4212 }
   4213 
   4214 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
   4215 {
   4216    unsigned int k;
   4217    if (z->num_bits < n) stbi__fill_bits(z);
   4218    k = z->code_buffer & ((1 << n) - 1);
   4219    z->code_buffer >>= n;
   4220    z->num_bits -= n;
   4221    return k;
   4222 }
   4223 
   4224 static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
   4225 {
   4226    int b,s,k;
   4227    // not resolved by fast table, so compute it the slow way
   4228    // use jpeg approach, which requires MSbits at top
   4229    k = stbi__bit_reverse(a->code_buffer, 16);
   4230    for (s=STBI__ZFAST_BITS+1; ; ++s)
   4231       if (k < z->maxcode[s])
   4232          break;
   4233    if (s >= 16) return -1; // invalid code!
   4234    // code size is s, so:
   4235    b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
   4236    if (b >= STBI__ZNSYMS) return -1; // some data was corrupt somewhere!
   4237    if (z->size[b] != s) return -1;  // was originally an assert, but report failure instead.
   4238    a->code_buffer >>= s;
   4239    a->num_bits -= s;
   4240    return z->value[b];
   4241 }
   4242 
   4243 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
   4244 {
   4245    int b,s;
   4246    if (a->num_bits < 16) {
   4247       if (stbi__zeof(a)) {
   4248          if (!a->hit_zeof_once) {
   4249             // This is the first time we hit eof, insert 16 extra padding btis
   4250             // to allow us to keep going; if we actually consume any of them
   4251             // though, that is invalid data. This is caught later.
   4252             a->hit_zeof_once = 1;
   4253             a->num_bits += 16; // add 16 implicit zero bits
   4254          } else {
   4255             // We already inserted our extra 16 padding bits and are again
   4256             // out, this stream is actually prematurely terminated.
   4257             return -1;
   4258          }
   4259       } else {
   4260          stbi__fill_bits(a);
   4261       }
   4262    }
   4263    b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
   4264    if (b) {
   4265       s = b >> 9;
   4266       a->code_buffer >>= s;
   4267       a->num_bits -= s;
   4268       return b & 511;
   4269    }
   4270    return stbi__zhuffman_decode_slowpath(a, z);
   4271 }
   4272 
   4273 static int stbi__zexpand(stbi__zbuf *z, char *zout, int n)  // need to make room for n bytes
   4274 {
   4275    char *q;
   4276    unsigned int cur, limit, old_limit;
   4277    z->zout = zout;
   4278    if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
   4279    cur   = (unsigned int) (z->zout - z->zout_start);
   4280    limit = old_limit = (unsigned) (z->zout_end - z->zout_start);
   4281    if (UINT_MAX - cur < (unsigned) n) return stbi__err("outofmem", "Out of memory");
   4282    while (cur + n > limit) {
   4283       if(limit > UINT_MAX / 2) return stbi__err("outofmem", "Out of memory");
   4284       limit *= 2;
   4285    }
   4286    q = (char *) STBI_REALLOC_SIZED(z->zout_start, old_limit, limit);
   4287    STBI_NOTUSED(old_limit);
   4288    if (q == NULL) return stbi__err("outofmem", "Out of memory");
   4289    z->zout_start = q;
   4290    z->zout       = q + cur;
   4291    z->zout_end   = q + limit;
   4292    return 1;
   4293 }
   4294 
   4295 static const int stbi__zlength_base[31] = {
   4296    3,4,5,6,7,8,9,10,11,13,
   4297    15,17,19,23,27,31,35,43,51,59,
   4298    67,83,99,115,131,163,195,227,258,0,0 };
   4299 
   4300 static const int stbi__zlength_extra[31]=
   4301 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
   4302 
   4303 static const int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
   4304 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
   4305 
   4306 static const int stbi__zdist_extra[32] =
   4307 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
   4308 
   4309 static int stbi__parse_huffman_block(stbi__zbuf *a)
   4310 {
   4311    char *zout = a->zout;
   4312    for(;;) {
   4313       int z = stbi__zhuffman_decode(a, &a->z_length);
   4314       if (z < 256) {
   4315          if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
   4316          if (zout >= a->zout_end) {
   4317             if (!stbi__zexpand(a, zout, 1)) return 0;
   4318             zout = a->zout;
   4319          }
   4320          *zout++ = (char) z;
   4321       } else {
   4322          stbi_uc *p;
   4323          int len,dist;
   4324          if (z == 256) {
   4325             a->zout = zout;
   4326             if (a->hit_zeof_once && a->num_bits < 16) {
   4327                // The first time we hit zeof, we inserted 16 extra zero bits into our bit
   4328                // buffer so the decoder can just do its speculative decoding. But if we
   4329                // actually consumed any of those bits (which is the case when num_bits < 16),
   4330                // the stream actually read past the end so it is malformed.
   4331                return stbi__err("unexpected end","Corrupt PNG");
   4332             }
   4333             return 1;
   4334          }
   4335          if (z >= 286) return stbi__err("bad huffman code","Corrupt PNG"); // per DEFLATE, length codes 286 and 287 must not appear in compressed data
   4336          z -= 257;
   4337          len = stbi__zlength_base[z];
   4338          if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
   4339          z = stbi__zhuffman_decode(a, &a->z_distance);
   4340          if (z < 0 || z >= 30) return stbi__err("bad huffman code","Corrupt PNG"); // per DEFLATE, distance codes 30 and 31 must not appear in compressed data
   4341          dist = stbi__zdist_base[z];
   4342          if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
   4343          if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
   4344          if (len > a->zout_end - zout) {
   4345             if (!stbi__zexpand(a, zout, len)) return 0;
   4346             zout = a->zout;
   4347          }
   4348          p = (stbi_uc *) (zout - dist);
   4349          if (dist == 1) { // run of one byte; common in images.
   4350             stbi_uc v = *p;
   4351             if (len) { do *zout++ = v; while (--len); }
   4352          } else {
   4353             if (len) { do *zout++ = *p++; while (--len); }
   4354          }
   4355       }
   4356    }
   4357 }
   4358 
   4359 static int stbi__compute_huffman_codes(stbi__zbuf *a)
   4360 {
   4361    static const stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
   4362    stbi__zhuffman z_codelength;
   4363    stbi_uc lencodes[286+32+137];//padding for maximum single op
   4364    stbi_uc codelength_sizes[19];
   4365    int i,n;
   4366 
   4367    int hlit  = stbi__zreceive(a,5) + 257;
   4368    int hdist = stbi__zreceive(a,5) + 1;
   4369    int hclen = stbi__zreceive(a,4) + 4;
   4370    int ntot  = hlit + hdist;
   4371 
   4372    memset(codelength_sizes, 0, sizeof(codelength_sizes));
   4373    for (i=0; i < hclen; ++i) {
   4374       int s = stbi__zreceive(a,3);
   4375       codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
   4376    }
   4377    if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
   4378 
   4379    n = 0;
   4380    while (n < ntot) {
   4381       int c = stbi__zhuffman_decode(a, &z_codelength);
   4382       if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
   4383       if (c < 16)
   4384          lencodes[n++] = (stbi_uc) c;
   4385       else {
   4386          stbi_uc fill = 0;
   4387          if (c == 16) {
   4388             c = stbi__zreceive(a,2)+3;
   4389             if (n == 0) return stbi__err("bad codelengths", "Corrupt PNG");
   4390             fill = lencodes[n-1];
   4391          } else if (c == 17) {
   4392             c = stbi__zreceive(a,3)+3;
   4393          } else if (c == 18) {
   4394             c = stbi__zreceive(a,7)+11;
   4395          } else {
   4396             return stbi__err("bad codelengths", "Corrupt PNG");
   4397          }
   4398          if (ntot - n < c) return stbi__err("bad codelengths", "Corrupt PNG");
   4399          memset(lencodes+n, fill, c);
   4400          n += c;
   4401       }
   4402    }
   4403    if (n != ntot) return stbi__err("bad codelengths","Corrupt PNG");
   4404    if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
   4405    if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
   4406    return 1;
   4407 }
   4408 
   4409 static int stbi__parse_uncompressed_block(stbi__zbuf *a)
   4410 {
   4411    stbi_uc header[4];
   4412    int len,nlen,k;
   4413    if (a->num_bits & 7)
   4414       stbi__zreceive(a, a->num_bits & 7); // discard
   4415    // drain the bit-packed data into header
   4416    k = 0;
   4417    while (a->num_bits > 0) {
   4418       header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
   4419       a->code_buffer >>= 8;
   4420       a->num_bits -= 8;
   4421    }
   4422    if (a->num_bits < 0) return stbi__err("zlib corrupt","Corrupt PNG");
   4423    // now fill header the normal way
   4424    while (k < 4)
   4425       header[k++] = stbi__zget8(a);
   4426    len  = header[1] * 256 + header[0];
   4427    nlen = header[3] * 256 + header[2];
   4428    if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
   4429    if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
   4430    if (a->zout + len > a->zout_end)
   4431       if (!stbi__zexpand(a, a->zout, len)) return 0;
   4432    memcpy(a->zout, a->zbuffer, len);
   4433    a->zbuffer += len;
   4434    a->zout += len;
   4435    return 1;
   4436 }
   4437 
   4438 static int stbi__parse_zlib_header(stbi__zbuf *a)
   4439 {
   4440    int cmf   = stbi__zget8(a);
   4441    int cm    = cmf & 15;
   4442    /* int cinfo = cmf >> 4; */
   4443    int flg   = stbi__zget8(a);
   4444    if (stbi__zeof(a)) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
   4445    if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
   4446    if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
   4447    if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
   4448    // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
   4449    return 1;
   4450 }
   4451 
   4452 static const stbi_uc stbi__zdefault_length[STBI__ZNSYMS] =
   4453 {
   4454    8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
   4455    8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
   4456    8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
   4457    8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
   4458    8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
   4459    9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
   4460    9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
   4461    9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
   4462    7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7, 7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8
   4463 };
   4464 static const stbi_uc stbi__zdefault_distance[32] =
   4465 {
   4466    5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5
   4467 };
   4468 /*
   4469 Init algorithm:
   4470 {
   4471    int i;   // use <= to match clearly with spec
   4472    for (i=0; i <= 143; ++i)     stbi__zdefault_length[i]   = 8;
   4473    for (   ; i <= 255; ++i)     stbi__zdefault_length[i]   = 9;
   4474    for (   ; i <= 279; ++i)     stbi__zdefault_length[i]   = 7;
   4475    for (   ; i <= 287; ++i)     stbi__zdefault_length[i]   = 8;
   4476 
   4477    for (i=0; i <=  31; ++i)     stbi__zdefault_distance[i] = 5;
   4478 }
   4479 */
   4480 
   4481 static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
   4482 {
   4483    int final, type;
   4484    if (parse_header)
   4485       if (!stbi__parse_zlib_header(a)) return 0;
   4486    a->num_bits = 0;
   4487    a->code_buffer = 0;
   4488    a->hit_zeof_once = 0;
   4489    do {
   4490       final = stbi__zreceive(a,1);
   4491       type = stbi__zreceive(a,2);
   4492       if (type == 0) {
   4493          if (!stbi__parse_uncompressed_block(a)) return 0;
   4494       } else if (type == 3) {
   4495          return 0;
   4496       } else {
   4497          if (type == 1) {
   4498             // use fixed code lengths
   4499             if (!stbi__zbuild_huffman(&a->z_length  , stbi__zdefault_length  , STBI__ZNSYMS)) return 0;
   4500             if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance,  32)) return 0;
   4501          } else {
   4502             if (!stbi__compute_huffman_codes(a)) return 0;
   4503          }
   4504          if (!stbi__parse_huffman_block(a)) return 0;
   4505       }
   4506    } while (!final);
   4507    return 1;
   4508 }
   4509 
   4510 static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
   4511 {
   4512    a->zout_start = obuf;
   4513    a->zout       = obuf;
   4514    a->zout_end   = obuf + olen;
   4515    a->z_expandable = exp;
   4516 
   4517    return stbi__parse_zlib(a, parse_header);
   4518 }
   4519 
   4520 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
   4521 {
   4522    stbi__zbuf a;
   4523    char *p = (char *) stbi__malloc(initial_size);
   4524    if (p == NULL) return NULL;
   4525    a.zbuffer = (stbi_uc *) buffer;
   4526    a.zbuffer_end = (stbi_uc *) buffer + len;
   4527    if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
   4528       if (outlen) *outlen = (int) (a.zout - a.zout_start);
   4529       return a.zout_start;
   4530    } else {
   4531       STBI_FREE(a.zout_start);
   4532       return NULL;
   4533    }
   4534 }
   4535 
   4536 STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
   4537 {
   4538    return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
   4539 }
   4540 
   4541 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
   4542 {
   4543    stbi__zbuf a;
   4544    char *p = (char *) stbi__malloc(initial_size);
   4545    if (p == NULL) return NULL;
   4546    a.zbuffer = (stbi_uc *) buffer;
   4547    a.zbuffer_end = (stbi_uc *) buffer + len;
   4548    if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
   4549       if (outlen) *outlen = (int) (a.zout - a.zout_start);
   4550       return a.zout_start;
   4551    } else {
   4552       STBI_FREE(a.zout_start);
   4553       return NULL;
   4554    }
   4555 }
   4556 
   4557 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
   4558 {
   4559    stbi__zbuf a;
   4560    a.zbuffer = (stbi_uc *) ibuffer;
   4561    a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
   4562    if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
   4563       return (int) (a.zout - a.zout_start);
   4564    else
   4565       return -1;
   4566 }
   4567 
   4568 STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
   4569 {
   4570    stbi__zbuf a;
   4571    char *p = (char *) stbi__malloc(16384);
   4572    if (p == NULL) return NULL;
   4573    a.zbuffer = (stbi_uc *) buffer;
   4574    a.zbuffer_end = (stbi_uc *) buffer+len;
   4575    if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
   4576       if (outlen) *outlen = (int) (a.zout - a.zout_start);
   4577       return a.zout_start;
   4578    } else {
   4579       STBI_FREE(a.zout_start);
   4580       return NULL;
   4581    }
   4582 }
   4583 
   4584 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
   4585 {
   4586    stbi__zbuf a;
   4587    a.zbuffer = (stbi_uc *) ibuffer;
   4588    a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
   4589    if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
   4590       return (int) (a.zout - a.zout_start);
   4591    else
   4592       return -1;
   4593 }
   4594 #endif
   4595 
   4596 // public domain "baseline" PNG decoder   v0.10  Sean Barrett 2006-11-18
   4597 //    simple implementation
   4598 //      - only 8-bit samples
   4599 //      - no CRC checking
   4600 //      - allocates lots of intermediate memory
   4601 //        - avoids problem of streaming data between subsystems
   4602 //        - avoids explicit window management
   4603 //    performance
   4604 //      - uses stb_zlib, a PD zlib implementation with fast huffman decoding
   4605 
   4606 #ifndef STBI_NO_PNG
   4607 typedef struct
   4608 {
   4609    stbi__uint32 length;
   4610    stbi__uint32 type;
   4611 } stbi__pngchunk;
   4612 
   4613 static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
   4614 {
   4615    stbi__pngchunk c;
   4616    c.length = stbi__get32be(s);
   4617    c.type   = stbi__get32be(s);
   4618    return c;
   4619 }
   4620 
   4621 static int stbi__check_png_header(stbi__context *s)
   4622 {
   4623    static const stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
   4624    int i;
   4625    for (i=0; i < 8; ++i)
   4626       if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
   4627    return 1;
   4628 }
   4629 
   4630 typedef struct
   4631 {
   4632    stbi__context *s;
   4633    stbi_uc *idata, *expanded, *out;
   4634    int depth;
   4635 } stbi__png;
   4636 
   4637 
   4638 enum {
   4639    STBI__F_none=0,
   4640    STBI__F_sub=1,
   4641    STBI__F_up=2,
   4642    STBI__F_avg=3,
   4643    STBI__F_paeth=4,
   4644    // synthetic filter used for first scanline to avoid needing a dummy row of 0s
   4645    STBI__F_avg_first
   4646 };
   4647 
   4648 static stbi_uc first_row_filter[5] =
   4649 {
   4650    STBI__F_none,
   4651    STBI__F_sub,
   4652    STBI__F_none,
   4653    STBI__F_avg_first,
   4654    STBI__F_sub // Paeth with b=c=0 turns out to be equivalent to sub
   4655 };
   4656 
   4657 static int stbi__paeth(int a, int b, int c)
   4658 {
   4659    // This formulation looks very different from the reference in the PNG spec, but is
   4660    // actually equivalent and has favorable data dependencies and admits straightforward
   4661    // generation of branch-free code, which helps performance significantly.
   4662    int thresh = c*3 - (a + b);
   4663    int lo = a < b ? a : b;
   4664    int hi = a < b ? b : a;
   4665    int t0 = (hi <= thresh) ? lo : c;
   4666    int t1 = (thresh <= lo) ? hi : t0;
   4667    return t1;
   4668 }
   4669 
   4670 static const stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
   4671 
   4672 // adds an extra all-255 alpha channel
   4673 // dest == src is legal
   4674 // img_n must be 1 or 3
   4675 static void stbi__create_png_alpha_expand8(stbi_uc *dest, stbi_uc *src, stbi__uint32 x, int img_n)
   4676 {
   4677    int i;
   4678    // must process data backwards since we allow dest==src
   4679    if (img_n == 1) {
   4680       for (i=x-1; i >= 0; --i) {
   4681          dest[i*2+1] = 255;
   4682          dest[i*2+0] = src[i];
   4683       }
   4684    } else {
   4685       STBI_ASSERT(img_n == 3);
   4686       for (i=x-1; i >= 0; --i) {
   4687          dest[i*4+3] = 255;
   4688          dest[i*4+2] = src[i*3+2];
   4689          dest[i*4+1] = src[i*3+1];
   4690          dest[i*4+0] = src[i*3+0];
   4691       }
   4692    }
   4693 }
   4694 
   4695 // create the png data from post-deflated data
   4696 static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
   4697 {
   4698    int bytes = (depth == 16 ? 2 : 1);
   4699    stbi__context *s = a->s;
   4700    stbi__uint32 i,j,stride = x*out_n*bytes;
   4701    stbi__uint32 img_len, img_width_bytes;
   4702    stbi_uc *filter_buf;
   4703    int all_ok = 1;
   4704    int k;
   4705    int img_n = s->img_n; // copy it into a local for later
   4706 
   4707    int output_bytes = out_n*bytes;
   4708    int filter_bytes = img_n*bytes;
   4709    int width = x;
   4710 
   4711    STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
   4712    a->out = (stbi_uc *) stbi__malloc_mad3(x, y, output_bytes, 0); // extra bytes to write off the end into
   4713    if (!a->out) return stbi__err("outofmem", "Out of memory");
   4714 
   4715    // note: error exits here don't need to clean up a->out individually,
   4716    // stbi__do_png always does on error.
   4717    if (!stbi__mad3sizes_valid(img_n, x, depth, 7)) return stbi__err("too large", "Corrupt PNG");
   4718    img_width_bytes = (((img_n * x * depth) + 7) >> 3);
   4719    if (!stbi__mad2sizes_valid(img_width_bytes, y, img_width_bytes)) return stbi__err("too large", "Corrupt PNG");
   4720    img_len = (img_width_bytes + 1) * y;
   4721 
   4722    // we used to check for exact match between raw_len and img_len on non-interlaced PNGs,
   4723    // but issue #276 reported a PNG in the wild that had extra data at the end (all zeros),
   4724    // so just check for raw_len < img_len always.
   4725    if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
   4726 
   4727    // Allocate two scan lines worth of filter workspace buffer.
   4728    filter_buf = (stbi_uc *) stbi__malloc_mad2(img_width_bytes, 2, 0);
   4729    if (!filter_buf) return stbi__err("outofmem", "Out of memory");
   4730 
   4731    // Filtering for low-bit-depth images
   4732    if (depth < 8) {
   4733       filter_bytes = 1;
   4734       width = img_width_bytes;
   4735    }
   4736 
   4737    for (j=0; j < y; ++j) {
   4738       // cur/prior filter buffers alternate
   4739       stbi_uc *cur = filter_buf + (j & 1)*img_width_bytes;
   4740       stbi_uc *prior = filter_buf + (~j & 1)*img_width_bytes;
   4741       stbi_uc *dest = a->out + stride*j;
   4742       int nk = width * filter_bytes;
   4743       int filter = *raw++;
   4744 
   4745       // check filter type
   4746       if (filter > 4) {
   4747          all_ok = stbi__err("invalid filter","Corrupt PNG");
   4748          break;
   4749       }
   4750 
   4751       // if first row, use special filter that doesn't sample previous row
   4752       if (j == 0) filter = first_row_filter[filter];
   4753 
   4754       // perform actual filtering
   4755       switch (filter) {
   4756       case STBI__F_none:
   4757          memcpy(cur, raw, nk);
   4758          break;
   4759       case STBI__F_sub:
   4760          memcpy(cur, raw, filter_bytes);
   4761          for (k = filter_bytes; k < nk; ++k)
   4762             cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]);
   4763          break;
   4764       case STBI__F_up:
   4765          for (k = 0; k < nk; ++k)
   4766             cur[k] = STBI__BYTECAST(raw[k] + prior[k]);
   4767          break;
   4768       case STBI__F_avg:
   4769          for (k = 0; k < filter_bytes; ++k)
   4770             cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1));
   4771          for (k = filter_bytes; k < nk; ++k)
   4772             cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1));
   4773          break;
   4774       case STBI__F_paeth:
   4775          for (k = 0; k < filter_bytes; ++k)
   4776             cur[k] = STBI__BYTECAST(raw[k] + prior[k]); // prior[k] == stbi__paeth(0,prior[k],0)
   4777          for (k = filter_bytes; k < nk; ++k)
   4778             cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes], prior[k], prior[k-filter_bytes]));
   4779          break;
   4780       case STBI__F_avg_first:
   4781          memcpy(cur, raw, filter_bytes);
   4782          for (k = filter_bytes; k < nk; ++k)
   4783             cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1));
   4784          break;
   4785       }
   4786 
   4787       raw += nk;
   4788 
   4789       // expand decoded bits in cur to dest, also adding an extra alpha channel if desired
   4790       if (depth < 8) {
   4791          stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
   4792          stbi_uc *in = cur;
   4793          stbi_uc *out = dest;
   4794          stbi_uc inb = 0;
   4795          stbi__uint32 nsmp = x*img_n;
   4796 
   4797          // expand bits to bytes first
   4798          if (depth == 4) {
   4799             for (i=0; i < nsmp; ++i) {
   4800                if ((i & 1) == 0) inb = *in++;
   4801                *out++ = scale * (inb >> 4);
   4802                inb <<= 4;
   4803             }
   4804          } else if (depth == 2) {
   4805             for (i=0; i < nsmp; ++i) {
   4806                if ((i & 3) == 0) inb = *in++;
   4807                *out++ = scale * (inb >> 6);
   4808                inb <<= 2;
   4809             }
   4810          } else {
   4811             STBI_ASSERT(depth == 1);
   4812             for (i=0; i < nsmp; ++i) {
   4813                if ((i & 7) == 0) inb = *in++;
   4814                *out++ = scale * (inb >> 7);
   4815                inb <<= 1;
   4816             }
   4817          }
   4818 
   4819          // insert alpha=255 values if desired
   4820          if (img_n != out_n)
   4821             stbi__create_png_alpha_expand8(dest, dest, x, img_n);
   4822       } else if (depth == 8) {
   4823          if (img_n == out_n)
   4824             memcpy(dest, cur, x*img_n);
   4825          else
   4826             stbi__create_png_alpha_expand8(dest, cur, x, img_n);
   4827       } else if (depth == 16) {
   4828          // convert the image data from big-endian to platform-native
   4829          stbi__uint16 *dest16 = (stbi__uint16*)dest;
   4830          stbi__uint32 nsmp = x*img_n;
   4831 
   4832          if (img_n == out_n) {
   4833             for (i = 0; i < nsmp; ++i, ++dest16, cur += 2)
   4834                *dest16 = (cur[0] << 8) | cur[1];
   4835          } else {
   4836             STBI_ASSERT(img_n+1 == out_n);
   4837             if (img_n == 1) {
   4838                for (i = 0; i < x; ++i, dest16 += 2, cur += 2) {
   4839                   dest16[0] = (cur[0] << 8) | cur[1];
   4840                   dest16[1] = 0xffff;
   4841                }
   4842             } else {
   4843                STBI_ASSERT(img_n == 3);
   4844                for (i = 0; i < x; ++i, dest16 += 4, cur += 6) {
   4845                   dest16[0] = (cur[0] << 8) | cur[1];
   4846                   dest16[1] = (cur[2] << 8) | cur[3];
   4847                   dest16[2] = (cur[4] << 8) | cur[5];
   4848                   dest16[3] = 0xffff;
   4849                }
   4850             }
   4851          }
   4852       }
   4853    }
   4854 
   4855    STBI_FREE(filter_buf);
   4856    if (!all_ok) return 0;
   4857 
   4858    return 1;
   4859 }
   4860 
   4861 static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
   4862 {
   4863    int bytes = (depth == 16 ? 2 : 1);
   4864    int out_bytes = out_n * bytes;
   4865    stbi_uc *final;
   4866    int p;
   4867    if (!interlaced)
   4868       return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
   4869 
   4870    // de-interlacing
   4871    final = (stbi_uc *) stbi__malloc_mad3(a->s->img_x, a->s->img_y, out_bytes, 0);
   4872    if (!final) return stbi__err("outofmem", "Out of memory");
   4873    for (p=0; p < 7; ++p) {
   4874       int xorig[] = { 0,4,0,2,0,1,0 };
   4875       int yorig[] = { 0,0,4,0,2,0,1 };
   4876       int xspc[]  = { 8,8,4,4,2,2,1 };
   4877       int yspc[]  = { 8,8,8,4,4,2,2 };
   4878       int i,j,x,y;
   4879       // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
   4880       x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
   4881       y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
   4882       if (x && y) {
   4883          stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
   4884          if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
   4885             STBI_FREE(final);
   4886             return 0;
   4887          }
   4888          for (j=0; j < y; ++j) {
   4889             for (i=0; i < x; ++i) {
   4890                int out_y = j*yspc[p]+yorig[p];
   4891                int out_x = i*xspc[p]+xorig[p];
   4892                memcpy(final + out_y*a->s->img_x*out_bytes + out_x*out_bytes,
   4893                       a->out + (j*x+i)*out_bytes, out_bytes);
   4894             }
   4895          }
   4896          STBI_FREE(a->out);
   4897          image_data += img_len;
   4898          image_data_len -= img_len;
   4899       }
   4900    }
   4901    a->out = final;
   4902 
   4903    return 1;
   4904 }
   4905 
   4906 static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
   4907 {
   4908    stbi__context *s = z->s;
   4909    stbi__uint32 i, pixel_count = s->img_x * s->img_y;
   4910    stbi_uc *p = z->out;
   4911 
   4912    // compute color-based transparency, assuming we've
   4913    // already got 255 as the alpha value in the output
   4914    STBI_ASSERT(out_n == 2 || out_n == 4);
   4915 
   4916    if (out_n == 2) {
   4917       for (i=0; i < pixel_count; ++i) {
   4918          p[1] = (p[0] == tc[0] ? 0 : 255);
   4919          p += 2;
   4920       }
   4921    } else {
   4922       for (i=0; i < pixel_count; ++i) {
   4923          if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
   4924             p[3] = 0;
   4925          p += 4;
   4926       }
   4927    }
   4928    return 1;
   4929 }
   4930 
   4931 static int stbi__compute_transparency16(stbi__png *z, stbi__uint16 tc[3], int out_n)
   4932 {
   4933    stbi__context *s = z->s;
   4934    stbi__uint32 i, pixel_count = s->img_x * s->img_y;
   4935    stbi__uint16 *p = (stbi__uint16*) z->out;
   4936 
   4937    // compute color-based transparency, assuming we've
   4938    // already got 65535 as the alpha value in the output
   4939    STBI_ASSERT(out_n == 2 || out_n == 4);
   4940 
   4941    if (out_n == 2) {
   4942       for (i = 0; i < pixel_count; ++i) {
   4943          p[1] = (p[0] == tc[0] ? 0 : 65535);
   4944          p += 2;
   4945       }
   4946    } else {
   4947       for (i = 0; i < pixel_count; ++i) {
   4948          if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
   4949             p[3] = 0;
   4950          p += 4;
   4951       }
   4952    }
   4953    return 1;
   4954 }
   4955 
   4956 static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
   4957 {
   4958    stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
   4959    stbi_uc *p, *temp_out, *orig = a->out;
   4960 
   4961    p = (stbi_uc *) stbi__malloc_mad2(pixel_count, pal_img_n, 0);
   4962    if (p == NULL) return stbi__err("outofmem", "Out of memory");
   4963 
   4964    // between here and free(out) below, exitting would leak
   4965    temp_out = p;
   4966 
   4967    if (pal_img_n == 3) {
   4968       for (i=0; i < pixel_count; ++i) {
   4969          int n = orig[i]*4;
   4970          p[0] = palette[n  ];
   4971          p[1] = palette[n+1];
   4972          p[2] = palette[n+2];
   4973          p += 3;
   4974       }
   4975    } else {
   4976       for (i=0; i < pixel_count; ++i) {
   4977          int n = orig[i]*4;
   4978          p[0] = palette[n  ];
   4979          p[1] = palette[n+1];
   4980          p[2] = palette[n+2];
   4981          p[3] = palette[n+3];
   4982          p += 4;
   4983       }
   4984    }
   4985    STBI_FREE(a->out);
   4986    a->out = temp_out;
   4987 
   4988    STBI_NOTUSED(len);
   4989 
   4990    return 1;
   4991 }
   4992 
   4993 static int stbi__unpremultiply_on_load_global = 0;
   4994 static int stbi__de_iphone_flag_global = 0;
   4995 
   4996 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
   4997 {
   4998    stbi__unpremultiply_on_load_global = flag_true_if_should_unpremultiply;
   4999 }
   5000 
   5001 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
   5002 {
   5003    stbi__de_iphone_flag_global = flag_true_if_should_convert;
   5004 }
   5005 
   5006 #ifndef STBI_THREAD_LOCAL
   5007 #define stbi__unpremultiply_on_load  stbi__unpremultiply_on_load_global
   5008 #define stbi__de_iphone_flag  stbi__de_iphone_flag_global
   5009 #else
   5010 static STBI_THREAD_LOCAL int stbi__unpremultiply_on_load_local, stbi__unpremultiply_on_load_set;
   5011 static STBI_THREAD_LOCAL int stbi__de_iphone_flag_local, stbi__de_iphone_flag_set;
   5012 
   5013 STBIDEF void stbi_set_unpremultiply_on_load_thread(int flag_true_if_should_unpremultiply)
   5014 {
   5015    stbi__unpremultiply_on_load_local = flag_true_if_should_unpremultiply;
   5016    stbi__unpremultiply_on_load_set = 1;
   5017 }
   5018 
   5019 STBIDEF void stbi_convert_iphone_png_to_rgb_thread(int flag_true_if_should_convert)
   5020 {
   5021    stbi__de_iphone_flag_local = flag_true_if_should_convert;
   5022    stbi__de_iphone_flag_set = 1;
   5023 }
   5024 
   5025 #define stbi__unpremultiply_on_load  (stbi__unpremultiply_on_load_set           \
   5026                                        ? stbi__unpremultiply_on_load_local      \
   5027                                        : stbi__unpremultiply_on_load_global)
   5028 #define stbi__de_iphone_flag  (stbi__de_iphone_flag_set                         \
   5029                                 ? stbi__de_iphone_flag_local                    \
   5030                                 : stbi__de_iphone_flag_global)
   5031 #endif // STBI_THREAD_LOCAL
   5032 
   5033 static void stbi__de_iphone(stbi__png *z)
   5034 {
   5035    stbi__context *s = z->s;
   5036    stbi__uint32 i, pixel_count = s->img_x * s->img_y;
   5037    stbi_uc *p = z->out;
   5038 
   5039    if (s->img_out_n == 3) {  // convert bgr to rgb
   5040       for (i=0; i < pixel_count; ++i) {
   5041          stbi_uc t = p[0];
   5042          p[0] = p[2];
   5043          p[2] = t;
   5044          p += 3;
   5045       }
   5046    } else {
   5047       STBI_ASSERT(s->img_out_n == 4);
   5048       if (stbi__unpremultiply_on_load) {
   5049          // convert bgr to rgb and unpremultiply
   5050          for (i=0; i < pixel_count; ++i) {
   5051             stbi_uc a = p[3];
   5052             stbi_uc t = p[0];
   5053             if (a) {
   5054                stbi_uc half = a / 2;
   5055                p[0] = (p[2] * 255 + half) / a;
   5056                p[1] = (p[1] * 255 + half) / a;
   5057                p[2] = ( t   * 255 + half) / a;
   5058             } else {
   5059                p[0] = p[2];
   5060                p[2] = t;
   5061             }
   5062             p += 4;
   5063          }
   5064       } else {
   5065          // convert bgr to rgb
   5066          for (i=0; i < pixel_count; ++i) {
   5067             stbi_uc t = p[0];
   5068             p[0] = p[2];
   5069             p[2] = t;
   5070             p += 4;
   5071          }
   5072       }
   5073    }
   5074 }
   5075 
   5076 #define STBI__PNG_TYPE(a,b,c,d)  (((unsigned) (a) << 24) + ((unsigned) (b) << 16) + ((unsigned) (c) << 8) + (unsigned) (d))
   5077 
   5078 static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
   5079 {
   5080    stbi_uc palette[1024], pal_img_n=0;
   5081    stbi_uc has_trans=0, tc[3]={0};
   5082    stbi__uint16 tc16[3];
   5083    stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
   5084    int first=1,k,interlace=0, color=0, is_iphone=0;
   5085    stbi__context *s = z->s;
   5086 
   5087    z->expanded = NULL;
   5088    z->idata = NULL;
   5089    z->out = NULL;
   5090 
   5091    if (!stbi__check_png_header(s)) return 0;
   5092 
   5093    if (scan == STBI__SCAN_type) return 1;
   5094 
   5095    for (;;) {
   5096       stbi__pngchunk c = stbi__get_chunk_header(s);
   5097       switch (c.type) {
   5098          case STBI__PNG_TYPE('C','g','B','I'):
   5099             is_iphone = 1;
   5100             stbi__skip(s, c.length);
   5101             break;
   5102          case STBI__PNG_TYPE('I','H','D','R'): {
   5103             int comp,filter;
   5104             if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
   5105             first = 0;
   5106             if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
   5107             s->img_x = stbi__get32be(s);
   5108             s->img_y = stbi__get32be(s);
   5109             if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
   5110             if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
   5111             z->depth = stbi__get8(s);  if (z->depth != 1 && z->depth != 2 && z->depth != 4 && z->depth != 8 && z->depth != 16)  return stbi__err("1/2/4/8/16-bit only","PNG not supported: 1/2/4/8/16-bit only");
   5112             color = stbi__get8(s);  if (color > 6)         return stbi__err("bad ctype","Corrupt PNG");
   5113             if (color == 3 && z->depth == 16)                  return stbi__err("bad ctype","Corrupt PNG");
   5114             if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
   5115             comp  = stbi__get8(s);  if (comp) return stbi__err("bad comp method","Corrupt PNG");
   5116             filter= stbi__get8(s);  if (filter) return stbi__err("bad filter method","Corrupt PNG");
   5117             interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
   5118             if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
   5119             if (!pal_img_n) {
   5120                s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
   5121                if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
   5122             } else {
   5123                // if paletted, then pal_n is our final components, and
   5124                // img_n is # components to decompress/filter.
   5125                s->img_n = 1;
   5126                if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
   5127             }
   5128             // even with SCAN_header, have to scan to see if we have a tRNS
   5129             break;
   5130          }
   5131 
   5132          case STBI__PNG_TYPE('P','L','T','E'):  {
   5133             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
   5134             if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
   5135             pal_len = c.length / 3;
   5136             if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
   5137             for (i=0; i < pal_len; ++i) {
   5138                palette[i*4+0] = stbi__get8(s);
   5139                palette[i*4+1] = stbi__get8(s);
   5140                palette[i*4+2] = stbi__get8(s);
   5141                palette[i*4+3] = 255;
   5142             }
   5143             break;
   5144          }
   5145 
   5146          case STBI__PNG_TYPE('t','R','N','S'): {
   5147             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
   5148             if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
   5149             if (pal_img_n) {
   5150                if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
   5151                if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
   5152                if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
   5153                pal_img_n = 4;
   5154                for (i=0; i < c.length; ++i)
   5155                   palette[i*4+3] = stbi__get8(s);
   5156             } else {
   5157                if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
   5158                if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
   5159                has_trans = 1;
   5160                // non-paletted with tRNS = constant alpha. if header-scanning, we can stop now.
   5161                if (scan == STBI__SCAN_header) { ++s->img_n; return 1; }
   5162                if (z->depth == 16) {
   5163                   for (k = 0; k < s->img_n && k < 3; ++k) // extra loop test to suppress false GCC warning
   5164                      tc16[k] = (stbi__uint16)stbi__get16be(s); // copy the values as-is
   5165                } else {
   5166                   for (k = 0; k < s->img_n && k < 3; ++k)
   5167                      tc[k] = (stbi_uc)(stbi__get16be(s) & 255) * stbi__depth_scale_table[z->depth]; // non 8-bit images will be larger
   5168                }
   5169             }
   5170             break;
   5171          }
   5172 
   5173          case STBI__PNG_TYPE('I','D','A','T'): {
   5174             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
   5175             if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
   5176             if (scan == STBI__SCAN_header) {
   5177                // header scan definitely stops at first IDAT
   5178                if (pal_img_n)
   5179                   s->img_n = pal_img_n;
   5180                return 1;
   5181             }
   5182             if (c.length > (1u << 30)) return stbi__err("IDAT size limit", "IDAT section larger than 2^30 bytes");
   5183             if ((int)(ioff + c.length) < (int)ioff) return 0;
   5184             if (ioff + c.length > idata_limit) {
   5185                stbi__uint32 idata_limit_old = idata_limit;
   5186                stbi_uc *p;
   5187                if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
   5188                while (ioff + c.length > idata_limit)
   5189                   idata_limit *= 2;
   5190                STBI_NOTUSED(idata_limit_old);
   5191                p = (stbi_uc *) STBI_REALLOC_SIZED(z->idata, idata_limit_old, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
   5192                z->idata = p;
   5193             }
   5194             if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
   5195             ioff += c.length;
   5196             break;
   5197          }
   5198 
   5199          case STBI__PNG_TYPE('I','E','N','D'): {
   5200             stbi__uint32 raw_len, bpl;
   5201             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
   5202             if (scan != STBI__SCAN_load) return 1;
   5203             if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
   5204             // initial guess for decoded data size to avoid unnecessary reallocs
   5205             bpl = (s->img_x * z->depth + 7) / 8; // bytes per line, per component
   5206             raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
   5207             z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
   5208             if (z->expanded == NULL) return 0; // zlib should set error
   5209             STBI_FREE(z->idata); z->idata = NULL;
   5210             if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
   5211                s->img_out_n = s->img_n+1;
   5212             else
   5213                s->img_out_n = s->img_n;
   5214             if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, z->depth, color, interlace)) return 0;
   5215             if (has_trans) {
   5216                if (z->depth == 16) {
   5217                   if (!stbi__compute_transparency16(z, tc16, s->img_out_n)) return 0;
   5218                } else {
   5219                   if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
   5220                }
   5221             }
   5222             if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
   5223                stbi__de_iphone(z);
   5224             if (pal_img_n) {
   5225                // pal_img_n == 3 or 4
   5226                s->img_n = pal_img_n; // record the actual colors we had
   5227                s->img_out_n = pal_img_n;
   5228                if (req_comp >= 3) s->img_out_n = req_comp;
   5229                if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
   5230                   return 0;
   5231             } else if (has_trans) {
   5232                // non-paletted image with tRNS -> source image has (constant) alpha
   5233                ++s->img_n;
   5234             }
   5235             STBI_FREE(z->expanded); z->expanded = NULL;
   5236             // end of PNG chunk, read and skip CRC
   5237             stbi__get32be(s);
   5238             return 1;
   5239          }
   5240 
   5241          default:
   5242             // if critical, fail
   5243             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
   5244             if ((c.type & (1 << 29)) == 0) {
   5245                #ifndef STBI_NO_FAILURE_STRINGS
   5246                // not threadsafe
   5247                static char invalid_chunk[] = "XXXX PNG chunk not known";
   5248                invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
   5249                invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
   5250                invalid_chunk[2] = STBI__BYTECAST(c.type >>  8);
   5251                invalid_chunk[3] = STBI__BYTECAST(c.type >>  0);
   5252                #endif
   5253                return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
   5254             }
   5255             stbi__skip(s, c.length);
   5256             break;
   5257       }
   5258       // end of PNG chunk, read and skip CRC
   5259       stbi__get32be(s);
   5260    }
   5261 }
   5262 
   5263 static void *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp, stbi__result_info *ri)
   5264 {
   5265    void *result=NULL;
   5266    if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
   5267    if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
   5268       if (p->depth <= 8)
   5269          ri->bits_per_channel = 8;
   5270       else if (p->depth == 16)
   5271          ri->bits_per_channel = 16;
   5272       else
   5273          return stbi__errpuc("bad bits_per_channel", "PNG not supported: unsupported color depth");
   5274       result = p->out;
   5275       p->out = NULL;
   5276       if (req_comp && req_comp != p->s->img_out_n) {
   5277          if (ri->bits_per_channel == 8)
   5278             result = stbi__convert_format((unsigned char *) result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
   5279          else
   5280             result = stbi__convert_format16((stbi__uint16 *) result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
   5281          p->s->img_out_n = req_comp;
   5282          if (result == NULL) return result;
   5283       }
   5284       *x = p->s->img_x;
   5285       *y = p->s->img_y;
   5286       if (n) *n = p->s->img_n;
   5287    }
   5288    STBI_FREE(p->out);      p->out      = NULL;
   5289    STBI_FREE(p->expanded); p->expanded = NULL;
   5290    STBI_FREE(p->idata);    p->idata    = NULL;
   5291 
   5292    return result;
   5293 }
   5294 
   5295 static void *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   5296 {
   5297    stbi__png p;
   5298    p.s = s;
   5299    return stbi__do_png(&p, x,y,comp,req_comp, ri);
   5300 }
   5301 
   5302 static int stbi__png_test(stbi__context *s)
   5303 {
   5304    int r;
   5305    r = stbi__check_png_header(s);
   5306    stbi__rewind(s);
   5307    return r;
   5308 }
   5309 
   5310 static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
   5311 {
   5312    if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
   5313       stbi__rewind( p->s );
   5314       return 0;
   5315    }
   5316    if (x) *x = p->s->img_x;
   5317    if (y) *y = p->s->img_y;
   5318    if (comp) *comp = p->s->img_n;
   5319    return 1;
   5320 }
   5321 
   5322 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
   5323 {
   5324    stbi__png p;
   5325    p.s = s;
   5326    return stbi__png_info_raw(&p, x, y, comp);
   5327 }
   5328 
   5329 static int stbi__png_is16(stbi__context *s)
   5330 {
   5331    stbi__png p;
   5332    p.s = s;
   5333    if (!stbi__png_info_raw(&p, NULL, NULL, NULL))
   5334 	   return 0;
   5335    if (p.depth != 16) {
   5336       stbi__rewind(p.s);
   5337       return 0;
   5338    }
   5339    return 1;
   5340 }
   5341 #endif
   5342 
   5343 // Microsoft/Windows BMP image
   5344 
   5345 #ifndef STBI_NO_BMP
   5346 static int stbi__bmp_test_raw(stbi__context *s)
   5347 {
   5348    int r;
   5349    int sz;
   5350    if (stbi__get8(s) != 'B') return 0;
   5351    if (stbi__get8(s) != 'M') return 0;
   5352    stbi__get32le(s); // discard filesize
   5353    stbi__get16le(s); // discard reserved
   5354    stbi__get16le(s); // discard reserved
   5355    stbi__get32le(s); // discard data offset
   5356    sz = stbi__get32le(s);
   5357    r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
   5358    return r;
   5359 }
   5360 
   5361 static int stbi__bmp_test(stbi__context *s)
   5362 {
   5363    int r = stbi__bmp_test_raw(s);
   5364    stbi__rewind(s);
   5365    return r;
   5366 }
   5367 
   5368 
   5369 // returns 0..31 for the highest set bit
   5370 static int stbi__high_bit(unsigned int z)
   5371 {
   5372    int n=0;
   5373    if (z == 0) return -1;
   5374    if (z >= 0x10000) { n += 16; z >>= 16; }
   5375    if (z >= 0x00100) { n +=  8; z >>=  8; }
   5376    if (z >= 0x00010) { n +=  4; z >>=  4; }
   5377    if (z >= 0x00004) { n +=  2; z >>=  2; }
   5378    if (z >= 0x00002) { n +=  1;/* >>=  1;*/ }
   5379    return n;
   5380 }
   5381 
   5382 static int stbi__bitcount(unsigned int a)
   5383 {
   5384    a = (a & 0x55555555) + ((a >>  1) & 0x55555555); // max 2
   5385    a = (a & 0x33333333) + ((a >>  2) & 0x33333333); // max 4
   5386    a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
   5387    a = (a + (a >> 8)); // max 16 per 8 bits
   5388    a = (a + (a >> 16)); // max 32 per 8 bits
   5389    return a & 0xff;
   5390 }
   5391 
   5392 // extract an arbitrarily-aligned N-bit value (N=bits)
   5393 // from v, and then make it 8-bits long and fractionally
   5394 // extend it to full full range.
   5395 static int stbi__shiftsigned(unsigned int v, int shift, int bits)
   5396 {
   5397    static unsigned int mul_table[9] = {
   5398       0,
   5399       0xff/*0b11111111*/, 0x55/*0b01010101*/, 0x49/*0b01001001*/, 0x11/*0b00010001*/,
   5400       0x21/*0b00100001*/, 0x41/*0b01000001*/, 0x81/*0b10000001*/, 0x01/*0b00000001*/,
   5401    };
   5402    static unsigned int shift_table[9] = {
   5403       0, 0,0,1,0,2,4,6,0,
   5404    };
   5405    if (shift < 0)
   5406       v <<= -shift;
   5407    else
   5408       v >>= shift;
   5409    STBI_ASSERT(v < 256);
   5410    v >>= (8-bits);
   5411    STBI_ASSERT(bits >= 0 && bits <= 8);
   5412    return (int) ((unsigned) v * mul_table[bits]) >> shift_table[bits];
   5413 }
   5414 
   5415 typedef struct
   5416 {
   5417    int bpp, offset, hsz;
   5418    unsigned int mr,mg,mb,ma, all_a;
   5419    int extra_read;
   5420 } stbi__bmp_data;
   5421 
   5422 static int stbi__bmp_set_mask_defaults(stbi__bmp_data *info, int compress)
   5423 {
   5424    // BI_BITFIELDS specifies masks explicitly, don't override
   5425    if (compress == 3)
   5426       return 1;
   5427 
   5428    if (compress == 0) {
   5429       if (info->bpp == 16) {
   5430          info->mr = 31u << 10;
   5431          info->mg = 31u <<  5;
   5432          info->mb = 31u <<  0;
   5433       } else if (info->bpp == 32) {
   5434          info->mr = 0xffu << 16;
   5435          info->mg = 0xffu <<  8;
   5436          info->mb = 0xffu <<  0;
   5437          info->ma = 0xffu << 24;
   5438          info->all_a = 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
   5439       } else {
   5440          // otherwise, use defaults, which is all-0
   5441          info->mr = info->mg = info->mb = info->ma = 0;
   5442       }
   5443       return 1;
   5444    }
   5445    return 0; // error
   5446 }
   5447 
   5448 static void *stbi__bmp_parse_header(stbi__context *s, stbi__bmp_data *info)
   5449 {
   5450    int hsz;
   5451    if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
   5452    stbi__get32le(s); // discard filesize
   5453    stbi__get16le(s); // discard reserved
   5454    stbi__get16le(s); // discard reserved
   5455    info->offset = stbi__get32le(s);
   5456    info->hsz = hsz = stbi__get32le(s);
   5457    info->mr = info->mg = info->mb = info->ma = 0;
   5458    info->extra_read = 14;
   5459 
   5460    if (info->offset < 0) return stbi__errpuc("bad BMP", "bad BMP");
   5461 
   5462    if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
   5463    if (hsz == 12) {
   5464       s->img_x = stbi__get16le(s);
   5465       s->img_y = stbi__get16le(s);
   5466    } else {
   5467       s->img_x = stbi__get32le(s);
   5468       s->img_y = stbi__get32le(s);
   5469    }
   5470    if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
   5471    info->bpp = stbi__get16le(s);
   5472    if (hsz != 12) {
   5473       int compress = stbi__get32le(s);
   5474       if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
   5475       if (compress >= 4) return stbi__errpuc("BMP JPEG/PNG", "BMP type not supported: unsupported compression"); // this includes PNG/JPEG modes
   5476       if (compress == 3 && info->bpp != 16 && info->bpp != 32) return stbi__errpuc("bad BMP", "bad BMP"); // bitfields requires 16 or 32 bits/pixel
   5477       stbi__get32le(s); // discard sizeof
   5478       stbi__get32le(s); // discard hres
   5479       stbi__get32le(s); // discard vres
   5480       stbi__get32le(s); // discard colorsused
   5481       stbi__get32le(s); // discard max important
   5482       if (hsz == 40 || hsz == 56) {
   5483          if (hsz == 56) {
   5484             stbi__get32le(s);
   5485             stbi__get32le(s);
   5486             stbi__get32le(s);
   5487             stbi__get32le(s);
   5488          }
   5489          if (info->bpp == 16 || info->bpp == 32) {
   5490             if (compress == 0) {
   5491                stbi__bmp_set_mask_defaults(info, compress);
   5492             } else if (compress == 3) {
   5493                info->mr = stbi__get32le(s);
   5494                info->mg = stbi__get32le(s);
   5495                info->mb = stbi__get32le(s);
   5496                info->extra_read += 12;
   5497                // not documented, but generated by photoshop and handled by mspaint
   5498                if (info->mr == info->mg && info->mg == info->mb) {
   5499                   // ?!?!?
   5500                   return stbi__errpuc("bad BMP", "bad BMP");
   5501                }
   5502             } else
   5503                return stbi__errpuc("bad BMP", "bad BMP");
   5504          }
   5505       } else {
   5506          // V4/V5 header
   5507          int i;
   5508          if (hsz != 108 && hsz != 124)
   5509             return stbi__errpuc("bad BMP", "bad BMP");
   5510          info->mr = stbi__get32le(s);
   5511          info->mg = stbi__get32le(s);
   5512          info->mb = stbi__get32le(s);
   5513          info->ma = stbi__get32le(s);
   5514          if (compress != 3) // override mr/mg/mb unless in BI_BITFIELDS mode, as per docs
   5515             stbi__bmp_set_mask_defaults(info, compress);
   5516          stbi__get32le(s); // discard color space
   5517          for (i=0; i < 12; ++i)
   5518             stbi__get32le(s); // discard color space parameters
   5519          if (hsz == 124) {
   5520             stbi__get32le(s); // discard rendering intent
   5521             stbi__get32le(s); // discard offset of profile data
   5522             stbi__get32le(s); // discard size of profile data
   5523             stbi__get32le(s); // discard reserved
   5524          }
   5525       }
   5526    }
   5527    return (void *) 1;
   5528 }
   5529 
   5530 
   5531 static void *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   5532 {
   5533    stbi_uc *out;
   5534    unsigned int mr=0,mg=0,mb=0,ma=0, all_a;
   5535    stbi_uc pal[256][4];
   5536    int psize=0,i,j,width;
   5537    int flip_vertically, pad, target;
   5538    stbi__bmp_data info;
   5539    STBI_NOTUSED(ri);
   5540 
   5541    info.all_a = 255;
   5542    if (stbi__bmp_parse_header(s, &info) == NULL)
   5543       return NULL; // error code already set
   5544 
   5545    flip_vertically = ((int) s->img_y) > 0;
   5546    s->img_y = abs((int) s->img_y);
   5547 
   5548    if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   5549    if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   5550 
   5551    mr = info.mr;
   5552    mg = info.mg;
   5553    mb = info.mb;
   5554    ma = info.ma;
   5555    all_a = info.all_a;
   5556 
   5557    if (info.hsz == 12) {
   5558       if (info.bpp < 24)
   5559          psize = (info.offset - info.extra_read - 24) / 3;
   5560    } else {
   5561       if (info.bpp < 16)
   5562          psize = (info.offset - info.extra_read - info.hsz) >> 2;
   5563    }
   5564    if (psize == 0) {
   5565       // accept some number of extra bytes after the header, but if the offset points either to before
   5566       // the header ends or implies a large amount of extra data, reject the file as malformed
   5567       int bytes_read_so_far = s->callback_already_read + (int)(s->img_buffer - s->img_buffer_original);
   5568       int header_limit = 1024; // max we actually read is below 256 bytes currently.
   5569       int extra_data_limit = 256*4; // what ordinarily goes here is a palette; 256 entries*4 bytes is its max size.
   5570       if (bytes_read_so_far <= 0 || bytes_read_so_far > header_limit) {
   5571          return stbi__errpuc("bad header", "Corrupt BMP");
   5572       }
   5573       // we established that bytes_read_so_far is positive and sensible.
   5574       // the first half of this test rejects offsets that are either too small positives, or
   5575       // negative, and guarantees that info.offset >= bytes_read_so_far > 0. this in turn
   5576       // ensures the number computed in the second half of the test can't overflow.
   5577       if (info.offset < bytes_read_so_far || info.offset - bytes_read_so_far > extra_data_limit) {
   5578          return stbi__errpuc("bad offset", "Corrupt BMP");
   5579       } else {
   5580          stbi__skip(s, info.offset - bytes_read_so_far);
   5581       }
   5582    }
   5583 
   5584    if (info.bpp == 24 && ma == 0xff000000)
   5585       s->img_n = 3;
   5586    else
   5587       s->img_n = ma ? 4 : 3;
   5588    if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
   5589       target = req_comp;
   5590    else
   5591       target = s->img_n; // if they want monochrome, we'll post-convert
   5592 
   5593    // sanity-check size
   5594    if (!stbi__mad3sizes_valid(target, s->img_x, s->img_y, 0))
   5595       return stbi__errpuc("too large", "Corrupt BMP");
   5596 
   5597    out = (stbi_uc *) stbi__malloc_mad3(target, s->img_x, s->img_y, 0);
   5598    if (!out) return stbi__errpuc("outofmem", "Out of memory");
   5599    if (info.bpp < 16) {
   5600       int z=0;
   5601       if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
   5602       for (i=0; i < psize; ++i) {
   5603          pal[i][2] = stbi__get8(s);
   5604          pal[i][1] = stbi__get8(s);
   5605          pal[i][0] = stbi__get8(s);
   5606          if (info.hsz != 12) stbi__get8(s);
   5607          pal[i][3] = 255;
   5608       }
   5609       stbi__skip(s, info.offset - info.extra_read - info.hsz - psize * (info.hsz == 12 ? 3 : 4));
   5610       if (info.bpp == 1) width = (s->img_x + 7) >> 3;
   5611       else if (info.bpp == 4) width = (s->img_x + 1) >> 1;
   5612       else if (info.bpp == 8) width = s->img_x;
   5613       else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
   5614       pad = (-width)&3;
   5615       if (info.bpp == 1) {
   5616          for (j=0; j < (int) s->img_y; ++j) {
   5617             int bit_offset = 7, v = stbi__get8(s);
   5618             for (i=0; i < (int) s->img_x; ++i) {
   5619                int color = (v>>bit_offset)&0x1;
   5620                out[z++] = pal[color][0];
   5621                out[z++] = pal[color][1];
   5622                out[z++] = pal[color][2];
   5623                if (target == 4) out[z++] = 255;
   5624                if (i+1 == (int) s->img_x) break;
   5625                if((--bit_offset) < 0) {
   5626                   bit_offset = 7;
   5627                   v = stbi__get8(s);
   5628                }
   5629             }
   5630             stbi__skip(s, pad);
   5631          }
   5632       } else {
   5633          for (j=0; j < (int) s->img_y; ++j) {
   5634             for (i=0; i < (int) s->img_x; i += 2) {
   5635                int v=stbi__get8(s),v2=0;
   5636                if (info.bpp == 4) {
   5637                   v2 = v & 15;
   5638                   v >>= 4;
   5639                }
   5640                out[z++] = pal[v][0];
   5641                out[z++] = pal[v][1];
   5642                out[z++] = pal[v][2];
   5643                if (target == 4) out[z++] = 255;
   5644                if (i+1 == (int) s->img_x) break;
   5645                v = (info.bpp == 8) ? stbi__get8(s) : v2;
   5646                out[z++] = pal[v][0];
   5647                out[z++] = pal[v][1];
   5648                out[z++] = pal[v][2];
   5649                if (target == 4) out[z++] = 255;
   5650             }
   5651             stbi__skip(s, pad);
   5652          }
   5653       }
   5654    } else {
   5655       int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
   5656       int z = 0;
   5657       int easy=0;
   5658       stbi__skip(s, info.offset - info.extra_read - info.hsz);
   5659       if (info.bpp == 24) width = 3 * s->img_x;
   5660       else if (info.bpp == 16) width = 2*s->img_x;
   5661       else /* bpp = 32 and pad = 0 */ width=0;
   5662       pad = (-width) & 3;
   5663       if (info.bpp == 24) {
   5664          easy = 1;
   5665       } else if (info.bpp == 32) {
   5666          if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
   5667             easy = 2;
   5668       }
   5669       if (!easy) {
   5670          if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
   5671          // right shift amt to put high bit in position #7
   5672          rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
   5673          gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
   5674          bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
   5675          ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
   5676          if (rcount > 8 || gcount > 8 || bcount > 8 || acount > 8) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
   5677       }
   5678       for (j=0; j < (int) s->img_y; ++j) {
   5679          if (easy) {
   5680             for (i=0; i < (int) s->img_x; ++i) {
   5681                unsigned char a;
   5682                out[z+2] = stbi__get8(s);
   5683                out[z+1] = stbi__get8(s);
   5684                out[z+0] = stbi__get8(s);
   5685                z += 3;
   5686                a = (easy == 2 ? stbi__get8(s) : 255);
   5687                all_a |= a;
   5688                if (target == 4) out[z++] = a;
   5689             }
   5690          } else {
   5691             int bpp = info.bpp;
   5692             for (i=0; i < (int) s->img_x; ++i) {
   5693                stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
   5694                unsigned int a;
   5695                out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
   5696                out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
   5697                out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
   5698                a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
   5699                all_a |= a;
   5700                if (target == 4) out[z++] = STBI__BYTECAST(a);
   5701             }
   5702          }
   5703          stbi__skip(s, pad);
   5704       }
   5705    }
   5706 
   5707    // if alpha channel is all 0s, replace with all 255s
   5708    if (target == 4 && all_a == 0)
   5709       for (i=4*s->img_x*s->img_y-1; i >= 0; i -= 4)
   5710          out[i] = 255;
   5711 
   5712    if (flip_vertically) {
   5713       stbi_uc t;
   5714       for (j=0; j < (int) s->img_y>>1; ++j) {
   5715          stbi_uc *p1 = out +      j     *s->img_x*target;
   5716          stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
   5717          for (i=0; i < (int) s->img_x*target; ++i) {
   5718             t = p1[i]; p1[i] = p2[i]; p2[i] = t;
   5719          }
   5720       }
   5721    }
   5722 
   5723    if (req_comp && req_comp != target) {
   5724       out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
   5725       if (out == NULL) return out; // stbi__convert_format frees input on failure
   5726    }
   5727 
   5728    *x = s->img_x;
   5729    *y = s->img_y;
   5730    if (comp) *comp = s->img_n;
   5731    return out;
   5732 }
   5733 #endif
   5734 
   5735 // Targa Truevision - TGA
   5736 // by Jonathan Dummer
   5737 #ifndef STBI_NO_TGA
   5738 // returns STBI_rgb or whatever, 0 on error
   5739 static int stbi__tga_get_comp(int bits_per_pixel, int is_grey, int* is_rgb16)
   5740 {
   5741    // only RGB or RGBA (incl. 16bit) or grey allowed
   5742    if (is_rgb16) *is_rgb16 = 0;
   5743    switch(bits_per_pixel) {
   5744       case 8:  return STBI_grey;
   5745       case 16: if(is_grey) return STBI_grey_alpha;
   5746                // fallthrough
   5747       case 15: if(is_rgb16) *is_rgb16 = 1;
   5748                return STBI_rgb;
   5749       case 24: // fallthrough
   5750       case 32: return bits_per_pixel/8;
   5751       default: return 0;
   5752    }
   5753 }
   5754 
   5755 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
   5756 {
   5757     int tga_w, tga_h, tga_comp, tga_image_type, tga_bits_per_pixel, tga_colormap_bpp;
   5758     int sz, tga_colormap_type;
   5759     stbi__get8(s);                   // discard Offset
   5760     tga_colormap_type = stbi__get8(s); // colormap type
   5761     if( tga_colormap_type > 1 ) {
   5762         stbi__rewind(s);
   5763         return 0;      // only RGB or indexed allowed
   5764     }
   5765     tga_image_type = stbi__get8(s); // image type
   5766     if ( tga_colormap_type == 1 ) { // colormapped (paletted) image
   5767         if (tga_image_type != 1 && tga_image_type != 9) {
   5768             stbi__rewind(s);
   5769             return 0;
   5770         }
   5771         stbi__skip(s,4);       // skip index of first colormap entry and number of entries
   5772         sz = stbi__get8(s);    //   check bits per palette color entry
   5773         if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) {
   5774             stbi__rewind(s);
   5775             return 0;
   5776         }
   5777         stbi__skip(s,4);       // skip image x and y origin
   5778         tga_colormap_bpp = sz;
   5779     } else { // "normal" image w/o colormap - only RGB or grey allowed, +/- RLE
   5780         if ( (tga_image_type != 2) && (tga_image_type != 3) && (tga_image_type != 10) && (tga_image_type != 11) ) {
   5781             stbi__rewind(s);
   5782             return 0; // only RGB or grey allowed, +/- RLE
   5783         }
   5784         stbi__skip(s,9); // skip colormap specification and image x/y origin
   5785         tga_colormap_bpp = 0;
   5786     }
   5787     tga_w = stbi__get16le(s);
   5788     if( tga_w < 1 ) {
   5789         stbi__rewind(s);
   5790         return 0;   // test width
   5791     }
   5792     tga_h = stbi__get16le(s);
   5793     if( tga_h < 1 ) {
   5794         stbi__rewind(s);
   5795         return 0;   // test height
   5796     }
   5797     tga_bits_per_pixel = stbi__get8(s); // bits per pixel
   5798     stbi__get8(s); // ignore alpha bits
   5799     if (tga_colormap_bpp != 0) {
   5800         if((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16)) {
   5801             // when using a colormap, tga_bits_per_pixel is the size of the indexes
   5802             // I don't think anything but 8 or 16bit indexes makes sense
   5803             stbi__rewind(s);
   5804             return 0;
   5805         }
   5806         tga_comp = stbi__tga_get_comp(tga_colormap_bpp, 0, NULL);
   5807     } else {
   5808         tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3) || (tga_image_type == 11), NULL);
   5809     }
   5810     if(!tga_comp) {
   5811       stbi__rewind(s);
   5812       return 0;
   5813     }
   5814     if (x) *x = tga_w;
   5815     if (y) *y = tga_h;
   5816     if (comp) *comp = tga_comp;
   5817     return 1;                   // seems to have passed everything
   5818 }
   5819 
   5820 static int stbi__tga_test(stbi__context *s)
   5821 {
   5822    int res = 0;
   5823    int sz, tga_color_type;
   5824    stbi__get8(s);      //   discard Offset
   5825    tga_color_type = stbi__get8(s);   //   color type
   5826    if ( tga_color_type > 1 ) goto errorEnd;   //   only RGB or indexed allowed
   5827    sz = stbi__get8(s);   //   image type
   5828    if ( tga_color_type == 1 ) { // colormapped (paletted) image
   5829       if (sz != 1 && sz != 9) goto errorEnd; // colortype 1 demands image type 1 or 9
   5830       stbi__skip(s,4);       // skip index of first colormap entry and number of entries
   5831       sz = stbi__get8(s);    //   check bits per palette color entry
   5832       if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
   5833       stbi__skip(s,4);       // skip image x and y origin
   5834    } else { // "normal" image w/o colormap
   5835       if ( (sz != 2) && (sz != 3) && (sz != 10) && (sz != 11) ) goto errorEnd; // only RGB or grey allowed, +/- RLE
   5836       stbi__skip(s,9); // skip colormap specification and image x/y origin
   5837    }
   5838    if ( stbi__get16le(s) < 1 ) goto errorEnd;      //   test width
   5839    if ( stbi__get16le(s) < 1 ) goto errorEnd;      //   test height
   5840    sz = stbi__get8(s);   //   bits per pixel
   5841    if ( (tga_color_type == 1) && (sz != 8) && (sz != 16) ) goto errorEnd; // for colormapped images, bpp is size of an index
   5842    if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
   5843 
   5844    res = 1; // if we got this far, everything's good and we can return 1 instead of 0
   5845 
   5846 errorEnd:
   5847    stbi__rewind(s);
   5848    return res;
   5849 }
   5850 
   5851 // read 16bit value and convert to 24bit RGB
   5852 static void stbi__tga_read_rgb16(stbi__context *s, stbi_uc* out)
   5853 {
   5854    stbi__uint16 px = (stbi__uint16)stbi__get16le(s);
   5855    stbi__uint16 fiveBitMask = 31;
   5856    // we have 3 channels with 5bits each
   5857    int r = (px >> 10) & fiveBitMask;
   5858    int g = (px >> 5) & fiveBitMask;
   5859    int b = px & fiveBitMask;
   5860    // Note that this saves the data in RGB(A) order, so it doesn't need to be swapped later
   5861    out[0] = (stbi_uc)((r * 255)/31);
   5862    out[1] = (stbi_uc)((g * 255)/31);
   5863    out[2] = (stbi_uc)((b * 255)/31);
   5864 
   5865    // some people claim that the most significant bit might be used for alpha
   5866    // (possibly if an alpha-bit is set in the "image descriptor byte")
   5867    // but that only made 16bit test images completely translucent..
   5868    // so let's treat all 15 and 16bit TGAs as RGB with no alpha.
   5869 }
   5870 
   5871 static void *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   5872 {
   5873    //   read in the TGA header stuff
   5874    int tga_offset = stbi__get8(s);
   5875    int tga_indexed = stbi__get8(s);
   5876    int tga_image_type = stbi__get8(s);
   5877    int tga_is_RLE = 0;
   5878    int tga_palette_start = stbi__get16le(s);
   5879    int tga_palette_len = stbi__get16le(s);
   5880    int tga_palette_bits = stbi__get8(s);
   5881    int tga_x_origin = stbi__get16le(s);
   5882    int tga_y_origin = stbi__get16le(s);
   5883    int tga_width = stbi__get16le(s);
   5884    int tga_height = stbi__get16le(s);
   5885    int tga_bits_per_pixel = stbi__get8(s);
   5886    int tga_comp, tga_rgb16=0;
   5887    int tga_inverted = stbi__get8(s);
   5888    // int tga_alpha_bits = tga_inverted & 15; // the 4 lowest bits - unused (useless?)
   5889    //   image data
   5890    unsigned char *tga_data;
   5891    unsigned char *tga_palette = NULL;
   5892    int i, j;
   5893    unsigned char raw_data[4] = {0};
   5894    int RLE_count = 0;
   5895    int RLE_repeating = 0;
   5896    int read_next_pixel = 1;
   5897    STBI_NOTUSED(ri);
   5898    STBI_NOTUSED(tga_x_origin); // @TODO
   5899    STBI_NOTUSED(tga_y_origin); // @TODO
   5900 
   5901    if (tga_height > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   5902    if (tga_width > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   5903 
   5904    //   do a tiny bit of precessing
   5905    if ( tga_image_type >= 8 )
   5906    {
   5907       tga_image_type -= 8;
   5908       tga_is_RLE = 1;
   5909    }
   5910    tga_inverted = 1 - ((tga_inverted >> 5) & 1);
   5911 
   5912    //   If I'm paletted, then I'll use the number of bits from the palette
   5913    if ( tga_indexed ) tga_comp = stbi__tga_get_comp(tga_palette_bits, 0, &tga_rgb16);
   5914    else tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3), &tga_rgb16);
   5915 
   5916    if(!tga_comp) // shouldn't really happen, stbi__tga_test() should have ensured basic consistency
   5917       return stbi__errpuc("bad format", "Can't find out TGA pixelformat");
   5918 
   5919    //   tga info
   5920    *x = tga_width;
   5921    *y = tga_height;
   5922    if (comp) *comp = tga_comp;
   5923 
   5924    if (!stbi__mad3sizes_valid(tga_width, tga_height, tga_comp, 0))
   5925       return stbi__errpuc("too large", "Corrupt TGA");
   5926 
   5927    tga_data = (unsigned char*)stbi__malloc_mad3(tga_width, tga_height, tga_comp, 0);
   5928    if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
   5929 
   5930    // skip to the data's starting position (offset usually = 0)
   5931    stbi__skip(s, tga_offset );
   5932 
   5933    if ( !tga_indexed && !tga_is_RLE && !tga_rgb16 ) {
   5934       for (i=0; i < tga_height; ++i) {
   5935          int row = tga_inverted ? tga_height -i - 1 : i;
   5936          stbi_uc *tga_row = tga_data + row*tga_width*tga_comp;
   5937          stbi__getn(s, tga_row, tga_width * tga_comp);
   5938       }
   5939    } else  {
   5940       //   do I need to load a palette?
   5941       if ( tga_indexed)
   5942       {
   5943          if (tga_palette_len == 0) {  /* you have to have at least one entry! */
   5944             STBI_FREE(tga_data);
   5945             return stbi__errpuc("bad palette", "Corrupt TGA");
   5946          }
   5947 
   5948          //   any data to skip? (offset usually = 0)
   5949          stbi__skip(s, tga_palette_start );
   5950          //   load the palette
   5951          tga_palette = (unsigned char*)stbi__malloc_mad2(tga_palette_len, tga_comp, 0);
   5952          if (!tga_palette) {
   5953             STBI_FREE(tga_data);
   5954             return stbi__errpuc("outofmem", "Out of memory");
   5955          }
   5956          if (tga_rgb16) {
   5957             stbi_uc *pal_entry = tga_palette;
   5958             STBI_ASSERT(tga_comp == STBI_rgb);
   5959             for (i=0; i < tga_palette_len; ++i) {
   5960                stbi__tga_read_rgb16(s, pal_entry);
   5961                pal_entry += tga_comp;
   5962             }
   5963          } else if (!stbi__getn(s, tga_palette, tga_palette_len * tga_comp)) {
   5964                STBI_FREE(tga_data);
   5965                STBI_FREE(tga_palette);
   5966                return stbi__errpuc("bad palette", "Corrupt TGA");
   5967          }
   5968       }
   5969       //   load the data
   5970       for (i=0; i < tga_width * tga_height; ++i)
   5971       {
   5972          //   if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
   5973          if ( tga_is_RLE )
   5974          {
   5975             if ( RLE_count == 0 )
   5976             {
   5977                //   yep, get the next byte as a RLE command
   5978                int RLE_cmd = stbi__get8(s);
   5979                RLE_count = 1 + (RLE_cmd & 127);
   5980                RLE_repeating = RLE_cmd >> 7;
   5981                read_next_pixel = 1;
   5982             } else if ( !RLE_repeating )
   5983             {
   5984                read_next_pixel = 1;
   5985             }
   5986          } else
   5987          {
   5988             read_next_pixel = 1;
   5989          }
   5990          //   OK, if I need to read a pixel, do it now
   5991          if ( read_next_pixel )
   5992          {
   5993             //   load however much data we did have
   5994             if ( tga_indexed )
   5995             {
   5996                // read in index, then perform the lookup
   5997                int pal_idx = (tga_bits_per_pixel == 8) ? stbi__get8(s) : stbi__get16le(s);
   5998                if ( pal_idx >= tga_palette_len ) {
   5999                   // invalid index
   6000                   pal_idx = 0;
   6001                }
   6002                pal_idx *= tga_comp;
   6003                for (j = 0; j < tga_comp; ++j) {
   6004                   raw_data[j] = tga_palette[pal_idx+j];
   6005                }
   6006             } else if(tga_rgb16) {
   6007                STBI_ASSERT(tga_comp == STBI_rgb);
   6008                stbi__tga_read_rgb16(s, raw_data);
   6009             } else {
   6010                //   read in the data raw
   6011                for (j = 0; j < tga_comp; ++j) {
   6012                   raw_data[j] = stbi__get8(s);
   6013                }
   6014             }
   6015             //   clear the reading flag for the next pixel
   6016             read_next_pixel = 0;
   6017          } // end of reading a pixel
   6018 
   6019          // copy data
   6020          for (j = 0; j < tga_comp; ++j)
   6021            tga_data[i*tga_comp+j] = raw_data[j];
   6022 
   6023          //   in case we're in RLE mode, keep counting down
   6024          --RLE_count;
   6025       }
   6026       //   do I need to invert the image?
   6027       if ( tga_inverted )
   6028       {
   6029          for (j = 0; j*2 < tga_height; ++j)
   6030          {
   6031             int index1 = j * tga_width * tga_comp;
   6032             int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
   6033             for (i = tga_width * tga_comp; i > 0; --i)
   6034             {
   6035                unsigned char temp = tga_data[index1];
   6036                tga_data[index1] = tga_data[index2];
   6037                tga_data[index2] = temp;
   6038                ++index1;
   6039                ++index2;
   6040             }
   6041          }
   6042       }
   6043       //   clear my palette, if I had one
   6044       if ( tga_palette != NULL )
   6045       {
   6046          STBI_FREE( tga_palette );
   6047       }
   6048    }
   6049 
   6050    // swap RGB - if the source data was RGB16, it already is in the right order
   6051    if (tga_comp >= 3 && !tga_rgb16)
   6052    {
   6053       unsigned char* tga_pixel = tga_data;
   6054       for (i=0; i < tga_width * tga_height; ++i)
   6055       {
   6056          unsigned char temp = tga_pixel[0];
   6057          tga_pixel[0] = tga_pixel[2];
   6058          tga_pixel[2] = temp;
   6059          tga_pixel += tga_comp;
   6060       }
   6061    }
   6062 
   6063    // convert to target component count
   6064    if (req_comp && req_comp != tga_comp)
   6065       tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
   6066 
   6067    //   the things I do to get rid of an error message, and yet keep
   6068    //   Microsoft's C compilers happy... [8^(
   6069    tga_palette_start = tga_palette_len = tga_palette_bits =
   6070          tga_x_origin = tga_y_origin = 0;
   6071    STBI_NOTUSED(tga_palette_start);
   6072    //   OK, done
   6073    return tga_data;
   6074 }
   6075 #endif
   6076 
   6077 // *************************************************************************************************
   6078 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
   6079 
   6080 #ifndef STBI_NO_PSD
   6081 static int stbi__psd_test(stbi__context *s)
   6082 {
   6083    int r = (stbi__get32be(s) == 0x38425053);
   6084    stbi__rewind(s);
   6085    return r;
   6086 }
   6087 
   6088 static int stbi__psd_decode_rle(stbi__context *s, stbi_uc *p, int pixelCount)
   6089 {
   6090    int count, nleft, len;
   6091 
   6092    count = 0;
   6093    while ((nleft = pixelCount - count) > 0) {
   6094       len = stbi__get8(s);
   6095       if (len == 128) {
   6096          // No-op.
   6097       } else if (len < 128) {
   6098          // Copy next len+1 bytes literally.
   6099          len++;
   6100          if (len > nleft) return 0; // corrupt data
   6101          count += len;
   6102          while (len) {
   6103             *p = stbi__get8(s);
   6104             p += 4;
   6105             len--;
   6106          }
   6107       } else if (len > 128) {
   6108          stbi_uc   val;
   6109          // Next -len+1 bytes in the dest are replicated from next source byte.
   6110          // (Interpret len as a negative 8-bit int.)
   6111          len = 257 - len;
   6112          if (len > nleft) return 0; // corrupt data
   6113          val = stbi__get8(s);
   6114          count += len;
   6115          while (len) {
   6116             *p = val;
   6117             p += 4;
   6118             len--;
   6119          }
   6120       }
   6121    }
   6122 
   6123    return 1;
   6124 }
   6125 
   6126 static void *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc)
   6127 {
   6128    int pixelCount;
   6129    int channelCount, compression;
   6130    int channel, i;
   6131    int bitdepth;
   6132    int w,h;
   6133    stbi_uc *out;
   6134    STBI_NOTUSED(ri);
   6135 
   6136    // Check identifier
   6137    if (stbi__get32be(s) != 0x38425053)   // "8BPS"
   6138       return stbi__errpuc("not PSD", "Corrupt PSD image");
   6139 
   6140    // Check file type version.
   6141    if (stbi__get16be(s) != 1)
   6142       return stbi__errpuc("wrong version", "Unsupported version of PSD image");
   6143 
   6144    // Skip 6 reserved bytes.
   6145    stbi__skip(s, 6 );
   6146 
   6147    // Read the number of channels (R, G, B, A, etc).
   6148    channelCount = stbi__get16be(s);
   6149    if (channelCount < 0 || channelCount > 16)
   6150       return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
   6151 
   6152    // Read the rows and columns of the image.
   6153    h = stbi__get32be(s);
   6154    w = stbi__get32be(s);
   6155 
   6156    if (h > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   6157    if (w > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   6158 
   6159    // Make sure the depth is 8 bits.
   6160    bitdepth = stbi__get16be(s);
   6161    if (bitdepth != 8 && bitdepth != 16)
   6162       return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
   6163 
   6164    // Make sure the color mode is RGB.
   6165    // Valid options are:
   6166    //   0: Bitmap
   6167    //   1: Grayscale
   6168    //   2: Indexed color
   6169    //   3: RGB color
   6170    //   4: CMYK color
   6171    //   7: Multichannel
   6172    //   8: Duotone
   6173    //   9: Lab color
   6174    if (stbi__get16be(s) != 3)
   6175       return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
   6176 
   6177    // Skip the Mode Data.  (It's the palette for indexed color; other info for other modes.)
   6178    stbi__skip(s,stbi__get32be(s) );
   6179 
   6180    // Skip the image resources.  (resolution, pen tool paths, etc)
   6181    stbi__skip(s, stbi__get32be(s) );
   6182 
   6183    // Skip the reserved data.
   6184    stbi__skip(s, stbi__get32be(s) );
   6185 
   6186    // Find out if the data is compressed.
   6187    // Known values:
   6188    //   0: no compression
   6189    //   1: RLE compressed
   6190    compression = stbi__get16be(s);
   6191    if (compression > 1)
   6192       return stbi__errpuc("bad compression", "PSD has an unknown compression format");
   6193 
   6194    // Check size
   6195    if (!stbi__mad3sizes_valid(4, w, h, 0))
   6196       return stbi__errpuc("too large", "Corrupt PSD");
   6197 
   6198    // Create the destination image.
   6199 
   6200    if (!compression && bitdepth == 16 && bpc == 16) {
   6201       out = (stbi_uc *) stbi__malloc_mad3(8, w, h, 0);
   6202       ri->bits_per_channel = 16;
   6203    } else
   6204       out = (stbi_uc *) stbi__malloc(4 * w*h);
   6205 
   6206    if (!out) return stbi__errpuc("outofmem", "Out of memory");
   6207    pixelCount = w*h;
   6208 
   6209    // Initialize the data to zero.
   6210    //memset( out, 0, pixelCount * 4 );
   6211 
   6212    // Finally, the image data.
   6213    if (compression) {
   6214       // RLE as used by .PSD and .TIFF
   6215       // Loop until you get the number of unpacked bytes you are expecting:
   6216       //     Read the next source byte into n.
   6217       //     If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
   6218       //     Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
   6219       //     Else if n is 128, noop.
   6220       // Endloop
   6221 
   6222       // The RLE-compressed data is preceded by a 2-byte data count for each row in the data,
   6223       // which we're going to just skip.
   6224       stbi__skip(s, h * channelCount * 2 );
   6225 
   6226       // Read the RLE data by channel.
   6227       for (channel = 0; channel < 4; channel++) {
   6228          stbi_uc *p;
   6229 
   6230          p = out+channel;
   6231          if (channel >= channelCount) {
   6232             // Fill this channel with default data.
   6233             for (i = 0; i < pixelCount; i++, p += 4)
   6234                *p = (channel == 3 ? 255 : 0);
   6235          } else {
   6236             // Read the RLE data.
   6237             if (!stbi__psd_decode_rle(s, p, pixelCount)) {
   6238                STBI_FREE(out);
   6239                return stbi__errpuc("corrupt", "bad RLE data");
   6240             }
   6241          }
   6242       }
   6243 
   6244    } else {
   6245       // We're at the raw image data.  It's each channel in order (Red, Green, Blue, Alpha, ...)
   6246       // where each channel consists of an 8-bit (or 16-bit) value for each pixel in the image.
   6247 
   6248       // Read the data by channel.
   6249       for (channel = 0; channel < 4; channel++) {
   6250          if (channel >= channelCount) {
   6251             // Fill this channel with default data.
   6252             if (bitdepth == 16 && bpc == 16) {
   6253                stbi__uint16 *q = ((stbi__uint16 *) out) + channel;
   6254                stbi__uint16 val = channel == 3 ? 65535 : 0;
   6255                for (i = 0; i < pixelCount; i++, q += 4)
   6256                   *q = val;
   6257             } else {
   6258                stbi_uc *p = out+channel;
   6259                stbi_uc val = channel == 3 ? 255 : 0;
   6260                for (i = 0; i < pixelCount; i++, p += 4)
   6261                   *p = val;
   6262             }
   6263          } else {
   6264             if (ri->bits_per_channel == 16) {    // output bpc
   6265                stbi__uint16 *q = ((stbi__uint16 *) out) + channel;
   6266                for (i = 0; i < pixelCount; i++, q += 4)
   6267                   *q = (stbi__uint16) stbi__get16be(s);
   6268             } else {
   6269                stbi_uc *p = out+channel;
   6270                if (bitdepth == 16) {  // input bpc
   6271                   for (i = 0; i < pixelCount; i++, p += 4)
   6272                      *p = (stbi_uc) (stbi__get16be(s) >> 8);
   6273                } else {
   6274                   for (i = 0; i < pixelCount; i++, p += 4)
   6275                      *p = stbi__get8(s);
   6276                }
   6277             }
   6278          }
   6279       }
   6280    }
   6281 
   6282    // remove weird white matte from PSD
   6283    if (channelCount >= 4) {
   6284       if (ri->bits_per_channel == 16) {
   6285          for (i=0; i < w*h; ++i) {
   6286             stbi__uint16 *pixel = (stbi__uint16 *) out + 4*i;
   6287             if (pixel[3] != 0 && pixel[3] != 65535) {
   6288                float a = pixel[3] / 65535.0f;
   6289                float ra = 1.0f / a;
   6290                float inv_a = 65535.0f * (1 - ra);
   6291                pixel[0] = (stbi__uint16) (pixel[0]*ra + inv_a);
   6292                pixel[1] = (stbi__uint16) (pixel[1]*ra + inv_a);
   6293                pixel[2] = (stbi__uint16) (pixel[2]*ra + inv_a);
   6294             }
   6295          }
   6296       } else {
   6297          for (i=0; i < w*h; ++i) {
   6298             unsigned char *pixel = out + 4*i;
   6299             if (pixel[3] != 0 && pixel[3] != 255) {
   6300                float a = pixel[3] / 255.0f;
   6301                float ra = 1.0f / a;
   6302                float inv_a = 255.0f * (1 - ra);
   6303                pixel[0] = (unsigned char) (pixel[0]*ra + inv_a);
   6304                pixel[1] = (unsigned char) (pixel[1]*ra + inv_a);
   6305                pixel[2] = (unsigned char) (pixel[2]*ra + inv_a);
   6306             }
   6307          }
   6308       }
   6309    }
   6310 
   6311    // convert to desired output format
   6312    if (req_comp && req_comp != 4) {
   6313       if (ri->bits_per_channel == 16)
   6314          out = (stbi_uc *) stbi__convert_format16((stbi__uint16 *) out, 4, req_comp, w, h);
   6315       else
   6316          out = stbi__convert_format(out, 4, req_comp, w, h);
   6317       if (out == NULL) return out; // stbi__convert_format frees input on failure
   6318    }
   6319 
   6320    if (comp) *comp = 4;
   6321    *y = h;
   6322    *x = w;
   6323 
   6324    return out;
   6325 }
   6326 #endif
   6327 
   6328 // *************************************************************************************************
   6329 // Softimage PIC loader
   6330 // by Tom Seddon
   6331 //
   6332 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
   6333 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
   6334 
   6335 #ifndef STBI_NO_PIC
   6336 static int stbi__pic_is4(stbi__context *s,const char *str)
   6337 {
   6338    int i;
   6339    for (i=0; i<4; ++i)
   6340       if (stbi__get8(s) != (stbi_uc)str[i])
   6341          return 0;
   6342 
   6343    return 1;
   6344 }
   6345 
   6346 static int stbi__pic_test_core(stbi__context *s)
   6347 {
   6348    int i;
   6349 
   6350    if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
   6351       return 0;
   6352 
   6353    for(i=0;i<84;++i)
   6354       stbi__get8(s);
   6355 
   6356    if (!stbi__pic_is4(s,"PICT"))
   6357       return 0;
   6358 
   6359    return 1;
   6360 }
   6361 
   6362 typedef struct
   6363 {
   6364    stbi_uc size,type,channel;
   6365 } stbi__pic_packet;
   6366 
   6367 static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
   6368 {
   6369    int mask=0x80, i;
   6370 
   6371    for (i=0; i<4; ++i, mask>>=1) {
   6372       if (channel & mask) {
   6373          if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
   6374          dest[i]=stbi__get8(s);
   6375       }
   6376    }
   6377 
   6378    return dest;
   6379 }
   6380 
   6381 static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
   6382 {
   6383    int mask=0x80,i;
   6384 
   6385    for (i=0;i<4; ++i, mask>>=1)
   6386       if (channel&mask)
   6387          dest[i]=src[i];
   6388 }
   6389 
   6390 static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
   6391 {
   6392    int act_comp=0,num_packets=0,y,chained;
   6393    stbi__pic_packet packets[10];
   6394 
   6395    // this will (should...) cater for even some bizarre stuff like having data
   6396     // for the same channel in multiple packets.
   6397    do {
   6398       stbi__pic_packet *packet;
   6399 
   6400       if (num_packets==sizeof(packets)/sizeof(packets[0]))
   6401          return stbi__errpuc("bad format","too many packets");
   6402 
   6403       packet = &packets[num_packets++];
   6404 
   6405       chained = stbi__get8(s);
   6406       packet->size    = stbi__get8(s);
   6407       packet->type    = stbi__get8(s);
   6408       packet->channel = stbi__get8(s);
   6409 
   6410       act_comp |= packet->channel;
   6411 
   6412       if (stbi__at_eof(s))          return stbi__errpuc("bad file","file too short (reading packets)");
   6413       if (packet->size != 8)  return stbi__errpuc("bad format","packet isn't 8bpp");
   6414    } while (chained);
   6415 
   6416    *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
   6417 
   6418    for(y=0; y<height; ++y) {
   6419       int packet_idx;
   6420 
   6421       for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
   6422          stbi__pic_packet *packet = &packets[packet_idx];
   6423          stbi_uc *dest = result+y*width*4;
   6424 
   6425          switch (packet->type) {
   6426             default:
   6427                return stbi__errpuc("bad format","packet has bad compression type");
   6428 
   6429             case 0: {//uncompressed
   6430                int x;
   6431 
   6432                for(x=0;x<width;++x, dest+=4)
   6433                   if (!stbi__readval(s,packet->channel,dest))
   6434                      return 0;
   6435                break;
   6436             }
   6437 
   6438             case 1://Pure RLE
   6439                {
   6440                   int left=width, i;
   6441 
   6442                   while (left>0) {
   6443                      stbi_uc count,value[4];
   6444 
   6445                      count=stbi__get8(s);
   6446                      if (stbi__at_eof(s))   return stbi__errpuc("bad file","file too short (pure read count)");
   6447 
   6448                      if (count > left)
   6449                         count = (stbi_uc) left;
   6450 
   6451                      if (!stbi__readval(s,packet->channel,value))  return 0;
   6452 
   6453                      for(i=0; i<count; ++i,dest+=4)
   6454                         stbi__copyval(packet->channel,dest,value);
   6455                      left -= count;
   6456                   }
   6457                }
   6458                break;
   6459 
   6460             case 2: {//Mixed RLE
   6461                int left=width;
   6462                while (left>0) {
   6463                   int count = stbi__get8(s), i;
   6464                   if (stbi__at_eof(s))  return stbi__errpuc("bad file","file too short (mixed read count)");
   6465 
   6466                   if (count >= 128) { // Repeated
   6467                      stbi_uc value[4];
   6468 
   6469                      if (count==128)
   6470                         count = stbi__get16be(s);
   6471                      else
   6472                         count -= 127;
   6473                      if (count > left)
   6474                         return stbi__errpuc("bad file","scanline overrun");
   6475 
   6476                      if (!stbi__readval(s,packet->channel,value))
   6477                         return 0;
   6478 
   6479                      for(i=0;i<count;++i, dest += 4)
   6480                         stbi__copyval(packet->channel,dest,value);
   6481                   } else { // Raw
   6482                      ++count;
   6483                      if (count>left) return stbi__errpuc("bad file","scanline overrun");
   6484 
   6485                      for(i=0;i<count;++i, dest+=4)
   6486                         if (!stbi__readval(s,packet->channel,dest))
   6487                            return 0;
   6488                   }
   6489                   left-=count;
   6490                }
   6491                break;
   6492             }
   6493          }
   6494       }
   6495    }
   6496 
   6497    return result;
   6498 }
   6499 
   6500 static void *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp, stbi__result_info *ri)
   6501 {
   6502    stbi_uc *result;
   6503    int i, x,y, internal_comp;
   6504    STBI_NOTUSED(ri);
   6505 
   6506    if (!comp) comp = &internal_comp;
   6507 
   6508    for (i=0; i<92; ++i)
   6509       stbi__get8(s);
   6510 
   6511    x = stbi__get16be(s);
   6512    y = stbi__get16be(s);
   6513 
   6514    if (y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   6515    if (x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   6516 
   6517    if (stbi__at_eof(s))  return stbi__errpuc("bad file","file too short (pic header)");
   6518    if (!stbi__mad3sizes_valid(x, y, 4, 0)) return stbi__errpuc("too large", "PIC image too large to decode");
   6519 
   6520    stbi__get32be(s); //skip `ratio'
   6521    stbi__get16be(s); //skip `fields'
   6522    stbi__get16be(s); //skip `pad'
   6523 
   6524    // intermediate buffer is RGBA
   6525    result = (stbi_uc *) stbi__malloc_mad3(x, y, 4, 0);
   6526    if (!result) return stbi__errpuc("outofmem", "Out of memory");
   6527    memset(result, 0xff, x*y*4);
   6528 
   6529    if (!stbi__pic_load_core(s,x,y,comp, result)) {
   6530       STBI_FREE(result);
   6531       result=0;
   6532    }
   6533    *px = x;
   6534    *py = y;
   6535    if (req_comp == 0) req_comp = *comp;
   6536    result=stbi__convert_format(result,4,req_comp,x,y);
   6537 
   6538    return result;
   6539 }
   6540 
   6541 static int stbi__pic_test(stbi__context *s)
   6542 {
   6543    int r = stbi__pic_test_core(s);
   6544    stbi__rewind(s);
   6545    return r;
   6546 }
   6547 #endif
   6548 
   6549 // *************************************************************************************************
   6550 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
   6551 
   6552 #ifndef STBI_NO_GIF
   6553 typedef struct
   6554 {
   6555    stbi__int16 prefix;
   6556    stbi_uc first;
   6557    stbi_uc suffix;
   6558 } stbi__gif_lzw;
   6559 
   6560 typedef struct
   6561 {
   6562    int w,h;
   6563    stbi_uc *out;                 // output buffer (always 4 components)
   6564    stbi_uc *background;          // The current "background" as far as a gif is concerned
   6565    stbi_uc *history;
   6566    int flags, bgindex, ratio, transparent, eflags;
   6567    stbi_uc  pal[256][4];
   6568    stbi_uc lpal[256][4];
   6569    stbi__gif_lzw codes[8192];
   6570    stbi_uc *color_table;
   6571    int parse, step;
   6572    int lflags;
   6573    int start_x, start_y;
   6574    int max_x, max_y;
   6575    int cur_x, cur_y;
   6576    int line_size;
   6577    int delay;
   6578 } stbi__gif;
   6579 
   6580 static int stbi__gif_test_raw(stbi__context *s)
   6581 {
   6582    int sz;
   6583    if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
   6584    sz = stbi__get8(s);
   6585    if (sz != '9' && sz != '7') return 0;
   6586    if (stbi__get8(s) != 'a') return 0;
   6587    return 1;
   6588 }
   6589 
   6590 static int stbi__gif_test(stbi__context *s)
   6591 {
   6592    int r = stbi__gif_test_raw(s);
   6593    stbi__rewind(s);
   6594    return r;
   6595 }
   6596 
   6597 static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
   6598 {
   6599    int i;
   6600    for (i=0; i < num_entries; ++i) {
   6601       pal[i][2] = stbi__get8(s);
   6602       pal[i][1] = stbi__get8(s);
   6603       pal[i][0] = stbi__get8(s);
   6604       pal[i][3] = transp == i ? 0 : 255;
   6605    }
   6606 }
   6607 
   6608 static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
   6609 {
   6610    stbi_uc version;
   6611    if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
   6612       return stbi__err("not GIF", "Corrupt GIF");
   6613 
   6614    version = stbi__get8(s);
   6615    if (version != '7' && version != '9')    return stbi__err("not GIF", "Corrupt GIF");
   6616    if (stbi__get8(s) != 'a')                return stbi__err("not GIF", "Corrupt GIF");
   6617 
   6618    stbi__g_failure_reason = "";
   6619    g->w = stbi__get16le(s);
   6620    g->h = stbi__get16le(s);
   6621    g->flags = stbi__get8(s);
   6622    g->bgindex = stbi__get8(s);
   6623    g->ratio = stbi__get8(s);
   6624    g->transparent = -1;
   6625 
   6626    if (g->w > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
   6627    if (g->h > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
   6628 
   6629    if (comp != 0) *comp = 4;  // can't actually tell whether it's 3 or 4 until we parse the comments
   6630 
   6631    if (is_info) return 1;
   6632 
   6633    if (g->flags & 0x80)
   6634       stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
   6635 
   6636    return 1;
   6637 }
   6638 
   6639 static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
   6640 {
   6641    stbi__gif* g = (stbi__gif*) stbi__malloc(sizeof(stbi__gif));
   6642    if (!g) return stbi__err("outofmem", "Out of memory");
   6643    if (!stbi__gif_header(s, g, comp, 1)) {
   6644       STBI_FREE(g);
   6645       stbi__rewind( s );
   6646       return 0;
   6647    }
   6648    if (x) *x = g->w;
   6649    if (y) *y = g->h;
   6650    STBI_FREE(g);
   6651    return 1;
   6652 }
   6653 
   6654 static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
   6655 {
   6656    stbi_uc *p, *c;
   6657    int idx;
   6658 
   6659    // recurse to decode the prefixes, since the linked-list is backwards,
   6660    // and working backwards through an interleaved image would be nasty
   6661    if (g->codes[code].prefix >= 0)
   6662       stbi__out_gif_code(g, g->codes[code].prefix);
   6663 
   6664    if (g->cur_y >= g->max_y) return;
   6665 
   6666    idx = g->cur_x + g->cur_y;
   6667    p = &g->out[idx];
   6668    g->history[idx / 4] = 1;
   6669 
   6670    c = &g->color_table[g->codes[code].suffix * 4];
   6671    if (c[3] > 128) { // don't render transparent pixels;
   6672       p[0] = c[2];
   6673       p[1] = c[1];
   6674       p[2] = c[0];
   6675       p[3] = c[3];
   6676    }
   6677    g->cur_x += 4;
   6678 
   6679    if (g->cur_x >= g->max_x) {
   6680       g->cur_x = g->start_x;
   6681       g->cur_y += g->step;
   6682 
   6683       while (g->cur_y >= g->max_y && g->parse > 0) {
   6684          g->step = (1 << g->parse) * g->line_size;
   6685          g->cur_y = g->start_y + (g->step >> 1);
   6686          --g->parse;
   6687       }
   6688    }
   6689 }
   6690 
   6691 static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
   6692 {
   6693    stbi_uc lzw_cs;
   6694    stbi__int32 len, init_code;
   6695    stbi__uint32 first;
   6696    stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
   6697    stbi__gif_lzw *p;
   6698 
   6699    lzw_cs = stbi__get8(s);
   6700    if (lzw_cs > 12) return NULL;
   6701    clear = 1 << lzw_cs;
   6702    first = 1;
   6703    codesize = lzw_cs + 1;
   6704    codemask = (1 << codesize) - 1;
   6705    bits = 0;
   6706    valid_bits = 0;
   6707    for (init_code = 0; init_code < clear; init_code++) {
   6708       g->codes[init_code].prefix = -1;
   6709       g->codes[init_code].first = (stbi_uc) init_code;
   6710       g->codes[init_code].suffix = (stbi_uc) init_code;
   6711    }
   6712 
   6713    // support no starting clear code
   6714    avail = clear+2;
   6715    oldcode = -1;
   6716 
   6717    len = 0;
   6718    for(;;) {
   6719       if (valid_bits < codesize) {
   6720          if (len == 0) {
   6721             len = stbi__get8(s); // start new block
   6722             if (len == 0)
   6723                return g->out;
   6724          }
   6725          --len;
   6726          bits |= (stbi__int32) stbi__get8(s) << valid_bits;
   6727          valid_bits += 8;
   6728       } else {
   6729          stbi__int32 code = bits & codemask;
   6730          bits >>= codesize;
   6731          valid_bits -= codesize;
   6732          // @OPTIMIZE: is there some way we can accelerate the non-clear path?
   6733          if (code == clear) {  // clear code
   6734             codesize = lzw_cs + 1;
   6735             codemask = (1 << codesize) - 1;
   6736             avail = clear + 2;
   6737             oldcode = -1;
   6738             first = 0;
   6739          } else if (code == clear + 1) { // end of stream code
   6740             stbi__skip(s, len);
   6741             while ((len = stbi__get8(s)) > 0)
   6742                stbi__skip(s,len);
   6743             return g->out;
   6744          } else if (code <= avail) {
   6745             if (first) {
   6746                return stbi__errpuc("no clear code", "Corrupt GIF");
   6747             }
   6748 
   6749             if (oldcode >= 0) {
   6750                p = &g->codes[avail++];
   6751                if (avail > 8192) {
   6752                   return stbi__errpuc("too many codes", "Corrupt GIF");
   6753                }
   6754 
   6755                p->prefix = (stbi__int16) oldcode;
   6756                p->first = g->codes[oldcode].first;
   6757                p->suffix = (code == avail) ? p->first : g->codes[code].first;
   6758             } else if (code == avail)
   6759                return stbi__errpuc("illegal code in raster", "Corrupt GIF");
   6760 
   6761             stbi__out_gif_code(g, (stbi__uint16) code);
   6762 
   6763             if ((avail & codemask) == 0 && avail <= 0x0FFF) {
   6764                codesize++;
   6765                codemask = (1 << codesize) - 1;
   6766             }
   6767 
   6768             oldcode = code;
   6769          } else {
   6770             return stbi__errpuc("illegal code in raster", "Corrupt GIF");
   6771          }
   6772       }
   6773    }
   6774 }
   6775 
   6776 // this function is designed to support animated gifs, although stb_image doesn't support it
   6777 // two back is the image from two frames ago, used for a very specific disposal format
   6778 static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp, stbi_uc *two_back)
   6779 {
   6780    int dispose;
   6781    int first_frame;
   6782    int pi;
   6783    int pcount;
   6784    STBI_NOTUSED(req_comp);
   6785 
   6786    // on first frame, any non-written pixels get the background colour (non-transparent)
   6787    first_frame = 0;
   6788    if (g->out == 0) {
   6789       if (!stbi__gif_header(s, g, comp,0)) return 0; // stbi__g_failure_reason set by stbi__gif_header
   6790       if (!stbi__mad3sizes_valid(4, g->w, g->h, 0))
   6791          return stbi__errpuc("too large", "GIF image is too large");
   6792       pcount = g->w * g->h;
   6793       g->out = (stbi_uc *) stbi__malloc(4 * pcount);
   6794       g->background = (stbi_uc *) stbi__malloc(4 * pcount);
   6795       g->history = (stbi_uc *) stbi__malloc(pcount);
   6796       if (!g->out || !g->background || !g->history)
   6797          return stbi__errpuc("outofmem", "Out of memory");
   6798 
   6799       // image is treated as "transparent" at the start - ie, nothing overwrites the current background;
   6800       // background colour is only used for pixels that are not rendered first frame, after that "background"
   6801       // color refers to the color that was there the previous frame.
   6802       memset(g->out, 0x00, 4 * pcount);
   6803       memset(g->background, 0x00, 4 * pcount); // state of the background (starts transparent)
   6804       memset(g->history, 0x00, pcount);        // pixels that were affected previous frame
   6805       first_frame = 1;
   6806    } else {
   6807       // second frame - how do we dispose of the previous one?
   6808       dispose = (g->eflags & 0x1C) >> 2;
   6809       pcount = g->w * g->h;
   6810 
   6811       if ((dispose == 3) && (two_back == 0)) {
   6812          dispose = 2; // if I don't have an image to revert back to, default to the old background
   6813       }
   6814 
   6815       if (dispose == 3) { // use previous graphic
   6816          for (pi = 0; pi < pcount; ++pi) {
   6817             if (g->history[pi]) {
   6818                memcpy( &g->out[pi * 4], &two_back[pi * 4], 4 );
   6819             }
   6820          }
   6821       } else if (dispose == 2) {
   6822          // restore what was changed last frame to background before that frame;
   6823          for (pi = 0; pi < pcount; ++pi) {
   6824             if (g->history[pi]) {
   6825                memcpy( &g->out[pi * 4], &g->background[pi * 4], 4 );
   6826             }
   6827          }
   6828       } else {
   6829          // This is a non-disposal case eithe way, so just
   6830          // leave the pixels as is, and they will become the new background
   6831          // 1: do not dispose
   6832          // 0:  not specified.
   6833       }
   6834 
   6835       // background is what out is after the undoing of the previou frame;
   6836       memcpy( g->background, g->out, 4 * g->w * g->h );
   6837    }
   6838 
   6839    // clear my history;
   6840    memset( g->history, 0x00, g->w * g->h );        // pixels that were affected previous frame
   6841 
   6842    for (;;) {
   6843       int tag = stbi__get8(s);
   6844       switch (tag) {
   6845          case 0x2C: /* Image Descriptor */
   6846          {
   6847             stbi__int32 x, y, w, h;
   6848             stbi_uc *o;
   6849 
   6850             x = stbi__get16le(s);
   6851             y = stbi__get16le(s);
   6852             w = stbi__get16le(s);
   6853             h = stbi__get16le(s);
   6854             if (((x + w) > (g->w)) || ((y + h) > (g->h)))
   6855                return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
   6856 
   6857             g->line_size = g->w * 4;
   6858             g->start_x = x * 4;
   6859             g->start_y = y * g->line_size;
   6860             g->max_x   = g->start_x + w * 4;
   6861             g->max_y   = g->start_y + h * g->line_size;
   6862             g->cur_x   = g->start_x;
   6863             g->cur_y   = g->start_y;
   6864 
   6865             // if the width of the specified rectangle is 0, that means
   6866             // we may not see *any* pixels or the image is malformed;
   6867             // to make sure this is caught, move the current y down to
   6868             // max_y (which is what out_gif_code checks).
   6869             if (w == 0)
   6870                g->cur_y = g->max_y;
   6871 
   6872             g->lflags = stbi__get8(s);
   6873 
   6874             if (g->lflags & 0x40) {
   6875                g->step = 8 * g->line_size; // first interlaced spacing
   6876                g->parse = 3;
   6877             } else {
   6878                g->step = g->line_size;
   6879                g->parse = 0;
   6880             }
   6881 
   6882             if (g->lflags & 0x80) {
   6883                stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
   6884                g->color_table = (stbi_uc *) g->lpal;
   6885             } else if (g->flags & 0x80) {
   6886                g->color_table = (stbi_uc *) g->pal;
   6887             } else
   6888                return stbi__errpuc("missing color table", "Corrupt GIF");
   6889 
   6890             o = stbi__process_gif_raster(s, g);
   6891             if (!o) return NULL;
   6892 
   6893             // if this was the first frame,
   6894             pcount = g->w * g->h;
   6895             if (first_frame && (g->bgindex > 0)) {
   6896                // if first frame, any pixel not drawn to gets the background color
   6897                for (pi = 0; pi < pcount; ++pi) {
   6898                   if (g->history[pi] == 0) {
   6899                      g->pal[g->bgindex][3] = 255; // just in case it was made transparent, undo that; It will be reset next frame if need be;
   6900                      memcpy( &g->out[pi * 4], &g->pal[g->bgindex], 4 );
   6901                   }
   6902                }
   6903             }
   6904 
   6905             return o;
   6906          }
   6907 
   6908          case 0x21: // Comment Extension.
   6909          {
   6910             int len;
   6911             int ext = stbi__get8(s);
   6912             if (ext == 0xF9) { // Graphic Control Extension.
   6913                len = stbi__get8(s);
   6914                if (len == 4) {
   6915                   g->eflags = stbi__get8(s);
   6916                   g->delay = 10 * stbi__get16le(s); // delay - 1/100th of a second, saving as 1/1000ths.
   6917 
   6918                   // unset old transparent
   6919                   if (g->transparent >= 0) {
   6920                      g->pal[g->transparent][3] = 255;
   6921                   }
   6922                   if (g->eflags & 0x01) {
   6923                      g->transparent = stbi__get8(s);
   6924                      if (g->transparent >= 0) {
   6925                         g->pal[g->transparent][3] = 0;
   6926                      }
   6927                   } else {
   6928                      // don't need transparent
   6929                      stbi__skip(s, 1);
   6930                      g->transparent = -1;
   6931                   }
   6932                } else {
   6933                   stbi__skip(s, len);
   6934                   break;
   6935                }
   6936             }
   6937             while ((len = stbi__get8(s)) != 0) {
   6938                stbi__skip(s, len);
   6939             }
   6940             break;
   6941          }
   6942 
   6943          case 0x3B: // gif stream termination code
   6944             return (stbi_uc *) s; // using '1' causes warning on some compilers
   6945 
   6946          default:
   6947             return stbi__errpuc("unknown code", "Corrupt GIF");
   6948       }
   6949    }
   6950 }
   6951 
   6952 static void *stbi__load_gif_main_outofmem(stbi__gif *g, stbi_uc *out, int **delays)
   6953 {
   6954    STBI_FREE(g->out);
   6955    STBI_FREE(g->history);
   6956    STBI_FREE(g->background);
   6957 
   6958    if (out) STBI_FREE(out);
   6959    if (delays && *delays) STBI_FREE(*delays);
   6960    return stbi__errpuc("outofmem", "Out of memory");
   6961 }
   6962 
   6963 static void *stbi__load_gif_main(stbi__context *s, int **delays, int *x, int *y, int *z, int *comp, int req_comp)
   6964 {
   6965    if (stbi__gif_test(s)) {
   6966       int layers = 0;
   6967       stbi_uc *u = 0;
   6968       stbi_uc *out = 0;
   6969       stbi_uc *two_back = 0;
   6970       stbi__gif g;
   6971       int stride;
   6972       int out_size = 0;
   6973       int delays_size = 0;
   6974 
   6975       STBI_NOTUSED(out_size);
   6976       STBI_NOTUSED(delays_size);
   6977 
   6978       memset(&g, 0, sizeof(g));
   6979       if (delays) {
   6980          *delays = 0;
   6981       }
   6982 
   6983       do {
   6984          u = stbi__gif_load_next(s, &g, comp, req_comp, two_back);
   6985          if (u == (stbi_uc *) s) u = 0;  // end of animated gif marker
   6986 
   6987          if (u) {
   6988             *x = g.w;
   6989             *y = g.h;
   6990             ++layers;
   6991             stride = g.w * g.h * 4;
   6992 
   6993             if (out) {
   6994                void *tmp = (stbi_uc*) STBI_REALLOC_SIZED( out, out_size, layers * stride );
   6995                if (!tmp)
   6996                   return stbi__load_gif_main_outofmem(&g, out, delays);
   6997                else {
   6998                    out = (stbi_uc*) tmp;
   6999                    out_size = layers * stride;
   7000                }
   7001 
   7002                if (delays) {
   7003                   int *new_delays = (int*) STBI_REALLOC_SIZED( *delays, delays_size, sizeof(int) * layers );
   7004                   if (!new_delays)
   7005                      return stbi__load_gif_main_outofmem(&g, out, delays);
   7006                   *delays = new_delays;
   7007                   delays_size = layers * sizeof(int);
   7008                }
   7009             } else {
   7010                out = (stbi_uc*)stbi__malloc( layers * stride );
   7011                if (!out)
   7012                   return stbi__load_gif_main_outofmem(&g, out, delays);
   7013                out_size = layers * stride;
   7014                if (delays) {
   7015                   *delays = (int*) stbi__malloc( layers * sizeof(int) );
   7016                   if (!*delays)
   7017                      return stbi__load_gif_main_outofmem(&g, out, delays);
   7018                   delays_size = layers * sizeof(int);
   7019                }
   7020             }
   7021             memcpy( out + ((layers - 1) * stride), u, stride );
   7022             if (layers >= 2) {
   7023                two_back = out - 2 * stride;
   7024             }
   7025 
   7026             if (delays) {
   7027                (*delays)[layers - 1U] = g.delay;
   7028             }
   7029          }
   7030       } while (u != 0);
   7031 
   7032       // free temp buffer;
   7033       STBI_FREE(g.out);
   7034       STBI_FREE(g.history);
   7035       STBI_FREE(g.background);
   7036 
   7037       // do the final conversion after loading everything;
   7038       if (req_comp && req_comp != 4)
   7039          out = stbi__convert_format(out, 4, req_comp, layers * g.w, g.h);
   7040 
   7041       *z = layers;
   7042       return out;
   7043    } else {
   7044       return stbi__errpuc("not GIF", "Image was not as a gif type.");
   7045    }
   7046 }
   7047 
   7048 static void *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   7049 {
   7050    stbi_uc *u = 0;
   7051    stbi__gif g;
   7052    memset(&g, 0, sizeof(g));
   7053    STBI_NOTUSED(ri);
   7054 
   7055    u = stbi__gif_load_next(s, &g, comp, req_comp, 0);
   7056    if (u == (stbi_uc *) s) u = 0;  // end of animated gif marker
   7057    if (u) {
   7058       *x = g.w;
   7059       *y = g.h;
   7060 
   7061       // moved conversion to after successful load so that the same
   7062       // can be done for multiple frames.
   7063       if (req_comp && req_comp != 4)
   7064          u = stbi__convert_format(u, 4, req_comp, g.w, g.h);
   7065    } else if (g.out) {
   7066       // if there was an error and we allocated an image buffer, free it!
   7067       STBI_FREE(g.out);
   7068    }
   7069 
   7070    // free buffers needed for multiple frame loading;
   7071    STBI_FREE(g.history);
   7072    STBI_FREE(g.background);
   7073 
   7074    return u;
   7075 }
   7076 
   7077 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
   7078 {
   7079    return stbi__gif_info_raw(s,x,y,comp);
   7080 }
   7081 #endif
   7082 
   7083 // *************************************************************************************************
   7084 // Radiance RGBE HDR loader
   7085 // originally by Nicolas Schulz
   7086 #ifndef STBI_NO_HDR
   7087 static int stbi__hdr_test_core(stbi__context *s, const char *signature)
   7088 {
   7089    int i;
   7090    for (i=0; signature[i]; ++i)
   7091       if (stbi__get8(s) != signature[i])
   7092           return 0;
   7093    stbi__rewind(s);
   7094    return 1;
   7095 }
   7096 
   7097 static int stbi__hdr_test(stbi__context* s)
   7098 {
   7099    int r = stbi__hdr_test_core(s, "#?RADIANCE\n");
   7100    stbi__rewind(s);
   7101    if(!r) {
   7102        r = stbi__hdr_test_core(s, "#?RGBE\n");
   7103        stbi__rewind(s);
   7104    }
   7105    return r;
   7106 }
   7107 
   7108 #define STBI__HDR_BUFLEN  1024
   7109 static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
   7110 {
   7111    int len=0;
   7112    char c = '\0';
   7113 
   7114    c = (char) stbi__get8(z);
   7115 
   7116    while (!stbi__at_eof(z) && c != '\n') {
   7117       buffer[len++] = c;
   7118       if (len == STBI__HDR_BUFLEN-1) {
   7119          // flush to end of line
   7120          while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
   7121             ;
   7122          break;
   7123       }
   7124       c = (char) stbi__get8(z);
   7125    }
   7126 
   7127    buffer[len] = 0;
   7128    return buffer;
   7129 }
   7130 
   7131 static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
   7132 {
   7133    if ( input[3] != 0 ) {
   7134       float f1;
   7135       // Exponent
   7136       f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
   7137       if (req_comp <= 2)
   7138          output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
   7139       else {
   7140          output[0] = input[0] * f1;
   7141          output[1] = input[1] * f1;
   7142          output[2] = input[2] * f1;
   7143       }
   7144       if (req_comp == 2) output[1] = 1;
   7145       if (req_comp == 4) output[3] = 1;
   7146    } else {
   7147       switch (req_comp) {
   7148          case 4: output[3] = 1; /* fallthrough */
   7149          case 3: output[0] = output[1] = output[2] = 0;
   7150                  break;
   7151          case 2: output[1] = 1; /* fallthrough */
   7152          case 1: output[0] = 0;
   7153                  break;
   7154       }
   7155    }
   7156 }
   7157 
   7158 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   7159 {
   7160    char buffer[STBI__HDR_BUFLEN];
   7161    char *token;
   7162    int valid = 0;
   7163    int width, height;
   7164    stbi_uc *scanline;
   7165    float *hdr_data;
   7166    int len;
   7167    unsigned char count, value;
   7168    int i, j, k, c1,c2, z;
   7169    const char *headerToken;
   7170    STBI_NOTUSED(ri);
   7171 
   7172    // Check identifier
   7173    headerToken = stbi__hdr_gettoken(s,buffer);
   7174    if (strcmp(headerToken, "#?RADIANCE") != 0 && strcmp(headerToken, "#?RGBE") != 0)
   7175       return stbi__errpf("not HDR", "Corrupt HDR image");
   7176 
   7177    // Parse header
   7178    for(;;) {
   7179       token = stbi__hdr_gettoken(s,buffer);
   7180       if (token[0] == 0) break;
   7181       if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
   7182    }
   7183 
   7184    if (!valid)    return stbi__errpf("unsupported format", "Unsupported HDR format");
   7185 
   7186    // Parse width and height
   7187    // can't use sscanf() if we're not using stdio!
   7188    token = stbi__hdr_gettoken(s,buffer);
   7189    if (strncmp(token, "-Y ", 3))  return stbi__errpf("unsupported data layout", "Unsupported HDR format");
   7190    token += 3;
   7191    height = (int) strtol(token, &token, 10);
   7192    while (*token == ' ') ++token;
   7193    if (strncmp(token, "+X ", 3))  return stbi__errpf("unsupported data layout", "Unsupported HDR format");
   7194    token += 3;
   7195    width = (int) strtol(token, NULL, 10);
   7196 
   7197    if (height > STBI_MAX_DIMENSIONS) return stbi__errpf("too large","Very large image (corrupt?)");
   7198    if (width > STBI_MAX_DIMENSIONS) return stbi__errpf("too large","Very large image (corrupt?)");
   7199 
   7200    *x = width;
   7201    *y = height;
   7202 
   7203    if (comp) *comp = 3;
   7204    if (req_comp == 0) req_comp = 3;
   7205 
   7206    if (!stbi__mad4sizes_valid(width, height, req_comp, sizeof(float), 0))
   7207       return stbi__errpf("too large", "HDR image is too large");
   7208 
   7209    // Read data
   7210    hdr_data = (float *) stbi__malloc_mad4(width, height, req_comp, sizeof(float), 0);
   7211    if (!hdr_data)
   7212       return stbi__errpf("outofmem", "Out of memory");
   7213 
   7214    // Load image data
   7215    // image data is stored as some number of sca
   7216    if ( width < 8 || width >= 32768) {
   7217       // Read flat data
   7218       for (j=0; j < height; ++j) {
   7219          for (i=0; i < width; ++i) {
   7220             stbi_uc rgbe[4];
   7221            main_decode_loop:
   7222             stbi__getn(s, rgbe, 4);
   7223             stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
   7224          }
   7225       }
   7226    } else {
   7227       // Read RLE-encoded data
   7228       scanline = NULL;
   7229 
   7230       for (j = 0; j < height; ++j) {
   7231          c1 = stbi__get8(s);
   7232          c2 = stbi__get8(s);
   7233          len = stbi__get8(s);
   7234          if (c1 != 2 || c2 != 2 || (len & 0x80)) {
   7235             // not run-length encoded, so we have to actually use THIS data as a decoded
   7236             // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
   7237             stbi_uc rgbe[4];
   7238             rgbe[0] = (stbi_uc) c1;
   7239             rgbe[1] = (stbi_uc) c2;
   7240             rgbe[2] = (stbi_uc) len;
   7241             rgbe[3] = (stbi_uc) stbi__get8(s);
   7242             stbi__hdr_convert(hdr_data, rgbe, req_comp);
   7243             i = 1;
   7244             j = 0;
   7245             STBI_FREE(scanline);
   7246             goto main_decode_loop; // yes, this makes no sense
   7247          }
   7248          len <<= 8;
   7249          len |= stbi__get8(s);
   7250          if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
   7251          if (scanline == NULL) {
   7252             scanline = (stbi_uc *) stbi__malloc_mad2(width, 4, 0);
   7253             if (!scanline) {
   7254                STBI_FREE(hdr_data);
   7255                return stbi__errpf("outofmem", "Out of memory");
   7256             }
   7257          }
   7258 
   7259          for (k = 0; k < 4; ++k) {
   7260             int nleft;
   7261             i = 0;
   7262             while ((nleft = width - i) > 0) {
   7263                count = stbi__get8(s);
   7264                if (count > 128) {
   7265                   // Run
   7266                   value = stbi__get8(s);
   7267                   count -= 128;
   7268                   if ((count == 0) || (count > nleft)) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("corrupt", "bad RLE data in HDR"); }
   7269                   for (z = 0; z < count; ++z)
   7270                      scanline[i++ * 4 + k] = value;
   7271                } else {
   7272                   // Dump
   7273                   if ((count == 0) || (count > nleft)) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("corrupt", "bad RLE data in HDR"); }
   7274                   for (z = 0; z < count; ++z)
   7275                      scanline[i++ * 4 + k] = stbi__get8(s);
   7276                }
   7277             }
   7278          }
   7279          for (i=0; i < width; ++i)
   7280             stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
   7281       }
   7282       if (scanline)
   7283          STBI_FREE(scanline);
   7284    }
   7285 
   7286    return hdr_data;
   7287 }
   7288 
   7289 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
   7290 {
   7291    char buffer[STBI__HDR_BUFLEN];
   7292    char *token;
   7293    int valid = 0;
   7294    int dummy;
   7295 
   7296    if (!x) x = &dummy;
   7297    if (!y) y = &dummy;
   7298    if (!comp) comp = &dummy;
   7299 
   7300    if (stbi__hdr_test(s) == 0) {
   7301        stbi__rewind( s );
   7302        return 0;
   7303    }
   7304 
   7305    for(;;) {
   7306       token = stbi__hdr_gettoken(s,buffer);
   7307       if (token[0] == 0) break;
   7308       if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
   7309    }
   7310 
   7311    if (!valid) {
   7312        stbi__rewind( s );
   7313        return 0;
   7314    }
   7315    token = stbi__hdr_gettoken(s,buffer);
   7316    if (strncmp(token, "-Y ", 3)) {
   7317        stbi__rewind( s );
   7318        return 0;
   7319    }
   7320    token += 3;
   7321    *y = (int) strtol(token, &token, 10);
   7322    while (*token == ' ') ++token;
   7323    if (strncmp(token, "+X ", 3)) {
   7324        stbi__rewind( s );
   7325        return 0;
   7326    }
   7327    token += 3;
   7328    *x = (int) strtol(token, NULL, 10);
   7329    *comp = 3;
   7330    return 1;
   7331 }
   7332 #endif // STBI_NO_HDR
   7333 
   7334 #ifndef STBI_NO_BMP
   7335 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
   7336 {
   7337    void *p;
   7338    stbi__bmp_data info;
   7339 
   7340    info.all_a = 255;
   7341    p = stbi__bmp_parse_header(s, &info);
   7342    if (p == NULL) {
   7343       stbi__rewind( s );
   7344       return 0;
   7345    }
   7346    if (x) *x = s->img_x;
   7347    if (y) *y = s->img_y;
   7348    if (comp) {
   7349       if (info.bpp == 24 && info.ma == 0xff000000)
   7350          *comp = 3;
   7351       else
   7352          *comp = info.ma ? 4 : 3;
   7353    }
   7354    return 1;
   7355 }
   7356 #endif
   7357 
   7358 #ifndef STBI_NO_PSD
   7359 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
   7360 {
   7361    int channelCount, dummy, depth;
   7362    if (!x) x = &dummy;
   7363    if (!y) y = &dummy;
   7364    if (!comp) comp = &dummy;
   7365    if (stbi__get32be(s) != 0x38425053) {
   7366        stbi__rewind( s );
   7367        return 0;
   7368    }
   7369    if (stbi__get16be(s) != 1) {
   7370        stbi__rewind( s );
   7371        return 0;
   7372    }
   7373    stbi__skip(s, 6);
   7374    channelCount = stbi__get16be(s);
   7375    if (channelCount < 0 || channelCount > 16) {
   7376        stbi__rewind( s );
   7377        return 0;
   7378    }
   7379    *y = stbi__get32be(s);
   7380    *x = stbi__get32be(s);
   7381    depth = stbi__get16be(s);
   7382    if (depth != 8 && depth != 16) {
   7383        stbi__rewind( s );
   7384        return 0;
   7385    }
   7386    if (stbi__get16be(s) != 3) {
   7387        stbi__rewind( s );
   7388        return 0;
   7389    }
   7390    *comp = 4;
   7391    return 1;
   7392 }
   7393 
   7394 static int stbi__psd_is16(stbi__context *s)
   7395 {
   7396    int channelCount, depth;
   7397    if (stbi__get32be(s) != 0x38425053) {
   7398        stbi__rewind( s );
   7399        return 0;
   7400    }
   7401    if (stbi__get16be(s) != 1) {
   7402        stbi__rewind( s );
   7403        return 0;
   7404    }
   7405    stbi__skip(s, 6);
   7406    channelCount = stbi__get16be(s);
   7407    if (channelCount < 0 || channelCount > 16) {
   7408        stbi__rewind( s );
   7409        return 0;
   7410    }
   7411    STBI_NOTUSED(stbi__get32be(s));
   7412    STBI_NOTUSED(stbi__get32be(s));
   7413    depth = stbi__get16be(s);
   7414    if (depth != 16) {
   7415        stbi__rewind( s );
   7416        return 0;
   7417    }
   7418    return 1;
   7419 }
   7420 #endif
   7421 
   7422 #ifndef STBI_NO_PIC
   7423 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
   7424 {
   7425    int act_comp=0,num_packets=0,chained,dummy;
   7426    stbi__pic_packet packets[10];
   7427 
   7428    if (!x) x = &dummy;
   7429    if (!y) y = &dummy;
   7430    if (!comp) comp = &dummy;
   7431 
   7432    if (!stbi__pic_is4(s,"\x53\x80\xF6\x34")) {
   7433       stbi__rewind(s);
   7434       return 0;
   7435    }
   7436 
   7437    stbi__skip(s, 88);
   7438 
   7439    *x = stbi__get16be(s);
   7440    *y = stbi__get16be(s);
   7441    if (stbi__at_eof(s)) {
   7442       stbi__rewind( s);
   7443       return 0;
   7444    }
   7445    if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
   7446       stbi__rewind( s );
   7447       return 0;
   7448    }
   7449 
   7450    stbi__skip(s, 8);
   7451 
   7452    do {
   7453       stbi__pic_packet *packet;
   7454 
   7455       if (num_packets==sizeof(packets)/sizeof(packets[0]))
   7456          return 0;
   7457 
   7458       packet = &packets[num_packets++];
   7459       chained = stbi__get8(s);
   7460       packet->size    = stbi__get8(s);
   7461       packet->type    = stbi__get8(s);
   7462       packet->channel = stbi__get8(s);
   7463       act_comp |= packet->channel;
   7464 
   7465       if (stbi__at_eof(s)) {
   7466           stbi__rewind( s );
   7467           return 0;
   7468       }
   7469       if (packet->size != 8) {
   7470           stbi__rewind( s );
   7471           return 0;
   7472       }
   7473    } while (chained);
   7474 
   7475    *comp = (act_comp & 0x10 ? 4 : 3);
   7476 
   7477    return 1;
   7478 }
   7479 #endif
   7480 
   7481 // *************************************************************************************************
   7482 // Portable Gray Map and Portable Pixel Map loader
   7483 // by Ken Miller
   7484 //
   7485 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
   7486 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
   7487 //
   7488 // Known limitations:
   7489 //    Does not support comments in the header section
   7490 //    Does not support ASCII image data (formats P2 and P3)
   7491 
   7492 #ifndef STBI_NO_PNM
   7493 
   7494 static int      stbi__pnm_test(stbi__context *s)
   7495 {
   7496    char p, t;
   7497    p = (char) stbi__get8(s);
   7498    t = (char) stbi__get8(s);
   7499    if (p != 'P' || (t != '5' && t != '6')) {
   7500        stbi__rewind( s );
   7501        return 0;
   7502    }
   7503    return 1;
   7504 }
   7505 
   7506 static void *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
   7507 {
   7508    stbi_uc *out;
   7509    STBI_NOTUSED(ri);
   7510 
   7511    ri->bits_per_channel = stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n);
   7512    if (ri->bits_per_channel == 0)
   7513       return 0;
   7514 
   7515    if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   7516    if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
   7517 
   7518    *x = s->img_x;
   7519    *y = s->img_y;
   7520    if (comp) *comp = s->img_n;
   7521 
   7522    if (!stbi__mad4sizes_valid(s->img_n, s->img_x, s->img_y, ri->bits_per_channel / 8, 0))
   7523       return stbi__errpuc("too large", "PNM too large");
   7524 
   7525    out = (stbi_uc *) stbi__malloc_mad4(s->img_n, s->img_x, s->img_y, ri->bits_per_channel / 8, 0);
   7526    if (!out) return stbi__errpuc("outofmem", "Out of memory");
   7527    if (!stbi__getn(s, out, s->img_n * s->img_x * s->img_y * (ri->bits_per_channel / 8))) {
   7528       STBI_FREE(out);
   7529       return stbi__errpuc("bad PNM", "PNM file truncated");
   7530    }
   7531 
   7532    if (req_comp && req_comp != s->img_n) {
   7533       if (ri->bits_per_channel == 16) {
   7534          out = (stbi_uc *) stbi__convert_format16((stbi__uint16 *) out, s->img_n, req_comp, s->img_x, s->img_y);
   7535       } else {
   7536          out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
   7537       }
   7538       if (out == NULL) return out; // stbi__convert_format frees input on failure
   7539    }
   7540    return out;
   7541 }
   7542 
   7543 static int      stbi__pnm_isspace(char c)
   7544 {
   7545    return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
   7546 }
   7547 
   7548 static void     stbi__pnm_skip_whitespace(stbi__context *s, char *c)
   7549 {
   7550    for (;;) {
   7551       while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
   7552          *c = (char) stbi__get8(s);
   7553 
   7554       if (stbi__at_eof(s) || *c != '#')
   7555          break;
   7556 
   7557       while (!stbi__at_eof(s) && *c != '\n' && *c != '\r' )
   7558          *c = (char) stbi__get8(s);
   7559    }
   7560 }
   7561 
   7562 static int      stbi__pnm_isdigit(char c)
   7563 {
   7564    return c >= '0' && c <= '9';
   7565 }
   7566 
   7567 static int      stbi__pnm_getinteger(stbi__context *s, char *c)
   7568 {
   7569    int value = 0;
   7570 
   7571    while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
   7572       value = value*10 + (*c - '0');
   7573       *c = (char) stbi__get8(s);
   7574       if((value > 214748364) || (value == 214748364 && *c > '7'))
   7575           return stbi__err("integer parse overflow", "Parsing an integer in the PPM header overflowed a 32-bit int");
   7576    }
   7577 
   7578    return value;
   7579 }
   7580 
   7581 static int      stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
   7582 {
   7583    int maxv, dummy;
   7584    char c, p, t;
   7585 
   7586    if (!x) x = &dummy;
   7587    if (!y) y = &dummy;
   7588    if (!comp) comp = &dummy;
   7589 
   7590    stbi__rewind(s);
   7591 
   7592    // Get identifier
   7593    p = (char) stbi__get8(s);
   7594    t = (char) stbi__get8(s);
   7595    if (p != 'P' || (t != '5' && t != '6')) {
   7596        stbi__rewind(s);
   7597        return 0;
   7598    }
   7599 
   7600    *comp = (t == '6') ? 3 : 1;  // '5' is 1-component .pgm; '6' is 3-component .ppm
   7601 
   7602    c = (char) stbi__get8(s);
   7603    stbi__pnm_skip_whitespace(s, &c);
   7604 
   7605    *x = stbi__pnm_getinteger(s, &c); // read width
   7606    if(*x == 0)
   7607        return stbi__err("invalid width", "PPM image header had zero or overflowing width");
   7608    stbi__pnm_skip_whitespace(s, &c);
   7609 
   7610    *y = stbi__pnm_getinteger(s, &c); // read height
   7611    if (*y == 0)
   7612        return stbi__err("invalid width", "PPM image header had zero or overflowing width");
   7613    stbi__pnm_skip_whitespace(s, &c);
   7614 
   7615    maxv = stbi__pnm_getinteger(s, &c);  // read max value
   7616    if (maxv > 65535)
   7617       return stbi__err("max value > 65535", "PPM image supports only 8-bit and 16-bit images");
   7618    else if (maxv > 255)
   7619       return 16;
   7620    else
   7621       return 8;
   7622 }
   7623 
   7624 static int stbi__pnm_is16(stbi__context *s)
   7625 {
   7626    if (stbi__pnm_info(s, NULL, NULL, NULL) == 16)
   7627 	   return 1;
   7628    return 0;
   7629 }
   7630 #endif
   7631 
   7632 static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
   7633 {
   7634    #ifndef STBI_NO_JPEG
   7635    if (stbi__jpeg_info(s, x, y, comp)) return 1;
   7636    #endif
   7637 
   7638    #ifndef STBI_NO_PNG
   7639    if (stbi__png_info(s, x, y, comp))  return 1;
   7640    #endif
   7641 
   7642    #ifndef STBI_NO_GIF
   7643    if (stbi__gif_info(s, x, y, comp))  return 1;
   7644    #endif
   7645 
   7646    #ifndef STBI_NO_BMP
   7647    if (stbi__bmp_info(s, x, y, comp))  return 1;
   7648    #endif
   7649 
   7650    #ifndef STBI_NO_PSD
   7651    if (stbi__psd_info(s, x, y, comp))  return 1;
   7652    #endif
   7653 
   7654    #ifndef STBI_NO_PIC
   7655    if (stbi__pic_info(s, x, y, comp))  return 1;
   7656    #endif
   7657 
   7658    #ifndef STBI_NO_PNM
   7659    if (stbi__pnm_info(s, x, y, comp))  return 1;
   7660    #endif
   7661 
   7662    #ifndef STBI_NO_HDR
   7663    if (stbi__hdr_info(s, x, y, comp))  return 1;
   7664    #endif
   7665 
   7666    // test tga last because it's a crappy test!
   7667    #ifndef STBI_NO_TGA
   7668    if (stbi__tga_info(s, x, y, comp))
   7669        return 1;
   7670    #endif
   7671    return stbi__err("unknown image type", "Image not of any known type, or corrupt");
   7672 }
   7673 
   7674 static int stbi__is_16_main(stbi__context *s)
   7675 {
   7676    #ifndef STBI_NO_PNG
   7677    if (stbi__png_is16(s))  return 1;
   7678    #endif
   7679 
   7680    #ifndef STBI_NO_PSD
   7681    if (stbi__psd_is16(s))  return 1;
   7682    #endif
   7683 
   7684    #ifndef STBI_NO_PNM
   7685    if (stbi__pnm_is16(s))  return 1;
   7686    #endif
   7687    return 0;
   7688 }
   7689 
   7690 #ifndef STBI_NO_STDIO
   7691 STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
   7692 {
   7693     FILE *f = stbi__fopen(filename, "rb");
   7694     int result;
   7695     if (!f) return stbi__err("can't fopen", "Unable to open file");
   7696     result = stbi_info_from_file(f, x, y, comp);
   7697     fclose(f);
   7698     return result;
   7699 }
   7700 
   7701 STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
   7702 {
   7703    int r;
   7704    stbi__context s;
   7705    long pos = ftell(f);
   7706    stbi__start_file(&s, f);
   7707    r = stbi__info_main(&s,x,y,comp);
   7708    fseek(f,pos,SEEK_SET);
   7709    return r;
   7710 }
   7711 
   7712 STBIDEF int stbi_is_16_bit(char const *filename)
   7713 {
   7714     FILE *f = stbi__fopen(filename, "rb");
   7715     int result;
   7716     if (!f) return stbi__err("can't fopen", "Unable to open file");
   7717     result = stbi_is_16_bit_from_file(f);
   7718     fclose(f);
   7719     return result;
   7720 }
   7721 
   7722 STBIDEF int stbi_is_16_bit_from_file(FILE *f)
   7723 {
   7724    int r;
   7725    stbi__context s;
   7726    long pos = ftell(f);
   7727    stbi__start_file(&s, f);
   7728    r = stbi__is_16_main(&s);
   7729    fseek(f,pos,SEEK_SET);
   7730    return r;
   7731 }
   7732 #endif // !STBI_NO_STDIO
   7733 
   7734 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
   7735 {
   7736    stbi__context s;
   7737    stbi__start_mem(&s,buffer,len);
   7738    return stbi__info_main(&s,x,y,comp);
   7739 }
   7740 
   7741 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
   7742 {
   7743    stbi__context s;
   7744    stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
   7745    return stbi__info_main(&s,x,y,comp);
   7746 }
   7747 
   7748 STBIDEF int stbi_is_16_bit_from_memory(stbi_uc const *buffer, int len)
   7749 {
   7750    stbi__context s;
   7751    stbi__start_mem(&s,buffer,len);
   7752    return stbi__is_16_main(&s);
   7753 }
   7754 
   7755 STBIDEF int stbi_is_16_bit_from_callbacks(stbi_io_callbacks const *c, void *user)
   7756 {
   7757    stbi__context s;
   7758    stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
   7759    return stbi__is_16_main(&s);
   7760 }
   7761 
   7762 #endif // STB_IMAGE_IMPLEMENTATION
   7763 
   7764 /*
   7765    revision history:
   7766       2.20  (2019-02-07) support utf8 filenames in Windows; fix warnings and platform ifdefs
   7767       2.19  (2018-02-11) fix warning
   7768       2.18  (2018-01-30) fix warnings
   7769       2.17  (2018-01-29) change sbti__shiftsigned to avoid clang -O2 bug
   7770                          1-bit BMP
   7771                          *_is_16_bit api
   7772                          avoid warnings
   7773       2.16  (2017-07-23) all functions have 16-bit variants;
   7774                          STBI_NO_STDIO works again;
   7775                          compilation fixes;
   7776                          fix rounding in unpremultiply;
   7777                          optimize vertical flip;
   7778                          disable raw_len validation;
   7779                          documentation fixes
   7780       2.15  (2017-03-18) fix png-1,2,4 bug; now all Imagenet JPGs decode;
   7781                          warning fixes; disable run-time SSE detection on gcc;
   7782                          uniform handling of optional "return" values;
   7783                          thread-safe initialization of zlib tables
   7784       2.14  (2017-03-03) remove deprecated STBI_JPEG_OLD; fixes for Imagenet JPGs
   7785       2.13  (2016-11-29) add 16-bit API, only supported for PNG right now
   7786       2.12  (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
   7787       2.11  (2016-04-02) allocate large structures on the stack
   7788                          remove white matting for transparent PSD
   7789                          fix reported channel count for PNG & BMP
   7790                          re-enable SSE2 in non-gcc 64-bit
   7791                          support RGB-formatted JPEG
   7792                          read 16-bit PNGs (only as 8-bit)
   7793       2.10  (2016-01-22) avoid warning introduced in 2.09 by STBI_REALLOC_SIZED
   7794       2.09  (2016-01-16) allow comments in PNM files
   7795                          16-bit-per-pixel TGA (not bit-per-component)
   7796                          info() for TGA could break due to .hdr handling
   7797                          info() for BMP to shares code instead of sloppy parse
   7798                          can use STBI_REALLOC_SIZED if allocator doesn't support realloc
   7799                          code cleanup
   7800       2.08  (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
   7801       2.07  (2015-09-13) fix compiler warnings
   7802                          partial animated GIF support
   7803                          limited 16-bpc PSD support
   7804                          #ifdef unused functions
   7805                          bug with < 92 byte PIC,PNM,HDR,TGA
   7806       2.06  (2015-04-19) fix bug where PSD returns wrong '*comp' value
   7807       2.05  (2015-04-19) fix bug in progressive JPEG handling, fix warning
   7808       2.04  (2015-04-15) try to re-enable SIMD on MinGW 64-bit
   7809       2.03  (2015-04-12) extra corruption checking (mmozeiko)
   7810                          stbi_set_flip_vertically_on_load (nguillemot)
   7811                          fix NEON support; fix mingw support
   7812       2.02  (2015-01-19) fix incorrect assert, fix warning
   7813       2.01  (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
   7814       2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
   7815       2.00  (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
   7816                          progressive JPEG (stb)
   7817                          PGM/PPM support (Ken Miller)
   7818                          STBI_MALLOC,STBI_REALLOC,STBI_FREE
   7819                          GIF bugfix -- seemingly never worked
   7820                          STBI_NO_*, STBI_ONLY_*
   7821       1.48  (2014-12-14) fix incorrectly-named assert()
   7822       1.47  (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
   7823                          optimize PNG (ryg)
   7824                          fix bug in interlaced PNG with user-specified channel count (stb)
   7825       1.46  (2014-08-26)
   7826               fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
   7827       1.45  (2014-08-16)
   7828               fix MSVC-ARM internal compiler error by wrapping malloc
   7829       1.44  (2014-08-07)
   7830               various warning fixes from Ronny Chevalier
   7831       1.43  (2014-07-15)
   7832               fix MSVC-only compiler problem in code changed in 1.42
   7833       1.42  (2014-07-09)
   7834               don't define _CRT_SECURE_NO_WARNINGS (affects user code)
   7835               fixes to stbi__cleanup_jpeg path
   7836               added STBI_ASSERT to avoid requiring assert.h
   7837       1.41  (2014-06-25)
   7838               fix search&replace from 1.36 that messed up comments/error messages
   7839       1.40  (2014-06-22)
   7840               fix gcc struct-initialization warning
   7841       1.39  (2014-06-15)
   7842               fix to TGA optimization when req_comp != number of components in TGA;
   7843               fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
   7844               add support for BMP version 5 (more ignored fields)
   7845       1.38  (2014-06-06)
   7846               suppress MSVC warnings on integer casts truncating values
   7847               fix accidental rename of 'skip' field of I/O
   7848       1.37  (2014-06-04)
   7849               remove duplicate typedef
   7850       1.36  (2014-06-03)
   7851               convert to header file single-file library
   7852               if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
   7853       1.35  (2014-05-27)
   7854               various warnings
   7855               fix broken STBI_SIMD path
   7856               fix bug where stbi_load_from_file no longer left file pointer in correct place
   7857               fix broken non-easy path for 32-bit BMP (possibly never used)
   7858               TGA optimization by Arseny Kapoulkine
   7859       1.34  (unknown)
   7860               use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
   7861       1.33  (2011-07-14)
   7862               make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
   7863       1.32  (2011-07-13)
   7864               support for "info" function for all supported filetypes (SpartanJ)
   7865       1.31  (2011-06-20)
   7866               a few more leak fixes, bug in PNG handling (SpartanJ)
   7867       1.30  (2011-06-11)
   7868               added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
   7869               removed deprecated format-specific test/load functions
   7870               removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
   7871               error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
   7872               fix inefficiency in decoding 32-bit BMP (David Woo)
   7873       1.29  (2010-08-16)
   7874               various warning fixes from Aurelien Pocheville
   7875       1.28  (2010-08-01)
   7876               fix bug in GIF palette transparency (SpartanJ)
   7877       1.27  (2010-08-01)
   7878               cast-to-stbi_uc to fix warnings
   7879       1.26  (2010-07-24)
   7880               fix bug in file buffering for PNG reported by SpartanJ
   7881       1.25  (2010-07-17)
   7882               refix trans_data warning (Won Chun)
   7883       1.24  (2010-07-12)
   7884               perf improvements reading from files on platforms with lock-heavy fgetc()
   7885               minor perf improvements for jpeg
   7886               deprecated type-specific functions so we'll get feedback if they're needed
   7887               attempt to fix trans_data warning (Won Chun)
   7888       1.23    fixed bug in iPhone support
   7889       1.22  (2010-07-10)
   7890               removed image *writing* support
   7891               stbi_info support from Jetro Lauha
   7892               GIF support from Jean-Marc Lienher
   7893               iPhone PNG-extensions from James Brown
   7894               warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
   7895       1.21    fix use of 'stbi_uc' in header (reported by jon blow)
   7896       1.20    added support for Softimage PIC, by Tom Seddon
   7897       1.19    bug in interlaced PNG corruption check (found by ryg)
   7898       1.18  (2008-08-02)
   7899               fix a threading bug (local mutable static)
   7900       1.17    support interlaced PNG
   7901       1.16    major bugfix - stbi__convert_format converted one too many pixels
   7902       1.15    initialize some fields for thread safety
   7903       1.14    fix threadsafe conversion bug
   7904               header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
   7905       1.13    threadsafe
   7906       1.12    const qualifiers in the API
   7907       1.11    Support installable IDCT, colorspace conversion routines
   7908       1.10    Fixes for 64-bit (don't use "unsigned long")
   7909               optimized upsampling by Fabian "ryg" Giesen
   7910       1.09    Fix format-conversion for PSD code (bad global variables!)
   7911       1.08    Thatcher Ulrich's PSD code integrated by Nicolas Schulz
   7912       1.07    attempt to fix C++ warning/errors again
   7913       1.06    attempt to fix C++ warning/errors again
   7914       1.05    fix TGA loading to return correct *comp and use good luminance calc
   7915       1.04    default float alpha is 1, not 255; use 'void *' for stbi_image_free
   7916       1.03    bugfixes to STBI_NO_STDIO, STBI_NO_HDR
   7917       1.02    support for (subset of) HDR files, float interface for preferred access to them
   7918       1.01    fix bug: possible bug in handling right-side up bmps... not sure
   7919               fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
   7920       1.00    interface to zlib that skips zlib header
   7921       0.99    correct handling of alpha in palette
   7922       0.98    TGA loader by lonesock; dynamically add loaders (untested)
   7923       0.97    jpeg errors on too large a file; also catch another malloc failure
   7924       0.96    fix detection of invalid v value - particleman@mollyrocket forum
   7925       0.95    during header scan, seek to markers in case of padding
   7926       0.94    STBI_NO_STDIO to disable stdio usage; rename all #defines the same
   7927       0.93    handle jpegtran output; verbose errors
   7928       0.92    read 4,8,16,24,32-bit BMP files of several formats
   7929       0.91    output 24-bit Windows 3.0 BMP files
   7930       0.90    fix a few more warnings; bump version number to approach 1.0
   7931       0.61    bugfixes due to Marc LeBlanc, Christopher Lloyd
   7932       0.60    fix compiling as c++
   7933       0.59    fix warnings: merge Dave Moore's -Wall fixes
   7934       0.58    fix bug: zlib uncompressed mode len/nlen was wrong endian
   7935       0.57    fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
   7936       0.56    fix bug: zlib uncompressed mode len vs. nlen
   7937       0.55    fix bug: restart_interval not initialized to 0
   7938       0.54    allow NULL for 'int *comp'
   7939       0.53    fix bug in png 3->4; speedup png decoding
   7940       0.52    png handles req_comp=3,4 directly; minor cleanup; jpeg comments
   7941       0.51    obey req_comp requests, 1-component jpegs return as 1-component,
   7942               on 'test' only check type, not whether we support this variant
   7943       0.50  (2006-11-19)
   7944               first released version
   7945 */
   7946 
   7947 
   7948 /*
   7949 ------------------------------------------------------------------------------
   7950 This software is available under 2 licenses -- choose whichever you prefer.
   7951 ------------------------------------------------------------------------------
   7952 ALTERNATIVE A - MIT License
   7953 Copyright (c) 2017 Sean Barrett
   7954 Permission is hereby granted, free of charge, to any person obtaining a copy of
   7955 this software and associated documentation files (the "Software"), to deal in
   7956 the Software without restriction, including without limitation the rights to
   7957 use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
   7958 of the Software, and to permit persons to whom the Software is furnished to do
   7959 so, subject to the following conditions:
   7960 The above copyright notice and this permission notice shall be included in all
   7961 copies or substantial portions of the Software.
   7962 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
   7963 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
   7964 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
   7965 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
   7966 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
   7967 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
   7968 SOFTWARE.
   7969 ------------------------------------------------------------------------------
   7970 ALTERNATIVE B - Public Domain (www.unlicense.org)
   7971 This is free and unencumbered software released into the public domain.
   7972 Anyone is free to copy, modify, publish, use, compile, sell, or distribute this
   7973 software, either in source code form or as a compiled binary, for any purpose,
   7974 commercial or non-commercial, and by any means.
   7975 In jurisdictions that recognize copyright laws, the author or authors of this
   7976 software dedicate any and all copyright interest in the software to the public
   7977 domain. We make this dedication for the benefit of the public at large and to
   7978 the detriment of our heirs and successors. We intend this dedication to be an
   7979 overt act of relinquishment in perpetuity of all present and future rights to
   7980 this software under copyright law.
   7981 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
   7982 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
   7983 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
   7984 AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
   7985 ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
   7986 WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
   7987 ------------------------------------------------------------------------------
   7988 */