I've seen many potentially good lossless codecs come out with almost useless interfaces (or none at all). Here are some attributes I've seen of good codecs:
- If you want others to use your codec in their apps, don't just provide a single command line executable with a awkward as hell command line interface. Support static libraries and SO's/DLL's, otherwise it's not useful to a large number of potential customers no matter how cool your codec is.
- Minimum number of source files, preferably in all-C.
If you use C++, don't rely on a ton of 3rd party crap like boost, etc. It just needs to compile out of the box.
Related: Programmers are generally a lazy bunch and hate mucking around with build systems, make files, etc. Make it easy for others to build your stuff, or just copy & paste into your project. Programmers will gladly sacrifice some things (such as raw perf, features, format compatibility, etc. - see stb_image.h) if it's fast and trivial to plop your code into their project. Don't rely on a ton of weird macros that must be configured by a custom build system.
- Even if the codec is C++, provide the interface in pure C so the codec can be trivially interfaced to other languages.
- Must provide heap alloc callbacks, so the caller can redirect all allocations to their own system.
- Support a "compile as ANSI C" mode, so it's easy to get your codec minimally working on new platforms. The user can fill in the platform specific stuff (atomics, threading, etc.) later, if needed.
Related: If you use threads, support pthreads and don't use pthread spinlocks (because OSX doesn't support pthread spinlocks). Basic pthreads is portable across many platforms (even Win32 with a library, but just support Win32 too because it's trivial).
- Don't assume you can go allocate a single huge 256MB+ block on the heap. On mobile platforms this isn't a hot idea. Allocate smaller blocks, or ideally just 1 block and manage the heap yourself, or don't use heaps.
- Streaming support, to minimize memory consumption on small devices. Very important in the mobile world.
- Expose a brutally simple API for memory to memory compression.
- Support a zlib-compatible API. It's a standard, everybody knows it, and it just works. If you support this API, it becomes almost trivial to plop your codec into other existing programs. This allows you to also leverage the existing body of zlib docs/knowledge.
- Support in-place memory to memory decompression, if you can, for use in very memory constrained environments.
- Single threaded performance is still important: Codecs which depend on using tons of cores (for either comp or decomp) to be practical aren't useful on many mobile devices.
- In many practical use cases, the user doesn't give a hoot about compression performance at all. They are compressing once and distributing the resulting compressed data many times, and only decompressing in their app. So expose optional parameters to allow the user to tune your codec's internal models to their data, like LZMA does. Don't worry about the extra time needed to compress, we have the cloud and 40+ core boxes.
- Provide a "reinit()" API for your codec, so the user can reuse all those expensive heap allocations you've made on the first init on subsequent blocks.
- Deal gracefully with already compressed, or incompressible data. Don't expand it, except by a tiny amount, and don't slow to a crawl. Related: don't fall over on very compressible data, or data containing repeated strings, etc.
- Communicate the intended use cases and assumptions up front:
Is it a super fast but low ratio codec that massively trades off ratio for speed?
Is it a symmetrical codec, i.e. is compression throughput roughly equal to decompression?
Is it a asymmetric codec, where (typically) compression time is longer than decompression time?
Is the codec useful on tiny or small blocks, or is it intended to be used on large solid blocks of data?
Does your codec require a zillion cores or massive amounts of RAM to be practical at all?
- Test and tune your codec on mobile and console devices. You'll be surprised at the dismally low performance available vs. even mid-range x86 devices. Also, these are the platforms that benefit greatly from data compression systems. One some CPU's, stuff like int divides, variable length int shifts, L2 cache misses are surprisingly expensive. On some platforms, CPU load hit stores can crush performance on seemingly decent looking code.
Also, think about your codec's strengths and weaknesses, and how it will be used in practice. It's doubtful that one codec will be good for all real-world use cases. Some example use cases I've seen from the video game world:
- If a game is displaying a static loading screen, the codec probably has access to almost the entire machine's CPU(s) and possibly a good chunk of temporary memory. The decompressor must be able to keep up with the data provider's (DVD/BlueRay/network) rate, otherwise it'll be the bottleneck. As long as the codec's consumption rate is greater or equal to the provider's data rate, it can use up a ton of CPU (because it won't be the pipeline's bottleneck). A high ratio, heavy CPU, potentially threaded codec is excellent in this case.
- If a game is streaming assets in the background during gameplay, the codec probably doesn't have a lot of CPU available. The decompressor should be optimized for low memory consumption, high performance, low CPU cache overhead, etc. It's fine if the ratio is lower than the best achievable, because streaming systems are tolerant of high latencies.
- In many games I've worked on or seen, the vast majority of distributed data falls into a few big buckets: Audio, textures, meshes, animations, executable, compiled shaders, video. The rest of the game's data (scripts, protodata, misc serialized objects) forms a long tail (lots of tiny files and a small percent of the total). It can pay off to support optimizations for these specific data types.
- If you want others to use your codec in their apps, don't just provide a single command line executable with a awkward as hell command line interface. Support static libraries and SO's/DLL's, otherwise it's not useful to a large number of potential customers no matter how cool your codec is.
- Minimum number of source files, preferably in all-C.
If you use C++, don't rely on a ton of 3rd party crap like boost, etc. It just needs to compile out of the box.
Related: Programmers are generally a lazy bunch and hate mucking around with build systems, make files, etc. Make it easy for others to build your stuff, or just copy & paste into your project. Programmers will gladly sacrifice some things (such as raw perf, features, format compatibility, etc. - see stb_image.h) if it's fast and trivial to plop your code into their project. Don't rely on a ton of weird macros that must be configured by a custom build system.
- Even if the codec is C++, provide the interface in pure C so the codec can be trivially interfaced to other languages.
- Must provide heap alloc callbacks, so the caller can redirect all allocations to their own system.
- Support a "compile as ANSI C" mode, so it's easy to get your codec minimally working on new platforms. The user can fill in the platform specific stuff (atomics, threading, etc.) later, if needed.
Related: If you use threads, support pthreads and don't use pthread spinlocks (because OSX doesn't support pthread spinlocks). Basic pthreads is portable across many platforms (even Win32 with a library, but just support Win32 too because it's trivial).
- Don't assume you can go allocate a single huge 256MB+ block on the heap. On mobile platforms this isn't a hot idea. Allocate smaller blocks, or ideally just 1 block and manage the heap yourself, or don't use heaps.
- Streaming support, to minimize memory consumption on small devices. Very important in the mobile world.
- Expose a brutally simple API for memory to memory compression.
- Support a zlib-compatible API. It's a standard, everybody knows it, and it just works. If you support this API, it becomes almost trivial to plop your codec into other existing programs. This allows you to also leverage the existing body of zlib docs/knowledge.
- Support in-place memory to memory decompression, if you can, for use in very memory constrained environments.
- Single threaded performance is still important: Codecs which depend on using tons of cores (for either comp or decomp) to be practical aren't useful on many mobile devices.
- In many practical use cases, the user doesn't give a hoot about compression performance at all. They are compressing once and distributing the resulting compressed data many times, and only decompressing in their app. So expose optional parameters to allow the user to tune your codec's internal models to their data, like LZMA does. Don't worry about the extra time needed to compress, we have the cloud and 40+ core boxes.
- Provide a "reinit()" API for your codec, so the user can reuse all those expensive heap allocations you've made on the first init on subsequent blocks.
- Deal gracefully with already compressed, or incompressible data. Don't expand it, except by a tiny amount, and don't slow to a crawl. Related: don't fall over on very compressible data, or data containing repeated strings, etc.
- Communicate the intended use cases and assumptions up front:
Is it a super fast but low ratio codec that massively trades off ratio for speed?
Is it a symmetrical codec, i.e. is compression throughput roughly equal to decompression?
Is it a asymmetric codec, where (typically) compression time is longer than decompression time?
Is the codec useful on tiny or small blocks, or is it intended to be used on large solid blocks of data?
Does your codec require a zillion cores or massive amounts of RAM to be practical at all?
- Test and tune your codec on mobile and console devices. You'll be surprised at the dismally low performance available vs. even mid-range x86 devices. Also, these are the platforms that benefit greatly from data compression systems. One some CPU's, stuff like int divides, variable length int shifts, L2 cache misses are surprisingly expensive. On some platforms, CPU load hit stores can crush performance on seemingly decent looking code.
Also, think about your codec's strengths and weaknesses, and how it will be used in practice. It's doubtful that one codec will be good for all real-world use cases. Some example use cases I've seen from the video game world:
- If a game is displaying a static loading screen, the codec probably has access to almost the entire machine's CPU(s) and possibly a good chunk of temporary memory. The decompressor must be able to keep up with the data provider's (DVD/BlueRay/network) rate, otherwise it'll be the bottleneck. As long as the codec's consumption rate is greater or equal to the provider's data rate, it can use up a ton of CPU (because it won't be the pipeline's bottleneck). A high ratio, heavy CPU, potentially threaded codec is excellent in this case.
- If a game is streaming assets in the background during gameplay, the codec probably doesn't have a lot of CPU available. The decompressor should be optimized for low memory consumption, high performance, low CPU cache overhead, etc. It's fine if the ratio is lower than the best achievable, because streaming systems are tolerant of high latencies.
- In many games I've worked on or seen, the vast majority of distributed data falls into a few big buckets: Audio, textures, meshes, animations, executable, compiled shaders, video. The rest of the game's data (scripts, protodata, misc serialized objects) forms a long tail (lots of tiny files and a small percent of the total). It can pay off to support optimizations for these specific data types.