aboutsummaryrefslogtreecommitdiff
path: root/research/dictionary_generator.cc
AgeCommit message (Collapse)AuthorFilesLines
2021-11-10Prepare for copybara (#939)Eugene Kliuchnikov1-1/+1
Co-authored-by: Eugene Kliuchnikov <eustas@chromium.org>
2021-09-08Strip "./" in includes (#925)Eugene Kliuchnikov1-3/+3
Co-authored-by: Eugene Kliuchnikov <eustas@chromium.org>
2021-08-31Migrate to github actions (#920)Eugene Kliuchnikov1-3/+12
Not all combinations are migrated to the initial configuration; corresponding TODOs added. Drive-by: additional combinations uncovered minor portability problems -> fixed Drive-by: remove no-longer used "script" files. Co-authored-by: Eugene Kliuchnikov <eustas@chromium.org>
2021-06-23Update (#908)Eugene Kliuchnikov1-27/+44
* re-enable Js build/test * improve decoder performance * rewrite dictionary data in Java/Js to a shorter uncompressed form * improve dictionary generation tool
2020-05-15Update (#807)Eugene Kliuchnikov1-0/+0
- fix formatting - fix type conversion - fix no-op arithmetic with null-pointer - improve performance of hash_longest_match64 - go: detect read after close - java decoder: support compound dictionary - remove executable flag on non-scripts
2019-04-12Update (#749)Eugene Kliuchnikov1-25/+119
Update: * Bazel: fix MSVC configuration * C: common: extended documentation and helpers around distance codes * C: common: enable BROTLI_DCHECK in "debug" builds * C: common: fix implicit trailing zero in `kPrefixSuffix` * C: dec: fix possible bit reader discharge for "large-window" mode * C: dec: simplify distance decoding via lookup table * C: dec: reuse decoder state members memory via union with lookup table * C: dec: add decoder state diagram * C: enc: clarify access to static dictionary * C: enc: improve static dictionary hash * C: enc: add "stream offset" parameter for parallel encoding * C: enc: reorganize hasher; now Q2-Q3 require exactly 256KiB to avoid global TCMalloc lock * C: enc: fix rare access to uninitialized data in ring-buffer * C: enc: reorganize logging / checks in `write_bits.h` * Java: dec: add "large-window" support * Java: dec: improve speed * Java: dec: debug and 32-bit mode are now activated via system properties * Java: dec: demystify some state variables (use better names) * Dictionary generator: add single input mode * Java: dec: modernize tests * Bazel: js: pick working commit for closure rules
2018-02-26New feature: "Large Window Brotli" (#640)Eugene Kliuchnikov1-20/+99
* New feature: "Large Window Brotli" By setting special encoder/decoder flag it is now possible to extend LZ-window up to 30 bits; though produced stream will not be RFC7932 compliant. Added new dictionary generator - "DSH". It combines speed of "Sieve" and quality of "DM". Plus utilities to prepare train corpora (remove unique strings). Improved compression ratio: now two sub-blocks could be stitched: the last copy command could be extended to span the next sub-block. Fixed compression ineffectiveness caused by floating numbers rounding and wrong cost heuristic. Other C changes: - combined / moved `context.h` to `common` - moved transforms to `common` - unified some aspects of code formatting - added an abstraction for encoder (static) dictionary - moved default allocator/deallocator functions to `common` brotli CLI: - window size is auto-adjusted if not specified explicitly Java: - added "eager" decoding both to JNI wrapper and pure decoder - huge speed-up of `DictionaryData` initialization * Add dictionaryless compressed dictionary * Fix `sources.lst` * Fix `sources.lst` and add a note that `libtool` is also required. * Update setup.py * Fix `EagerStreamTest` * Fix BUILD file * Add missing `libdivsufsort` dependency * Fix "unused parameter" warning.
2017-10-13Add new (fast) dictionary generator engine. (#616)Eugene Kliuchnikov1-0/+153
Add CLI for dictionary generation. Add BUILD file for research folder