aboutsummaryrefslogtreecommitdiff
path: root/research
AgeCommit message (Collapse)AuthorFilesLines
2023-10-26fix wordingEvgenii Kliuchnikov1-1/+1
PiperOrigin-RevId: 576788685
2023-07-10simplify building of fuzzerEvgenii Kliuchnikov2-0/+0
PiperOrigin-RevId: 545950923
2023-01-03brotlidump: fix dictionary file discovery (#997)Eugene Kliuchnikov2-3/+9
2022-11-17UpdateEvgenii Kliuchnikov1-2/+2
Documentation: - add note that brotli is a "stream" format, not an archive-like - regenerate .1 with Pandoc Build: - drop legacy "BROTLI_BUILD_PORTABLE" option - drop "BROTLI_SANITIZED" definition Code: - c: comb includes - c/enc: extract encoder state into separate header - c/enc: drop designated q10 codepath - c/enc: dealing better with flushing of empty stream - fix MSVC compilation API: - py: use library version instead of one in version.h - c: add plugable API to report consumed input / produced output - c/java: support "lean" prepared dictionaries (without copy of source)
2021-11-10Prepare for copybara (#939)Eugene Kliuchnikov3-3/+3
Co-authored-by: Eugene Kliuchnikov <eustas@chromium.org>
2021-09-08Strip "./" in includes (#925)Eugene Kliuchnikov5-7/+7
Co-authored-by: Eugene Kliuchnikov <eustas@chromium.org>
2021-08-31Migrate to github actions (#920)Eugene Kliuchnikov5-4/+34
Not all combinations are migrated to the initial configuration; corresponding TODOs added. Drive-by: additional combinations uncovered minor portability problems -> fixed Drive-by: remove no-longer used "script" files. Co-authored-by: Eugene Kliuchnikov <eustas@chromium.org>
2021-08-18Update (#918)Eugene Kliuchnikov10-56/+56
Prepare to use copybara worklow.
2021-06-23Update (#908)Eugene Kliuchnikov1-27/+44
* re-enable Js build/test * improve decoder performance * rewrite dictionary data in Java/Js to a shorter uncompressed form * improve dictionary generation tool
2020-09-27docs: Fix small typo: rougly -> roughly (#849)Tim Gates1-1/+1
2020-08-26Update (#826)Eugene Kliuchnikov2-11/+15
* IMPORTANT: decoder: fix potential overflow when input chunk is >2GiB * simplify max Huffman table size calculation * eliminate symbol duplicates (static arrays in .h files) * minor combing in research/ code
2020-05-15Update (#807)Eugene Kliuchnikov6-0/+0
- fix formatting - fix type conversion - fix no-op arithmetic with null-pointer - improve performance of hash_longest_match64 - go: detect read after close - java decoder: support compound dictionary - remove executable flag on non-scripts
2019-04-12Update (#749)Eugene Kliuchnikov1-25/+119
Update: * Bazel: fix MSVC configuration * C: common: extended documentation and helpers around distance codes * C: common: enable BROTLI_DCHECK in "debug" builds * C: common: fix implicit trailing zero in `kPrefixSuffix` * C: dec: fix possible bit reader discharge for "large-window" mode * C: dec: simplify distance decoding via lookup table * C: dec: reuse decoder state members memory via union with lookup table * C: dec: add decoder state diagram * C: enc: clarify access to static dictionary * C: enc: improve static dictionary hash * C: enc: add "stream offset" parameter for parallel encoding * C: enc: reorganize hasher; now Q2-Q3 require exactly 256KiB to avoid global TCMalloc lock * C: enc: fix rare access to uninitialized data in ring-buffer * C: enc: reorganize logging / checks in `write_bits.h` * Java: dec: add "large-window" support * Java: dec: improve speed * Java: dec: debug and 32-bit mode are now activated via system properties * Java: dec: demystify some state variables (use better names) * Dictionary generator: add single input mode * Java: dec: modernize tests * Bazel: js: pick working commit for closure rules
2018-06-09Update (#680)Eugene Kliuchnikov2-2/+2
* fix MSVC warnings * cleanups
2018-06-04Inverse bazel project/workspace tree (#677)Eugene Kliuchnikov2-1/+13
* Inverse bazel workspace tree. Now each subproject directly depends on root (c) project. This helps to mitigate Bazel bug bazelbuild/bazel#2391; short summary: Bazel does not work if referenced subproject `WORKSPACE` uses any repositories that embedding project does not. Bright side: building C project is much faster; no need to download closure, go and JDK...
2018-03-20Update (#651)Eugene Kliuchnikov1-13/+25
* fix `bazel` build (ignore switch case fall-through) * add `NPOSTFIX` / `NDIRECT` encoder parameters * fix source file lists (add `params.h`) * fix bug in `durchschlag` * print clarifying messages wheb CLI argument parsing fails
2018-03-02Update (#643)v1.0.3Eugene Kliuchnikov2-0/+100
Update * make the zopflification aware of `NDIRECT`, `NPOSTFIX` (better compression in `font` mode) * add small and simple decoder tool * fix typo * Java: wrapper: make decoder channel more async-friendly Ramp up version to 1.0.3 / 1.0.3
2018-02-26New feature: "Large Window Brotli" (#640)Eugene Kliuchnikov11-171/+1206
* New feature: "Large Window Brotli" By setting special encoder/decoder flag it is now possible to extend LZ-window up to 30 bits; though produced stream will not be RFC7932 compliant. Added new dictionary generator - "DSH". It combines speed of "Sieve" and quality of "DM". Plus utilities to prepare train corpora (remove unique strings). Improved compression ratio: now two sub-blocks could be stitched: the last copy command could be extended to span the next sub-block. Fixed compression ineffectiveness caused by floating numbers rounding and wrong cost heuristic. Other C changes: - combined / moved `context.h` to `common` - moved transforms to `common` - unified some aspects of code formatting - added an abstraction for encoder (static) dictionary - moved default allocator/deallocator functions to `common` brotli CLI: - window size is auto-adjusted if not specified explicitly Java: - added "eager" decoding both to JNI wrapper and pure decoder - huge speed-up of `DictionaryData` initialization * Add dictionaryless compressed dictionary * Fix `sources.lst` * Fix `sources.lst` and add a note that `libtool` is also required. * Update setup.py * Fix `EagerStreamTest` * Fix BUILD file * Add missing `libdivsufsort` dependency * Fix "unused parameter" warning.
2018-02-08Fix brotlidump.py crashing when complex prefix code has exactly 1 non-zero ↵Daniel Chýlek1-3/+4
code length (#635) According to the format specification regarding complex prefix codes: > If there are at least two non-zero code lengths, any trailing zero > code lengths are omitted, i.e., the last code length in the > sequence must be non-zero. In this case, the sum of (32 >> code > length) over all the non-zero code lengths must equal to 32. > If the lengths have been read for the entire code length alphabet > and there was only one non-zero code length, then the prefix code > has one symbol whose code has zero length. The script does not handle a case where there is just 1 non-zero code length where the sum rule doesn't apply, which causes a StopIteration exception when it attempts to read past the list boundaries. An example of such file is tests/testdata/mapsdatazrh.compressed. I made sure this change doesn't break anything by processing all *.compressed files from the testdata folder with no thrown exceptions.
2017-11-28Update (#620)v1.0.2Eugene Kliuchnikov1-0/+1
* add autotools build * separate semantic and ABI version * extract sources.lst (used by CMake and Automake) * share pkgconfig templates (used by CMake and Automake) * decoder: always set `total_out` * encoder: fix `BROTLI_ENSURE_CAPACITY` macro (no-op after preprocessor) * decoder/encoder: refine `free_func` contract
2017-10-13Add new (fast) dictionary generator engine. (#616)Eugene Kliuchnikov6-25/+436
Add CLI for dictionary generation. Add BUILD file for research folder
2017-10-10Fix permissions of various files in project (#613)Tomáš Popela2-0/+0
Move from 755 to 644.
2017-08-28Update (#590)Eugene Kliuchnikov1-2/+5
* add transpiled JS decoder * make PY wrapper accept memview * fix dictionary generator * speedup compression of RLEish data
2017-07-21Update (#574)custom-dictionaryEugene Kliuchnikov2-0/+300
* Update * decoder: better behavior after failure * encoder: replace "len_x_code" with delta * research: add experimental dictionary generator * python: test combing
2016-12-22Research (#491)Eugene Kliuchnikov2-32/+151
* add advanced mode for optimal references generator * fix #489 Thanks to Ivan Nikulin for working on it.
2016-12-20Move brotlidump.py to research/ (#487)Eugene Kliuchnikov1-0/+2361
2016-09-22Update researchEugene Kliuchnikov5-36/+55
* don't use `assert` when side-effect is desired * use `gflags` to pick options from args Other changes: * teach stub `Makefile` to do partial rebuild * remove obsolete `tools/version.h`
2016-09-19Replace sais.hxx by submodule hillbig/esaxx.Ivan Nikulin3-365/+1
2016-09-15Update research tools description.Ivan Nikulin4-6/+21
2016-09-15Update variable naming.Ivan Nikulin1-20/+16
2016-09-15Add description of research tools.Ivan Nikulin2-7/+53
2016-09-15Add distance encoding research tools.Ivan Nikulin6-0/+879