aboutsummaryrefslogtreecommitdiff
path: root/research
diff options
context:
space:
mode:
authorIvan Nikulin <vanickulin@google.com>2016-09-15 11:34:19 +0200
committerIvan Nikulin <vanickulin@google.com>2016-09-15 11:34:19 +0200
commit9589396e5dcca3fd16f568ff93d99124fd0eed2e (patch)
treea1748ce452d6f7690b121a4263c195e7d4411110 /research
parent58cecf1783c1a814f282f859609721fefebf74aa (diff)
downloadbrotli-9589396e5dcca3fd16f568ff93d99124fd0eed2e.zip
brotli-9589396e5dcca3fd16f568ff93d99124fd0eed2e.tar.gz
brotli-9589396e5dcca3fd16f568ff93d99124fd0eed2e.tar.bz2
Add description of research tools.
Diffstat (limited to 'research')
-rw-r--r--research/README.md52
-rw-r--r--research/read_dist.h8
2 files changed, 53 insertions, 7 deletions
diff --git a/research/README.md b/research/README.md
new file mode 100644
index 0000000..ce89dd6
--- /dev/null
+++ b/research/README.md
@@ -0,0 +1,52 @@
+## Introduction
+
+This directory contains several research tools that have been very useful during LZ77 backward distance encoding research.
+
+Notice that all `FLAGS_*` variables were supposed to be command-line flags.
+
+## Tools
+### find\_opt\_references
+
+This tool generates optimal (match-length-wise) backward references for every position in the input files and stores them in `*.dist` file described below.
+
+Example usage:
+
+ find_opt_references input.txt output.dist
+
+### draw\_histogram
+
+This tool generates a visualization of the distribution of backward references stored in `*.dist` file. The output is a grayscale PGM (binary) image.
+
+Example usage:
+
+ draw_histogram input.dist 65536 output.pgm
+
+### draw\_diff
+
+This tool generates a diff PPM (binary) image between two input PGM (binary) images. Input images must be of same size and contain 255 colors. Useful for comparing different backward references distributions for same input file.
+
+Example usage:
+
+ draw_diff image1.pgm image2.pgm diff.ppm
+
+
+## Backward distance file format
+
+The format of `*.dist` files is as follows:
+
+ [[ 0| match legnth][ 1|position|distance]...]
+ [1 byte| 4 bytes][1 byte| 4 bytes| 4 bytes]
+
+More verbose explanation: for each backward reference there is a position-distance pair, also a copy length may be specified. Copy length is prefixed with flag byte 0, position-distance pair is prefixed with flag byte 1. Each number is a 32-bit integer. Copy length always comes before position-distance pair. Standalone copy length is allowed, in this case it is ignored.
+
+Here's an example how to read from `*.dist` file:
+
+```c++
+#include "read_dist.h"
+
+FILE* f;
+int copy, pos, dist;
+while (ReadBackwardReference(fin, &copy, &pos, &dist)) {
+ ...
+}
+```
diff --git a/research/read_dist.h b/research/read_dist.h
index e626ea1..dd5ade3 100644
--- a/research/read_dist.h
+++ b/research/read_dist.h
@@ -5,13 +5,7 @@
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
-/* API for reading distances from *.dist file.
- The format of *.dist file is as follows: for each backward reference there is
- a position-distance pair, also a copy length may be specified. Copy length is
- prefixed with flag byte 0, position-distance pair is prefixed with flag
- byte 1. Each number is a 32-bit integer. Copy length always comes before
- position-distance pair. Standalone copy length is allowed, in this case it is
- ignored. */
+/* API for reading distances from *.dist file. */
#include <cassert>
#include <cstdio>