|
|
# What is `salza`?
|
|
|
|
|
|
`salza` is a practical implementation of two universal algorithmic information measures on sequences based on LZ77[^1]<sup>,</sup>[^2] and relative LZ[^3]<sup>,</sup>[^4] coding. Its rationale are described in this [preprint](https://cloud.uvolante.org/index.php/s/6VDJhgCBFv2dUCM/download).
|
|
|
|
|
|
`salza` comes with built-in computation of a universal, normalized semi-distance (much in the spirit of Cilibrasi _et al._ work[^5]) and an implementation of causality inference using the (stable) PC algorithm[^6].
|
|
|
|
|
|
`salza` is multithreaded[^6], too.
|
|
|
|
|
|
# Licensing information
|
|
|
|
|
|
`salza` is released as is, without any warranty, under a dual licensing scheme.
|
|
|
|
|
|
By default, `salza` is distributed under the [GNU Affero General Public License, version 3](https://www.gnu.org/licenses/agpl-3.0.html).
|
|
|
|
|
|
If you cannot comply with AGPLv3, please [contact us](mailto:cayre@uvolante.org?Subject=Alternative Software Licesing Inquiry for SALZA) for alternative licensing.
|
|
|
|
|
|
# Debian/Ubuntu repository
|
|
|
|
|
|
We provide pre-compiled binaries for Debian/Ubuntu `amd64` architectures.
|
|
|
|
|
|
Please follow [these instructions](https://www.uvolante.org/apt) to add the repository to your system.
|
|
|
|
|
|
Once the repository is available on your system:
|
|
|
```
|
|
|
sudo apt install salza
|
|
|
```
|
|
|
|
|
|
# Source code
|
|
|
|
|
|
## Requirements
|
|
|
|
|
|
`salza` makes use of the following software:
|
|
|
|
|
|
* `clang`, `make`, `cmake`, `doxygen`, `git`,
|
|
|
* the [`oops`](https://forge.uvolante.org/code/oops/wikis) library.
|
|
|
|
|
|
## Cloning the source repository
|
|
|
|
|
|
Once `oops` is compiled, clone the `git` tree:
|
|
|
```
|
|
|
git clone https://forge.uvolante.org/code/salza.git
|
|
|
```
|
|
|
|
|
|
# References
|
|
|
|
|
|
[^1]: Jacob Ziv and Abraham Lempel, _"A Universal Algorithm for Sequential Data Compression"_, IEEE Transactions on Information Theory, vol. 23, No. 3, pp. 337--343, May 1977.
|
|
|
[^2]: Maxime Crochemore and Lucian Ilie, _"Computing Longest Previous Factor in Linear Time and Applications"_, Information Processing Letters, vol. 160, issue 2, pp. 75--80, April 2008.
|
|
|
[^3]: Jacob Ziv and Neri Merhav, _"A Measure of Relative Entropy between Individual Sequences with Application to Universal Classification"_, IEEE Transactions on Information Theory, vol. 39, No. 4, pp. 1270--1279, July 1993.
|
|
|
[^4]: Christopher Hoobin, Simon J. Puglisi and Justin Zobel, _"Relative Lempel-Ziv Factorization for Efficient Storage and Retrieval of Web Collections"_, Proceedings of the VLDB Endowment, vol. 5, issue 3, pp. 265--273, November 2011.
|
|
|
[^5]: Rudi Cilibrasi and Paul M. B. Vitányi, _"Clustering by Compression"_, IEEE Transactions on Information Theory, vol. 51, No. 4, pp. 1523--1545, April 2005.
|
|
|
[^6]: Diego Colombo and Marloes H. Maathuis, _"Order-Independent Constraint-Based Causal Structure Learning"_, Journal of Machine Learning Research, vol. 15, pp. 3921--3962, November 2014 ([link](http://jmlr.org/papers/v15/colombo14a.html)).
|
|
|
[^7]: Bil Lewis and Daniel J. Berg, _"Threads Primer: A Guide to Multithreaded Programming"_, ISBN-13 978-0134436982, Prentice Hall, October 1995. |
|
|
\ No newline at end of file |