... | ... | @@ -56,7 +56,7 @@ The last step can be replaced with building a Debian package (type: `cpack`) to |
|
|
|
|
|
`salza` eats a lot of memory. Launch it with the verbose option (`-v`) to monitor its behaviour.
|
|
|
In case it is too slow, that may be because your machine had to free memory. A suggestion is to kill the process and launch it again.
|
|
|
That proved useful in LXD containers where the guests do not have the same view of memory as the host.
|
|
|
That proved useful in LXD containers where the guests do not have the same view of memory as the host (you /think/ you have plenty of free memory, but you don't).
|
|
|
|
|
|
In any case, all strings have to fit in memory with their associated search structures.
|
|
|
|
... | ... | @@ -91,7 +91,7 @@ The NCD distance matrices are provided with the datasets. |
|
|
To compute the NSD matrices:
|
|
|
1. make sure you have installed [`drpt`](https://forge.uvolante.org/code/drpt/wikis) ;
|
|
|
2. download the script [compute-nxd-matrices](https://cloud.uvolante.org/index.php/s/gqPCRCKwFKwf887/download) ;
|
|
|
3. update the variable `DATASET` in `compute-nxd-matrices` with the path to the datasets ;
|
|
|
3. update the `DATASET` variable in `compute-nxd-matrices` with the path to the datasets ;
|
|
|
4. run the script on the four datasets:
|
|
|
```
|
|
|
./compute-nxd-matrices markov
|
... | ... | @@ -106,8 +106,8 @@ This Jupyter [notebook](https://cloud.uvolante.org/index.php/s/jB4qgyMZx6PxttD/d |
|
|
|
|
|
First, compute the NSD semi-distance matrices:
|
|
|
```
|
|
|
salza -d -i /path/to/dataset/languages/strings -v > ~/languages-nsd-sim.csv
|
|
|
salza -d -i /path/to/dataset/languages/strings --cross -v > ~/languages-nsd.csv
|
|
|
salza -d -i /path/to/datasets/languages/strings -v > ~/languages-nsd-sim.csv
|
|
|
salza -d -i /path/to/datasets/languages/strings --cross -v > ~/languages-nsd.csv
|
|
|
```
|
|
|
|
|
|
This Jupyter [notebook](https://cloud.uvolante.org/index.php/s/N7xNkTJsYF5sc5T/download) uses these utility [functions](https://cloud.uvolante.org/index.php/s/ERs88CFipPxZZaB/download) to plot the dendrograms.
|
... | ... | @@ -118,12 +118,12 @@ Make sure you have installed `neato` (provided by `graphviz`). |
|
|
|
|
|
Computing a CPDAG from the Toussaint drafts:
|
|
|
```
|
|
|
salza -g -i /path/to/dataset/toussaint/strings --skel stable --eta 0.008 --dag -v | neato -Tpdf > toussaint.pdf
|
|
|
salza -g -i /path/to/datasets/toussaint/strings --skel stable --eta 0.008 --dag -v | neato -Tpdf > toussaint.pdf
|
|
|
```
|
|
|
|
|
|
Computing a stable skeleton from the `languages` dataset:
|
|
|
```
|
|
|
salza -g -i /path/to/dataset/dataset/strings --skel stable -v | neato -Tpdf > languages.pdf
|
|
|
salza -g -i /path/to/datasets/languages/strings --skel stable -v | neato -Tpdf > languages.pdf
|
|
|
```
|
|
|
|
|
|
# References
|
... | ... | |