diff options
| -rw-r--r-- | README.md | 29 |
1 files changed, 29 insertions, 0 deletions
| @@ -66,6 +66,28 @@ Then; | |||
| 66 | This will create two directories; `dictionaries` and `wordnets`. | 66 | This will create two directories; `dictionaries` and `wordnets`. |
| 67 | Linewise aligned definition files are in `wordnets/ready`. | 67 | Linewise aligned definition files are in `wordnets/ready`. |
| 68 | 68 | ||
| 69 | <details><summary>Language pairs and number of available aligned glosses</summary> | ||
| 70 | <p> | ||
| 71 | Source Language | Target Language | # of Pairs | ||
| 72 | --- | --- | ---: | ||
| 73 | en | bg | 4959 | ||
| 74 | en | el | 18136 | ||
| 75 | en | it | 12688 | ||
| 76 | en | ro | 58754 | ||
| 77 | en | sl | 3144 | ||
| 78 | en | sq | 4681 | ||
| 79 | bg | el | 2817 | ||
| 80 | bg | it | 2115 | ||
| 81 | bg | ro | 4701 | ||
| 82 | el | it | 4801 | ||
| 83 | el | ro | 2144 | ||
| 84 | el | sq | 4681 | ||
| 85 | it | ro | 10353 | ||
| 86 | ro | sl | 2085 | ||
| 87 | ro | sq | 4646 | ||
| 88 | </p> | ||
| 89 | </details> | ||
| 90 | |||
| 69 | ## Acquiring The Embeddings | 91 | ## Acquiring The Embeddings |
| 70 | 92 | ||
| 71 | We use [VecMap](https://github.com/artetxem/vecmap) on [fastText](https://fasttext.cc/) embeddings. You can skip this step if you are providing your own polylingual embeddings. | 93 | We use [VecMap](https://github.com/artetxem/vecmap) on [fastText](https://fasttext.cc/) embeddings. You can skip this step if you are providing your own polylingual embeddings. |
| @@ -88,3 +110,10 @@ git submodule init && git submodule update | |||
| 88 | 110 | ||
| 89 | Bear in mind that this will require around 50 GB free space. | 111 | Bear in mind that this will require around 50 GB free space. |
| 90 | 112 | ||
| 113 | ## Quick Demo | ||
| 114 | |||
| 115 | `demo.sh` is included, downloads data for 2 languages. | ||
| 116 | |||
| 117 | ```bash | ||
| 118 | ./demo.sh | ||
| 119 | ``` | ||
