diff options
-rw-r--r-- | README.md | 29 |
1 files changed, 29 insertions, 0 deletions
@@ -66,6 +66,28 @@ Then; | |||
66 | This will create two directories; `dictionaries` and `wordnets`. | 66 | This will create two directories; `dictionaries` and `wordnets`. |
67 | Linewise aligned definition files are in `wordnets/ready`. | 67 | Linewise aligned definition files are in `wordnets/ready`. |
68 | 68 | ||
69 | <details><summary>Language pairs and number of available aligned glosses</summary> | ||
70 | <p> | ||
71 | Source Language | Target Language | # of Pairs | ||
72 | --- | --- | ---: | ||
73 | en | bg | 4959 | ||
74 | en | el | 18136 | ||
75 | en | it | 12688 | ||
76 | en | ro | 58754 | ||
77 | en | sl | 3144 | ||
78 | en | sq | 4681 | ||
79 | bg | el | 2817 | ||
80 | bg | it | 2115 | ||
81 | bg | ro | 4701 | ||
82 | el | it | 4801 | ||
83 | el | ro | 2144 | ||
84 | el | sq | 4681 | ||
85 | it | ro | 10353 | ||
86 | ro | sl | 2085 | ||
87 | ro | sq | 4646 | ||
88 | </p> | ||
89 | </details> | ||
90 | |||
69 | ## Acquiring The Embeddings | 91 | ## Acquiring The Embeddings |
70 | 92 | ||
71 | We use [VecMap](https://github.com/artetxem/vecmap) on [fastText](https://fasttext.cc/) embeddings. You can skip this step if you are providing your own polylingual embeddings. | 93 | We use [VecMap](https://github.com/artetxem/vecmap) on [fastText](https://fasttext.cc/) embeddings. You can skip this step if you are providing your own polylingual embeddings. |
@@ -88,3 +110,10 @@ git submodule init && git submodule update | |||
88 | 110 | ||
89 | Bear in mind that this will require around 50 GB free space. | 111 | Bear in mind that this will require around 50 GB free space. |
90 | 112 | ||
113 | ## Quick Demo | ||
114 | |||
115 | `demo.sh` is included, downloads data for 2 languages. | ||
116 | |||
117 | ```bash | ||
118 | ./demo.sh | ||
119 | ``` | ||