UMR data sets
The following chart shows the number of sentences for each language in UMR 1.0 Dataset.
Doc level refers to sentences annotated with document level information, while
Sentence level refers to sentences annotated by within-sentence relations only.
Language |
Sentence Level |
Doc Level |
English |
209 |
202 |
Chinese |
358 |
358 |
Arapaho |
406 |
109 |
Navajo |
522 |
168 |
Sanapaná |
602 |
602 |
Kukama |
105 |
86 |
Total |
2022 |
1525 |