UMR data sets

UMR 1.0 Dataset
The following chart shows the number of sentences for each language in UMR 1.0 Dataset. Doc level refers to sentences annotated with document level information, while Sentence level refers to sentences annotated by within-sentence relations only.
Language Sentence Level Doc Level
English 209 202
Chinese 358 358
Arapaho 406 109
Navajo 522 168
Sanapaná 602 602
Kukama 105 86
Total 2022 1525