ModelChorusModelChorus
ChallengeChatLeaderboardBenchmarksHistoryHow it works
Terms of ServicePrivacy PolicyAPI

Copyright 2026 MeetKai Inc.

Benchmarks/Rnj 1 Instruct/Arabic (Saudi Arabia) tasks

Rnj 1 Instruct

5 tasks

Each row below is a single benchmark task this model was evaluated on. The Score column averages every metric the task reports (accuracy, F1, exact-match, etc.). Click a row to browse the individual questions and the model's responses.

Average
52.4
ScoreLanguageTaskMetrics
77.0Arabic (Saudi Arabia)
arabic_sib200
arabic classification
f1_macro: 77.0sample_len: 204.0
f1_macro: 77.0sample_len: 204.0
51.3Arabic (Saudi Arabia)
arabic_belebele
arabic mcq
f1_macro: 51.3sample_len: 900.0
f1_macro: 51.3sample_len: 900.0
51.2Arabic (Saudi Arabia)
arabic_tydiqa
arabic qa
exact_match: 39.1f1: 63.4sample_len: 921.0
exact_match: 39.1f1: 63.4sample_len: 921.0
48.1Arabic (Saudi Arabia)
arabic_aratrust
arabic mcq
f1_macro: 48.1sample_len: 522.0
f1_macro: 48.1sample_len: 522.0
34.1Arabic (Saudi Arabia)
arabic_mmlu
arabic mcq
f1_macro: 34.1sample_len: 14316.0
f1_macro: 34.1sample_len: 14316.0