ModelChorusModelChorus
ChallengeChatLeaderboardBenchmarksHistoryHow it works
Terms of ServicePrivacy PolicyAPI

Copyright 2026 MeetKai Inc.

Benchmarks/Rnj 1 Instruct/English (US) tasks

Rnj 1 Instruct

5 tasks

Each row below is a single benchmark task this model was evaluated on. The Score column averages every metric the task reports (accuracy, F1, exact-match, etc.). Click a row to browse the individual questions and the model's responses.

Average
71.7
ScoreLanguageTaskMetrics
84.4English (US)
english_mgsm
english math
exact_match: 84.4sample_len: 250.0
exact_match: 84.4sample_len: 250.0
82.4English (US)
english_gsm8k
english math
exact_match: 82.4sample_len: 1319.0
exact_match: 82.4sample_len: 1319.0
77.7English (US)
english_belebele
english mcq
f1_macro: 77.7sample_len: 900.0
f1_macro: 77.7sample_len: 900.0
69.6English (US)
ifeval
ifeval
inst_level_loose_acc: 76.3inst_level_strict_acc: 72.3prompt_level_loose_acc: 66.7prompt_level_strict_acc: 63.0sample_len: 541.0
inst_level_loose_acc: 76.3inst_level_strict_acc: 72.3prompt_level_loose_acc: 66.7prompt_level_strict_acc: 63.0sample_len: 541.0
44.2English (US)
english_mmlu_pro
english mmlu pro
exact_match: 44.2sample_len: 2100.0
exact_match: 44.2sample_len: 2100.0