BenchmarksGPT-5 NanoFrench (France) tasks

GPT-5 Nano

5 tasks

Each row below is a single benchmark task this model was evaluated on. The Score column averages every metric the task reports (accuracy, F1, exact-match, etc.). Click a row to browse the individual questions and the model's responses.

Average

64.9

Score	Language	Task	Metrics
88.4	French (France)	french_sib200 french classification f1_macro: 88.4sample_len: 204.0	f1_macro: 88.4sample_len: 204.0
65.2	French (France)	french_mgsm french math exact_match: 65.2sample_len: 250.0	exact_match: 65.2sample_len: 250.0
59.2	French (France)	french_belebele french mcq f1_macro: 59.2sample_len: 900.0	f1_macro: 59.2sample_len: 900.0
58.3	French (France)	french_fquad french qa exact_match: 54.3f1: 62.3sample_len: 400.0	exact_match: 54.3f1: 62.3sample_len: 400.0
53.6	French (France)	french_mmmlu french mcq f1_macro: 53.6sample_len: 14042.0	f1_macro: 53.6sample_len: 14042.0