lcb_codegen_v5: by examples

Home Doc/Code

Not solved by any model

There are 56 examples not solved by any model. Solving some of these can be a good signal that your model is indeed better than leading models if these are good problems.
atcoder.abc301_f, atcoder.abc311_c, atcoder.abc314_e, atcoder.abc315_e, atcoder.abc315_f, atcoder.abc319_c, atcoder.abc324_f, atcoder.abc327_e, atcoder.abc333_e, atcoder.abc337_e, atcoder.abc338_f, atcoder.abc343_a, atcoder.abc343_e, atcoder.abc350_c, atcoder.abc350_e, atcoder.abc355_e, atcoder.abc359_c, atcoder.abc359_e, atcoder.abc362_c, atcoder.abc363_f, atcoder.abc371_f, atcoder.abc372_f, atcoder.abc373_g, atcoder.abc374_d, atcoder.abc374_g, atcoder.abc375_b, atcoder.abc375_f, atcoder.abc376_f, atcoder.abc376_g, atcoder.abc378_g, atcoder.abc382_g, atcoder.abc385_f, atcoder.arc181_a, atcoder.arc181_c, atcoder.arc181_d, atcoder.arc182_d, atcoder.arc182_e, atcoder.arc183_b, atcoder.arc183_c, atcoder.arc183_d, atcoder.arc184_c, atcoder.arc184_d, atcoder.arc185_c, atcoder.arc186_a, atcoder.arc186_b, atcoder.arc186_c, atcoder.arc186_d, atcoder.arc186_e, atcoder.arc187_b, atcoder.arc188_c, atcoder.arc189_a, atcoder.arc189_b, leetcode.3211, leetcode.3327, leetcode.3584, leetcode.3638

Problems solved by 1 model only

example_link model min_elo
atcoder.arc183_a Kimi-k1.6-IOI-high 1096.025
atcoder.arc182_a O1-2024-12-17 (High) 1085.166
leetcode.3688 O1-2024-12-17 (High) 1085.166
atcoder.abc325_d O1-2024-12-17 (High) 1085.166
atcoder.abc354_d O1-2024-12-17 (High) 1085.166
atcoder.arc185_d O1-2024-12-17 (High) 1085.166
atcoder.arc184_e DeepSeek-R1-Preview 1066.811
leetcode.3551 DeepSeek-R1-Preview 1066.811
atcoder.abc364_f Llama-3_1-Nemotron-Ultra-253B-v1 1064.699
atcoder.abc368_g DeepCoder-14B-Preview 1048.360
leetcode.3478 DeepCoder-14B-Preview 1048.360
atcoder.abc366_g DeepCoder-14B-Preview 1048.360
atcoder.abc373_f DeepCoder-14B-Preview 1048.360
atcoder.abc373_e DeepCoder-14B-Preview 1048.360
atcoder.abc372_g DeepCoder-14B-Preview 1048.360
atcoder.abc370_f DeepCoder-14B-Preview 1048.360
atcoder.abc370_g DeepCoder-14B-Preview 1048.360
atcoder.abc367_g DeepCoder-14B-Preview 1048.360
leetcode.3562 O1-Preview-2024-09-12 984.201
leetcode.3344 DeepSeek-V3 copy 980.094
atcoder.arc188_d DeepSeek-V3 copy 980.094
leetcode.3233 Claude-3.5-Sonnet-20240620 957.799

Suspect problems

These are 10 problems with the lowest correlation with the overall evaluation (i.e. better models tend to do worse on these. )

example_link acc tau
codeforces.1883_B 0.417 -0.387
leetcode.2834 0.958 -0.264
atcoder.abc379_f 0.333 -0.255
leetcode.2857 0.792 -0.216
leetcode.2919 0.750 -0.174
leetcode.3230 0.792 -0.142
leetcode.3347 0.958 -0.138
leetcode.3233 0.042 -0.138
atcoder.abc303_a 0.917 -0.127
atcoder.abc384_f 0.667 -0.096

Histogram of accuracies

Histogram of problems by the accuracy on each problem.

Histogram of difficulties

Histogram of problems by the minimum Elo to solve each problem.