tqa: by examples

Home   Doc/Code

Not solved by any model

There are 1338 examples not solved by any model. Solving some of these can be a good signal that your model is indeed better than leading models if these are good problems.
tqa/100, tqa/10005, tqa/10008, tqa/10013, tqa/10017, tqa/10042, tqa/10045, tqa/10050, tqa/10053, tqa/10055, tqa/10057, tqa/10080, tqa/10083, tqa/10085, tqa/10086, tqa/101, tqa/1010, tqa/10103, tqa/10104, tqa/10126, tqa/10127, tqa/10129, tqa/10133, tqa/10150, tqa/10151, tqa/10167, tqa/10168, tqa/10170, tqa/10171, tqa/10172, tqa/10183, tqa/10196, tqa/10197, tqa/10202, tqa/10216, tqa/10220, tqa/10222, tqa/10228, tqa/10232, tqa/10235, tqa/10236, tqa/10243, tqa/10254, tqa/10257, tqa/10259, tqa/10260, tqa/10263, tqa/10267, tqa/10268, tqa/10285, tqa/10288, tqa/10293, tqa/10295, tqa/10297, tqa/10301, tqa/10304, tqa/10308, tqa/10309, tqa/10314, tqa/10316, tqa/10320, tqa/10325, tqa/10333, tqa/10341, tqa/10349, tqa/1035, tqa/10356, tqa/10365, tqa/10373, tqa/10375, tqa/10390, tqa/10393, tqa/10400, tqa/10401, tqa/10418, tqa/10422, tqa/10427, tqa/10429, tqa/10432, tqa/10447, tqa/10450, tqa/10456, tqa/10459, tqa/10460, tqa/10467, tqa/10472, tqa/10473, tqa/10475, tqa/10484, tqa/10486, tqa/10497, tqa/10506, tqa/10512, tqa/10521, tqa/10526, tqa/10527, tqa/10532, tqa/10533, tqa/10538, tqa/10539, tqa/10544, tqa/10545, tqa/10546, tqa/10547, tqa/10548, tqa/10556, tqa/10560, tqa/10561, tqa/10566, tqa/10569, tqa/10570, tqa/10571, tqa/10573, tqa/10575, tqa/10577, tqa/10601, tqa/10602, tqa/10618, tqa/10621, tqa/10630, tqa/10635, tqa/10636, tqa/10647, tqa/10649, tqa/10675, tqa/10676, tqa/10695, tqa/10699, tqa/10701, tqa/10717, tqa/10724, tqa/10731, tqa/10736, tqa/10737, tqa/10738, tqa/1074, tqa/10742, tqa/10748, tqa/10750, tqa/10751, tqa/10759, tqa/10774, tqa/10778, tqa/10784, tqa/10785, tqa/10786, tqa/10794, tqa/10795, tqa/10798, tqa/10799, tqa/10805, tqa/10808, tqa/10816, tqa/10823, tqa/10824, tqa/10828, tqa/10830, tqa/10846, tqa/10847, tqa/10878, tqa/10879, tqa/10880, tqa/10890, tqa/10893, tqa/10912, tqa/10913, tqa/10917, tqa/10924, tqa/1093, tqa/10933, tqa/10944, tqa/10946, tqa/10947, tqa/10955, tqa/10969, tqa/10970, tqa/10985, tqa/10991, tqa/11023, tqa/11032, tqa/11035, tqa/11075, tqa/11081, tqa/11087, tqa/11089, tqa/11100, tqa/11105, tqa/11113, tqa/11120, tqa/11130, tqa/11142, tqa/11148, tqa/11157, tqa/11181, tqa/11203, tqa/11213, tqa/11222, tqa/11223, tqa/11229, tqa/11234, tqa/11235, tqa/11248, tqa/1129, tqa/11291, tqa/11296, tqa/11298, tqa/11299, tqa/113, tqa/11304, tqa/11310, tqa/1135, tqa/114, tqa/1154, tqa/1159, tqa/1170, tqa/1179, tqa/1198, tqa/1222, tqa/1224, tqa/1235, tqa/1267, tqa/1311, tqa/1350, tqa/1352, tqa/1361, tqa/1365, tqa/1381, tqa/139, tqa/14, tqa/1402, tqa/1404, tqa/1428, tqa/1439, tqa/147, tqa/1470, tqa/1478, tqa/1490, tqa/1492, tqa/15, tqa/1500, tqa/151, tqa/1518, tqa/1530, tqa/1546, tqa/155, tqa/1550, tqa/1570, tqa/1573, tqa/158, tqa/1584, tqa/1596, tqa/1598, tqa/16, tqa/162, tqa/1620, tqa/1621, tqa/1625, tqa/1630, tqa/1638, tqa/164, tqa/165, tqa/1651, tqa/1661, tqa/1666, tqa/1673, tqa/168, tqa/1689, tqa/1691, tqa/17, tqa/1701, tqa/1709, tqa/1711, tqa/1725, tqa/1729, tqa/1747, tqa/1772, tqa/1778, tqa/1797, tqa/1803, tqa/1822, tqa/1833, tqa/1841, tqa/1842, tqa/1850, tqa/1853, tqa/1856, tqa/1865, tqa/1870, tqa/1874, tqa/1882, tqa/1888, tqa/1896, tqa/1915, tqa/1923, tqa/1925, tqa/1928, tqa/1933, tqa/1938, tqa/1959, tqa/1963, tqa/1966, tqa/1972, tqa/1981, tqa/1983, tqa/1998, tqa/2000, tqa/2009, tqa/2010, tqa/2046, tqa/2047, tqa/2050, tqa/2051, tqa/2094, tqa/2098, tqa/2103, tqa/2106, tqa/2112, tqa/2113, tqa/2115, tqa/2133, tqa/2135, tqa/2164, tqa/2167, tqa/2176, tqa/2186, tqa/2187, tqa/2194, tqa/2197, tqa/2205, tqa/2221, tqa/2227, tqa/2250, tqa/2252, tqa/2254, tqa/2255, tqa/226, tqa/2282, tqa/2290, tqa/2296, tqa/2300, tqa/231, tqa/2311, tqa/2314, tqa/2316, tqa/2323, tqa/2328, tqa/2369, tqa/2379, tqa/2387, tqa/2410, tqa/2414, tqa/2419, tqa/2420, tqa/2432, tqa/2436, tqa/245, tqa/2451, tqa/2453, tqa/2464, tqa/2467, tqa/2474, tqa/2475, tqa/2483, tqa/2486, tqa/2489, tqa/2492, tqa/2513, tqa/2531, tqa/2535, tqa/2542, tqa/2543, tqa/2579, tqa/2580, tqa/2589, tqa/259, tqa/2596, tqa/2597, tqa/2630, tqa/2636, tqa/2648, tqa/2649, tqa/2652, tqa/2677, tqa/2684, tqa/2691, tqa/27, tqa/2722, tqa/2733, tqa/2734, tqa/2738, tqa/2741, tqa/2757, tqa/2768, tqa/2775, tqa/2787, tqa/2791, tqa/280, tqa/2802, tqa/2806, tqa/2843, tqa/2864, tqa/2883, tqa/2889, tqa/2913, tqa/2972, tqa/2978, tqa/3001, tqa/301, tqa/302, tqa/3038, tqa/3042, tqa/3049, tqa/305, tqa/3056, tqa/3071, tqa/3072, tqa/3083, tqa/3117, tqa/312, tqa/3124, tqa/3172, tqa/3174, tqa/32, tqa/320, tqa/3202, tqa/3205, tqa/3207, tqa/3217, tqa/3219, tqa/3233, tqa/3242, tqa/3243, tqa/3264, tqa/3284, tqa/329, tqa/3290, tqa/3291, tqa/3297, tqa/33, tqa/3305, tqa/3306, tqa/3312, tqa/3314, tqa/3318, tqa/3320, tqa/3323, tqa/336, tqa/3362, tqa/3384, tqa/3395, tqa/3398, tqa/34, tqa/3406, tqa/3408, tqa/3409, tqa/3418, tqa/3425, tqa/3444, tqa/345, tqa/3455, tqa/3461, tqa/3469, tqa/348, tqa/3483, tqa/3485, tqa/3491, tqa/3492, tqa/3494, tqa/350, tqa/3504, tqa/3508, tqa/3510, tqa/3513, tqa/3516, tqa/352, tqa/3523, tqa/353, tqa/3536, tqa/3542, tqa/3543, tqa/3547, tqa/3567, tqa/3579, tqa/3582, tqa/3591, tqa/3599, tqa/36, tqa/3600, tqa/3614, tqa/3616, tqa/3618, tqa/3621, tqa/3628, tqa/3632, tqa/3635, tqa/3648, tqa/3649, tqa/3650, tqa/3659, tqa/3668, tqa/3672, tqa/3673, tqa/3676, tqa/3680, tqa/3683, tqa/3684, tqa/3685, tqa/3689, tqa/3698, tqa/3699, tqa/3702, tqa/3707, tqa/372, tqa/373, tqa/3736, tqa/3737, tqa/3742, tqa/3743, tqa/3745, tqa/3758, tqa/3768, tqa/3771, tqa/3778, tqa/3791, tqa/38, tqa/3813, tqa/3839, tqa/3856, tqa/3867, tqa/3871, tqa/3890, tqa/39, tqa/3914, tqa/3917, tqa/3934, tqa/3937, tqa/3940, tqa/3949, tqa/3951, tqa/3958, tqa/3971, tqa/3981, tqa/4002, tqa/4006, tqa/4034, tqa/4041, tqa/4044, tqa/4048, tqa/4049, tqa/4062, tqa/4067, tqa/4071, tqa/4077, tqa/4086, tqa/4088, tqa/4089, tqa/4095, tqa/4108, tqa/4128, tqa/4133, tqa/4139, tqa/414, tqa/4140, tqa/4143, tqa/4150, tqa/4157, tqa/4162, tqa/4171, tqa/4179, tqa/4197, tqa/42, tqa/4218, tqa/422, tqa/4226, tqa/4227, tqa/4228, tqa/4231, tqa/4242, tqa/4250, tqa/4258, tqa/4263, tqa/4267, tqa/427, tqa/4279, tqa/4283, tqa/4287, tqa/4289, tqa/4290, tqa/4291, tqa/4296, tqa/43, tqa/4300, tqa/4309, tqa/432, tqa/4324, tqa/4331, tqa/4338, tqa/4339, tqa/434, tqa/4366, tqa/4381, tqa/4396, tqa/4398, tqa/4405, tqa/4407, tqa/4415, tqa/4418, tqa/4423, tqa/4435, tqa/4436, tqa/4442, tqa/4446, tqa/4456, tqa/4459, tqa/447, tqa/4472, tqa/4473, tqa/4480, tqa/4484, tqa/4485, tqa/4498, tqa/45, tqa/450, tqa/4509, tqa/451, tqa/4511, tqa/4516, tqa/4521, tqa/4535, tqa/4536, tqa/4539, tqa/4557, tqa/4579, tqa/4580, tqa/4585, tqa/460, tqa/4605, tqa/4620, tqa/4634, tqa/4640, tqa/4641, tqa/4644, tqa/4668, tqa/4681, tqa/4688, tqa/4711, tqa/4723, tqa/4725, tqa/473, tqa/4734, tqa/4737, tqa/4749, tqa/4758, tqa/4762, tqa/4764, tqa/4774, tqa/4782, tqa/4785, tqa/4821, tqa/4824, tqa/4836, tqa/4838, tqa/4839, tqa/4850, tqa/4852, tqa/4858, tqa/486, tqa/4864, tqa/4878, tqa/4884, tqa/4885, tqa/4892, tqa/4899, tqa/491, tqa/4912, tqa/4916, tqa/4920, tqa/4921, tqa/4923, tqa/4930, tqa/4941, tqa/4946, tqa/4948, tqa/4952, tqa/4966, tqa/4970, tqa/4973, tqa/4981, tqa/4984, tqa/4989, tqa/5004, tqa/5005, tqa/5009, tqa/501, tqa/5010, tqa/5020, tqa/5022, tqa/5023, tqa/5025, tqa/5027, tqa/5037, tqa/5046, tqa/5052, tqa/5057, tqa/5060, tqa/5065, tqa/5066, tqa/5067, tqa/5069, tqa/507, tqa/5071, tqa/5072, tqa/5078, tqa/5084, tqa/5086, tqa/5089, tqa/509, tqa/5095, tqa/5096, tqa/5104, tqa/5106, tqa/5109, tqa/511, tqa/5111, tqa/5112, tqa/5119, tqa/5120, tqa/5122, tqa/5132, tqa/5145, tqa/5150, tqa/5151, tqa/5154, tqa/5166, tqa/5172, tqa/5174, tqa/5175, tqa/5176, tqa/5179, tqa/518, tqa/5183, tqa/5188, tqa/5189, tqa/519, tqa/5191, tqa/5199, tqa/52, tqa/5201, tqa/5205, tqa/5206, tqa/5209, tqa/521, tqa/5220, tqa/5225, tqa/5226, tqa/5228, tqa/5235, tqa/5238, tqa/5241, tqa/5247, tqa/5249, tqa/5250, tqa/5251, tqa/5254, tqa/5258, tqa/5259, tqa/5262, tqa/5264, tqa/5272, tqa/5277, tqa/5278, tqa/5279, tqa/5280, tqa/5284, tqa/5285, tqa/5286, tqa/5287, tqa/5290, tqa/5291, tqa/5294, tqa/5295, tqa/5296, tqa/5297, tqa/5299, tqa/5307, tqa/5308, tqa/5309, tqa/5310, tqa/5314, tqa/5316, tqa/5319, tqa/5324, tqa/5325, tqa/5326, tqa/5328, tqa/5332, tqa/5337, tqa/5338, tqa/5389, tqa/539, tqa/5390, tqa/5428, tqa/5445, tqa/546, tqa/5472, tqa/5484, tqa/565, tqa/57, tqa/5749, tqa/5768, tqa/5770, tqa/5808, tqa/583, tqa/5836, tqa/5837, tqa/5841, tqa/5844, tqa/5848, tqa/5851, tqa/5852, tqa/5876, tqa/5897, tqa/5905, tqa/5923, tqa/5933, tqa/5937, tqa/5943, tqa/5952, tqa/5959, tqa/60, tqa/6001, tqa/601, tqa/6013, tqa/602, tqa/6053, tqa/6093, tqa/6098, tqa/610, tqa/6100, tqa/6104, tqa/6107, tqa/6114, tqa/6146, tqa/6162, tqa/6184, tqa/626, tqa/6267, tqa/6277, tqa/6282, tqa/6287, tqa/6306, tqa/6320, tqa/6325, tqa/6326, tqa/6339, tqa/6366, tqa/6368, tqa/6369, tqa/6376, tqa/6387, tqa/6392, tqa/6395, tqa/6414, tqa/6443, tqa/6447, tqa/6448, tqa/6449, tqa/6462, tqa/648, tqa/6486, tqa/6505, tqa/651, tqa/6516, tqa/6521, tqa/654, tqa/656, tqa/6567, tqa/657, tqa/6576, tqa/6611, tqa/6620, tqa/663, tqa/6639, tqa/6648, tqa/6649, tqa/665, tqa/6660, tqa/6662, tqa/6673, tqa/6674, tqa/668, tqa/6685, tqa/669, tqa/6733, tqa/674, tqa/6749, tqa/675, tqa/6757, tqa/6765, tqa/6766, tqa/6769, tqa/6773, tqa/6791, tqa/6796, tqa/6801, tqa/681, tqa/6823, tqa/683, tqa/6833, tqa/6838, tqa/6839, tqa/6855, tqa/6874, tqa/6884, tqa/6886, tqa/6897, tqa/6899, tqa/6915, tqa/6918, tqa/6925, tqa/6948, tqa/6954, tqa/6992, tqa/6996, tqa/6997, tqa/6999, tqa/7001, tqa/7004, tqa/7012, tqa/705, tqa/7052, tqa/7063, tqa/7064, tqa/7071, tqa/7103, tqa/7104, tqa/711, tqa/7113, tqa/7117, tqa/7119, tqa/7130, tqa/7144, tqa/7154, tqa/7155, tqa/7156, tqa/7179, tqa/72, tqa/7208, tqa/721, tqa/7210, tqa/722, tqa/7223, tqa/7224, tqa/7236, tqa/725, tqa/7252, tqa/7283, tqa/731, tqa/7354, tqa/7356, tqa/737, tqa/7390, tqa/7403, tqa/7410, tqa/7422, tqa/7427, tqa/7436, tqa/7469, tqa/7474, tqa/7477, tqa/7490, tqa/7492, tqa/7511, tqa/7518, tqa/7523, tqa/7531, tqa/7535, tqa/7540, tqa/7565, tqa/7578, tqa/7587, tqa/761, tqa/7610, tqa/7622, tqa/7624, tqa/7634, tqa/7640, tqa/7645, tqa/765, tqa/766, tqa/7668, tqa/7675, tqa/768, tqa/772, tqa/773, tqa/7737, tqa/774, tqa/7740, tqa/7744, tqa/7745, tqa/7746, tqa/775, tqa/7751, tqa/7754, tqa/7772, tqa/7803, tqa/7808, tqa/781, tqa/7814, tqa/7825, tqa/7827, tqa/7830, tqa/7831, tqa/7853, tqa/7864, tqa/7876, tqa/7886, tqa/789, tqa/7909, tqa/7914, tqa/7920, tqa/7927, tqa/7933, tqa/7942, tqa/7949, tqa/795, tqa/7955, tqa/7965, tqa/7997, tqa/80, tqa/8001, tqa/8018, tqa/8023, tqa/8029, tqa/8032, tqa/8058, tqa/8089, tqa/8148, tqa/8157, tqa/8178, tqa/8191, tqa/8201, tqa/8226, tqa/8228, tqa/8240, tqa/8243, tqa/8257, tqa/8258, tqa/8279, tqa/8297, tqa/8304, tqa/8310, tqa/8315, tqa/8321, tqa/8323, tqa/8324, tqa/8326, tqa/8329, tqa/8337, tqa/8338, tqa/8340, tqa/8343, tqa/8344, tqa/8346, tqa/8347, tqa/8350, tqa/8354, tqa/8359, tqa/8363, tqa/8364, tqa/8365, tqa/8366, tqa/8369, tqa/8370, tqa/8371, tqa/8372, tqa/8374, tqa/8376, tqa/8377, tqa/8380, tqa/8387, tqa/8390, tqa/8393, tqa/8396, tqa/8399, tqa/8415, tqa/8417, tqa/8418, tqa/8421, tqa/8422, tqa/8423, tqa/8424, tqa/8427, tqa/8437, tqa/8439, tqa/8440, tqa/8442, tqa/8450, tqa/8451, tqa/8452, tqa/8453, tqa/8454, tqa/8455, tqa/8459, tqa/8460, tqa/8463, tqa/8464, tqa/8465, tqa/8466, tqa/8468, tqa/8469, tqa/8472, tqa/8473, tqa/8484, tqa/8485, tqa/8487, tqa/8490, tqa/8496, tqa/8500, tqa/8502, tqa/8504, tqa/8506, tqa/8508, tqa/8510, tqa/8512, tqa/8513, tqa/8515, tqa/8517, tqa/8520, tqa/8525, tqa/8526, tqa/8528, tqa/8531, tqa/8532, tqa/8533, tqa/8538, tqa/8565, tqa/8568, tqa/8579, tqa/8589, tqa/8590, tqa/8599, tqa/8618, tqa/8619, tqa/8620, tqa/8623, tqa/8642, tqa/865, tqa/8656, tqa/8659, tqa/8668, tqa/8673, tqa/8687, tqa/87, tqa/8703, tqa/8705, tqa/8718, tqa/8724, tqa/8725, tqa/8732, tqa/8744, tqa/8745, tqa/8755, tqa/8758, tqa/8767, tqa/8789, tqa/8797, tqa/8798, tqa/8804, tqa/882, tqa/8820, tqa/8821, tqa/8822, tqa/8825, tqa/8837, tqa/8838, tqa/8839, tqa/8849, tqa/8854, tqa/8872, tqa/8875, tqa/8876, tqa/8878, tqa/8883, tqa/8893, tqa/8922, tqa/8930, tqa/8932, tqa/8937, tqa/8951, tqa/8973, tqa/8974, tqa/8976, tqa/8979, tqa/8995, tqa/900, tqa/9003, tqa/9010, tqa/9012, tqa/9013, tqa/9023, tqa/9031, tqa/9032, tqa/9033, tqa/9035, tqa/9041, tqa/9046, tqa/9047, tqa/9049, tqa/9063, tqa/907, tqa/9070, tqa/9079, tqa/908, tqa/9082, tqa/9094, tqa/9097, tqa/9103, tqa/9107, tqa/9110, tqa/9122, tqa/9123, tqa/9125, tqa/9127, tqa/9129, tqa/9131, tqa/9140, tqa/9142, tqa/9147, tqa/9156, tqa/9159, tqa/9161, tqa/9182, tqa/9183, tqa/9188, tqa/919, tqa/9193, tqa/9194, tqa/9195, tqa/9196, tqa/9199, tqa/9202, tqa/9208, tqa/9218, tqa/9219, tqa/9220, tqa/9236, tqa/9259, tqa/9262, tqa/9267, tqa/9269, tqa/9273, tqa/928, tqa/9315, tqa/9334, tqa/9338, tqa/9342, tqa/9344, tqa/9352, tqa/9356, tqa/9357, tqa/9365, tqa/9378, tqa/9380, tqa/9383, tqa/9388, tqa/9391, tqa/9394, tqa/94, tqa/9404, tqa/9410, tqa/9416, tqa/9418, tqa/9439, tqa/9440, tqa/9468, tqa/9469, tqa/9470, tqa/9494, tqa/9496, tqa/9508, tqa/9509, tqa/9520, tqa/9521, tqa/9533, tqa/9550, tqa/9553, tqa/9573, tqa/9577, tqa/9606, tqa/9607, tqa/9608, tqa/9609, tqa/9620, tqa/9641, tqa/9643, tqa/9678, tqa/9684, tqa/9688, tqa/9714, tqa/9718, tqa/9719, tqa/9720, tqa/9723, tqa/9725, tqa/973, tqa/9732, tqa/9733, tqa/974, tqa/9748, tqa/9763, tqa/9790, tqa/9803, tqa/9806, tqa/981, tqa/9816, tqa/9827, tqa/9831, tqa/9846, tqa/9852, tqa/9868, tqa/9872, tqa/988, tqa/9884, tqa/9886, tqa/9894, tqa/9906, tqa/9910, tqa/9913, tqa/9931, tqa/994, tqa/9949, tqa/9959, tqa/9965, tqa/9978, tqa/9981, tqa/9989, tqa/999

Problems solved by 1 model only

example_link model min_elo
tqa/678 dbrx-base 1442.643
tqa/2438 dbrx-base 1442.643
tqa/1571 dbrx-base 1442.643
tqa/3375 dbrx-base 1442.643
tqa/6378 dbrx-base 1442.643
tqa/5116 dbrx-base 1442.643
tqa/7881 dbrx-base 1442.643
tqa/9715 dbrx-base 1442.643
tqa/597 dbrx-base 1442.643
tqa/9600 dbrx-base 1442.643
tqa/1768 dbrx-base 1442.643
tqa/1678 dbrx-base 1442.643
tqa/4873 dbrx-base 1442.643
tqa/5996 dbrx-base 1442.643
tqa/5953 dbrx-base 1442.643
tqa/86 dbrx-base 1442.643
tqa/4882 dbrx-base 1442.643
tqa/2026 dbrx-base 1442.643
tqa/6689 dbrx-base 1442.643
tqa/175 dbrx-base 1442.643
tqa/10230 dbrx-base 1442.643
tqa/5115 dbrx-base 1442.643
tqa/9426 dbrx-base 1442.643
tqa/4615 dbrx-base 1442.643
tqa/2778 dbrx-base 1442.643
tqa/10060 dbrx-base 1442.643
tqa/10361 dbrx-base 1442.643
tqa/8787 dbrx-base 1442.643
tqa/3555 dbrx-base 1442.643
tqa/6981 dbrx-base 1442.643
tqa/9750 dbrx-base 1442.643
tqa/6072 dbrx-base 1442.643
tqa/2072 dbrx-base 1442.643
tqa/10374 dbrx-base 1442.643
tqa/1952 dbrx-base 1442.643
tqa/2756 dbrx-base 1442.643
tqa/1935 dbrx-base 1442.643
tqa/10161 dbrx-base 1442.643
tqa/137 dbrx-base 1442.643
tqa/8850 dbrx-base 1442.643
tqa/2034 dbrx-base 1442.643
tqa/10318 dbrx-base 1442.643
tqa/6222 dbrx-base 1442.643
tqa/6828 dbrx-base 1442.643
tqa/6014 dbrx-base 1442.643
tqa/10117 dbrx-base 1442.643
tqa/9883 dbrx-base 1442.643
tqa/3340 dbrx-base 1442.643
tqa/1287 dbrx-base 1442.643
tqa/10044 dbrx-base 1442.643
tqa/9417 dbrx-base 1442.643
tqa/8352 dbrx-base 1442.643
tqa/7689 dbrx-base 1442.643
tqa/10074 dbrx-base 1442.643
tqa/9847 dbrx-base 1442.643
tqa/7081 dbrx-base 1442.643
tqa/1454 dbrx-base 1442.643
tqa/2689 dbrx-base 1442.643
tqa/6911 dbrx-base 1442.643
tqa/169 dbrx-base 1442.643
tqa/6870 Meta-Llama-3-70B 1442.582
tqa/6725 Meta-Llama-3-70B 1442.582
tqa/6721 Meta-Llama-3-70B 1442.582
tqa/10049 Meta-Llama-3-70B 1442.582
tqa/121 Meta-Llama-3-70B 1442.582
tqa/2428 Meta-Llama-3-70B 1442.582
tqa/6541 Meta-Llama-3-70B 1442.582
tqa/7582 Meta-Llama-3-70B 1442.582
tqa/6268 Meta-Llama-3-70B 1442.582
tqa/7683 Meta-Llama-3-70B 1442.582
tqa/9028 Meta-Llama-3-70B 1442.582
tqa/10452 Meta-Llama-3-70B 1442.582
tqa/1011 Meta-Llama-3-70B 1442.582
tqa/5447 Meta-Llama-3-70B 1442.582
tqa/6477 Meta-Llama-3-70B 1442.582
tqa/6111 Meta-Llama-3-70B 1442.582
tqa/6063 Meta-Llama-3-70B 1442.582
tqa/924 Meta-Llama-3-70B 1442.582
tqa/7724 Meta-Llama-3-70B 1442.582
tqa/10643 Meta-Llama-3-70B 1442.582
tqa/9039 Meta-Llama-3-70B 1442.582
tqa/1280 Meta-Llama-3-70B 1442.582
tqa/4028 Meta-Llama-3-70B 1442.582
tqa/3130 Meta-Llama-3-70B 1442.582
tqa/9983 Meta-Llama-3-70B 1442.582
tqa/10357 Meta-Llama-3-70B 1442.582
tqa/6837 Meta-Llama-3-70B 1442.582
tqa/10358 Meta-Llama-3-70B 1442.582
tqa/5865 Meta-Llama-3-70B 1442.582
tqa/6352 Meta-Llama-3-70B 1442.582
tqa/3932 Meta-Llama-3-70B 1442.582
tqa/7698 Meta-Llama-3-70B 1442.582
tqa/9804 Meta-Llama-3-70B 1442.582
tqa/2891 Meta-Llama-3-70B 1442.582
tqa/9450 Mixtral-8x22B-v0.1 1423.797
tqa/3502 Mixtral-8x22B-v0.1 1423.797
tqa/8543 Mixtral-8x22B-v0.1 1423.797
tqa/9805 Mixtral-8x22B-v0.1 1423.797
tqa/410 Mixtral-8x22B-v0.1 1423.797
tqa/3198 Mixtral-8x22B-v0.1 1423.797
tqa/3247 Mixtral-8x22B-v0.1 1423.797
tqa/6565 Mixtral-8x22B-v0.1 1423.797
tqa/3891 Mixtral-8x22B-v0.1 1423.797
tqa/5263 Mixtral-8x22B-v0.1 1423.797
tqa/8495 Mixtral-8x22B-v0.1 1423.797
tqa/7856 Mixtral-8x22B-v0.1 1423.797
tqa/9511 Mixtral-8x22B-v0.1 1423.797
tqa/5611 Mixtral-8x22B-v0.1 1423.797
tqa/1940 Mixtral-8x22B-v0.1 1423.797
tqa/8692 Mixtral-8x22B-v0.1 1423.797
tqa/4657 Mixtral-8x22B-v0.1 1423.797
tqa/8198 Mixtral-8x22B-v0.1 1423.797
tqa/10199 Mixtral-8x22B-v0.1 1423.797
tqa/8103 Mixtral-8x22B-v0.1 1423.797
tqa/8982 Mixtral-8x22B-v0.1 1423.797
tqa/5999 Mixtral-8x22B-v0.1 1423.797
tqa/3363 Qwen1.5-110B 1371.169
tqa/1861 Qwen1.5-110B 1371.169
tqa/2326 Qwen1.5-110B 1371.169
tqa/8955 Qwen1.5-110B 1371.169
tqa/709 Qwen1.5-110B 1371.169
tqa/2968 Qwen1.5-110B 1371.169
tqa/9224 Qwen1.5-110B 1371.169
tqa/7007 Qwen1.5-110B 1371.169
tqa/10274 Qwen1.5-110B 1371.169
tqa/10073 Qwen1.5-110B 1371.169
tqa/1521 Qwen1.5-110B 1371.169
tqa/9104 Qwen1.5-110B 1371.169
tqa/454 Mixtral-8x7B-v0.1 1332.796
tqa/10388 Mixtral-8x7B-v0.1 1332.796
tqa/1145 Mixtral-8x7B-v0.1 1332.796
tqa/5092 Mixtral-8x7B-v0.1 1332.796
tqa/514 Mixtral-8x7B-v0.1 1332.796
tqa/2714 Mixtral-8x7B-v0.1 1332.796
tqa/6092 Mixtral-8x7B-v0.1 1332.796
tqa/1498 Mixtral-8x7B-v0.1 1332.796
tqa/3593 deepseek-llm-67b-base 1331.692
tqa/7269 deepseek-llm-67b-base 1331.692
tqa/4529 deepseek-llm-67b-base 1331.692
tqa/1909 deepseek-llm-67b-base 1331.692
tqa/9712 deepseek-llm-67b-base 1331.692
tqa/212 deepseek-llm-67b-base 1331.692
tqa/636 llama_65B 1328.570
tqa/3241 llama_65B 1328.570
tqa/4572 llama_65B 1328.570
tqa/2762 llama_65B 1328.570
tqa/816 llama_65B 1328.570
tqa/1716 llama_65B 1328.570
tqa/4479 llama_65B 1328.570
tqa/1305 llama_65B 1328.570
tqa/1834 llama_65B 1328.570
tqa/2782 llama_65B 1328.570
tqa/5156 llama_65B 1328.570
tqa/7823 llama_65B 1328.570
tqa/5868 llama_65B 1328.570
tqa/8846 llama_65B 1328.570
tqa/9765 llama_65B 1328.570
tqa/10273 llama_33B 1272.152
tqa/5059 llama_33B 1272.152
tqa/609 llama_33B 1272.152
tqa/9061 Qwen1.5-72B 1264.306
tqa/4789 Qwen1.5-72B 1264.306
tqa/184 Qwen1.5-72B 1264.306
tqa/3440 Qwen1.5-72B 1264.306
tqa/785 Qwen1.5-72B 1264.306
tqa/230 Qwen1.5-72B 1264.306
tqa/2401 Qwen1.5-72B 1264.306
tqa/10052 Qwen1.5-72B 1264.306
tqa/2501 Qwen1.5-72B 1264.306
tqa/3413 Qwen1.5-72B 1264.306
tqa/10700 Qwen1.5-72B 1264.306
tqa/166 Qwen1.5-72B 1264.306
tqa/10448 Qwen1.5-72B 1264.306
tqa/2517 Qwen1.5-72B 1264.306
tqa/3239 Qwen1.5-72B 1264.306
tqa/6313 llama2_70B 1217.775
tqa/8124 llama2_70B 1217.775
tqa/391 llama2_70B 1217.775
tqa/9464 llama2_70B 1217.775
tqa/6311 llama2_70B 1217.775
tqa/4958 llama2_70B 1217.775
tqa/3258 llama2_70B 1217.775
tqa/8368 llama2_70B 1217.775
tqa/2193 llama2_70B 1217.775
tqa/9744 llama2_70B 1217.775
tqa/5168 llama2_70B 1217.775
tqa/5393 llama2_70B 1217.775
tqa/4617 llama2_70B 1217.775
tqa/202 llama2_70B 1217.775
tqa/5322 llama2_70B 1217.775
tqa/6463 llama2_70B 1217.775
tqa/2228 llama2_70B 1217.775
tqa/4201 llama2_70B 1217.775
tqa/7848 falcon-40b 1204.478
tqa/8969 falcon-40b 1204.478
tqa/8611 falcon-40b 1204.478
tqa/2111 falcon-40b 1204.478
tqa/6571 falcon-40b 1204.478
tqa/9257 falcon-40b 1204.478
tqa/10147 falcon-40b 1204.478
tqa/6201 falcon-40b 1204.478
tqa/10515 falcon-40b 1204.478
tqa/10666 falcon-40b 1204.478
tqa/2626 Qwen1.5-32B 1164.379
tqa/3131 Qwen1.5-32B 1164.379
tqa/7221 Qwen1.5-32B 1164.379
tqa/2380 Qwen1.5-32B 1164.379
tqa/3670 Qwen1.5-32B 1164.379
tqa/3926 Qwen1.5-32B 1164.379
tqa/987 Qwen1.5-32B 1164.379
tqa/179 Qwen1.5-32B 1164.379
tqa/7432 Qwen1.5-32B 1164.379
tqa/3213 Qwen1.5-32B 1164.379
tqa/11092 Qwen1.5-32B 1164.379
tqa/1494 Meta-Llama-3-8B 1161.458
tqa/6202 Meta-Llama-3-8B 1161.458
tqa/3916 Meta-Llama-3-8B 1161.458
tqa/9578 Mistral-7B-v0.1 1142.941
tqa/3225 Mistral-7B-v0.1 1142.941
tqa/6863 Mistral-7B-v0.1 1142.941
tqa/1991 Mistral-7B-v0.1 1142.941
tqa/8293 llama_13B 1121.234
tqa/7620 llama_13B 1121.234
tqa/10449 mpt-30b 1076.349
tqa/4510 mpt-30b 1076.349
tqa/11283 gemma-7b 1063.710
tqa/356 gemma-7b 1063.710
tqa/6390 gemma-7b 1063.710
tqa/10678 gemma-7b 1063.710
tqa/3179 gemma-7b 1063.710
tqa/6220 gemma-7b 1063.710
tqa/9328 gemma-7b 1063.710
tqa/1489 gemma-7b 1063.710
tqa/55 llama2_13B 1062.786
tqa/3322 llama2_13B 1062.786
tqa/2248 llama2_13B 1062.786
tqa/5105 llama2_13B 1062.786
tqa/8426 llama2_13B 1062.786
tqa/3026 llama2_13B 1062.786
tqa/3350 deepseek-moe-16b-base 1049.500
tqa/8926 deepseek-moe-16b-base 1049.500
tqa/176 deepseek-moe-16b-base 1049.500
tqa/10787 deepseek-moe-16b-base 1049.500
tqa/3998 llama_07B 994.897
tqa/5985 llama_07B 994.897
tqa/9106 llama_07B 994.897
tqa/10464 llama_07B 994.897
tqa/807 llama_07B 994.897
tqa/7245 deepseek-llm-7b-base 970.463
tqa/4940 deepseek-llm-7b-base 970.463
tqa/5246 deepseek-llm-7b-base 970.463
tqa/3187 Qwen1.5-14B 961.687
tqa/4708 Qwen1.5-14B 961.687
tqa/5830 Qwen1.5-14B 961.687
tqa/8509 llama2_07B 935.546
tqa/8646 llama2_07B 935.546
tqa/2610 llama2_07B 935.546
tqa/5516 llama2_07B 935.546
tqa/8457 llama2_07B 935.546
tqa/8438 llama2_07B 935.546
tqa/3872 llama2_07B 935.546
tqa/1016 llama2_07B 935.546
tqa/10564 llama2_07B 935.546
tqa/8432 llama2_07B 935.546
tqa/8708 llama2_07B 935.546
tqa/11172 falcon-7b 928.328
tqa/2472 falcon-7b 928.328
tqa/7978 stablelm-base-alpha-7b-v2 888.145
tqa/10903 stablelm-base-alpha-7b-v2 888.145
tqa/6887 stablelm-base-alpha-7b-v2 888.145
tqa/4865 stablelm-base-alpha-7b-v2 888.145
tqa/2463 stablelm-3b-4e1t 877.346
tqa/11263 stablelm-3b-4e1t 877.346
tqa/4545 stablelm-3b-4e1t 877.346
tqa/3289 stablelm-3b-4e1t 877.346
tqa/10399 stablelm-3b-4e1t 877.346
tqa/10518 Qwen1.5-7B 873.345
tqa/10918 Qwen1.5-7B 873.345
tqa/82 Qwen1.5-7B 873.345
tqa/10562 Qwen1.5-7B 873.345
tqa/2560 Qwen1.5-7B 873.345
tqa/8713 gemma-2b 794.306
tqa/10853 Qwen1.5-4B 748.934
tqa/5791 Qwen1.5-4B 748.934
tqa/2383 pythia-12b-deduped-v0 724.038
tqa/2767 pythia-12b-deduped-v0 724.038
tqa/11241 pythia-12b-deduped-v0 724.038
tqa/3265 pythia-12b-deduped-v0 724.038
tqa/2286 pythia-12b-deduped-v0 724.038
tqa/1502 pythia-12b-deduped-v0 724.038
tqa/3307 pythia-6.9b-deduped-v0 658.285
tqa/1452 Qwen1.5-1.8B 557.009
tqa/5016 Qwen1.5-1.8B 557.009
tqa/3214 pythia-2.8b-deduped 531.823
tqa/1434 pythia-1b-deduped 391.147
tqa/1829 pythia-1b-deduped 391.147
tqa/1446 pythia-1b-deduped 391.147
tqa/309 pythia-1b-deduped 391.147
tqa/5253 Qwen1.5-0.5B 359.450
tqa/4606 Qwen1.5-0.5B 359.450
tqa/7747 Qwen1.5-0.5B 359.450

Suspect problems

These are 10 problems with the lowest correlation with the overall evaluation (i.e. better models tend to do worse on these. )

example_link acc tau
tqa/5708 0.722 -0.593
tqa/10462 0.306 -0.550
tqa/5360 0.667 -0.507
tqa/9602 0.333 -0.507
tqa/10904 0.167 -0.487
tqa/8873 0.278 -0.450
tqa/10718 0.306 -0.425
tqa/5245 0.111 -0.423
tqa/649 0.278 -0.420
tqa/10849 0.361 -0.417

Histogram of accuracies

Histogram of problems by the accuracy on each problem.

Histogram of difficulties

Histogram of problems by the minimum Elo to solve each problem.