swebench-test: by examples

Home   Doc/Code

Not solved by any model

There are 1075 examples not solved by any model. Solving some of these can be a good signal that your model is indeed better than leading models if these are good problems.
astropy__astropy-12057, astropy__astropy-12318, astropy__astropy-12544, astropy__astropy-12825, astropy__astropy-12842, astropy__astropy-12880, astropy__astropy-12891, astropy__astropy-12962, astropy__astropy-13033, astropy__astropy-13068, astropy__astropy-13073, astropy__astropy-13075, astropy__astropy-13132, astropy__astropy-13158, astropy__astropy-13162, astropy__astropy-13234, astropy__astropy-13236, astropy__astropy-13398, astropy__astropy-13417, astropy__astropy-13438, astropy__astropy-13462, astropy__astropy-13465, astropy__astropy-13477, astropy__astropy-13638, astropy__astropy-13731, astropy__astropy-13734, astropy__astropy-13842, astropy__astropy-13933, astropy__astropy-13977, astropy__astropy-14042, astropy__astropy-14182, astropy__astropy-14213, astropy__astropy-14253, astropy__astropy-14295, astropy__astropy-14365, astropy__astropy-14379, astropy__astropy-14413, astropy__astropy-14439, astropy__astropy-14566, astropy__astropy-14578, astropy__astropy-14590, astropy__astropy-14701, astropy__astropy-14966, astropy__astropy-7008, astropy__astropy-7441, astropy__astropy-7737, astropy__astropy-7746, astropy__astropy-7973, astropy__astropy-8251, astropy__astropy-8519, astropy__astropy-8715, astropy__astropy-8747, django__django-10287, django__django-10531, django__django-10554, django__django-10643, django__django-10680, django__django-10737, django__django-10853, django__django-10904, django__django-10910, django__django-10939, django__django-10957, django__django-10999, django__django-11003, django__django-11019, django__django-11030, django__django-11057, django__django-11062, django__django-11087, django__django-11088, django__django-11115, django__django-11129, django__django-11138, django__django-11169, django__django-11177, django__django-11205, django__django-11270, django__django-11278, django__django-11281, django__django-11283, django__django-11323, django__django-11354, django__django-11359, django__django-11396, django__django-11400, django__django-11417, django__django-11433, django__django-11446, django__django-11457, django__django-11477, django__django-11517, django__django-11525, django__django-11527, django__django-11539, django__django-11560, django__django-11564, django__django-11584, django__django-11591, django__django-11622, django__django-11630, django__django-11669, django__django-11688, django__django-11692, django__django-11701, django__django-11728, django__django-11734, django__django-11742, django__django-11751, django__django-11754, django__django-11772, django__django-11797, django__django-11808, django__django-11820, django__django-11829, django__django-11885, django__django-11891, django__django-11893, django__django-11894, django__django-11905, django__django-11910, django__django-11916, django__django-11991, django__django-12009, django__django-12073, django__django-12122, django__django-12148, django__django-12212, django__django-12273, django__django-12281, django__django-12313, django__django-12343, django__django-12360, django__django-12396, django__django-12406, django__django-12407, django__django-12431, django__django-12469, django__django-12477, django__django-12484, django__django-12504, django__django-12508, django__django-12513, django__django-12518, django__django-12532, django__django-12553, django__django-12556, django__django-12630, django__django-12669, django__django-12733, django__django-12734, django__django-12748, django__django-12771, django__django-12796, django__django-12830, django__django-12851, django__django-12869, django__django-12908, django__django-12910, django__django-12928, django__django-12953, django__django-12957, django__django-12961, django__django-12973, django__django-13030, django__django-13077, django__django-13097, django__django-13111, django__django-13118, django__django-13145, django__django-13162, django__django-13170, django__django-13192, django__django-13195, django__django-13199, django__django-13207, django__django-13212, django__django-13220, django__django-13250, django__django-13267, django__django-13287, django__django-13295, django__django-13300, django__django-13321, django__django-13344, django__django-13350, django__django-13355, django__django-13431, django__django-13448, django__django-13454, django__django-13458, django__django-13460, django__django-13466, django__django-13490, django__django-13495, django__django-13513, django__django-13528, django__django-13530, django__django-13560, django__django-13578, django__django-13585, django__django-13589, django__django-13592, django__django-13606, django__django-13607, django__django-13660, django__django-13671, django__django-13682, django__django-13684, django__django-13708, django__django-13722, django__django-13744, django__django-13768, django__django-13794, django__django-13797, django__django-13800, django__django-13808, django__django-13924, django__django-13992, django__django-14011, django__django-14019, django__django-14026, django__django-14030, django__django-14031, django__django-14034, django__django-14056, django__django-14059, django__django-14071, django__django-14109, django__django-14124, django__django-14149, django__django-14155, django__django-14170, django__django-14182, django__django-14271, django__django-14282, django__django-14313, django__django-14315, django__django-14324, django__django-14336, django__django-14372, django__django-14374, django__django-14385, django__django-14395, django__django-14399, django__django-14404, django__django-14407, django__django-14430, django__django-14453, django__django-14463, django__django-14480, django__django-14495, django__django-14508, django__django-14513, django__django-14518, django__django-14534, django__django-14631, django__django-14634, django__django-14664, django__django-14667, django__django-14681, django__django-14722, django__django-14727, django__django-14730, django__django-14762, django__django-14785, django__django-14792, django__django-14805, django__django-14832, django__django-14871, django__django-14880, django__django-14890, django__django-14894, django__django-14916, django__django-14935, django__django-14969, django__django-14983, django__django-14996, django__django-14997, django__django-15018, django__django-15031, django__django-15038, django__django-15061, django__django-15087, django__django-15098, django__django-15108, django__django-15135, django__django-15139, django__django-15154, django__django-15199, django__django-15202, django__django-15240, django__django-15248, django__django-15252, django__django-15272, django__django-15280, django__django-15316, django__django-15318, django__django-15320, django__django-15324, django__django-15334, django__django-15388, django__django-15401, django__django-15413, django__django-15421, django__django-15433, django__django-15438, django__django-15481, django__django-15483, django__django-15492, django__django-15563, django__django-15620, django__django-15629, django__django-15648, django__django-15651, django__django-15666, django__django-15669, django__django-15671, django__django-15678, django__django-15682, django__django-15695, django__django-15703, django__django-15732, django__django-15738, django__django-15747, django__django-15752, django__django-15766, django__django-15774, django__django-15781, django__django-15819, django__django-15869, django__django-15957, django__django-15969, django__django-15973, django__django-15993, django__django-16027, django__django-16072, django__django-16076, django__django-16092, django__django-16117, django__django-16120, django__django-16142, django__django-16143, django__django-16229, django__django-16256, django__django-16263, django__django-16302, django__django-16311, django__django-16322, django__django-16343, django__django-16369, django__django-16408, django__django-16411, django__django-16502, django__django-16514, django__django-16532, django__django-16578, django__django-16597, django__django-16599, django__django-16603, django__django-16614, django__django-16629, django__django-16631, django__django-16635, django__django-16649, django__django-16667, django__django-16735, django__django-16745, django__django-16746, django__django-16757, django__django-16759, django__django-16786, django__django-16810, django__django-16816, django__django-16820, django__django-16830, django__django-16879, django__django-16883, django__django-16903, django__django-16920, django__django-16948, django__django-16952, django__django-16983, django__django-17045, django__django-17046, django__django-17058, django__django-17066, django__django-5158, django__django-8119, django__django-8630, django__django-9703, matplotlib__matplotlib-13859, matplotlib__matplotlib-13908, matplotlib__matplotlib-13959, matplotlib__matplotlib-13980, matplotlib__matplotlib-13983, matplotlib__matplotlib-14471, matplotlib__matplotlib-17810, matplotlib__matplotlib-18869, matplotlib__matplotlib-19553, matplotlib__matplotlib-19743, matplotlib__matplotlib-19763, matplotlib__matplotlib-20374, matplotlib__matplotlib-20470, matplotlib__matplotlib-20518, matplotlib__matplotlib-20676, matplotlib__matplotlib-20679, matplotlib__matplotlib-20693, matplotlib__matplotlib-20788, matplotlib__matplotlib-20816, matplotlib__matplotlib-21238, matplotlib__matplotlib-21318, matplotlib__matplotlib-21443, matplotlib__matplotlib-21490, matplotlib__matplotlib-21550, matplotlib__matplotlib-21559, matplotlib__matplotlib-21568, matplotlib__matplotlib-21570, matplotlib__matplotlib-21617, matplotlib__matplotlib-22711, matplotlib__matplotlib-22767, matplotlib__matplotlib-22835, matplotlib__matplotlib-22883, matplotlib__matplotlib-22929, matplotlib__matplotlib-22945, matplotlib__matplotlib-22991, matplotlib__matplotlib-23057, matplotlib__matplotlib-23088, matplotlib__matplotlib-23140, matplotlib__matplotlib-23266, matplotlib__matplotlib-23288, matplotlib__matplotlib-23476, matplotlib__matplotlib-23516, matplotlib__matplotlib-23573, matplotlib__matplotlib-23740, matplotlib__matplotlib-23742, matplotlib__matplotlib-24013, matplotlib__matplotlib-24088, matplotlib__matplotlib-24111, matplotlib__matplotlib-24177, matplotlib__matplotlib-24224, matplotlib__matplotlib-24257, matplotlib__matplotlib-24538, matplotlib__matplotlib-24604, matplotlib__matplotlib-24619, matplotlib__matplotlib-24691, matplotlib__matplotlib-24749, matplotlib__matplotlib-24849, matplotlib__matplotlib-24870, matplotlib__matplotlib-24912, matplotlib__matplotlib-24924, matplotlib__matplotlib-25027, matplotlib__matplotlib-25079, matplotlib__matplotlib-25129, matplotlib__matplotlib-25238, matplotlib__matplotlib-25281, matplotlib__matplotlib-25334, matplotlib__matplotlib-25405, matplotlib__matplotlib-25430, matplotlib__matplotlib-25479, matplotlib__matplotlib-25515, matplotlib__matplotlib-25565, matplotlib__matplotlib-25624, matplotlib__matplotlib-25631, matplotlib__matplotlib-25651, matplotlib__matplotlib-25772, matplotlib__matplotlib-25779, matplotlib__matplotlib-25785, matplotlib__matplotlib-25794, matplotlib__matplotlib-25859, matplotlib__matplotlib-25960, matplotlib__matplotlib-26024, matplotlib__matplotlib-26101, matplotlib__matplotlib-26160, matplotlib__matplotlib-26208, matplotlib__matplotlib-26285, matplotlib__matplotlib-26399, matplotlib__matplotlib-26466, matplotlib__matplotlib-26469, matplotlib__matplotlib-26472, matplotlib__matplotlib-26479, mwaskom__seaborn-2576, mwaskom__seaborn-2766, mwaskom__seaborn-2846, mwaskom__seaborn-2848, mwaskom__seaborn-2979, mwaskom__seaborn-3180, mwaskom__seaborn-3187, mwaskom__seaborn-3202, mwaskom__seaborn-3216, mwaskom__seaborn-3217, mwaskom__seaborn-3394, pallets__flask-4045, pallets__flask-4074, pallets__flask-4544, pallets__flask-4992, pallets__flask-5063, psf__requests-1339, psf__requests-1376, psf__requests-1657, psf__requests-2678, psf__requests-2754, psf__requests-2873, psf__requests-4718, psf__requests-6028, pydata__xarray-2922, pydata__xarray-3114, pydata__xarray-3159, pydata__xarray-3239, pydata__xarray-3302, pydata__xarray-3338, pydata__xarray-3364, pydata__xarray-3527, pydata__xarray-3631, pydata__xarray-3637, pydata__xarray-3649, pydata__xarray-3733, pydata__xarray-3976, pydata__xarray-3979, pydata__xarray-4184, pydata__xarray-4248, pydata__xarray-4339, pydata__xarray-4419, pydata__xarray-4423, pydata__xarray-4493, pydata__xarray-4510, pydata__xarray-4684, pydata__xarray-4750, pydata__xarray-4758, pydata__xarray-4759, pydata__xarray-4767, pydata__xarray-4819, pydata__xarray-4879, pydata__xarray-4911, pydata__xarray-4940, pydata__xarray-5126, pydata__xarray-5187, pydata__xarray-5233, pydata__xarray-5362, pydata__xarray-5365, pydata__xarray-5455, pydata__xarray-5580, pydata__xarray-5662, pydata__xarray-6135, pydata__xarray-6400, pydata__xarray-6548, pydata__xarray-6798, pydata__xarray-6804, pydata__xarray-6823, pydata__xarray-6857, pydata__xarray-6938, pydata__xarray-6971, pydata__xarray-6992, pydata__xarray-6999, pydata__xarray-7003, pydata__xarray-7019, pydata__xarray-7052, pydata__xarray-7089, pydata__xarray-7101, pydata__xarray-7105, pydata__xarray-7112, pydata__xarray-7120, pydata__xarray-7147, pydata__xarray-7179, pydata__xarray-7229, pydata__xarray-7400, pydata__xarray-7444, pylint-dev__pylint-4175, pylint-dev__pylint-4330, pylint-dev__pylint-4339, pylint-dev__pylint-4398, pylint-dev__pylint-4421, pylint-dev__pylint-4492, pylint-dev__pylint-4516, pylint-dev__pylint-4551, pylint-dev__pylint-4604, pylint-dev__pylint-4661, pylint-dev__pylint-5175, pylint-dev__pylint-5201, pylint-dev__pylint-5446, pylint-dev__pylint-5613, pylint-dev__pylint-5730, pylint-dev__pylint-5743, pylint-dev__pylint-5839, pylint-dev__pylint-5951, pylint-dev__pylint-6059, pylint-dev__pylint-6196, pylint-dev__pylint-6412, pylint-dev__pylint-6506, pylint-dev__pylint-6526, pylint-dev__pylint-6556, pylint-dev__pylint-6820, pylint-dev__pylint-7097, pylint-dev__pylint-7228, pylint-dev__pylint-8124, pylint-dev__pylint-8169, pylint-dev__pylint-8683, pylint-dev__pylint-8757, pylint-dev__pylint-8799, pylint-dev__pylint-8819, pylint-dev__pylint-8929, pytest-dev__pytest-10115, pytest-dev__pytest-10343, pytest-dev__pytest-10356, pytest-dev__pytest-10371, pytest-dev__pytest-10442, pytest-dev__pytest-10482, pytest-dev__pytest-10758, pytest-dev__pytest-10893, pytest-dev__pytest-10988, pytest-dev__pytest-11044, pytest-dev__pytest-11125, pytest-dev__pytest-11160, pytest-dev__pytest-5205, pytest-dev__pytest-5221, pytest-dev__pytest-5281, pytest-dev__pytest-5356, pytest-dev__pytest-5404, pytest-dev__pytest-5413, pytest-dev__pytest-5479, pytest-dev__pytest-5559, pytest-dev__pytest-5840, pytest-dev__pytest-5980, pytest-dev__pytest-6116, pytest-dev__pytest-6186, pytest-dev__pytest-6323, pytest-dev__pytest-6368, pytest-dev__pytest-7046, pytest-dev__pytest-7122, pytest-dev__pytest-7186, pytest-dev__pytest-7231, pytest-dev__pytest-7314, pytest-dev__pytest-7481, pytest-dev__pytest-7499, pytest-dev__pytest-7500, pytest-dev__pytest-7648, pytest-dev__pytest-8055, pytest-dev__pytest-8124, pytest-dev__pytest-8365, pytest-dev__pytest-8428, pytest-dev__pytest-8447, pytest-dev__pytest-8463, pytest-dev__pytest-8906, pytest-dev__pytest-8950, pytest-dev__pytest-9064, pytest-dev__pytest-9249, pytest-dev__pytest-9279, pytest-dev__pytest-9709, pytest-dev__pytest-9780, pytest-dev__pytest-9911, pytest-dev__pytest-9956, scikit-learn__scikit-learn-10306, scikit-learn__scikit-learn-10331, scikit-learn__scikit-learn-10397, scikit-learn__scikit-learn-10427, scikit-learn__scikit-learn-10428, scikit-learn__scikit-learn-10443, scikit-learn__scikit-learn-10452, scikit-learn__scikit-learn-10471, scikit-learn__scikit-learn-10483, scikit-learn__scikit-learn-10495, scikit-learn__scikit-learn-10508, scikit-learn__scikit-learn-10558, scikit-learn__scikit-learn-10577, scikit-learn__scikit-learn-10777, scikit-learn__scikit-learn-10899, scikit-learn__scikit-learn-10913, scikit-learn__scikit-learn-10949, scikit-learn__scikit-learn-10982, scikit-learn__scikit-learn-11040, scikit-learn__scikit-learn-11042, scikit-learn__scikit-learn-11043, scikit-learn__scikit-learn-11151, scikit-learn__scikit-learn-11206, scikit-learn__scikit-learn-11264, scikit-learn__scikit-learn-11315, scikit-learn__scikit-learn-11391, scikit-learn__scikit-learn-11496, scikit-learn__scikit-learn-11542, scikit-learn__scikit-learn-11574, scikit-learn__scikit-learn-11585, scikit-learn__scikit-learn-11596, scikit-learn__scikit-learn-11635, scikit-learn__scikit-learn-12258, scikit-learn__scikit-learn-12421, scikit-learn__scikit-learn-12443, scikit-learn__scikit-learn-12462, scikit-learn__scikit-learn-12557, scikit-learn__scikit-learn-12626, scikit-learn__scikit-learn-12656, scikit-learn__scikit-learn-12733, scikit-learn__scikit-learn-12758, scikit-learn__scikit-learn-12784, scikit-learn__scikit-learn-12827, scikit-learn__scikit-learn-12860, scikit-learn__scikit-learn-12908, scikit-learn__scikit-learn-12961, scikit-learn__scikit-learn-12983, scikit-learn__scikit-learn-12989, scikit-learn__scikit-learn-13010, scikit-learn__scikit-learn-13013, scikit-learn__scikit-learn-13046, scikit-learn__scikit-learn-13087, scikit-learn__scikit-learn-13143, scikit-learn__scikit-learn-13157, scikit-learn__scikit-learn-13165, scikit-learn__scikit-learn-13174, scikit-learn__scikit-learn-13283, scikit-learn__scikit-learn-13313, scikit-learn__scikit-learn-13333, scikit-learn__scikit-learn-13363, scikit-learn__scikit-learn-13436, scikit-learn__scikit-learn-13618, scikit-learn__scikit-learn-13628, scikit-learn__scikit-learn-13641, scikit-learn__scikit-learn-13780, scikit-learn__scikit-learn-13828, scikit-learn__scikit-learn-13877, scikit-learn__scikit-learn-13910, scikit-learn__scikit-learn-13915, scikit-learn__scikit-learn-13960, scikit-learn__scikit-learn-14012, scikit-learn__scikit-learn-14125, scikit-learn__scikit-learn-14237, scikit-learn__scikit-learn-14464, scikit-learn__scikit-learn-14520, scikit-learn__scikit-learn-14544, scikit-learn__scikit-learn-14629, scikit-learn__scikit-learn-14704, scikit-learn__scikit-learn-14706, scikit-learn__scikit-learn-14806, scikit-learn__scikit-learn-14878, scikit-learn__scikit-learn-14898, scikit-learn__scikit-learn-14999, scikit-learn__scikit-learn-15028, scikit-learn__scikit-learn-15084, scikit-learn__scikit-learn-15086, scikit-learn__scikit-learn-15094, scikit-learn__scikit-learn-15120, scikit-learn__scikit-learn-15138, scikit-learn__scikit-learn-23099, scikit-learn__scikit-learn-24145, scikit-learn__scikit-learn-24677, scikit-learn__scikit-learn-24769, scikit-learn__scikit-learn-25299, scikit-learn__scikit-learn-25308, scikit-learn__scikit-learn-25363, scikit-learn__scikit-learn-25500, scikit-learn__scikit-learn-25589, scikit-learn__scikit-learn-25638, scikit-learn__scikit-learn-25672, scikit-learn__scikit-learn-25694, scikit-learn__scikit-learn-25697, scikit-learn__scikit-learn-25744, scikit-learn__scikit-learn-25774, scikit-learn__scikit-learn-25969, scikit-learn__scikit-learn-26194, scikit-learn__scikit-learn-26242, scikit-learn__scikit-learn-26289, scikit-learn__scikit-learn-26318, scikit-learn__scikit-learn-26634, scikit-learn__scikit-learn-3840, scikit-learn__scikit-learn-7760, scikit-learn__scikit-learn-9274, scikit-learn__scikit-learn-9775, scikit-learn__scikit-learn-9939, sphinx-doc__sphinx-10021, sphinx-doc__sphinx-10067, sphinx-doc__sphinx-10097, sphinx-doc__sphinx-10191, sphinx-doc__sphinx-10207, sphinx-doc__sphinx-10320, sphinx-doc__sphinx-10353, sphinx-doc__sphinx-10360, sphinx-doc__sphinx-10481, sphinx-doc__sphinx-10551, sphinx-doc__sphinx-10614, sphinx-doc__sphinx-10757, sphinx-doc__sphinx-10807, sphinx-doc__sphinx-10819, sphinx-doc__sphinx-11109, sphinx-doc__sphinx-11266, sphinx-doc__sphinx-11311, sphinx-doc__sphinx-11312, sphinx-doc__sphinx-11489, sphinx-doc__sphinx-11510, sphinx-doc__sphinx-11550, sphinx-doc__sphinx-7234, sphinx-doc__sphinx-7305, sphinx-doc__sphinx-7350, sphinx-doc__sphinx-7351, sphinx-doc__sphinx-7356, sphinx-doc__sphinx-7374, sphinx-doc__sphinx-7380, sphinx-doc__sphinx-7395, sphinx-doc__sphinx-7462, sphinx-doc__sphinx-7501, sphinx-doc__sphinx-7557, sphinx-doc__sphinx-7578, sphinx-doc__sphinx-7590, sphinx-doc__sphinx-7593, sphinx-doc__sphinx-7670, sphinx-doc__sphinx-7686, sphinx-doc__sphinx-7738, sphinx-doc__sphinx-7748, sphinx-doc__sphinx-7760, sphinx-doc__sphinx-7762, sphinx-doc__sphinx-7831, sphinx-doc__sphinx-7906, sphinx-doc__sphinx-7923, sphinx-doc__sphinx-7930, sphinx-doc__sphinx-7985, sphinx-doc__sphinx-8007, sphinx-doc__sphinx-8020, sphinx-doc__sphinx-8026, sphinx-doc__sphinx-8037, sphinx-doc__sphinx-8058, sphinx-doc__sphinx-8075, sphinx-doc__sphinx-8095, sphinx-doc__sphinx-8117, sphinx-doc__sphinx-8202, sphinx-doc__sphinx-8273, sphinx-doc__sphinx-8278, sphinx-doc__sphinx-8282, sphinx-doc__sphinx-8284, sphinx-doc__sphinx-8362, sphinx-doc__sphinx-8474, sphinx-doc__sphinx-8539, sphinx-doc__sphinx-8548, sphinx-doc__sphinx-8552, sphinx-doc__sphinx-8579, sphinx-doc__sphinx-8599, sphinx-doc__sphinx-8611, sphinx-doc__sphinx-8633, sphinx-doc__sphinx-8638, sphinx-doc__sphinx-8658, sphinx-doc__sphinx-8674, sphinx-doc__sphinx-8707, sphinx-doc__sphinx-8731, sphinx-doc__sphinx-8771, sphinx-doc__sphinx-8863, sphinx-doc__sphinx-8951, sphinx-doc__sphinx-9053, sphinx-doc__sphinx-9128, sphinx-doc__sphinx-9155, sphinx-doc__sphinx-9171, sphinx-doc__sphinx-9229, sphinx-doc__sphinx-9233, sphinx-doc__sphinx-9234, sphinx-doc__sphinx-9260, sphinx-doc__sphinx-9261, sphinx-doc__sphinx-9289, sphinx-doc__sphinx-9386, sphinx-doc__sphinx-9459, sphinx-doc__sphinx-9461, sphinx-doc__sphinx-9547, sphinx-doc__sphinx-9602, sphinx-doc__sphinx-9654, sphinx-doc__sphinx-9665, sphinx-doc__sphinx-9798, sphinx-doc__sphinx-9828, sphinx-doc__sphinx-9902, sphinx-doc__sphinx-9931, sphinx-doc__sphinx-9982, sphinx-doc__sphinx-9997, sphinx-doc__sphinx-9999, sympy__sympy-11232, sympy__sympy-11788, sympy__sympy-11818, sympy__sympy-11862, sympy__sympy-11870, sympy__sympy-11897, sympy__sympy-11919, sympy__sympy-12088, sympy__sympy-12108, sympy__sympy-12144, sympy__sympy-12171, sympy__sympy-12194, sympy__sympy-12214, sympy__sympy-12286, sympy__sympy-12307, sympy__sympy-12428, sympy__sympy-12454, sympy__sympy-12472, sympy__sympy-12798, sympy__sympy-12812, sympy__sympy-12881, sympy__sympy-12945, sympy__sympy-13018, sympy__sympy-13043, sympy__sympy-13091, sympy__sympy-13146, sympy__sympy-13173, sympy__sympy-13177, sympy__sympy-13198, sympy__sympy-13236, sympy__sympy-13259, sympy__sympy-13264, sympy__sympy-13265, sympy__sympy-13286, sympy__sympy-13309, sympy__sympy-13364, sympy__sympy-13429, sympy__sympy-13437, sympy__sympy-13441, sympy__sympy-13581, sympy__sympy-13619, sympy__sympy-13682, sympy__sympy-13744, sympy__sympy-13768, sympy__sympy-13773, sympy__sympy-13806, sympy__sympy-13808, sympy__sympy-13840, sympy__sympy-13852, sympy__sympy-13895, sympy__sympy-13903, sympy__sympy-13915, sympy__sympy-13962, sympy__sympy-13988, sympy__sympy-14024, sympy__sympy-14031, sympy__sympy-14070, sympy__sympy-14082, sympy__sympy-14085, sympy__sympy-14180, sympy__sympy-14308, sympy__sympy-14317, sympy__sympy-14564, sympy__sympy-14575, sympy__sympy-14627, sympy__sympy-14821, sympy__sympy-15085, sympy__sympy-15151, sympy__sympy-15198, sympy__sympy-15222, sympy__sympy-15231, sympy__sympy-15241, sympy__sympy-15273, sympy__sympy-15286, sympy__sympy-15304, sympy__sympy-15308, sympy__sympy-15320, sympy__sympy-15446, sympy__sympy-15555, sympy__sympy-15586, sympy__sympy-15596, sympy__sympy-15625, sympy__sympy-15635, sympy__sympy-15933, sympy__sympy-15948, sympy__sympy-15970, sympy__sympy-16003, sympy__sympy-16052, sympy__sympy-16056, sympy__sympy-16085, sympy__sympy-16088, sympy__sympy-16106, sympy__sympy-16221, sympy__sympy-16281, sympy__sympy-16331, sympy__sympy-16334, sympy__sympy-16422, sympy__sympy-16437, sympy__sympy-16449, sympy__sympy-16474, sympy__sympy-16503, sympy__sympy-16527, sympy__sympy-16597, sympy__sympy-16632, sympy__sympy-16637, sympy__sympy-16781, sympy__sympy-16840, sympy__sympy-16858, sympy__sympy-16862, sympy__sympy-16864, sympy__sympy-16901, sympy__sympy-16906, sympy__sympy-16963, sympy__sympy-17010, sympy__sympy-17038, sympy__sympy-17103, sympy__sympy-17194, sympy__sympy-17223, sympy__sympy-17271, sympy__sympy-17273, sympy__sympy-17288, sympy__sympy-17313, sympy__sympy-17394, sympy__sympy-17512, sympy__sympy-17630, sympy__sympy-17653, sympy__sympy-17696, sympy__sympy-17809, sympy__sympy-18030, sympy__sympy-18033, sympy__sympy-18062, sympy__sympy-18116, sympy__sympy-18130, sympy__sympy-18137, sympy__sympy-18168, sympy__sympy-18191, sympy__sympy-18199, sympy__sympy-18200, sympy__sympy-18256, sympy__sympy-18351, sympy__sympy-18477, sympy__sympy-18587, sympy__sympy-18605, sympy__sympy-18630, sympy__sympy-18667, sympy__sympy-18728, sympy__sympy-18922, sympy__sympy-18961, sympy__sympy-19007, sympy__sympy-19091, sympy__sympy-19093, sympy__sympy-19182, sympy__sympy-19201, sympy__sympy-19254, sympy__sympy-19487, sympy__sympy-19601, sympy__sympy-19713, sympy__sympy-19885, sympy__sympy-20049, sympy__sympy-20115, sympy__sympy-20131, sympy__sympy-20134, sympy__sympy-20169, sympy__sympy-20264, sympy__sympy-20322, sympy__sympy-20428, sympy__sympy-20438, sympy__sympy-20639, sympy__sympy-20691, sympy__sympy-20741, sympy__sympy-20916, sympy__sympy-21171, sympy__sympy-21259, sympy__sympy-21260, sympy__sympy-21271, sympy__sympy-21286, sympy__sympy-21370, sympy__sympy-21432, sympy__sympy-21436, sympy__sympy-21476, sympy__sympy-21527, sympy__sympy-21567, sympy__sympy-21586, sympy__sympy-21596, sympy__sympy-21612, sympy__sympy-21627, sympy__sympy-21769, sympy__sympy-21849, sympy__sympy-21864, sympy__sympy-21930, sympy__sympy-21931, sympy__sympy-21932, sympy__sympy-21952, sympy__sympy-22080, sympy__sympy-22098, sympy__sympy-22236, sympy__sympy-22383, sympy__sympy-22402, sympy__sympy-22740, sympy__sympy-22773, sympy__sympy-23021, sympy__sympy-23141, sympy__sympy-23191, sympy__sympy-23413, sympy__sympy-23560, sympy__sympy-23729, sympy__sympy-24102, sympy__sympy-24353, sympy__sympy-24909

Problems solved by 1 model only

example_link model min_elo
django__django-16037 20250605_atlassian-rovo-dev 1620.462
django__django-12113 20250605_atlassian-rovo-dev 1620.462
django__django-12589 20250605_atlassian-rovo-dev 1620.462
pytest-dev__pytest-9646 20250605_atlassian-rovo-dev 1620.462
astropy__astropy-11693 20250605_atlassian-rovo-dev 1620.462
mwaskom__seaborn-3407 20250605_atlassian-rovo-dev 1620.462
django__django-12906 20250605_atlassian-rovo-dev 1620.462
sympy__sympy-20476 20250605_atlassian-rovo-dev 1620.462
sympy__sympy-15976 20250605_atlassian-rovo-dev 1620.462
pytest-dev__pytest-6197 20250605_atlassian-rovo-dev 1620.462
pylint-dev__pylint-4858 20250605_atlassian-rovo-dev 1620.462
sympy__sympy-18087 20250605_atlassian-rovo-dev 1620.462
django__django-11239 20250605_atlassian-rovo-dev 1620.462
sphinx-doc__sphinx-8719 20250605_atlassian-rovo-dev 1620.462
pytest-dev__pytest-7985 20250605_atlassian-rovo-dev 1620.462
sympy__sympy-18650 20250605_atlassian-rovo-dev 1620.462
pytest-dev__pytest-9359 20250605_atlassian-rovo-dev 1620.462
django__django-12441 20250605_atlassian-rovo-dev 1620.462
sympy__sympy-12489 20250605_atlassian-rovo-dev 1620.462
django__django-12588 20250605_atlassian-rovo-dev 1620.462
sphinx-doc__sphinx-9015 20250605_atlassian-rovo-dev 1620.462
sphinx-doc__sphinx-10435 20250605_atlassian-rovo-dev 1620.462
django__django-15503 20250605_atlassian-rovo-dev 1620.462
django__django-14717 20250605_atlassian-rovo-dev 1620.462
django__django-15995 20250605_atlassian-rovo-dev 1620.462
matplotlib__matplotlib-22871 20250605_atlassian-rovo-dev 1620.462
sympy__sympy-13978 20250605_atlassian-rovo-dev 1620.462
django__django-15136 20250605_atlassian-rovo-dev 1620.462
sympy__sympy-13551 20250605_atlassian-rovo-dev 1620.462
pytest-dev__pytest-5103 20250605_atlassian-rovo-dev 1620.462
django__django-13915 20250605_atlassian-rovo-dev 1620.462
sympy__sympy-18198 20250605_atlassian-rovo-dev 1620.462
sympy__sympy-11831 20250605_atlassian-rovo-dev 1620.462
pydata__xarray-3520 20250605_atlassian-rovo-dev 1620.462
astropy__astropy-13306 20250605_atlassian-rovo-dev 1620.462
scikit-learn__scikit-learn-14430 20250605_atlassian-rovo-dev 1620.462
django__django-12912 20250605_atlassian-rovo-dev 1620.462
sympy__sympy-11400 20250605_atlassian-rovo-dev 1620.462
pydata__xarray-6889 20250605_atlassian-rovo-dev 1620.462
sphinx-doc__sphinx-9797 20250605_atlassian-rovo-dev 1620.462
scikit-learn__scikit-learn-15524 20250605_atlassian-rovo-dev 1620.462
django__django-14387 20250605_atlassian-rovo-dev 1620.462
django__django-12821 20250605_atlassian-rovo-dev 1620.462
django__django-12519 20250605_atlassian-rovo-dev 1620.462
sympy__sympy-16601 20250605_atlassian-rovo-dev 1620.462
sympy__sympy-17067 20250605_atlassian-rovo-dev 1620.462
sympy__sympy-12236 20250605_atlassian-rovo-dev 1620.462
sphinx-doc__sphinx-7615 20250605_atlassian-rovo-dev 1620.462
matplotlib__matplotlib-14623 20250605_atlassian-rovo-dev 1620.462
django__django-15689 20250605_atlassian-rovo-dev 1620.462
sphinx-doc__sphinx-11316 20250605_atlassian-rovo-dev 1620.462
django__django-12187 20250605_atlassian-rovo-dev 1620.462
scikit-learn__scikit-learn-13974 20250605_atlassian-rovo-dev 1620.462
pytest-dev__pytest-7324 20250605_atlassian-rovo-dev 1620.462
pytest-dev__pytest-5254 20250605_atlassian-rovo-dev 1620.462
django__django-14751 20250605_atlassian-rovo-dev 1620.462
django__django-14396 20250605_atlassian-rovo-dev 1620.462
matplotlib__matplotlib-23332 20250605_atlassian-rovo-dev 1620.462
scikit-learn__scikit-learn-14591 20250605_atlassian-rovo-dev 1620.462
sphinx-doc__sphinx-7859 20250605_atlassian-rovo-dev 1620.462
astropy__astropy-7858 20250605_atlassian-rovo-dev 1620.462
sphinx-doc__sphinx-7597 20250605_atlassian-rovo-dev 1620.462
sphinx-doc__sphinx-8125 20250605_atlassian-rovo-dev 1620.462
django__django-15375 20250605_atlassian-rovo-dev 1620.462
django__django-14919 20250605_atlassian-rovo-dev 1620.462
sphinx-doc__sphinx-8265 20250605_atlassian-rovo-dev 1620.462
scikit-learn__scikit-learn-13392 20250605_atlassian-rovo-dev 1620.462
django__django-16938 20250605_atlassian-rovo-dev 1620.462
django__django-11260 20250605_atlassian-rovo-dev 1620.462
django__django-13886 20250605_atlassian-rovo-dev 1620.462
sphinx-doc__sphinx-8551 20250605_atlassian-rovo-dev 1620.462
pytest-dev__pytest-5787 20250605_atlassian-rovo-dev 1620.462
sympy__sympy-18903 20250605_atlassian-rovo-dev 1620.462
mwaskom__seaborn-3069 20250605_atlassian-rovo-dev 1620.462
pallets__flask-4575 20250605_atlassian-rovo-dev 1620.462
pylint-dev__pylint-5231 20250605_atlassian-rovo-dev 1620.462
django__django-14725 20250605_atlassian-rovo-dev 1620.462
scikit-learn__scikit-learn-25805 20250605_atlassian-rovo-dev 1620.462
sympy__sympy-18478 20250605_atlassian-rovo-dev 1620.462
sphinx-doc__sphinx-9104 20250605_atlassian-rovo-dev 1620.462
django__django-12196 20250605_atlassian-rovo-dev 1620.462
sphinx-doc__sphinx-8621 20250605_atlassian-rovo-dev 1620.462
django__django-16950 20250522_amazon-q-developer-agent-20250405-dev 1548.698
django__django-13413 20250522_amazon-q-developer-agent-20250405-dev 1548.698
django__django-15352 20250522_amazon-q-developer-agent-20250405-dev 1548.698
matplotlib__matplotlib-25346 20250522_amazon-q-developer-agent-20250405-dev 1548.698
sympy__sympy-14248 20250522_amazon-q-developer-agent-20250405-dev 1548.698
sympy__sympy-19040 20250522_amazon-q-developer-agent-20250405-dev 1548.698
scikit-learn__scikit-learn-10382 20250522_amazon-q-developer-agent-20250405-dev 1548.698
matplotlib__matplotlib-25433 20250522_amazon-q-developer-agent-20250405-dev 1548.698
scikit-learn__scikit-learn-14024 20250522_amazon-q-developer-agent-20250405-dev 1548.698
django__django-13237 20250522_amazon-q-developer-agent-20250405-dev 1548.698
sphinx-doc__sphinx-10451 20250522_amazon-q-developer-agent-20250405-dev 1548.698
django__django-15423 20250522_amazon-q-developer-agent-20250405-dev 1548.698
scikit-learn__scikit-learn-13549 20250522_amazon-q-developer-agent-20250405-dev 1548.698
sympy__sympy-17720 20250522_amazon-q-developer-agent-20250405-dev 1548.698
sympy__sympy-18698 20250522_amazon-q-developer-agent-20250405-dev 1548.698
matplotlib__matplotlib-25126 20250522_amazon-q-developer-agent-20250405-dev 1548.698
sphinx-doc__sphinx-8028 20250522_amazon-q-developer-agent-20250405-dev 1548.698
pytest-dev__pytest-7168 20250522_amazon-q-developer-agent-20250405-dev 1548.698
sympy__sympy-18109 20250227_sweagent-claude-3-7-20250219 1464.129
sympy__sympy-14166 20250227_sweagent-claude-3-7-20250219 1464.129
sympy__sympy-13369 20250227_sweagent-claude-3-7-20250219 1464.129
django__django-16517 20250227_sweagent-claude-3-7-20250219 1464.129
django__django-13791 20250227_sweagent-claude-3-7-20250219 1464.129
sphinx-doc__sphinx-9350 20250227_sweagent-claude-3-7-20250219 1464.129
django__django-15127 20250227_sweagent-claude-3-7-20250219 1464.129
sympy__sympy-18633 20250227_sweagent-claude-3-7-20250219 1464.129
django__django-15180 20250227_sweagent-claude-3-7-20250219 1464.129
django__django-13115 20250227_sweagent-claude-3-7-20250219 1464.129
astropy__astropy-13469 20250227_sweagent-claude-3-7-20250219 1464.129
matplotlib__matplotlib-25547 20250227_sweagent-claude-3-7-20250219 1464.129
sympy__sympy-17251 20250227_sweagent-claude-3-7-20250219 1464.129
django__django-15996 20250227_sweagent-claude-3-7-20250219 1464.129
django__django-12951 20250227_sweagent-claude-3-7-20250219 1464.129
pydata__xarray-7347 20250227_sweagent-claude-3-7-20250219 1464.129
sympy__sympy-18835 20250227_sweagent-claude-3-7-20250219 1464.129
scikit-learn__scikit-learn-11235 20250227_sweagent-claude-3-7-20250219 1464.129
matplotlib__matplotlib-25551 20250227_sweagent-claude-3-7-20250219 1464.129
django__django-11605 20250131_amazon-q-developer-agent-20241202-dev 1392.039
django__django-11677 20250131_amazon-q-developer-agent-20241202-dev 1392.039
django__django-12936 20250131_amazon-q-developer-agent-20241202-dev 1392.039
scikit-learn__scikit-learn-26400 20250131_amazon-q-developer-agent-20241202-dev 1392.039
psf__requests-2466 20250131_amazon-q-developer-agent-20241202-dev 1392.039
django__django-11559 20250131_amazon-q-developer-agent-20241202-dev 1392.039
django__django-13484 20250131_amazon-q-developer-agent-20241202-dev 1392.039
sympy__sympy-16943 20250131_amazon-q-developer-agent-20241202-dev 1392.039
sympy__sympy-15346 20250131_amazon-q-developer-agent-20241202-dev 1392.039
astropy__astropy-14598 20241103_OpenHands-CodeAct-2.1-sonnet-20241022 1372.596
pytest-dev__pytest-7158 20241103_OpenHands-CodeAct-2.1-sonnet-20241022 1372.596
django__django-12470 20241103_OpenHands-CodeAct-2.1-sonnet-20241022 1372.596
matplotlib__matplotlib-25498 20241103_OpenHands-CodeAct-2.1-sonnet-20241022 1372.596
sympy__sympy-15971 20241103_OpenHands-CodeAct-2.1-sonnet-20241022 1372.596
django__django-12747 20241103_OpenHands-CodeAct-2.1-sonnet-20241022 1372.596
django__django-13265 20241103_OpenHands-CodeAct-2.1-sonnet-20241022 1372.596
matplotlib__matplotlib-23267 20241103_OpenHands-CodeAct-2.1-sonnet-20241022 1372.596
scikit-learn__scikit-learn-25747 20241103_OpenHands-CodeAct-2.1-sonnet-20241022 1372.596
psf__requests-1944 20241103_OpenHands-CodeAct-2.1-sonnet-20241022 1372.596
astropy__astropy-14369 20241103_OpenHands-CodeAct-2.1-sonnet-20241022 1372.596
sympy__sympy-17770 20241121_autocoderover-v2.0-claude-3-5-sonnet-20241022 1277.289
scikit-learn__scikit-learn-12486 20241121_autocoderover-v2.0-claude-3-5-sonnet-20241022 1277.289
sphinx-doc__sphinx-8056 20241121_autocoderover-v2.0-claude-3-5-sonnet-20241022 1277.289
pylint-dev__pylint-8898 20241121_autocoderover-v2.0-claude-3-5-sonnet-20241022 1277.289
django__django-11356 20241121_autocoderover-v2.0-claude-3-5-sonnet-20241022 1277.289
sphinx-doc__sphinx-8729 20241121_autocoderover-v2.0-claude-3-5-sonnet-20241022 1277.289
django__django-8326 20241121_autocoderover-v2.0-claude-3-5-sonnet-20241022 1277.289
scikit-learn__scikit-learn-10198 20240820_honeycomb 1217.911
django__django-11011 20240820_honeycomb 1217.911
django__django-13743 20240820_honeycomb 1217.911
django__django-13218 20240820_honeycomb 1217.911
sympy__sympy-13361 20240820_honeycomb 1217.911
pytest-dev__pytest-7352 20240820_honeycomb 1217.911
sympy__sympy-18744 20240820_honeycomb 1217.911
matplotlib__matplotlib-23563 20240820_honeycomb 1217.911
django__django-16693 20240721_amazon-q-developer-agent-20240719-dev 1172.720
pallets__flask-4935 20240721_amazon-q-developer-agent-20240719-dev 1172.720
astropy__astropy-8263 20240721_amazon-q-developer-agent-20240719-dev 1172.720
django__django-13822 20240721_amazon-q-developer-agent-20240719-dev 1172.720
matplotlib__matplotlib-23299 20240721_amazon-q-developer-agent-20240719-dev 1172.720
scikit-learn__scikit-learn-13704 20240617_factory_code_droid 1158.668
sphinx-doc__sphinx-8506 20240617_factory_code_droid 1158.668
matplotlib__matplotlib-20488 20240617_factory_code_droid 1158.668
sympy__sympy-12529 20240617_factory_code_droid 1158.668
pytest-dev__pytest-6680 20240617_factory_code_droid 1158.668
scikit-learn__scikit-learn-14450 20240628_autocoderover-v20240620 1149.708
psf__requests-1327 20240628_autocoderover-v20240620 1149.708
astropy__astropy-14484 20240628_autocoderover-v20240620 1149.708
astropy__astropy-13390 20240628_autocoderover-v20240620 1149.708
django__django-13620 20240628_autocoderover-v20240620 1149.708
scikit-learn__scikit-learn-13472 20240628_autocoderover-v20240620 1149.708
django__django-11964 20240628_autocoderover-v20240620 1149.708
django__django-13952 20240620_sweagent_claude3.5sonnet 1128.167
pytest-dev__pytest-8033 20240620_sweagent_claude3.5sonnet 1128.167
scikit-learn__scikit-learn-15096 20240620_sweagent_claude3.5sonnet 1128.167
django__django-12613 20240620_sweagent_claude3.5sonnet 1128.167
matplotlib__matplotlib-14043 20240620_sweagent_claude3.5sonnet 1128.167
django__django-11053 20240615_appmap-navie_gpt4o 1041.313
psf__requests-2393 20240509_amazon-q-developer-agent-20240430-dev 1031.237
pylint-dev__pylint-4970 20240509_amazon-q-developer-agent-20240430-dev 1031.237
sympy__sympy-20565 20240509_amazon-q-developer-agent-20240430-dev 1031.237
django__django-12049 20240509_amazon-q-developer-agent-20240430-dev 1031.237
django__django-12394 20240509_amazon-q-developer-agent-20240430-dev 1031.237
django__django-14599 20240402_sweagent_gpt4 983.429
sympy__sympy-13798 20240402_sweagent_gpt4 983.429
django__django-16306 20240402_sweagent_gpt4 983.429
django__django-12091 20240728_sweagent_gpt4o 974.412
django__django-13556 20240402_sweagent_claude3opus 888.815
sphinx-doc__sphinx-7961 20240402_rag_claude3opus 686.755
pydata__xarray-4994 20231010_rag_claude2 568.661
pytest-dev__pytest-5550 20231010_rag_claude2 568.661

Suspect problems

These are 10 problems with the lowest correlation with the overall evaluation (i.e. better models tend to do worse on these. )

example_link acc tau
pydata__xarray-5731 0.091 -0.323
scikit-learn__scikit-learn-13454 0.182 -0.264
django__django-15206 0.136 -0.236
django__django-13341 0.136 -0.218
pytest-dev__pytest-5550 0.045 -0.187
pydata__xarray-4994 0.045 -0.187
sphinx-doc__sphinx-7961 0.045 -0.158
django__django-11695 0.273 -0.148
django__django-13556 0.045 -0.129
django__django-14960 0.091 -0.125

Histogram of accuracies

Histogram of problems by the accuracy on each problem.

Histogram of difficulties

Histogram of problems by the minimum Elo to solve each problem.