Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NULL ptr deref in _PyCode_ConstantKey when compiling code #128632

Open
alex opened this issue Jan 8, 2025 · 5 comments
Open

NULL ptr deref in _PyCode_ConstantKey when compiling code #128632

alex opened this issue Jan 8, 2025 · 5 comments
Assignees
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@alex
Copy link
Member

alex commented Jan 8, 2025

Crash report

What happened?

Unfortunately it's a slightly large minimal reproducer. You can use xxd -r to go from the hexdump to the actual binary.

~/p/cpython ❯❯❯ xxd ~/Downloads/clusterfuzz-testcase-minimized-fuzz_pycompile-5092056728403968 
00000000: 5c62 2320 2323 2063 6f64 696e 673a 206c  \b# ## coding: l
00000010: 6174 696e 332f 30ff ffff ffff ffff ff6c  atin3/0........l
00000020: 6174 696e 37ff ffff 6463 6173 6564 6464  atin7...dcaseddd
00000030: 6464 6479 2e62 6b0a 0a0a 0a63 6c61 7373  dddy.bk....class
00000040: 2069 6e32 2829 3a0a 2020 2364 6464 6464   in2():.  #ddddd
00000050: 6762 6b0a 0a20 2064 6464 6464 640a 0a0a  gbk..  dddddd...
00000060: 476c 6174 762e 5f5f 7274 0a63 6c61 7373  Glatv.__rt.class
00000070: 2069 6e32 28ba 293a 0a20 2064 6464 6464   in2(.):.  ddddd
00000080: 6467 626c 0a0a 2020 636c 6173 7320 47ed  dgbl..  class G.
00000090: 5b5f 7072 7765 616e 7065 725d 3a61 7464  [_prweanper]:atd
000000a0: 6464 640a 0a0a 0a0a 0a0a 0a30 6f37 300a  ddd........0o70.
000000b0: 0a0a 0a0a 0a0a 0a0a 7476 6147 6c2e 5f5f  ........tvaGl.__
000000c0: 7274 0a63 6c61 7373 2069 6e32 28cf 293a  rt.class in2(.):
000000d0: 0a20 2064 6464 6464 6467 626c 0a0a 2020  .  ddddddgbl..  
000000e0: 636c 6173 7320 47ed 5b5f 7072 7765 616e  class G.[_prwean
000000f0: 7065 725d 3a61 2564 6462 2320 6762 6c0a  per]:a%ddb# gbl.
00000100: 0a20 2063 6c61 7373 2047 ed5b 5f70 7277  .  class G.[_prw
00000110: 6561 6e70 6572 5d3a 6174 6464 6464 0a0a  eanper]:atdddd..
00000120: 2320 636f 6469 6e67 3d6c 6174 696e 2d31  # coding=latin-1
00000130: 0a0a 0a47 6c61 7476 2e5f 5f72 740a 636c  ...Glatv.__rt.cl
00000140: 6173 7320 696e 3228 ba29 3a0a 2020 6464  ass in2(.):.  dd
00000150: 6464 6464 6762 6c0a 0a20 2063 6c61 7373  ddddgbl..  class
00000160: 2047 ed5b 5f5f 636c 6173 7364 6963 745f   G.[__classdict_
00000170: 5f5d 3a61 7464 6464 640a 0a0a 0a0a 0a0a  _]:atdddd.......
00000180: 0a30 6f37 300a 0a0a 0a0a 0a0a 0a0a 7476  .0o70.........tv
00000190: 6147 6c2e 5f5f 7274 0a63 6c61 7373 2069  aGl.__rt.class i
000001a0: 6e32 28cf 293a 0a20 2064 6464 6464 6467  n2(.):.  ddddddg
000001b0: 626c 0a0a 2020 636c 6173 7320 47ed 5b5f  bl..  class G.[_
000001c0: 7072 7765 616e 7065 725d 3a61 2564 6462  prweanper]:a%ddb
000001d0: 2320 6762 6c0a 0a20 2063 6c61 7373 2047  # gbl..  class G
000001e0: ed5b 5f70 7277 6561 6e70 6572 5d3a 6174  .[_prweanper]:at
000001f0: 6464 6464 0a0a 2320 636f 2600 0000 0000  dddd..# co&.....
00000200: 0000 6469 6c67 3d6c 6174 696e 2d31 0a0a  ..dilg=latin-1..
00000210: 0a47 6c61 7476 2e5f 5f72 740a 636c 6173  .Glatv.__rt.clas
00000220: 7320 696e 3228 ba29 3a0a 2020 6464 6464  s in2(.):.  dddd
00000230: 6464 6762 6c0a 0a20 2063 6c61 7373 2047  ddgbl..  class G
00000240: ed5b 5f70 7277 6561 6e70 6572 5d3a 6174  .[_prweanper]:at
00000250: 6464 6464 0aee 0a0a 0a0a 0a0a 306f 3730  dddd........0o70
00000260: 0a0a 0a0a 0a0a 0a0a 0a47 6c61 7476 2e5f  .........Glatv._
00000270: 5f72 740a 636c 6173 7320 696e 3228 cf29  _rt.class in2(.)
00000280: 3a0a 2020 6464 6464 6464 6762 6c0a 0a20  :.  ddddddgbl.. 
00000290: 2063 6c61 7373 2047 ed5b 5f70 7277 6561   class G.[_prwea
000002a0: 6e70 6572 5d3a 6125 6464 6223 2023 2320  nper]:a%ddb# ## 
000002b0: 636f 64ff ffff ff64 6464 6464 989b 86d1  cod....ddddd....
000002c0: 9d94 f5f5 0a0a 636c 6173 7320 696e 3228  ......class in2(
000002d0: 293a 6f37 300a 0a0a 0a0a 0a0a 0a40 476c  ):o70........@Gl
000002e0: 6174 3a61 7464 6464 640a 0a0a 0a0a 0a0a  at:atdddd.......
000002f0: 0a30 6f37 300a 0a0a 0a0a 0a0a 0a0a 476c  .0o70.........Gl
00000300: 6174 762e 6223 2023 2320 606f 6469 6e67  atv.b# ## `oding
00000310: 3a20 6c61 7469 6e33 2f30 ffff ffff ffff  : latin3/0......
00000320: ffff 6c61 7469 6e37 ffff ff64 6361 7365  ..latin7...dcase
00000330: 6464 6464 6464 792e 2e5f 5f72 740a 636c  ddddddy..__rt.cl
00000340: 6173 7320 696e 3228 ba29 3a0a 2020 6464  ass in2(.):.  dd
00000350: 6464 6464 6762 6c0a 0a20 2063 6c61 7373  ddddgbl..  class
00000360: 2047 ed0a 306f 3730 0a0a 0a0a 0a0a 0a0a   G..0o70........
00000370: 0a74 7661 476c 2e5f 5f72 740a 636c 6173  .tvaGl.__rt.clas
00000380: 7320 696e 3228 cf29 3a0a 2020 6464 6464  s in2(.):.  dddd
00000390: 6464 6762 6c0a 0a20 2043 6c61 7373 2047  ddgbl..  Class G
000003a0: ed5b 5f70 7277 6561 6e70 6572 5d3a 6125  .[_prweanper]:a%
000003b0: 6464 6223 2067 6237 3531 3734 3631 3034  ddb# gb751746104
000003c0: 3530 3935 3634 3039 3431 3731 3531 2320  50956409417151# 
000003d0: 636f 6464 6464 640a 0a23 2063 6f64 696e  coddddd..# codin
000003e0: 673d 6c61 7469 6e2d 310a 0a0a 476c 6174  g=latin-1...Glat
000003f0: 762e 5f5f 7274 0a63 6c61 7373 2069 6e32  v.__rt.class in2
00000400: 28ba 293a 0a20 2064 6464 6464 6467 626c  (.):.  ddddddgbl
00000410: 0a0a 0a0a 0a0a 0a0a 0a47 6c61 7476 2e5f  .........Glatv._
00000420: 5f72 740a 636c 6173 7320 696e 3228 cf29  _rt.class in2(.)
00000430: 3a0a 2020 6464 6464 6464 6762 6c0a 0a20  :.  ddddddgbl.. 
00000440: 2063 6c61 7373 2047 ed5b 5f70 7277 6561   class G.[_prwea
00000450: 6e70 6572 5d3a 6125 6464 6264 6464 6464  nper]:a%ddbddddd
00000460: 6762 6b0a 0a20 2064 6464 6464 640a 0a0a  gbk..  dddddd...
00000470: 476c 6174 762e 5f5f 7274 0a63 6c61 7373  Glatv.__rt.class
00000480: 2069 6e32 28ba 293a 0a20 2064 6464 6464   in2(.):.  ddddd
00000490: 6467 626c 0a0a 2020 636c 6173 7320 47ed  dgbl..  class G.
000004a0: 5b5f 7072 7765 616e 7065 725d 3a61 7464  [_prweanper]:atd
000004b0: 6464 640a 0a0a 0100 000d 0a0a 0a0a 0a30  ddd............0
000004c0: 6f37 300a 0a0a 0a0a 0a0a 0a74 7279 3a20  o70........try: 
000004d0: 0a47 6c61 0a0a 0a0a 0a0a 0a47 6c61 7476  .Gla.......Glatv
000004e0: 2e5f 5f72 740a 636c 6173 7320 696e 3228  .__rt.class in2(
000004f0: cf29 3a0a 2020 6464 6464 6464 6762 6c0a  .):.  ddddddgbl.
00000500: 0a20 2063 6cff ffff ffff ffff ffff ffff  .  cl...........
00000510: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000520: 6173 7320 47ed 5b5f 7072 7765 616e 7065  ass G.[_prweanpe
00000530: 725d 3a61 2564 6462 6464 6464 6467 626b  r]:a%ddbdddddgbk
00000540: 0a0a 2020 6464 6464 6464 0a0a 0a43 6c61  ..  dddddd...Cla
00000550: 7476 2e5f 5f72 740a 636c 6173 7320 6965  tv.__rt.class ie
00000560: 3228 ba29 3a0a 2020 6464 6464 6464 6762  2(.):.  ddddddgb
00000570: 6c0a 0a20 2063 6c61 7373 2047 ed5b 5f70  l..  class G.[_p
00000580: 7277 6561 6e70 6572 5d3a 6174 640a ee0a  rweanper]:atd...
00000590: 0a0a 0a0a 0a30 6f37 300a 0a0a 0a0a 0a0a  .....0o70.......
000005a0: 0a0a 2f3d 2074 762e 5f5f 7274 0a63 6c61  ../= tv.__rt.cla
000005b0: 7373 2069 6e32 28cf 293a 0a20 2064 6464  ss in2(.):.  ddd
000005c0: 6464 6467 626c 0a0a 5f70 7277 6561 6e70  dddgbl.._prweanp
000005d0: 6572 5d3a 6174 6464 6464 0a0a 2320 636f  er]:atdddd..# co
000005e0: 6469 6e67 3d74 6820 6223 2023 2320 636f  ding=th b# ## co
000005f0: 64ff ffff 7479 7065 ff64 6464 6464 6464  d...type.ddddddd
00000600: 792e 62                                  y.b
~/p/cpython ❯❯❯ ./python.exe -c '
                data = open("/Users/alex_gaynor/Downloads/clusterfuzz-testcase-minimized-fuzz_pycompile-5092056728403968", "rb").read()
                start = ["eval", "single", "exec"][data[0] % 3]
                opt = data[1] % 4
                compile(data[2:].split(b"\x00")[0], "<fuzz>", start, optimize=opt)'
python.exe(19196,0x1f3918240) malloc: nano zone abandoned due to inability to reserve vm space.
<string>:2: ResourceWarning: unclosed file <_io.BufferedReader name='/Users/alex_gaynor/Downloads/clusterfuzz-testcase-minimized-fuzz_pycompile-5092056728403968'>
  data = open("/Users/alex_gaynor/Downloads/clusterfuzz-testcase-minimized-fuzz_pycompile-5092056728403968", "rb").read()
ResourceWarning: Enable tracemalloc to get the object allocation traceback
Include/object.h:268:20: runtime error: member access within null pointer of type 'PyObject' (aka 'struct _object')
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Include/object.h:268:20 in 
AddressSanitizer:DEADLYSIGNAL
=================================================================
==19196==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000008 (pc 0x000104329198 bp 0x00016bddd130 sp 0x00016bddd000 T0)
==19196==The signal is caused by a READ memory access.
==19196==Hint: address points to the zero page.
    #0 0x104329198 in _PyCode_ConstantKey codeobject.c:2417
    #1 0x104329430 in _PyCode_ConstantKey codeobject.c:2479
    #2 0x104859bec in const_cache_insert compile.c:315
    #3 0x104859794 in _PyCompile_ConstCacheMergeOne compile.c:1233
    #4 0x104722d20 in _PyAssemble_MakeCodeObject assemble.c:754
    #5 0x10485ad24 in _PyCompile_OptimizeAndAssemble compile.c:1369
    #6 0x104812d58 in codegen_visit_stmt codegen.c:2897
    #7 0x10480ba24 in _PyCodegen_Body codegen.c:828
    #8 0x104824604 in codegen_class_body codegen.c:1483
    #9 0x104812740 in codegen_visit_stmt codegen.c:2897
    #10 0x10480ba24 in _PyCodegen_Body codegen.c:828
    #11 0x10485d27c in compiler_codegen compile.c
    #12 0x10485b518 in _PyAST_Compile compile.c:1382
    #13 0x1049c48f0 in Py_CompileStringObject pythonrun.c:1497
    #14 0x10475cb34 in builtin_compile bltinmodule.c.h:363
    #15 0x1044669fc in cfunction_vectorcall_FASTCALL_KEYWORDS methodobject.c:452
    #16 0x104304afc in _PyObject_VectorcallTstate pycore_call.h:167
    #17 0x1047994f8 in _PyEval_EvalFrameDefault generated_cases.c.h:2013
    #18 0x1047698b4 in PyEval_EvalCode ceval.c:658
    #19 0x1049c66b8 in run_eval_code_obj pythonrun.c:1338
    #20 0x1049c6204 in run_mod pythonrun.c:1423
    #21 0x1049c21c4 in _PyRun_StringFlagsWithName pythonrun.c:1222
    #22 0x1049c2004 in _PyRun_SimpleStringFlagsWithName pythonrun.c:548
    #23 0x104a57cc4 in Py_RunMain main.c:776
    #24 0x104a591b8 in pymain_main main.c:806
    #25 0x104a59554 in Py_BytesMain main.c:830
    #26 0x189cb8270  (<unknown module>)

==19196==Register values:
 x[0] = 0x000000016bddcf18   x[1] = 0x0000000000000000   x[2] = 0x0000000000000000   x[3] = 0x00000001084007a0  
 x[4] = 0x0000000063000000   x[5] = 0x0000000000000000   x[6] = 0x0000000000000000   x[7] = 0x0000000000000000  
 x[8] = 0x0000000000000000   x[9] = 0x00000001064be5e8  x[10] = 0x0000000000000000  x[11] = 0x0000000000000084  
x[12] = 0x0000000105c50000  x[13] = 0x00000001064c06e8  x[14] = 0x0000000000000000  x[15] = 0x0000000000000000  
x[16] = 0x000000030a47dd90  x[17] = 0x00000001064180a0  x[18] = 0x0000000000000000  x[19] = 0x000000016bddd080  
x[20] = 0x000000016bddd000  x[21] = 0x0000000000000000  x[22] = 0x0000000000000008  x[23] = 0x000000702d7dba00  
x[24] = 0x0000000000000000  x[25] = 0x0000007000020000  x[26] = 0x0000000000000000  x[27] = 0x0000000000000000  
x[28] = 0x0000000000000001     fp = 0x000000016bddd130     lr = 0x0000000104329834     sp = 0x000000016bddd000  
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV codeobject.c:2417 in _PyCode_ConstantKey
==19196==ABORTING
fish: Job 1, './python.exe -c '' terminated by signal data = open("/Users/alex_gaynor… (start = ["eval", "single", "exe…)
fish: Job opt = data[1] % 4, 'compile(data[2:].split(b"\x00")…' terminated by signal SIGABRT (Abort)

Found by OSS-Fuzz.

CPython versions tested on:

CPython main branch

Operating systems tested on:

No response

Output from running 'python -VV' on the command line:

No response

@alex alex added the type-crash A hard crash of the interpreter, possibly with a core dump label Jan 8, 2025
@tom-pytel
Copy link

tom-pytel commented Jan 8, 2025

Hi there, I'm not familiar with the code so can't say WHY this is happening but the immediate cause of this seems to be the offset calculation in Python/assemble.c compute_localsplus_info() line 535, the last loop for freevars does not account for a cellvar put immediately above and overwrites that pointer instead of putting at the next location in the tuple:

https://github.com/python/cpython/blob/main/Python/assemble.c#L535

When corrected the following error appears instead (no segfault):

Traceback (most recent call last):
  File "<python-input-1>", line 1, in <module>
    compile(src, '<fuzz>', start, optimize=opt)
    ~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SystemError: compiler_lookup_arg(name='__classdict__') with reftype=5 failed in in2; freevars of code <generic parameters of >: ('__classdict__',)

EDIT: Segfault reproducible with just this script:

class A:
  class B[__classdict__]: pass

@ZeroIntensity ZeroIntensity added 3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes labels Jan 8, 2025
@ZeroIntensity
Copy link
Member

EDIT: Segfault reproducible with just this script:

Thanks for that! Confirmed on current main back to 3.12.8. This is possibly a security problem, because the interpreter can be crashed with just compile(). I'm speculating, though.

cc @Eclips4 as an ast expert (you might like this issue!)

@tom-pytel
Copy link

tom-pytel commented Jan 8, 2025

Tracked the problem down to the fact that in this particular case a cell var and a free var get the same offset in locals which causes the free var pointer for the last __classdict__ to overwrite the cell var pointer for .type_params and leaves a NULL in the tuple where it is meant to be written into. At the point of error in the function compute_localsplus_info() the var dicts are as follows:

umd->u_varnames = {'__classdict__': 0, '.generic_base': 1}
umd->u_cellvars = {'.type_params': 0}
umd->u_freevars = {'__classdict__': 0}  # assuming the value should be a 1?

Included a snippet of code here below which will detect the condition and raise a SystemError exception (which would be raised if this were handled correctly anyway as this is a very degenerate condition). Included the "fix" here and not a PR because it is a bandaid just to avoid the crash. The proper way to fix this would be to figure out why the __classdict__ freevar gets an index of 0 and fix that (the __classdict__ being added in codegen_load_classdict_freevar() doesn't seem to take into account variables already present in u_metadata.u_cellvars to offset its index?).

"Fix": Replace Python/assemble.c lines 506-538 with the following:

    // This counter mirrors the fix done in fix_cell_offsets().
    int numdropped = 0, maxcelloffset = -1;
    pos = 0;
    while (PyDict_Next(umd->u_cellvars, &pos, &k, &v)) {
        int has_name = PyDict_Contains(umd->u_varnames, k);
        RETURN_IF_ERROR(has_name);
        if (has_name) {
            // Skip cells that are already covered by locals.
            numdropped += 1;
            continue;
        }

        int offset = PyLong_AsInt(v);
        if (offset == -1 && PyErr_Occurred()) {
            return ERROR;
        }
        assert(offset >= 0);
        offset += nlocals - numdropped;
        maxcelloffset = Py_MAX(maxcelloffset, offset);
        assert(offset < nlocalsplus);
        _Py_set_localsplus_info(offset, k, CO_FAST_CELL, names, kinds);
    }

    pos = 0;
    while (PyDict_Next(umd->u_freevars, &pos, &k, &v)) {
        int offset = PyLong_AsInt(v);
        if (offset == -1 && PyErr_Occurred()) {
            return ERROR;
        }
        assert(offset >= 0);
        offset += nlocals - numdropped;
        assert(offset < nlocalsplus);

        // TODO: remove once gh-128632 is fixed properly or leave to prevent future unforseen segfaults?
        if (offset <= maxcelloffset) {
            PyErr_SetString(PyExc_SystemError,
                            "overlapping cell and free variable offsets detected (see gh-128632)");
            return ERROR;
        }

        _Py_set_localsplus_info(offset, k, CO_FAST_FREE, names, kinds);
    }

P.S. If this niche case is not worth fixing properly let me know and I will send up a PR with this "fix" to at least avoid a crash and a test case.

@alex
Copy link
Member Author

alex commented Jan 8, 2025

Thanks for minimizing this!

@iritkatriel
Copy link
Member

Thank you @tom-pytel for the analysis.

CC @JelleZijlstra .

@JelleZijlstra JelleZijlstra self-assigned this Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

5 participants