I've been segfaulting CPython quite a bit with stack underflows while developing a befunge-to-python-bytecode JIT that uses the python stack as the befunge stack. It has to include instrumentation to track the stack depth so that it substitutes 0 when the user pops a value on an empty stack. Latest issue was this weekend, reducing the bytecode size of `p` by converting a while loop to move the stack into an array for recompilation to use FOR_ITER, it didn't like being called on non iterables
It'd be neat to see how PyPy handles fuzzing. It uses CPython's bytecode, I was able to get it to run beer6.bf (it was pretty slow, since that's a benchmark that mostly tests recompile speed) but it locked up when testing mandel.bf (odd since mandel.bf doesn't trigger recompilation)
I think the article is a tutorial. It doesn't appear to present a new result.
It's moderately-well-known that .pyc files execute arbitrary code (not just arbitrary Python code) inside the Python process. (See e.g. https://docs.python.org/2/library/marshal.html#module-marsha..., "Warning: The marshal module [i.e. loading .pyc files] is not intended to be secure against erroneous or maliciously constructed data. Never unmarshal data received from an untrusted or unauthenticated source." It is nice to see that afl can re-discover this issue, but I'm pretty sure it's not new.
Oh my, I finished this quite late last night and forgot to add a conclusion. I thought the post was getting a bit lengthy, and the next step is to use gdb to dive into the crashes + make a patch, which I feel is too much for one post.
Would the core team really be interested in that? The bytecode interpreter relies on implicit invariants from the codegen, re-checking these invariants on the bytecode means slowing down the interpreter for very little value.
Nope they wouldn't, making the patch is just for completeness sake I guess. Also the afl tool to narrow down segfaults seems to always result in the same fault, so maybe if I patch it it will narrow down some more interesting ones.
https://github.com/serprex/Befunge/blob/master/funge.py
It'd be neat to see how PyPy handles fuzzing. It uses CPython's bytecode, I was able to get it to run beer6.bf (it was pretty slow, since that's a benchmark that mostly tests recompile speed) but it locked up when testing mandel.bf (odd since mandel.bf doesn't trigger recompilation)