Thursday, March 10, 2016

PyPy 5.0 released

PyPy 5.0

We have released PyPy 5.0, about three months after PyPy 4.0.1. We encourage all users of PyPy to update to this version.

You can download the PyPy 5.0 release here:
We would like to thank our donors for the continued support of the PyPy project.
We would also like to thank our contributors and encourage new people to join the project. PyPy has many layers and we need help with all of them: PyPy and RPython documentation improvements, tweaking popular modules to run on pypy, or general help with making RPython’s JIT even better.

 

Faster and Leaner

We continue to improve the warmup time and memory usage of JIT-related metadata. The exact effects depend vastly on the program you’re running and can range from insignificant to warmup being up to 30% faster and memory dropping by about 30%.

 

C-API Upgrade

We also merged a major upgrade to our C-API layer (cpyext), simplifying the interaction between c-level objects and PyPy interpreter level objects. As a result, lxml (prerelease) with its cython compiled component passes all tests on PyPy. The new cpyext is also much faster. This major refactoring will soon be followed by an expansion of our C-API compatibility.

 

Profiling with vmprof supported on more platforms


vmprof has been a go-to profiler for PyPy on linux for a few releases and we’re happy to announce that thanks to the cooperation with jetbrains, vmprof now works on Linux, OS X and Windows on both PyPy and CPython.

 

CFFI

While not applicable only to PyPy, cffi is arguably our most significant contribution to the python ecosystem. PyPy 5.0 ships with cffi-1.5.2 which now allows embedding PyPy (or CPython) in a C program.

 

What is PyPy?


PyPy is a very compliant Python interpreter, almost a drop-in replacement for CPython 2.7. It’s fast (pypy and cpython 2.7.x performance comparison) due to its integrated tracing JIT compiler.
We also welcome developers of other dynamic languages to see what RPython can do for them.
This release supports x86 machines on most common operating systems (Linux 32/64, Mac OS X 64, Windows 32, OpenBSD, freebsd), newer ARM hardware (ARMv6 or ARMv7, with VFPv3) running Linux, and 64 bit PowerPC hardware, specifically Linux running the big- and little-endian variants of ppc64.

 

Other Highlights (since 4.0.1 released in November 2015)

  • New features:
    • Support embedding PyPy in a C-program via cffi and static callbacks in cffi.
      This deprecates the old method of embedding PyPy
    • Refactor vmprof to work cross-operating-system, deprecate using buggy
      libunwind on Linux platforms. Vmprof even works on Windows now.
    • Support more of the C-API type slots, like tp_getattro, and fix C-API
      macros, functions, and structs such as _PyLong_FromByteArray(),
      PyString_GET_SIZE, f_locals in PyFrameObject, Py_NAN, co_filename in
      PyCodeObject
    • Use a more stable approach for allocating PyObjects in cpyext. (see
      blog post). Once the PyObject corresponding to a PyPy object is created,
      it stays around at the same location until the death of the PyPy object.
      Done with a little bit of custom GC support. It allows us to kill the
      notion of “borrowing” inside cpyext, reduces 4 dictionaries down to 1, and
      significantly simplifies the whole approach (which is why it is a new
      feature while technically a refactoring) and allows PyPy to support the
      populart lxml module (as of the next release) with no PyPy specific
      patches needed
    • Make the default filesystem encoding ASCII, like CPython
    • Use hypothesis in test creation, which is great for randomizing tests
     
  • Bug Fixes
    • Backport always using os.urandom for uuid4 from cpython and fix the JIT as well
      (issue #2202)
    • More completely support datetime, optimize timedelta creation
    • Fix for issue #2185 which caused an inconsistent list of operations to be
      generated by the unroller, appeared in a complicated DJango app
    • Fix an elusive issue with stacklets on shadowstack which showed up when
      forgetting stacklets without resuming them
    • Fix entrypoint() which now acquires the GIL
    • Fix direct_ffi_call() so failure does not bail out before setting CALL_MAY_FORCE
    • Fix (de)pickling long values by simplifying the implementation
    • Fix RPython rthread so that objects stored as threadlocal do not force minor
      GC collection and are kept alive automatically. This improves perfomance of
      short-running Python callbacks and prevents resetting such object between
      calls
    • Support floats as parameters to itertools.isslice()
    • Check for the existence of CODESET, ignoring it should have prevented PyPy
      from working on FreeBSD
    • Fix for corner case (likely shown by Krakatau) for consecutive guards with
      interdependencies
    • Fix applevel bare class method comparisons which should fix pretty printing
      in IPython
    • Issues reported with our previous release were resolved after reports from users on our issue tracker at https://bitbucket.org/pypy/pypy/issues or on IRC at #pypy
     
  • Numpy:
    • Updates to numpy 1.10.2 (incompatibilities and not-implemented features
      still exist)
    • Support dtype=((‘O’, spec)) union while disallowing record arrays with
      mixed object, non-object values
    • Remove all traces of micronumpy from cpyext if –withoutmod-micronumpy option used
    • Support indexing filtering with a boolean ndarray
    • Support partition() as an app-level function, together with a cffi wrapper
      in pypy/numpy, this now provides partial support for partition()
     
  • Performance improvements:
    • Optimize global lookups
    • Improve the memory signature of numbering instances in the JIT. This should
      massively decrease the amount of memory consumed by the JIT, which is
      significant for most programs. Also compress the numberings using variable-
      size encoding
    • Optimize string concatenation
    • Use INT_LSHIFT instead of INT_MUL when possible
    • Improve struct.unpack by casting directly from the underlying buffer.
      Unpacking floats and doubles is about 15 times faster, and integer types
      about 50% faster (on 64 bit integers). This was then subsequently
      improved further in optimizeopt.py.
    • Optimize two-tuple lookups in mapdict, which improves warmup of instance
      variable access somewhat
    • Reduce all guards from int_floordiv_ovf if one of the arguments is constant
    • Identify permutations of attributes at instance creation, reducing the
      number of bridges created
    • Greatly improve re.sub() performance
     
  • Internal refactorings:
    • Refactor and improve exception analysis in the annotator
    • Remove unnecessary special handling of space.wrap().
    • Support list-resizing setslice operations in RPython
    • Tweak the trace-too-long heuristic for multiple jit drivers
    • Refactor bookkeeping (such a cool word - three double letters) in the
      annotater
    • Refactor wrappers for OS functions from rtyper to rlib and simplify them
    • Simplify backend loading instructions to only use four variants
    • Simplify GIL handling in non-jitted code
    • Refactor naming in optimizeopt
    • Change GraphAnalyzer to use a more precise way to recognize external
      functions and fix null pointer handling, generally clean up external
      function handling
    • Remove pure variants of getfield_gc_* operations from the JIT by
      determining purity while tracing
    • Refactor databasing
    • Simplify bootstrapping in cpyext
    • Refactor rtyper debug code into python.rtyper.debug
    • Seperate structmember.h from Python.h Also enhance creating api functions
      to specify which header file they appear in (previously only pypy_decl.h)
    • Fix tokenizer to enforce universal newlines, needed for Python 3 support
Please try it out and let us know what you think. We welcome feedback, we know you are using PyPy, please tell us about it!
Cheers
The PyPy Team

12 comments:

HelpingHand said...

What is the status on finally getting a functional x64 build for windows? I am mainly interested in embedding PyPy and unless there is support for it, I will continue to avoid it.

mathgl said...

does new cpyext help for supporting numpy?

mattip said...

HelpingHand: work on x64 for windows [0] is awaiting a champion, with either the skill to do it or with the deep pockets to sponsor it. If you are interested, please come to #pypy on IRC to discuss it

[0] http://doc.pypy.org/en/latest/windows.html#what-is-missing-for-a-full-64-bit-translation

mattip said...

mathgl: yes, we are cautiously optimistic that if we now flesh out cpyext to support enough of the C-API that vanilla numpy might just work. Stay tuned for further developments

Martin Gfeller said...

I've asked Brett Cannon, well-know Pythonista working at Microsoft about whether they could sponsor or undertake Windows 64-bit work.

If you have a substantial use cause requiring the speed of PyPy, large address spaces and Windows, it might help.

Unknown said...

What happened to the speed graph on speed.pypy.org? The speedups for earlier versions of PyPy before 5.0 suddenly are much higher than they used to be. Compare for example against the graph of a couple of weeks ago (http://web.archive.org/web/20160228102615/http://speed.pypy.org/)

Version 28/2 11/3
1.5 3.18x 4.86x
2.1 6.12x 7.50x
2.4.0 6.22x 7.61x
2.6.1 7.05x 8.58x

Has the benchmark been changed, the timing method, the speed computation, hardware used, etc? More importantly, which version is "correct"?

Maciej Fijalkowski said...

Hi Paul.

We rerun all benchmarks on old Pythons and it shows now a different subset of benchmarks. I must admit I don't know why the main site chooses some benchmarks and not others, it's certainly not deliberate. Any single number you use is not correct, a bit by definition - we suggest you look in details what the benchmarks do or even better, benchmark yourself. We'll look why it's showing a different subset

Unknown said...

Great news! Awesome!

mattip said...

Paul Melis, Maciej Fjalkowski - indeed there was a bug; I reran the old benchmarks but only ~half ran to completion. I reverted the bad run, now results are like they used to be. Thanks for pointing it out

Unknown said...

When is release of pypy3 5.0?
I'd like also to get the profit of pypy5.0 by a condition of support of python 3.2.5.

Armin Rigo said...

lxml 3.6.0 released with support for PyPy 5.x.

Armin Rigo said...

Before trying out lxml 3.6.0, upgrade to PyPy 5.0.1: the release 5.0.0 does not reliably work with it.