• 2012 Language Summit Report
    This year's Language Summit took place on Wednesday March 7 in Santa Clara, CA before the start of PyCon 2012. As with previous years, in attendance were members of the various Python VMs, packagers from various Linux distributions, and members of several community projects.

    The Namespace PEPs

    The summit began with a discussion on PEPs 382 and 402, with Barry Warsaw leading much of the discussion. After some discussion, the decision was ultimately deferred with what appeared to be a want for parts of both PEPs.

    As of Monday at the PyCon sprints, both PEPs have been rejected (see the Rejection Notice at the top of each PEP). Martin von Loewis posted to the import-sig list that a resolution has been found and Eric Smith will draft a new PEP on the ideas agreed upon there. Effectively, PEP 382 has been outright rejected, while portions of PEP 402 will be accepted.

    importlib Status

    Brett Cannon announced that there is a completed and available branch of CPython using importlib at http://hg.python.org/sandbox/bcannon/. See the bootstrap_importlib named branch.

    Discussion began by outlining the only real existing issue, which lies in stat'ing of directories. There's a minor backwards incompatibility issue with time granularity. However, everyone agreed that it's so unlikely to be of issue that it's not a showstopper and the work can move forward. Additionally, there was an optimization made around the stat calls, which was arrived at independently by each of Brett, Antoine Pitrou, and P.J. Eby.

    The topic of performance came up and Brett explained that the current pure-Python implementation is around 5% slower. Thomas Wouters exclaimed that 5% slower is actually really good, especially given some recent benchmark work he was doing showing that changing compilers sometimes shows a 5% difference in startup time. There was a shared feeling that 5% slower was not something to hold up integration of the code, which pushed discussion happily along.

    Brett went on to explain what the bootstrapping actually looks like, even asserting that the implementation finds what could be the first real use of frozen modules! Guido's first response was, "you mean to tell me that after 20 years we finally found a use for freezing code?"

    importlib._bootstrap is a frozen module containing the necessary builtins to operate, along with some re-implementations of a small number of functions. Some of the libraries included in the frozen module are warnings, _os (select code from posix), and marshal.

    Another compatibility issue was brought up, but again, was decided to be an issue unworthy of halting the progress on this issue. There's a negative level count which is not supported in importlib, used in implicit relative imports, and it was agreed that it's acceptable to continue not supporting it.

    The future will likely result in a strip down of import.c, as well as the exposure of numerous hooks as well as exposure of much of the importlib API.

    As for merging with the default branch, it was pretty universally agreed upon that this should happen for 3.3 and it should happen soon in order to get mileage on the implementation throughout the alpha and beta cycles. Since this will be happening shortly, Brett is going to follow-up to python-dev with some cleanup details and look for reviews.

    Release Schedule PEPs

    Discussion on PEPs 407 and 413 followed the importlib talk. Like the namespace PEP discussion, several ideas were tossed around but the group didn't arrive at any conclusion on acceptability of the PEPs.

    Immediately, the idea of splitting out the standard library to be on its own was resurrected, which could lend itself to both PEPs. Some questions remain, namely in where would the test suite live. Additionally, there may need to be some distinction between the tests which cover standard libraries versus the tests which cover language features.

    The topic of versioning came up, with three distinctions needing to be made. We would seem to need a version of the language spec, a version of the implementation, and a version of the standard library.

    Many commenters mentioned that these PEPs make things too complicated. Additionally, there was a question about whether there are enough users who care about either of these changes being made. Several of us stated that we could use the quicker releases, but with so many users being stuck on old versions for one reason or another, there was a wonder of who would take the releases.

    Thomas Wouters mentioned a good point about the difficulty in lining up the so-called Python "LTS" releases with other Python consumers who do similar LTS-style releases. Ubuntu and their LTS schedule was a prime example, as well as the organizations who plan releases atop something like Ubuntu. Many of the Linux distribution packagers in attendance seemed to agree.

    One thing that seemed to have broad agreement was that shortening the standard library turnaround time would be a good thing in terms of new contributors. Few people are interested in writing new features that might not be released for over a year -- it's just not fun. Even with bug fixes, sometimes the duration can be seen as too long, to the point where users may end up just fixing our problems from within their own code if possible.

    Guido went on to make a comment about how we hope to avoid the mindset some have of "my package isn't accepted until it's in the standard library". The focus continues to be on projects being hosted on PyPI, being successful out in the wild, then vetted for acceptance in the standard library after maturity of the project and its APIs.

    It was suggested that perhaps speeding up bug fix releases could be a good move, but we would need to check with release managers to ensure they're on board and willing to expend the effort to produce more frequent releases. As with the new feature releases, we need to be sure there's an audience to take the new bug fixes.

    There was also some discussion about what have previously been called "sumo" releases. Given that some similar releases are already made by third-party vendors, the idea didn't seem to gain much traction.

    Funding from the Python Software Foundation

    PSF Chairman Steve Holden joined the group after lunch to mention that the foundation has resources available to assist development efforts, especially given the sponsorship success of this year's conference. While the foundation can't and won't dictate what should be coded up, they're open to proposals about the types of work to be funded.

    Steve and Jesse Noller were adamant about the support not only being for all Python implementations, but also for third-party projects. What's needed to begin funding for a project is a concrete proposal on what will be accomplished. They stressed that the money is ready and waiting -- proposals are the way to unlock it.

    Some ideas for how to use the funding came from Steve but also from around the room. One idea which started off the discussion was the idea of funding one-month sabbaticals. Then comes the issue of who might be available. Some suggested that freelance consultants in the development community might be the ones we should try to engage. Those with full-time employment may find it harder to acquire such a sabbatical, but the possibility is open to anyone.
    Another thought was potential funding of someone to do spurts of full-time effort on the bug tracker, ideally someone already involved in the triage effort. This type of funding would hope to put an end to the times when it takes three days to fix a bug and three years for the patch to be accepted. Some thought this might be a nice idea in the short term, but it could be tough work and burn out the individual(s) involved. If anyone is up for it, they're encouraged to propose the idea to the foundation.

    Along similar lines of tracker maintenance, Glyph Lefkowitz of the Twisted project had an idea to fund code reviews over code-writing efforts. Some thought this might be a good way to push forward the regex/re situation, given that the regex is very large and most felt that the only thing holding it back from some form of inclusion is an in-depth review. The cdecimal module was mentioned as another project that could use some review assistance.

    The code review funding is also an idea to push forward some third-party project's ports to Python 3, specifically including Twisted, which the group felt was an effort which should receive some of this funding.

    Along the way it was remarked that the core-mentors group has been a success in involving new contributors. Kudos to those involved with that list.

    virtualenv Inclusion

    In about two minutes, discussion on PEP 405 came and went. Carl Meyer mentioned that a reference implementation is available and is working pretty well. A look from the OSX maintainers would be beneficial, and both Ned Deily and Ronald Oussoren were in attendance. It seemed like one of the only things left in terms of the PEP was to find someone to make a declaration on it, and Thomas Wouters put his name out there if Nick Coghlan wasn't going to do t (update: Nick will be the PEP czar).

    PEP 397 Inclusion

    Without much of a Windows representation at the summit, discussion was fairlyquick, but it was pretty much agreed that PEP 397 was something we should accept. Brian Curtin spoke in favor of the PEP, as well as mentioning ongoing work on the Windows installer to optionally add the executable's directory to the Path.

    After discussion outside of the summit, it was additionally agreed upon that the launcher should be installed via the 3.3 Windows installer, while it can also live as a standalone installer for those not taking 3.3. Additionally, there needs to be some work done on the PEP to remove much of the low-level detail that is coupled too tightly with the implementation, e.g., explaining of the location of the py.ini file.

    speed.python.org

    After generous hardware donations, the http://speed.python.org site has gone live and is currently running PyPy benchmarks. We need to make a decision on what benchmarks can be used as well as what benchmarks should be used when it comes to creating a Python 3 suite. As we get implementations on Python 3 we'll want to scale back 2.7 testing and push forward with 3.x.

    The project suffers not from a technological problem but from a personnel problem, which was thought to be another area that funding could be used for. However, even if money is on the table, we still need to find someone with the time, the know-how, and the drive to complete the task. Ideally the starting task would be to get PyPy and CPython implementations running and comparing. After that, there are a number of infrastructure tasks in line.

    PEP 411 Inclusion

    PEP 411 proposes the inclusion of provisional packages into the standard library. The recently discussed regex and ipaddr modules were used as examples of libraries to include under this PEP. As for how this inclusion should be implemented and denoted to users was the major discussion point.

    It was first suggested that documentation notes don't work -- we can't rely only on documentation to be the single notification point, especially for this type of code inclusion. Other thoughts were some type of flag on the library to specify its experimental status. Another thought was to emit a warning on import of a provisional library, but it's another thing that we'd likely want to silence by default in order to not affect user code in the hopes that developers are running their test suite with warnings enabled. However, as with other times we've gone down this path, we run the risk of developers just disabling warnings all together if they become annoying.

    As has been suggested on python-dev, importing a provisional library from a special package, e.g., from __experimental__ import foo, was pretty strongly discouraged. If the library gains a consistent API, it penalizes users once it moves from provisional status to being officially accepted. Aliasing just exacerbates the problem.

    The PEP boils down to being about process, and we need to be sure that libraries being included use the ability to change APIs very carefully. We also need to make people, especially the library author, aware of the need to be responsive to feedback and open to change as the code reaches a wider audience.

    Looking back, Jesse Noller suggested multiprocessing would have been a good candidate for something like this PEP is suggesting. Around this time, it was suggested that Michael Foord's mock could gain some provisional inclusion within unittest, perhaps as unittest.mock. Instead, given mock's stable API and wide use among us, along with the need for a mocking library within our own test suite, it was agreed to just accept it directly into the standard library without any provisional status.

    While on the topic of ``regex``'s role within the PEP came an idea from Thomas Wouters that ``regex`` be introduced into the standard library, bypassing any provisional status. From there, the previously known ``re`` module could be moved to the ``sre`` name, and there didn't appear to be any dissenting opinion there.

    It should also be noted to users of provisional libraries that the library maintainers would need to exercise extreme care and be very conservative in changing of the APIs. The last thing we want to do is introduce a good library but as a moving target to its users.

    Keyword Arguments on all builtin functions

    As recently came up on the tracker, it was suggested that wider use of keyword arguments in our APIs would likely be a good thing. Gregory P. Smith suggested that we leave single-argument APIs alone, which was agreed upon. However, the overall change got some push back as "change for change's sake".

    In order to support this, the PyArg_ParseTuple function would need to do more work, and it's already known to be somewhat slow. Alternatively, PyArg_Parse is much faster, and the tuple version could take a thing or two from it regardless of any wide scale change to builtins.

    There does exist some potential break in compatibility when replacing a builtin function with a Python one, where positional-only arguments suddenly get a potentially conflicting name.

    It was widely agreed upon that we should avoid any blanket rules and keep changes to places where it makes sense rather than make wholesale changes. We also need to be mindful of documentation and doc strings being kept to match the actual keyword argument names as well as keep them in sync.

    OrderedDict was suggested as the container for keyword arguments, but Guido and Gregory were unsure of use-cases for that. Whether or not we use a traditional or ordered dictionary, it was suggested that we could possibly use a decorator to handle some of this. We could even go as far as exposing something like PyArg_ParseTuple as a Python-level function.

    PEP 362, a proposal for a function signature object, would help here and with decorators in general. It seems that all that's left with that PEP is another look and someone to declare on it.

    Porting to Python 3

    We moved on to talk about Python 3 porting, starting with the current strategies and how they're working out. Single-codebase porting is working better than expected for most of us, although except handling is a bit messy when supporting versions like 2.4. Having a lot of options, from 3to2 to 2to3, then the single codebase through parallel trees, is a really good thing. However, it's hard for us to choose a strategy for projects, so we don't, which is why most documentation tries to lay numerous strategies out there.

    It was suggested that documentation could stand to gain more examples of real-world porting examples, ideally pointing to changesets of these projects. The thought of our porting documentation gaining a cookbook-style approach seemed to get some agreement as a good idea.

    Hash Randomization

    Release candidates are available to all branches receiving security fixes, and in the meantime, David Malcolm found and reported a security issue in the upstream expat project. However, since the upstream fix includes many other fixes at the same time, we should pick up only the security fix at this time and leave the bug fixes for the next bug fix release of the relevant branches.

    New dict Implementation

    Since the implementation makes sense and the tests pass, it was quickly agreed upon that Mark Shannon's PEP 412 should be accepted. As with other changes agreed upon in this summit, we'd like for the change to be pushed soon in order to get mileage on it throughout the alpha and beta cycles. With this acceptance comes commit access for Mark so that he can maintain the code.

    It was also remarked that the only user-visible difference that this implementation brings is a difference in sort ordering, but the recent hash randomization work makes this a moot point.

    New pickle Protocol

    PEP 3154, mentioned by Lukasz Langa, specifies a new pickle protocol -- version 4. Lukasz mentioned exception pickling in multiprocessing as being an issue, and Antoine solved it with this PEP. While qualified names provide some help, it was agreed upon that this PEP needs more attention.


    If you have any questions or comments, please post to python-dev.

    Thanks to Eric Snow and Senthil Kumaran for contributing to this post.
  • Meet the Team: Brett Cannon

    This post is part of the "Meet the Team" series of posts, which is meant to give a brief introduction to the Python core development team.

    Name:Brett Cannon
    Location:San Francisco, CA, USA
    Home Page:https://profiles.google.com/bcannon
    Blog:http://sayspy.blogspot.com

    How long have you been using Python?

    Since the fall of 2000

    How long have you been a core committer?

    Since April 2003 (shortly after PyCon 2003).

    How did you get started as a core developer? Do you remember your first commit?

    I became a core developer thanks to incessantly bugging people to commit patches for me (a trick that doesn't quite work as well as it used to; perk of getting in before Python's popularity spikein 2003/2004). Starting in August 2002 I revitalized the Python-Dev Summaries (which lasted for about 2.5 years). While writing the Summaries I would fairly regularly pick up on little issues that needed fixing. Since I was already talking on python-dev fairly regularly I simply asked folks to check my patches and commit them for me. One day Guido just asked why I didn't commit myself, I said I didn't have commit rights, and then he more or less said "you do now".

    As for my first commit (changeset 28686), it was fixing some string escapement in time.strptime() (which happens to be my first contribution to Python itself).

    Which parts of Python are you working on now?

    I typically focus on the import machinery and making the Python language work well across all VMs.

    What do you do with Python when you aren't doing core development work?

    I managed to use Python a little bit in my PhD thesis by implementing some server-side stuff in Python. Otherwise all of my personal projects use Python as much as possible. And my future job at Google is going to be mostly in Python.

    What do you do when you aren't programming?

    I'm somewhat of a movie junkie with selective bits of TV tossed in (losing my television in the summer of 2000 to a heat wave was one of the best things that ever accidentally happened to me; marrying my wife has been the best thing I did on purpose =). Otherwise I read a lot; mostly magazines and websites, but with some book always under progress.

  • Meet the Team: Michael Foord

    This post is part of the "Meet the Team" series of posts, which is meant to give a brief introduction to the Python core development team.

    Name:Michael Foord
    Location:Northampton UK
    Home Page:http://www.voidspace.org.uk/

    How long have you been using Python?

    I first started using Python as a hobby in 2002. I started using Python full time for work in 2006. When I started programming with Python it was with a group of guys who wanted to write a program to aggregate information from a Play By Email game. None of us had done any programming for a while and we had just decided on using Smalltalk when someone suggested we try Python. I quickly fell in love with Python.

    How long have you been a core committer?

    I became a core-committer at PyCon in 2009. It was originally because of my involvement with IronPython.

    How did you get started as a core developer? Do you remember your first commit?

    During the PyCon 2009 sprints I worked with Gregory Smith, another core developer, to incorporate some improvements to unittest contributed by Google.

    Which parts of Python are you working on now?

    After the initial work on unittest at the PyCon sprint I took on fixing other issues and making improvements to unittest, which was without a maintainer. I became the maintainer of unittest but also contribute to other parts of the standard library.

    I'm involved in supporting Python in various other minor ways, such as looking after Planet Python, being a PSF member, helping out on the python.org webmaster alias and so on.

    What do you do with Python when you aren't doing core development work?

    For my day job I do web development for Canonical. I work on some of the web services infrastructure around the Canonical websites and also some of the services that integrate with Ubuntu itself. That's good fun and its a great team.

    In my spare time I work on projects like unittest2 (a backport of the improvements of unittest for other platforms), mock (a testing library that provides mock objects and support for monkey patching in tests) and a whole bunch of other smaller stuff.

    I'd like to write more, but having devoted the best part of two years to writing IronPython in Action I doubt I'll take on any large writing projects soon.

    What do you do when you aren't programming?

    I'm very involved in a church in Northampton (UK), which takes a lot of my time and I help with administration for a charity we run. This is one reason why working for Canonical is good - I can work from home and having put my roots down here I won't move anywhere else (I certainly don't stay for the weather). Needless to say there isn't much Python programming happening in Northampton. My first full time programming gig was with an amazing team in London, which was a two hour door to door commute each way. I managed four years of that, and really enjoyed the job, but having escaped the commute I'm not likely to ever go back.

    I also enjoy gaming on the XBox. Unfortunately if I find a game I like I can get sucked into it for weeks so I have to be careful. I've avoided world of warcraft and eve online for this reason... I also organise a monthly geek meet in Northampton. There aren't enough Python programmers for a Python user group but we have a good collection of geeks of all sorts. We normally just get together in a pub and chew the fat or show off our latest gadgets.

  • A Python Launcher For Windows

    Mark Hammond (author of pywin32 and long-time supporter of Python on Windows) has written PEP 397, which describes a new launcher for Python on Windows. Vinay Sanjip (author of the standard library logging module) has recently created an implementation of the launcher, downloadable from https://bitbucket.org/vinay.sajip/pylauncher/downloads

    The launcher allows Python scripts (.py and .pyw files) on Windows to specify the version of Python which should be used, allowing simultaneous use of Python 2 and 3.

    Windows users should consider downloading the launcher and testing it, to help the Python developers iron out any remaining issues. The launcher is packaged as a standalone application, and will support currently available versions of Python. The intention is that once the launcher is finalised, it will be included as part of Python 3.3 (although it will remain available as a standalone download for users of earlier versions).

    Two versions of the launcher are available - launcher.msi which installs in the Program Files directory, and launchsys.msi which installs in Windows' System32 directory. (There are also 64-bit versions for 64-bit versions of Windows).

    Some Details About the Launcher

    The full specification of the behaviour of the launcher is given in PEP 397. To summarise the basic principles:

    • The launcher supplies two executables - py.exe (the console version) and pyw.exe (the GUI version).
    • The launcher is registered as the handler for .py (console) and .pyw (GUI) file extensions.
    • When executing a script, the launcher looks for a Unix-style #! (shebang) line in the script. It recognises executable names python (system default python), python2 (default Python 2 release) and python3 (default Python 3 release). The precise details can easily be customised on a per-user or per-machine basis.
    • When used standalone, the py.exe command launches the Python interactive interpreter. Command line switches are supported, so that py -2 launches Python 2, py -3 launches Python 3, and py launches the default version.

    Simple Usage Instructions

    When it is installed, the launcher associates itself with .py and .pyw scripts. Unless you do anything else, scripts will be run using the default Python on the machine, so you will see no change. One thing you might like to do, if you use the console a lot, is to add .py to your PATHEXT variable so that scripts don't get executed in a separate console.

    To specify that a script must use Python 2, simply add:

    #!/usr/bin/env python2
    

    as the first line of the script. (This is a Unix-compatible form. If you don't need Unix compatibility, #!python2 will do).

    If on the other hand, you want to specify that a script must use Python 3, add:

    #!/usr/bin/env python3
    

    as the first line.

    You can also start the Python interpreter using any of the following commands:

    # Default version of Python
    py
    # Python 2
    py -2
    # Python 3
    py -3
    

    For this to work, the py.exe executable must be on your path. This is automatic with the launchsys version of the installer, but the install directory (C:\Program Files\Python Launcher) must be added manually to PATH with launcher.msi.

    Further Reading

    The following email threads on python-dev cover some of the key discussions:

  • CPython 3.2.1 Released

    On behalf of the python-dev team, release manager Georg Brandl has announced the final release of CPython 3.2.1. Windows installers and tarballs are available as of July 10, so please consider upgrading to this release.

    The What's New document lists all of the new features in 3.2, and the Misc/NEWS file in the source lists each bug fixed.

    If you find any issues with this release or any other, please report them to http://bugs.python.org/.

  • 3.2.1 Release Candidate 2 Released

    Following up a big month of releases in June, the second release candidate of the 3.2.1 line is now ready. Since the first release candidate on May 15, over 40 issues have been fixed. We encourage everyone to test their projects with this candidate to get one last look before the final release of 3.2.1.

    What's fixed?

    I/O

    #1195 spent a few years witout a fix, but a simple addition to clear errors before calling fgets solves the problem of interrupting sys.stdin.read() with CTRL-D inside of input(). The io system saw a cleanup in #12175 with the readall method with None being the return value on a read() which returns None, and a ValueError is now raised when a file can't be opened.

    Although this isn't new for RC2, #11272 is an important 3.2.1 fix to input() on Windows - the fixing of a trailing \r. The issue has been reported many times over and affects a many people (distutils upload command anyone?), so hopefully 3.2.1 does the trick for you.

    Windows

    3.2.0 brought a new feature for Windows: os.symlink support. With that feature came #12084, os.stat was improperly evaluating Windows symlinks, so the inner workings of the various stat functions were corrected.

    A user noticed that os.path.isdir was slow, and the fact that it relied on os.stat contributed to that, especially when evaluating symlinks (which are generally twice as slow as regular files). While os.path.isdir isn't anyone's performance bottleneck, it's called numerous times on interpreter startup so changing it in #11583 to use GetFileAttributes gives a tiny speedup to build on.

    subprocess

    Creating a Popen object with unexpected arguments was causing an AttributeError, but that was reported in #12085 and was fixed by the reporter. Due to a change in 3.2.0, Popen wasn't correctly handling empty environment variables, specifically the env argument. #12383 was created for the issue and was promptly fixed.

    ...and more!

    For a full list of changes through 3.2.1 RC2, check out the change log and download it now!

    As always, please report any issues you find to http://bugs.python.org. We appreciate your help in making great Python releases.

  • June Releases - 2.6.7, 2.7.2, 3.1.4

    June is a big month for Python releases, with an update coming out of all active branches.

    2.6.7

    A new source-only release of Python 2.6.7 is available, providing fixes to three security issues. Now that the 2.6 line is in security-mode, these releases will happen on an as-needed basis until October 2013 in source-only form. If you require binary installers, you should consider an upgrade to 2.7 or 3.2.

    2.6.7 is the first release to contain a fix to the previously covered urllib vulnerability. Additionally, an smtpd DoS vulnerability (Issue #9129) and SimpleHTTPServer.list_directory XSS vulnerability (Issue #11442) are fixed.

    2.7.2

    The last minor version of the 2.x line, 2.7, received over 150 bug fixes since 2.7.1 in November 2010. 2.7.2 source and binary installers are available as of June 12, which include the security fixes mentioned in 2.6.7.

    A number of crashes are fixed: a situation when Python incorrectly used non-Python managed memory while it was being modified by another thread, when deleting __abstractmethods__ from a class, accessing a memory-mapped file past its length, and several others.

    A fix to getpass corrects a regression in regards to CTRL-C and CTRL-Z handling. multiprocessing received a number of fixes, including treating Windows services like frozen executables and a correction to a race condition when terminating multiprocessing.Pool workers. mmap was fixed to work with file sizes and offsets larger than 4 GB even on 32-bit builds, and a TypeError is now raised rather than segfaulting when trying to write to a non-writeable map.

    For a full list of changes, see the 2.7.2 news file.

    3.1.4

    3.1.4 is the last bug-fix release of the 3.1.x line, sending 3.1 into security-mode as the 3.2 line carries on. 3.1.4 contains over 100 bug fixes since the 3.1.3 release in November 2010. As with 2.7.2, binary installers are available as of June 12, and 3.1.4 is the first 3.x release to contain the security fixes listed in 2.6.7.

    3.1.4 corrects some problems with __dir__ lookups on objects, dates past 2038 in the Windows implementation of os.stat and os.utime, and a number of 64-bit cleanups. The io library saw a number of changes in returning None when nothing was read and raising appropriate exceptions in other spots. ctypes callback arguments were fixed on 64-bit Windows and a crash was also remedied.

    For a full list of changes, see the 3.1.4 news file.

    3.2.1

    3.2.1 is currently in the release candidate phase, with one round already completed and a second release candidate expected soon. We would greatly appreciate 3.2 users trying out the release candidates to ensure we cover any issues you may be seeing. If you have any bugs to report, please file them on bugs.python.org.

  • New faulthandler module in Python 3.3 helps debugging

    When a user reports that your program crashes or hangs, sometimes you can only help to try and collect more information and outline a scenario to reproduce the situation. Even with a reliable user scenario, as a developer you are often unable to reproduce the situation due to environment differences, e.g., operating system and compiler. If you are lucky, the user will be able to install debug tools, but most of time you will have to wait until another person is able to obtain more information from the same situation.

    Fatal Errors

    A new module introduced in Python 3.3 should help this problem: faulthandler. faulthandler provides the ability to dump the Python traceback on a fatal error such as a segmentation fault, division by zero, abort, or bus error. You can enable it inside your application using faulthandler.enable(), by providing the -X faulthandler option to the Python executable, or with the PYTHONFAULTHANDLER=1 environment variable. Output example:

    Fatal Python error: Segmentation fault
    
    Current thread 0x00007f7babc6b700:
      File "Lib/test/crashers/gc_inspection.py", line 29 in g
      File "Lib/test/crashers/gc_inspection.py", line 32 in <module>
    Segmentation fault

    Timeout

    faulthandler can also dump the traceback after a timeout using faulthandler.dump_tracebacks_later(timeout). Call it again to restart the timer or call faulthandler.cancel_dump_tracebacks_later() to stop the timer. Output example:

    Timeout (0:01:00)!
    Current thread 0x00007f987d459700:
      File "Lib/test/crashers/infinite_loop_re.py", line 20 in <module>
    

    Use the repeat=True option to dump the traceback each timeout seconds, or exit=True to immediatly exit the program in an unsafe fashion, e.g. don't flush files.

    User Signal

    If you have access to the host on which the program is running, you can use faulthandler.register(signal) to install a signal handler to dump the traceback when signal is received. On UNIX, for example, you can use the SIGUSR1 signal: kill -USR1 <pid> will dump the current traceback. This feature is not available on Windows. Output example:

    Current thread 0x00007fdc3da74700:
      File "Lib/test/crashers/infinite_loop_re.py", line 19 in <module>
    

    Another possibility is to explicitly call faulthandler.dump_traceback() in your program.

    Security Issues and the Output File

    faulthandler is disabled by default for security reasons, mainly because it stores the file descriptor of sys.stderr and writes the tracebacks into this file descriptor. If sys.stderr is closed and the file descriptor is reused, the file descriptor may be a socket, a pipe, a critical file or something else. By default, faulthandler writes the tracebacks to sys.stderr, but you can specify another file. For more information, see the faulthandler documentation.

    Third-party Module for Older Python Versions

    faulthandler is also maintained as a third-party module for Python 2.5 through 3.2 on PyPI. The major difference between the Python 3.3 module and the third-party module is the implementation of dump_tracebacks_later(): Python 3.3 uses a thread with a timeout on a lock, whereas the third party uses SIGALRM and alarm().

    The lock timeout, which is a new feature of Python 3.3, has a microsecond resolution. The alarm() timer used on older versions has a resolution of one second, and the SIGALRM signal may interrupt the current system call which will fail with an EINTR error.

    Early Success

    The new faulthandler module has already helped with tracking down race conditions in our buildbots. We hope that it will also help you in your programs.

  • The Python Core Mentorship Program

    Jesse Noller recently announced the formation of the Python Core Mentorship program. The idea behind the program is to help programmers, including students and developers from other projects, connect with experienced contributors who serve as mentors to ease them into Python Core development.

    Contributors Wanted

    The mentors will help people regardless of experience level by bringing them up to speed, answering questions, and giving guidance as needed in a non-confrontational and welcoming way. The contributors will receive guidance through the entire contribution process, including discussions on the related mailing lists, the bug tracker, Mercurial, code reviews, and much more.

    Early Success

    The program already has been successful, and the participants have actively committed a number of patches. There have also been several constructive discussions on the mailing list, helping guide people in the right direction for a variety of issues.

    Code of Conduct

    The program has a code of conduct explained on the website that aims to assuage concerns many new contributors have when interacting with experienced developers and mailing lists on contribution in general. Jesse and the other mentors hope that this program can act as a model for other projects long-term, not just benefiting Python-Core. They also want the program to help increase the overall diversity of the contributors to Python.

    Signing Up

    The program is run via the mailing list and has a clear, concise website devoted to it. If you would like to join to ask questions and begin on the path of core contribution, or even if you are an experienced developer (even experienced in Python-Core) looking to ask questions you're worried about asking on other lists, this is an excellent opportunity to jump in, ask and get your feet wet!

  • Portuguese, German, Korean, and Traditional Chinese Translations

    The Python Insider translation project is continuing to grow! Today we are launching Portuguese, German, Korean, and Traditional Chinese versions of the blog. The translators have already started publishing the backlog of posts. As with the other translations, these parallel editions may lag slightly behind the original posts on Python Insider.

  • Romanian and Simplified Chinese Translations

    The Python Insider team is very excited to announce two new blogs today. Translators for Romanian and Simplified Chinese have joined the Translation Project, and have already started publishing the backlog of posts. As with the other translations, these parallel editions may lag slightly behind the original posts on Python Insider.

  • Jython Migrates to Mercurial

    Jython has finally migrated from Subversion to Mercurial. This has been a long time coming: unfortunately we had a difficult Subversion repo that took some effort to cleanly convert to a different revision control system.

    The new official Jython repo is now hosted @

    http://hg.python.org/jython

    with a BitBucket Mirror for easy forking.

    There's also a larger read-only repo with ongoing feature branches (converted to Mercurial Bookmarks) hosted at http://hg.python.org/jython-fullhistory

    Mercurial makes it even easier to contribute to Jython, pull up a fork and come help us build Jython 2.6!

  • Python 3.3 to Drop Support for OS/2, Windows 2000, and VMS

    Every so often there comes a time to prune the list of supported operating systems to match the usage landscape. On top of that, the pool of contributing developers on a platform also holds significance, as there needs to be someone around to complete development tasks in order to have a quality release. Other factors, such as the age of an operating system and its hinderance on future development work, also weigh on the list.

    Victor Stinner recently proposed dropping OS/2 and VMS support for CPython, a year after his original question on OS/2 support. Victor's original inquiry came around the time of his seemingly non-stop Unicode efforts, specifically for an issue with os.execvpe() supporting environment variables via the PEP 383 surrogateescape handler. OS/2 and VMS currently have no representation on the development team and receive no testing during the release process.

    The process of writing this post got me thinking about a previous discussion about removing Windows 2000, which seemed to fall to the wayside. Systems setting COMSPEC to command.com were also supposed to be on the chopping block back then. As of now, both have joined OS/2 and VMS. Windows 2000 is up for removal in order to make development work easier, removing the need to account for legacy APIs on an operating system which hit end-of-life in 2010.

    In order to begin removing support for those systems, Victor and I started by updating PEP 11.

    PEP 11

    This PEP outlines the operating systems that are no longer supported and explains the process of adding a system to that list.

    Once it is decided that an operating system can start the process of removal, it is formally announced as unsupported. This announcement traditionally goes for the in-development version, so dropping support of OS/2, Windows 2000, and VMS begins with Python 3.3.

    The first stage is fairly hands off, more of a raising of the white flag. It's a signal that there's no one around to maintain the code and ensure a quality release. Changes to compilation and installation may be made to alert users on those platforms that the platform is unsupported. A note will go into the "What's New" document listing the newly unsupported platforms.

    After a release cycle of being unsupported, the version afterwards becomes fair game for removal of code. In this case, code can be removed in 3.4. There probably won't be a wholesale removal of that code, but developers that come across it in their normal work may remove any #ifdef blocks, configure sections, or out-of-date code.

    What You Can Do

    If you are a user of OS/2 or VMS, there are a few things you can do to save your platform.

    Become a Maintainer

    Nothing says support better than an active developer. Andrew MacIntyre has been the OS/2 maintainer for some time now, and he stated during Victor's first OS/2 query that OS/2 is behind on Unicode support, so that's certainly an area that needs focus. VMS appears to have some amount of external support via http://www.vmspython.org, but as discussed in issue 11918, someone needs to step up to allow the continued VMS support upstream.

    If you are interested in taking over for either platform, see the developer's guide for the current development proccesses.

    Contribute a build slave

    With an active developer, a platform stands a better chance of survival. With a build slave, a platform stands an even better chance, not only at survival but also at quality.

    Python uses Buildbot for continuous integration, and build slaves are currently provided for Linux, Mac, Windows, and Open Indiana (Solaris), for various versions, architectures, and configurations. Being able to donate a machine to the build fleet for OS/2 or VMS would allow those platforms to receive the same attention that more mainstream platforms receive.

    If you can donate either time or hardware to help keep OS/2 and VMS alive, contact the python-dev mailing list to coordinate your efforts.

  • Python Insider Translation Project

    We think the content of this blog is useful for the whole Python community, so reaching as many people as we can is one of our priorities. To expand our reach, we have assembled a team of translators to create parallel editions of the blog in other languages. We are launching two translations today: Japanese and Spanish.

    The translations will lag a little behind the posts on Python Insider, but try to keep more or less up to date.

    Help Wanted

    The translation team is still very small, so we are looking for more people to join. We need people able to work on the existing languages, or to help us expand to other languages. If you can help in either way, contact Doug Hellmann (doug dot hellmann at gmail).

  • Meet the Team: Brian Curtin

    This post is part of the "Meet the Team" series of posts, which is meant to give a brief introduction to the Python core development team.

    Name:Brian Curtin
    Location:Chicago, IL
    Home Page:http://blog.briancurtin.com/

    How long have you been using Python?

    On a day to day basis going on 6 years. Prior to that I used it occasionally for a class in college and also at a summer internship.

    How long have you been a core committer?

    Just over a year. March 24 marked my first year with the group.

    How did you get started as a core developer? Do you remember your first commit?

    I got started after noticing a documentation bug while writing an extension module at work, then I submitted a simple patch and Georg Brandl committed it almost immediately. After having that quick success and a fresh source checkout, I wanted to dive in and learn more about the modules I was using and ended up writing a patch to add context manager support to zipfile.

    The first few commits I made were documentation fixes in order to keep it simple early on. My first code commit was to add a few features and expand test coverage in the winreg module.

    Which parts of Python are you working on now?

    As one of the few Windows users involved in CPython development, I try to keep an eye on whatever issues Windows users are having. Due to that, I've had a chance to work on a bunch of the standard library, including modules I hadn't used. I haven't done much with the interpreter itself, but I'm looking to change that.

    What do you do with Python when you aren't doing core development work?

    I build a variety of test tools for a trading database which is written in C++. There's an extension module for the data API so we can easily write regression tests, performance tests, and we're always trying to build more.

    What do you do when you aren't programming?

    I'm a huge baseball fan. I umpire college baseball in the spring, various leagues in the summer, and mix in watching and going to Chicago Cubs games.

  • Meet the Team: Nick Coghlan

    This post is part of the "Meet the Team" series of posts, which is meant to give a brief introduction to the Python core development team.

    Name:Nick Coghlan
    Location:Brisbane, Australia
    Home Page:http://www.boredomandlaziness.org

    How long have you been using Python?

    First encountered 1.5.2 around 1999 when our lecturer used it for a networking course. Started using 2.2 professionally for automated testing around 2002 and never looked back.

    How long have you been a core committer?

    Guido gave me access in 2005 to update PEP 343 (primarily ditching the context method)

    How did you get started as a core developer? Do you remember your first commit?

    As far as contributing patches goes, I had 3 months off in 2004 and spent a bunch of it working with Raymond and Facundo on the decimal module, primarily running the telco benchmarks and finding ways to speed up the code. A few of the stranger hacks in the decimal module (like the fast path for checking for special cases and the use of strings when converting tuples of digits to integers) stem from that time.

    My actual first commit would have been to PEP 343, and then after that probably to the AST compiler branch as we finished it up for inclusion in 2.5.

    Which parts of Python are you working on now?

    runpy, functools and contextlib are the main things that tend to end up in my inbox. I also keep an eye on what Brett and Victor are doing with import, what Raymond is doing with collections and itertools, and anything that happens to the compiler. I'm also fascinated by the cultural side of things.

    What do you do with Python when you aren't doing core development work?

    Not a great deal, actually. The Python stuff at work generally just ticks away doing its thing, so there isn't a lot of call to hack on it at the moment. I do want to do something to tidy up my digital music library, but the scripts for that are just a hack job at the moment.

    What do you do when you aren't programming?

    Tae kwon do, computer gaming, soccer, reading, etc, etc...

  • New Blog Design

    If you read Python Insider through a feed reader, you may not have seen the new page design Marcin Wojtczuk created for us. It looks great while maintaining a lightweight feel, and we couldn't be happier with the results.

    Thank you for your time and efforts, Marcin!

  • urllib Security Vulnerability Fixed

    Guido van Rossum recently pushed a fix for CVE-2011-1521, a security issue in Python's URL libraries. While security issues are rare, it's a good opportunity to let the community in on the process behind reporting, handling, and fixing these issues as they arise.

    Reporting an Issue

    If you've found a security issue within CPython, the first thing we ask is that you keep the details of the issue private. After determining that you have found a legitimate security issue, generating a succinct but detailed report is key to transferring your knowledge to the core developers.

    A good report clearly explains how the relevant parts of the system are affected by the issue. If the issue occurs on a specific platform or due to a dependency, that's helpful to know as well. The affected versions are useful to know, and it's likely that the vulernability will be tested for all active versions as well. Lastly, if you have a test case that shows the issue, be sure to include it. Your report should be sent to the security@python.org group.

    Niels Heinen of the Google Security Team recently submitted a good report. He discovered an issue with HTTP 302 redirection handling in the standard library urllib and urllib2 modules. What he found was that a server could redirect requests to inappropriate schemes, leading to situations which could compromise data or systems. In his initial report, Neils explains two scenarios where these redirections could expose problems.

    First, since urllib/urllib2 supplies a handler for the file:// URL scheme, a redirection to file:///etc/passwd could expose password data. Neils also explained that redirection to a system device like file:///dev/zero could lead to exhaustion of resources leading to a denial of service.

    Handling a Report

    Due to the sensitive nature of security reports, the security@python.org list is maintained by a small group of trusted developers who analyze and act on reports as soon as possible. If you wish to keep your transmissions to the list encrypted, see the security news page for OpenPGP details.

    If the group determines that there is in fact a security issue, a public bug report may be made with an accompanying patch. In this case, Guido van Rossum made the issue public in issue #11662, complete with an initial patch.

    Fixing the Issue

    What Guido's patch does is restrict redirection to http://, https://, and ftp:// URL schemes. FTP redirection was deemed acceptable, and it's actually a common redirection: download mirroring systems sometimes redirect requests to geographically convenient FTP servers.

    For Python 2.x, FancyURLopener's redirect_internal method now raises an IOError when redirection to an inappropriate scheme is requested. HTTPRedirectHandler's http_error_302 does the same, only raising HTTPError. In Python 3, urllib.request received the same fixes. Included with the patch are two tests which exercise redirection to both valid and invalid schemes.

    As for users receiving the fix, the final security release of Python 2.5 will be occurring soon. While there are no scheduled dates for the next patch releases of the maintenance branches - 2.6, 2.7, 3.1, and 3.2 - all received the code to fix the vulnerability.

  • Meet the Team: Tarek Ziadé

    This post is part of the "Meet the Team" series of posts, which is meant to give a brief introduction to the Python core development team.

    Name:Tarek Ziadé
    Location:Turcey near Dijon, Burgundy, France
    Home Page:http://ziade.org

    How long have you been using Python?

    Around ten years

    How long have you been a core committer?

    Since December 21 2008

    How did you get started as a core developer? Do you remember your first commit?

    I started as a core developer in order to maintain Distutils and make it evolve.

    My first commit as a core developer was a fix for small bug in a distutils feature I proposed before I became a commiter. That feature was added the week before in Python. It's the ability to configure Distutils' register and upload commands to work with several pypi-like servers.

    I committed with my brand new rights on Wed, 24 Dec 2008, which happens to be my birthday, and also the 17th anniversary of the 0.9.4 release of Python.

    Which parts of Python are you working on now?

    In the stdlib: sysconfig, distutils, packaging (to be added in 3.3), shutil, pkgutil, and occasionally in other modules

    What do you do with Python when you aren't doing core development work?

    I work at Mozilla in the Service team, where I build web services using Python

    What do you do when you aren't programming?

    I read comics/graphic novels, write books, play with my kids, drink wines with my wife, and try to renovate my 1848's house.

  • Formalizing the AST Change Control Policy

    Python exposes an abstract syntax tree (AST) representing the compiled form of Python source code in the AST module. The AST module allows user code to inspect and manipulate of the AST representation, in between parsing of source and compilation of bytecode.

    Although the meaning of Python code is defined by the language reference, the AST module is a CPython implementation detail, and is not required to be implemented in other Python implementations.

    Compatibility of the AST

    As part of work to rewrite the CPython peephole optimizer to work on the AST (rather than on the raw bytecode, as is currently the case), Eugene Toder needed to make some changes to the structure of the AST. As a CPython implementation detail, it wasn't immediately clear what backward compatibility policies applied to the AST. So, Eugene asked the question on python-dev. Was it necessary, when changing the AST, to ensure backward compatibility?

    The general consensus was that compatibility is not required. The AST module exposes a constant, ast.__version__, which provides a means for user code to vary its behaviour depending on the version of the AST it encounters. This was viewed as sufficient compatibility for an implementation-specific module.

    Other Python Implementations

    In actual fact, both Jython and IronPython maintainers pointed out that their respective implementations either had a compatible AST module, or intended to provide one. Even so, they did not feel that this meant that the AST should be frozen, and were happy that as long as the ast.__version__ constant changed, the AST could be modified in incompatible ways.

    One point that was raised is that a full suite of tests in test_ast.py would help other implementations to ensure that their AST representations were compatible with CPython. Increasing the coverage of test_ast.py would make a good project for someone who wanted to get involved with Python internals!

    What Will Happen Next?

    The patch which started the discussion is not yet included in CPython. So possibly, nothing will happen. However, if it does get committed, the AST will change in an incompatible way. The ast.__version__ constant will change to reflect this, so user code will know, but changes will be needed. More generally, this will be the way AST changes will be handled in future.

    The Python developers are interested in how widely the AST is used, and how much impact this policy will have. If any readers have code that will be affected by the change, they are encouraged to participate in the discussion on python-dev.

  • Thomas Heller Steps Down as ctypes Maintainer

    The Python development community owes a big thanks to long-time ctypes maintainer Thomas Heller. Earlier this month, Thomas announced his departure from the CPython project, the home of his ctypes library since Python 2.5.

    I had a chance to talk with Thomas and he filled me in on his history with Python and his ctypes and py2exe projects.

    Python

    Back in 1999, Thomas came into Mark Lutz's Programming Python while looking for a resource to learn Python and became fascinated with the language right away. He was in the process of replacing Scheme as the extension language for a large C program he had written for Windows.

    As for how he got involved in the development team, his first contribution to CPython (and open source in general), was a small Windows-related patch to distutils. His interest in distutils ultimately led him to the creation of the bdist_wininst command to create point-and-click Windows installers. From there, Greg Ward invited him to the python-dev group where he eventually received commit access.

    py2exe

    Like many Windows users, he had the need to deploy shrink-wrapped Python applications as a single executable file. Early approaches to the problem came from Python luminaries Fredrik Lundh's squeeze and Christian Tismer's sqfreeze, and Thomas contributed several patches to Gordon McMillan's Installer project.

    His interest in distutils led Thomas to consider porting Installer to an extension to the packaging library. However, he ended up rewriting the source in order to make use of the existing distutils framework. In the end, he chose the simple yet descriptive name py2exe for the project.

    ctypes

    The idea for ctypes came from a need to go beyond what pywin32 provided at the time. Additionally, his work with Scheme required an interface to Windows APIs much like his Python work did, so he wanted to keep his project going.

    ctypes saw its first public release in 2003 around the release of Python 2.3, after Thomas received numerous requests to publish the project. He mentioned what used to be his small personal project on his Starship page, but it grew into a widely used library in no time.

    He originally started the project on Windows but quickly heard calls for a Linux port, which the community helped him complete. With the Linux port came the introduction of libffi to the project, which he also began using on Windows to replace its lower-level implementation.

    2006 marked a 1.0 release for ctypes, which corresponded with the library's acceptance into the standard library in Python 2.5. After years of hard work and numerous releases per year, ctypes was now bundled with Python and available by default to a much wider audience.

    It took a lot of people to get ctypes to where it is today, and Thomas wants to thank everyone involved, especially Robin Becker. Robin was instrumental in the early phases of the project and contributed both knowledge and encouragement.

    A New ctypes Maintainer

    After all of the hard work Thomas put in over the years, we would hate to see the project come to a stand still. If you have C experience and time to help out the Python project, the community would greatly appreciate your effort. Check out the new developer guide and search the bug tracker for more information.

    Updated: Fixed some links.

  • Meet the Team: Benjamin Peterson
    This post is part of the "Meet the Team" series of posts, which is meant to give a brief introduction to the Python core development team.
    Name:Benjamin Peterson
    Location:Minnesota, USA
    Home Page:http://benjamin-peterson.org
    Blog:http://pybites.blogspot.com

    How long have you been using Python?

    3.5 years.

    How long have you been a core committer?

    Exactly 3 years this March 25th.

    How did you get started as a core developer? Do you remember your first commit?

    My first proposal was personally rejected by Guido himself. Luckily, I persisted and got some patches accepted. I believe my first commit was reordering the Misc/ACKS file.

    Which parts of Python are you working on now?

    I like the parser, compiler, and interpreter core, but I've been known to dabble in just about every part of core Python development... except Windows!

    What do you do with Python when you aren't doing core development work?

    I use it to implement a Python interpreter (http://pypy.org)! Truly, I'm a Python implementor at heart. :) I am the creator of six (http://pypi.python.org/pypi/six), a Python 2 and 3 compatibility library.

    What do you do when you aren't programming?

    Compose music, play clarinet, and read math books. I do a little hiking now and then, too.

  • Deprecations between Python 2.7 and 3.x

    Recent discussion on python-dev highlighted an issue with Python's current deprecation policy facing developers moving from Python 2.7 to current versions of Python 3.x. As a result of this issue, the development team has modified the current deprecation policy to take into account the fact that Python users will normally migrate directly from Python 2.7 to the latest version of 3.x without ever seeing older versions.

    Background

    Python has a strong commitment to backward compatibility. No change is allowed unless it conforms to compatibility guidelines, which in essence say that correct programs should not be broken by new versions of Python. However, this is not always possible, e.g., where an API is clearly broken and needs to be replaced by something else. In this case, Python follows a deprecation policy based on a one-year transition period where features to be removed are formally deprecated. In the intermediate period, a deprecation warning must be issued to allow developers time to update their code. Full details of Python's deprecation policy are documented in PEP 5. As changes are only made in new Python releases, and there is normally an 18 month gap between releases, this means that a one-release deprecation period is the norm.

    The one exception to this policy was Python 3. The major version change from Python 2 to Python 3 was specifically intended to allow changes which broke backward compatibility, to allow the Python developers the chance to correct issues which simply couldn't be fixed within the existing policy. For example, making strings Unicode by default, and returning iterators instead of lists.

    Parallel Lines of Development

    Knowing the transition to Python 3 would take time, 5 years by many estimates, there was going to be some amount of parallel development on 2 and 3.

    With Python 2.7 being the final release of Python 2, it was agreed upon that the maintenance period would be extended for a substantial period. In the end, developers who want to move to a newer version of Python will need to make the jump to Python 3.

    Here lies one the problems...

    Surprise deprecations

    In a thread on python-dev, a poster pointed out that one specific function in the C API, PyCObject_AsVoidPtr, was removed with what appeared to be insufficient warning. And yet, this is what the deprecation policy was supposed to protect against! What happened?

    The change was part of a larger migration from an older API (PyCObject) to a newer, improved one (PyCapsule). The problem is that PyCObject is the default, and indeed, only API available in Python 2.6. It went on to be deprecated in Python 2.7. In Python 3.2, that API doesn't exist and the new PyCapsule should be used. That gives a deprecation period from the release of Python 2.7 (July 2010) to the release of Python 3.2 (February 2011) - about 7 months. That is a lot less than the minimum 12 month period, and makes it difficult for developers to support a reasonable range of Python releases.

    For someone moving from 3.0 to 3.1 then 3.2, the deprecation path is fine. Python 3.1 was released in March 2010 with the deprecation, and so in the 3.x release series, a deprecation period of almost 12 months was available. However, that's not what people really do: they go from 2.7 straight to the latest version of 3.x, in this case 3.2, resulting in this problem. This was never the intention of python-dev, but PEP 5 had not been written with parallel versions of Python, both of which were under active development, in mind.

    So what do we do?

    While the PyCObject/PyCapsule API break is a definite problem, it's not impossible to work around, but at least one poster on python-dev had some difficulties to deal with. Overall, this shouldn't have happened.

    For the specific case of PyCObject/PyCapsule, the problem already exists and there is not much that can be done. Reinstating PyCObject was not really an option, as that would only add further incompatibilities. However, the general view was that it is possible, albeit tedious, to write code to adapt to whichever API is available. In fact, in Python 3.1, the PyCObject API was written as a wrapper over the PyCapsule API. There was a suggestion that should anyone need it, the Python 3.1 implementation could be extracted for use in 3rd party code. Additionally, it was agreed that a "retroactive" PEP covering the change would be written, to describe the reasons behind the change and document resources which can help developers migrate.

    On a more general note, the Python development team is now aware of the problem and will work to avoid it reoccurring. Guido posted a review of the situation and suggested that Python 3 should be conservative in the use of deprecations for the moment. At a minimum, deprecated APIs will be retained substantially longer before being removed, to give developers moving from 2.7 a migration path.

    More indirectly, the thread raised the issue of how to more effectively communicate changes in Python to a wider audience, in a more timely manner - an issue that this blog was formed precisely to address.

    What does all this mean?

    First and foremost, it means that the Python developers don't always get everything right. Nobody meant to make life harder for developers, it just wasn't something that was spotted in time.

    Secondly, fixing the problem can do more harm than good, so the PyCObject API is not being reinstated. While reinstatement might help developers who were bitten by the change, overall it would make compatibility issues more complex. In the meantime, we have to put up with the issue and move on. Lessons were learned, and we won't make the same mistake next time.

    On thing this shows is that the Python development team wants to hear from the users. Compatibility is very important, and every effort is made to make the transition to new versions as painless as possible. In particular, library developers should be able to support multiple Python versions with a reasonable level of effort.

    Finally, the developers haven't abandoned 2.7. While it won't be getting new features and there will be no 2.8, the views of people using 2.7 are still important. Making sure users can move to 3.x when they are ready is vital for the whole Python community.

  • Of polling, futures and parallel execution

    One of the big concerns in modern computing is saving power. It matters a lot in portable devices (laptops, tablets, handhelds). Your modern CPU is able to enter a various number of low-power states when it is idle. The longer it stays idle, the deeper the low-power state, and the lower the energy consumed, and, therefore, the longer the battery life of your device on a single charge.

    Low-power states have an enemy: polling. When a task periodically wakes up the CPU, even for something as trivial as reading a memory location to check for potential changes, the CPU leaves the low-power state, wakes up all its internal structures, and will only re-enter a low-power state long after your menial periodic wakeup has finished its intended work. This kills battery life. Intel itself feels concerned.

    Python 3.2 comes with a new standard module to launch concurrent tasks and wait for them to end: the concurrent.futures module. While perusing its code, I noticed that it used polling in some of its worker threads and processes. I'm saying "some of", as the implementation differs between the ThreadPoolExecutor and the ProcessPoolExecutor. The former did polling in each of its worker threads, while the latter only did so in a single thread named the queue management thread, which is used to communicate with the worker processes.

    Polling here was only used for one thing: detecting when the shutdown procedure should be started. Other tasks such as queueing callables or fetching results from previously queued callables use synchronized queue objects. These queue objects come from either the threading or multiprocessing module depending on which executor implementation you are using.

    So, I came up with a simple solution: I replaced this polling with a sentinel, the built-in sentinel named None. When a queue receives None, one waiting worker is naturally woken up and checks whether it should shutdown or not. In the ProcessPoolExecutor, there is a small complication as we need to wake up N worker processes in addition to the single queue management thread.

    In my initial patch, I still had a polling timeout; a very large one (10 minutes) so that the workers would wake up at some point. The large timeout existed in case the code is buggy and they didn't get a shutdown notification through the aforementioned sentinel when they should. Out of curiousity, I dove into the multiprocessing source code and came to another interesting observation: under Windows, multiprocessing.Queue.get() with a non-zero, non-infinite timeout uses...polling (for which I opened issue 11668). It uses an interesting high-frequency kind of polling, since it starts with a one millisecond timeout which is incremented at every loop iteration.

    Needless to say that still using a timeout, however huge, would render my patch useless under Windows since the way that timeout is implemented would involve wakeups every millisecond. So I bit the bullet and removed the huge polling timeout. My latest patch doesn't use a timeout at all, and therefore should cause no periodic wakeups, regardless of the platform.

    Historically speaking, before Python 3.2, every timeout facility in the threading module, and therefore in much of multiprocessing since multiprocessing itself uses worker threads for various tasks, used polling. This was fixed in issue 7316.

  • 2011 Language Summit Report

    This year's Language Summit took place on Thursday March 10 in Atlanta, the day before the conference portion of PyCon began. In attendance were members of the CPython, PyPy, Jython, IronPython, and Parrot VMs; packaging developers from Fedora, Ubuntu, and Debian; developers of the Twisted project, and several others.

    Development Blog

    One of the first orders of business was discussion of this very blog, initiated by PSF Communications Officer Doug Hellmann. Due to the high-traffic and often intense nature of the python-dev mailing-list, the blog hopes to be an easier way for users to get development news. We plan to cover PEPs, any major decisions, new features, and critical bug fixes, and will include informal coverage of what's going on in the development process.

    Posting to the blog is open to all implementations of Python. For example, while PyPy already has their own active blog, they are welcome to have news posted here as well. A related side discussion lead to the alternative implementations also being mentioned on the python.org download page. Their releases will also be listed as news items on the python.org front page.

    Compatibility Warnings

    With 3.2, we introduced ResourceWarning to allow users to find areas of code that depend on CPython's reference counting. The warning not only helps users write better code, but allows them to write safer cross-VM code. To further cross-VM compatibility, a new warning type was suggested: CompatibilityWarning.

    The idea came up due to a recently filed CPython bug found by the PyPy developers. Issue #11455 explains a problem where CPython allows a user to create a type with non-string keys in its __dict__, which at least PyPy and Jython do not support. Ideally, users could enable a warning to detect such cases, just as they do with ResourceWarning.

    Standalone Standard Library

    Now that the transition of CPython's source from Subversion to Mercurial has been completed, the idea of breaking out the standard library into its own repository was resurrected. The developers of alternative implementations are very interested in this conversion, as it would greatly simplify their development processes. They currently take a snapshot from CPython and apply any implementation specific patches, replace some C extensions with pure Python versions, etc.

    The conversion will need to be laid out in an upcoming PEP, and one of the discussion points will be how versioning will be worked out. Since the library will live outside of any of the implementations, it would likely be versioned by itself, and the tests will need version considerations as well.

    Another topic for the standard library breakout was pure Python implementations and their C extension counterparts. Maciej Fijalkowski of the PyPy project mentioned that over time, some modules have had minor feature differences between their C and Python versions. As discussion of the breakout goes on, the group suggested a more strict approach to changing such modules, as to not penalize the use of one or the other. Additionally, a preference on pure Python implementations was decided, with C implementations being created only in the event that a performance gain is achieved.

    Performance Benchmark Site

    The PyPy Speed Center has done a great job of showing PyPy's performance results, and some discussion was had about hosting a similar site on python.org, possibly as performance.python.org for all VMs to take part in. In addition to performance benchmarks, others such as memory usage, test success, and language compatibility should be considered. Some effort will be needed to adapt the infrastructure to work with multiple Python implementations, as it currently tests PyPy vs. CPython.

    Talk of putting some high-performance machines in the Open Source Lab at Oregon State University, where Allison Randal is on the board, came up as a target for where the new Speed Center could live. Jesse Noller mentioned efforts to obtain hardware to put in the lab -- donations welcome!

    If you or your organization are interested in donating for this cause or others, please contact the Python Software Foundation and check out our donations page.

    Moratorium Lifted

    With the start of development on CPython 3.3, the moratorium on language changes has been lifted. While the flood gates are open, language changes are expected to be conservative while we try to slow the rate of change and continue to allow alternative implementations to catch up. Although no one caught up to the 3.x line thanks to the moratorium, PyPy and IronPython recently reached 2.7 compatibility, and IronPython is beginning down the road to 3.x.

    As for what language changes are expected in 3.3, look forward to seeing PEP 380 accepted. The PEP introduces a new yield from <expr> syntax, allowing a generator to yield to another generator. Other than this, no other language changes are expected in the near future.

    Exception Attributes

    The next topic was a quick discussion on exceptions providing better attributes, rather than forcing users to rely on string messages. For example, on an ImportError, it would be useful to have easy access to the import which failed, rather than parsing to find it.

    The implementation will likely rely on a keyword-only argument when initializing an exception object, and a patch currently exists for the ImportError case.

    Contributor Agreements

    Contributor agreements were also mentioned, and some form of electronic agreement is underway. Google's individual contributor agreement was one of several inspirations for what the new system should work like. The topic has been long discussed, and many people are looking forward to a resolution in this area. Additionally, research is being done to ensure that any move to an electronic agreement remains valid in non-US jurisdictions.

    Google Summer of Code

    Martin von Löwis took a minute to introduce another year of Google Summer of Code under the PSF umbrella. Developers are encouraged not only to act as mentors, but also to propose projects for students to work on -- and remember that suggesting a project does not imply that you will mentor it. If you are interested in helping in any way, see the PSF's Call for Projects and Mentors.

    Distutils

    Distutils2 came up and Tarek Ziadé mentioned that their sprint goal was to finish the port to Python 3 and prepare for the eventual merger back into the Python standard library. Additionally, with the merge comes a new name: packaging. The packaging team also plans to provide a standalone package, still called Distutils2, supporting Python 2.4 through 3.2.

    The result of the packaging sprint, which was one of the larger groups at the PyCon sprints, was very successful. Their current results are on Bitbucket, awaiting the standard library merge.

    The Future of Alternative VMs

    IronPython mentioned their future plans, and a 3.x release is next on their plate. They announced their 2.7.0 release at PyCon, their first community-based release since the project was handed off from Microsoft, and will be starting towards 3.x over the next few months.

    Jython recently came out with a 2.5.2 release and have begun planning on a 2.6 release. Some suggested that they jump to 2.7, as the differences between 2.6 and 2.7 aren't all that great, but it may take longer to get a first release if they jump. "Release early, release often" was one of the quotes coming out of the talk, and they might be able to get away with going 2.6 to 3.x and considering any 2.6 to 2.7 differences after the fact.

    Development Funding

    Coming out of the 3.x planning talks was the topic of funding for development work and how it might be able to speed up some of the alternative implementations getting to 3.x. While funds are available, a proposal to the PSF has to be made before anything can be discussed. Those interested in receiving grants for these efforts should contact the PSF board.

    Baseline Python

    Jim Fulton began a discussion on what he called "baseline" Python. In his experience deploying Python applications, he has found the system Python to be unpredictable and difficult to target. With Fedora and Ubuntu/Debian packaging experts on-hand, we were able to get a look into why things are the way they are.

    For Fedora, the base Python install has the Live CD in mind, so it's a very minimal installation with few dependencies, basically the bare minimum to allow the system to run. Additional differences are seen in directory layouts, removal of standard library modules like distutils, or that the distribution provides out-of-date libraries.

    There didn't appear to be a clear-cut solution right away, but the relevant parties will continue to work on the problem.

    3.3 Features

    Some thoughts for 3.3 features came up, including two PEPs. PEP 382, covering Namespace Packages, should appear at some point in the cycle. It was also mentioned during the distutils merger topic.

    PEP 393, defining a flexible string repesentation, was also up for discussion and also has some interested students as a GSoC project. Along with the implementation, some effort will need to be placed on the performance and memory characteristics of the new internals in order to see if they can be accepted.

    Unladen Swallow

    Unladen Swallow is currently in a "resting" state and will not be included in CPython 3.3 as-is. To make further progress, we would need to identify several champions, as the domain experts are unavailable to do the work. During the discussion, it was again mentioned that if funding is what it would take to push Unladen Swallow to the next level, interested parties should apply to the PSF.

    While Unladen Swallow is in its resting state and has an uncertain future, the project provided a large benefit to the Python and general open source community. The benchmark suite used by Unladen Swallow is very useful for testing alternative implementations, for example. Additionally, contributions to LLVM and Clang from the Unladen Swallow developers helped out those projects as well.

    Two other performance ideas were also briefly discussed, including Dave Malcolm's function inlining proposal. Martin von Löwis mentioned a JIT extension module he has in the works, although the PyPy developers expressed skepticism of the effectiveness of a JIT of this kind.

    Paving a Path to Asynchronous Frameworks

    Ending the day was a discussion of some level of integration of Twisted into the standard library. The main idea is that an alternative to asyncore exists which allows for an easier transition to Twisted or other asynchronous programming frameworks.

    The process will be laid out in an upcoming PEP, which some suggested would serve a purpose similar to the WSGI reference but for asynchronous event loops. Along with the PEP author(s), the Twisted project and others will need to put in effort to ensure everyone is on the same page.

    More Information

    For more information, see CPython developer Nick Coghlan's rough notes and highlights