Nuitka - The Python Compiler

by Kay Hayen for EuroPython 2012

With Nuitka, for the first time, there is a consequently executed approach to statically translate the full language extent of Python, with all its special cases, without introducing a new or reduced version of Python.

It is compiled, but with practically 100% compatibility. Function dictionaries work, code objects exist, frame stack works, exception tracebacks, eval, exec, closures, nested functions, meta classes, etc. it’s all there, and behaves identical.

First, I would like to start out and explain how I came to write a Python compiler, why I want it to be 100% compatible, and why I find deviations from Python unacceptable and out of scope.

Then I would like to describe where difficulties were in the implementation, what Python constructs surprised me, and where the mapping from Python to C++ left things to desire.

In this project, I learned a lot about Python, it wasn’t easy to get the full CPython test suite to run. In doing that, I have learned anecdotes and fine details of Python, that are normally hidden in daily programming, but are still useful to know.

Esp. the work on re-formulating “with” statements, “assert”, “while"/"for” as generic loops, etc. gives an interesting view on Python itself. And I would like to present it, also for the insight it gives on Python.

I will give an overview over newly developed infrastructure, aiming for type inference at compile time, and show existing stuff. I will try and explain, why I hope to have picked the right approach in this domain.

An interesting side game, is the approach to use XML representations of the internal node tree of Nuitka to discover regressions/changes in the optimizer.

Then I will also present a project road map, with the milestones for Nuitka, and why I believe this is the right plan, and how Nuitka is different from projects like “Cython” and “PyPy”.

To round it up, I would like to make a demonstration of Nuitka, and give an example for how easy it should be to contribute.

As this will be the first time, Nuitka is introduced the PyCON EU (it was only shown on PyCON DE 2011 so far). And to celebrate that, the current GPLv3 license will be lifted, and replaced with Apache 2.0 license (ASF), which is entirely liberal.

Video

Comments

This project looks completely misguided. The talk focused on the trivialities of mapping Python to C++ rather than on the interesting problems to be encountered when trying to optimize Python while maintaining its extremely dynamic semantics. Also the benchmarking effort is laughable; pystone is not to be taken seriously (only exercises a tiny part of the language) and pybench does microbenchmarks, which are optimized away. You should try the "real-world" benchmarks from the PyPy and Unladen Swallow projects. And what is the size of the generated code? (E.g. how big would the binary for the entire standard library be?) In your blog, please use less boring subjects than "version x.y.z released".
Guido van Rossum, 04 July 2012 #
I didn't enter pre-mature optimization, so I didn't focus a lot on benchmarks. I did it for pystone, because it's quite easy, and some more limited projects, they could also handle it.

Actually, even now, I am still doing structural changes, e.g. the re-formulation work, which will make optimizations easier, and then I discover that I need more structural changes.

Answering the whole stdlib question is a bit tough. But "hg.exe" on Linux, is 24MB, so sizes are acceptable.

As for interesting blog subjects, well yes, but you have to be aware, that this is a spare time effort.

If you are interesting in my plans, you can find them in the Developer Manual, at http://www.nuitka.net/doc/developer-manual.html

There I detail how I intend to approach type inference.

Talking about that though, would have seemed to pre-mature. I believe my plan is half-sound and it appears to work out, but I don't want to talk about unfinished things.

In my first English talk, it seemed more appropiate to get out some kind of manifesto and explain the basic choices made by the project.

I hope, when the project advances, you will have a chance to see more beef to it.
Kay Hayen, 04 July 2012 #
This comment is totally misguided-- especially when you consider the first keynote’s topic.

I'm impressed with the amount of progress made so far by. Keep up the good work.

(I will agree that the PyPy benchmarks would be a good addition though. I look forward to seeing that)

E Marley, 07 July 2012 #
Well, thanks so much for the kind words.

The http://speedcenter.nuitka.net was setup quite short before the conference, and surely, it will see more benchmarks added as I find the time.

As for benchmarks, the PyPy ones, I will probably do them too. But I consider the comparison to ShedSkin with its demos to be far more interesting for my development.

Why? Because I will look at these programs as something where type inference absolutely can be done. Shedskin won't compile, if it's not possible.

And there I will be able to compare Shedskin and Nuitka and learn something from that.

Because benchmarks of all sorts, clearly are a must, once it's to be declared ready for the users. Until then, benchmarks should rather aid development.

One thing Guido wasn't at all taking into account, is that I have day job and family, and due to that, I can't just do everything that would be nice to have. But I have to make choices. And typically, I would rather just improve Nuitka.

But now that my blog is based on ReST, I expect to be able to post stuff from the developer manual, or generally make things public, once I write it up.

That developer manual, I made it only (and not only PDF as it just ed be just before the conference). It became possible thanks to Nikola.

So I realize, that while I have lots of interesting things to say, there is the issue, that actually doing it, was happening in ways, where nobody could see it. And I have worked to address that in my changing to Nikola, which will make it much more easy.

So well yeah. Don't forget, this was the first time you heard of Nuitka. Not the last time.

Check out that "ctypes" plan, in developer manual. It is my current state of toughts, for how to design the type inference.
Kay Hayen, 07 July 2012 #
I use your Nuitka a lot. Just to hide code (and if there are performance impovements that come with it that is for me a bonus then.....) That alone has great value for me. Might be a side effect of what your goal is..... but you can bring it more to attention of others. Lots of people are strugling to hide code of their commercial closed source Python apps.

Protecting source code with Python is something G. van Rossum apparently refuses to build into Python..... while so many want to have such feature.

I agree with you about the size the produced object code...... nowadays with such amount of CPU power and memory is what you/nuikta produce very good and much more then just acceptable :-)

TD, 06 November 2013 #
Hello,

that's kind words, and surely, I am not into Nuitka just to make Python another language that allows to keep the source code hidden. There are others that can do that already. It's rather in the larger quest to make Python a fast language, where it's a side effect.

However, as for "portable" mode, I am making good progress that you will like to know. There is now "hello world" program on Linux that doesn't need system standard library installation, and I expect that works for Windows too.

Then remains the issue of shared library modules, and it's done. And a lot of people will start to recommend Nuitka, which will expand its user and developer base hopefully.

As for GvR, in fairness, one must say, that I am basing my "portable" work in part on "freeze" tool ideas, which seems written by Guido (because or despite the ugly code I cannot say), and allowed the same thing. It seems not to be used anymore though.

So I don't believe he is totally opposed to the idea, but that is just my guess.

The good thing he did to Python is to make his opinion be just that. I can do Nuitka without and beyond his control.

Kay Hayen, 06 November 2013 #