by Enrico Franchi for
In the last years Python greatly increased its popularity as a tool for number-crunching and computationally intensive tasks. Although such tasks are traditionally associated with scientific computing, they also arise in several other scenarios, such as business and financial applications, or in the increasingly growing domain of social media.
A large part of Python success stems from its suitability both as a general purpose language for application development and as the language to perform these more computationally intensive tasks, albeit typically by gluing specialized libraries.
In the present talk, I introduce some of the tools and libraries that are typically used for such tasks, and I show how they fit nicely together to produce an effective computational environment that (i) is easy to use, (ii) is reasonably efficient, and (iii) interoperates effortlessly with the world outside. My main focus is to provide a brief introduction to the fundamental components that we use for crunching numbers and to show how they play nicely together.
Another important issue that needs to be addressed in most computationally intensive tasks is the efficient usage of internal memory. A part of this talk is devoted to understanding how to effectively represent data, considering the various trade-offs of Python built-ins and external libraries. Moreover, I introduce some profiling tools that allow measuring CPU or RAM usage.
Eventually, I examine how the problems that have been highlighted by profiling can be solved, (i) changing various high level strategies, (ii) using some of the previously discussed libraries (numpy, scipy, pandas), or (iii) using other tools, such as cython, scipy.weave, or F2Py.