Parallel Programming with Python
Schedule
Tue May 17 2022 at 09:30 am to Wed May 18 2022 at 04:30 pm
Location
Netherlands eScience Center | Amsterdam, NH
About this Event
This workshop will be delivered in person, unless new COVID-19 restrictions are put in place. If taught on location, the eScience Center will provide you with lunch during the workshop, and drinks at the end of the workshop.
Python is one of most widely used languages to do scientific data analysis, visualization, and even modelling and simulation. The popularity of Python is mainly due to the two pillars of a friendly syntax together with the availability of many high-quality libraries. The flexibility that Python offers comes with a few downsides though: code typically doesn’t perform as fast as lower-level implementations in C/C++ or Fortran, and it is not trivial to parallelize Python code to work efficiently on many-core architectures. This workshop addresses both these issues, with an emphasis on being able to run Python code efficiently (in parallel) on multiple cores.
We’ll start with learning to recognize problems that are suitable for parallel processing, looking at dependency diagrams and kitchen recipes. From then on, the workshop is highly interactive, diving straight into the first parallel programs. Participants will be coding along with the instructor in the style of teaching like Software Carpentry. This workshop teaches the principles of parallel programming in Python using Dask, Numba and Snakemake. More importantly, we try to give insight in how these different methods perform and when they should be used.
The workshop is based on the teaching style of the Carpentries, and learners will follow along while the instructors write the code on screen. More information can be found on the workshop website (will be activated once registration is live).
Who
The workshop is open and free to all researchers in the Netherlands. The workshop is aimed at PhD candidates and other researchers or research software engineers.
Prerequired knowledge
The participant should be:
- familiar with basic Python: control flow, functions, numpy
- comfortable working in Jupyter
Recommended
- understand how NumPy and/or Pandas work
Requirements
- A programming editor, when in doubt we recommend Microsoft VS Code
- Python version 3.9, we recommend Anaconda or Miniconda if you only use the command-line interface. If you insist on using vanilla Python, see instructions below.
- Git. If you’re on Windows, follow these instructions: Git for Windows.
To follow along with the workshop, you need to prepare an environment. Clone the workshop repository that we prepared:
git clone https://github.com/esciencecenter-digital-skills/parallel-python-workshop.git
cd parallel-python-workshop
You may prepare the environment either in conda or using vanilla Python with poetry.
Conda (recommended)
For most users we recommend that you use conda to install the requirements for the workshop.
conda env create -f environment.yml
conda activate parallel-python
pytest
If the tests pass, you’re all good! Otherwise, please contact us before the workshop.
Poetry
Only follow these instructions if you’re on Linux or Mac and don’t have conda installed. Make sure that you have Python 3.9 installed.
If you’ve never used poetry before, check it out!
Syllabus
- Recognizing potential for parallelism
- Dependency diagrams
- Measuring performance
- Working with Dask arrays
- Working with Numba
- Parallel design patterns
- Delayed evaluation
- Dependency based programming using Snakemake
Where
This training will take place in-person at the eScience Center, Science Park 402, Amsterdam.
Where is it happening?
Netherlands eScience Center, 402 Science Park, Amsterdam, NetherlandsEvent Location & Nearby Stays:
EUR 0.00