## Multi-core parallel processing in Python with multiple arguments

I recently had need for using parallel processing in Python. Parallel processing is very useful when:

• you have a large set of data that you want to (or are able to) process as separate ‘chunks’.
• you want to perform an identical process on each individual chunk (i.e. the basic code running on each chunk is the same). Of course, each chunk may have its own corresponding parameter requirements.
• the order in which each chunk is processed is not important, i.e. the output result from one chunk does not affect the processing of a subsequent chunk.

Under these conditions, if you are working on a multi-core computer (which I think is true for virtually all of us), you can set up your code to run parallelly using several or all of your computer’s cores. Using multiple cores is of paramount importance in order to gain any improvement in computation time. If you attempt such parallel processing on a single core, the computer will simply switch between separate computational threads on that single core, and the total computation time will remain constant (in fact, more likely the total time will increase because of the incessant switching between threads).

Anyhow, there are several methods of achieving multi-core parallel processing in Python. In this post, I will describe what I think is the simplest method to implement. This is the method I chose, and with whose results I am quite happy.

Additionally, most examples online that go over implementing parallel processing never mention how to handle multiple input arguments separate from the iteration parameter. There are several methods of including that too, and I will also describe what I think is the simplest method to implement and maintain.

Say, you have the following code setup:

arg1 = val1
arg2 = [val2, val3]
arg3 = ['val4', 'val5']
fileslist = ['list', 'of', 'files', 'that', 'are', 'to', 'be', 'processed']

for file in fileslist:
print('Start: {}'.format(file))
# perform a task with arg1
# perform a task with arg2
# print something with arg3
# save some data to disk
print('Status Update based on {}'.format(file))


Now, for parallel processing, the target is to convert the for loop into a parallel process controller, which will ‘assign’ file values from fileslist to available cores.

To achieve this, there are two steps we need to perform. First, convert the contents of your for loop into a separate function that can be called. In case of parallel processing, this function is only allowed one argument. Set up your function accordingly, planning that this single argument will be a tuple of variables. One of these variables will be the iteration variable, in our case file, and the rest will be the remaining variables required.

def loopfunc(argstuple):
file = argstuple[0]
arg1 = argstuple[1]
arg2 = argstuple[2]
arg3 = argstuple[3]
print('Start: {}'.format(file))
# perform a task with arg1
# perform a task with arg2
# print something with arg3
# save some data to disk
return 'Status Update based on {}'.format(file)


Second, update the main code structure to enable multi-core processing. We will be using the module concurrent.futures. Let’s see the updated code first, before I explain what is happening.

import concurrent.futures

arg1 = val1
arg2 = [val2, val3]
arg3 = ['val4', 'val5']
fileslist = ['list', 'of', 'files', 'that', 'are', 'to', 'be', 'processed']

argslist = ((file, arg1, arg2, arg3) for file in fileslist)
with concurrent.futures.ProcessPoolExecutor() as executor:
results = executor.map(loopfunc, argslist)

for rs in results:
print(rs)


OK, now let’s go over it. The with ... line invokes the parallel processing tool which creates the executor object. In the next line, executor.map() is used to provide two pieces of information: (a) what function is to be repeatedly executed, and (b) a tuple of arguments that need to be passed for each function execution. Notice that when calling executor.map(), we are providing loopfunc as an object, and are not attempting to execute the function itself via loopfunc().

Now, argslist is meant to be a tuple containing arguments for all iterations of loopfunc, i.e. len(argslist) = len(fileslist). However, in our case, only the fileslist variable is iterated over, while other arguments are provided ‘as-is’. The workaround for this is to use list-comprehension (err… I mean tuple-comprehension) to generate a new variable (in our case argslist) that contains all relevant arguments for each function iteration.

In this way, the first process is created with loopfunc( (fileslist[0], arg1, arg2, arg3) ), the second process is created with loopfunc( (fileslist[1], arg1, arg2, arg3) ), and so on. Of course, within loopfunc(), we have already converted the input single argument into multiple arguments as we need.

Values return-ed from loopfunc() are stored in the variable results, which is looped over to print out each value. The fun behavior here is that each rs item is executed as that value becomes available, i.e. when each process completes. For example, if you’re running on a 4-core machine, output from the code can look like the following, depending upon the speed of execution of each iteration:

Start: fileslist[0]
Start: fileslist[1]
Start: fileslist[2]
Start: fileslist[3]
Status Update based on fileslist[0]
Status Update based on fileslist[1]
Start: fileslist[4]
Start: fileslist[5]
Status Update based on fileslist[2]
Start: fileslist[6]
Status Update based on fileslist[3]
Start: fileslist[7]
...


Without any arguments, ProcessPoolExecutor() creates as many processes as there are cores on your computer. This is great if you want to run your code and walk away for a few hours, letting your Python script take over your whole computational capability. However, if you only want to allow a specific number of processes, you can use ProcessPoolExecutor(max_workers=nproc), where nproc is the number of processes you want to simultaneously allow at most.

## To-do

In my current implementation I have used the above method to work on ‘chunks’ of data and then saved the resultant output with appropriate markers to disk. However, another way to implement parallel processing would be to take the output from each iteration, and save it as an element in an array, at the correct array index.

This should not be hard to do, all I should need is to return both the output data and the correct marker for the array index. I just haven’t done it (nor needed to do it) yet. I actually prefer saving the output from each chunk to disk separately, if possible, so that even if something crashes (or the power goes out, or whatever) and the process is interrupted, I won’t lose all progress made until then.

Right from the beginning, I’ve assigned broad categories to every post I’ve written here. (For example, this is my—very lacking—Health Monitoring series of posts.) However, Octopress does not include these category tags by default into the RSS feed. So if a reader is using an RSS feed-reader app or website, they cannot make use of the assigned categories even if the app or website was capable of doing so.

I’ve now added some code necessary to add the categories to the RSS feed, and this is what I did.

cellArray = {'Alpha','Beta','Gamma','Delta','GammaSquared'};
refString = 'Gamma';


At the outset, here is the code that I added:

{% for post in site.posts limit: 20 %}
<entry>
<!-- Other items that are included in the feed -->

{% capture catnum %}{{ post.categories | category_links | size }}{% endcapture %}
{% unless catnum == '0' %}
<categories>
{% for cct in post.categories %}
{% assign idx=forloop.index0 %}<category>{{ post.categories[idx] }}</category>
{% endfor %}
</categories>
{% endunless %}

<!-- Other items that are included in the feed -->
<content type="html"><![CDATA[{{ post.content | expand_urls: site.url | cdata_escape }}]]></content>
</entry>


{% endfor %}

This code works great, but allow me to confess that I am not sure that this is the optimum implementation. To me this seems inelegant, but until I have a better solution, this performs the function appropriately and perfectly adequately.

I’ve only included the relevant portion and the context in which it must be inserted. (See the comment tags <!-- Other items that are included in the feed -->.)

The meat of the algorithm is from lines 7 through 11.

• A <categories> tag is defined, and a for loop is executed over post.categories, which contains the list of categories for the post.
• Within the for loop, each post category is enclosed in a <category></category> tag.

Now I had initially thought that the loop variable (cct here) would inherit sequentially the value of each category in post.categories, but apparently that does not work properly. Therefore, the workaround is to

• identify the loop index (assign idx=forloop.index0) and
• use individual values of the categories (post.categories[idx]).

We must use forloop.index0 and NOT forloop.index (both are valid commands; the index key starts numbering from 1) because the array numbering starts from 0, not 1.

OK, now that the meat of the algorithm is done, we must put in some code to handle the “unusual” cases—what happens if a post does not have any categories assigned? Such a scenario is handled by the capture command (line 5) and the unless segment that encloses our actual algorithm. The capture command simply captures a value, in our case the number of categories that exist. We only want to include the categories when they exist, therefore our algorithm is run only unless catnum=='0' i.e. when the number of categories is not 0.

Well, that’s it! I have added the code segment before the actual content of each post, but I don’t think it makes any difference if the segment appears after the <content> tag. It should work fine anywhere within the <entry> environment.

## Using MathJax with Octopress

I’ve been meaning to try and implement MathJax on this website for a while now. For including math equations on a website, MathJax is probably one of the more elegant ways to do it. I can write equations in TeX format, and MathJax renders the equations properly for you!

Finally, in the last couple of days I’ve been forced to get around to it, thanks to a new post that I’m writing that includes a little bit of math. So anyway, I just wanted to jot down that process.

The thing with MathJax is that it’s meant to, and does, work with HTML. But since I’m working with Octopress and Markdown, I have to ensure that the conversion from Markdown to HTML produces no unwanted syntactical errors for MathJax. To get around this problem, Zac Harmany (I hope I got the name right) suggests tweaking the Markdown rendering engine to Pandoc, so that the conversion works as desired. I’m sure that works great, but I had no intention of tweaking my Markdown conversion engine. Instead, I discovered a nifty ruby bundle (here) that serves a great purpose.

Next, the typical MathJax “installation” involves adding a line in the <head> section of your Octopress theme (Add it to %octopress_root%/source/_includes/custom/head.html), so that when every page loads, the necessary Javascript files are also loaded, ready to render your equations. I did not want the Javascript to execute to load along with all my pages, given that I don’t expect to have equations in all my posts. Instead, I’ve included the call to the script in the <body> of the post, i.e. in the meat of the post itself. Remember that the declaration needs to be before your first equation.

I also contemplated downloading the MathJax distribution and hosting it locally on my own server, but that did not work at all. There are way too many files to be uploaded (each file is small; the total package is ~50MB; there are too many files, though) and it just took forever to upload to my server until I just gave up. I’ll revisit that option if I think using MathJax’s own servers is not working well—which I doubt will happen.

With those details, here’s how I set things up:

• Install the verbatim.rb plugin from this Github repository. To do this, simply download the file (‘Gist’ in Github parlance) and place it in %octopress_root%/plugins. When inserting equations, there’s a syntax to using this plugin; I’ll demo it below.

• Create a MathJaxLocal.js file at %octopress_root%/source/javascripts/ to add local configurations for MathJax. Note that the last line of the code must point to the full path of the local file, in my case http://arnabocean/javascripts/MathJaxLocal.js

Here’s what my MathJaxLocal.js looks like (I started with Zac Harmany’s file and modified to suit my needs.):

MathJax.Hub.Config({
jax: ["input/TeX","output/HTML-CSS"],
tex2jax:
{
inlineMath: [ ['$','$'], ['\$$','\$$'] ],
displayMath: [ ['$$','$$'], ['\$','\$'] ],
skipTags: ["script","noscript","style","textarea","pre","code"],
processEscapes: true
},
TeX:
{
equationNumbers: { autoNumber: "AMS" },
TagSide: "left",

},
"HTML-CSS": { availableFonts: ["TeX"] }
});

• Declare the location of the MathJax files. The easiest thing to do is to use MathJax’s own servers. However, in addition to just their servers, you’ll have to link to your own local config file as well, so we’ll add both of these at the same time. In the main body of your markdown post (preferably after the “Read More” fold), add the following:

<script type="text/javascript"
src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML,http://arnabocean.com/javascripts/MathJaxLocal.js">
</script>


In the above, the first link points to MathJax servers, the second points to my own config file.

And that’s basically it! Now you’re all set to write beautiful equations. Here’s a demo:

<!-- MathJax configuration -->
<script type="text/javascript" src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML,http://arnabocean.com/javascripts/MathJaxLocal.js">
</script>
<!-- End MathJax Configuration -->

{% raw %}{% verbatim tag:p %}{% endraw %}
$f(x)= a_0 + a_1\sin(x) + a_2\sin(2x) + ...$

$+b_1\cos(x) + b_2\cos(2x) + ...$

$f(x)=a_0+\sum_{k=1}^\infty\big( a_k\cos(kx)+b_k\sin(kx) \big)$
{% raw %}{% endverbatim %}{% endraw %}


And here’s what that would look like:

$f(x)= a_0 + a_1\sin(x) + a_2\sin(2x) + …$ $+b_1\cos(x) + b_2\cos(2x) + …$ $f(x)=a_0+\sum_{k=1}^\infty\big( a_k\cos(kx)+b_k\sin(kx) \big)$

There’s still a lot of more that I need to find out, and from the looks of it, Zac’s website is a great resource. I’ll add more posts if I find anything useful that I end up using.

## Creating Bandpass Bessel Filter with MATLAB

Bessel filters are incredibly useful in numerical analysis, especially for acoustic-type waveforms. This is because analog Bessel filters are characterized by almost constant group delay over any frequency band, and this means that the shape of waves does not change when passed through such a filter.

Well, MATLAB provides some of the building blocks required to create a bandpass analog filter, but does not actually combine the pieces to make a usable filter function.

I created a function for my own research (sourced from pieces I found elsewhere, but it’s been too long—I don’t remember where I found each piece, sorry!), and can be found at my MATLAB repository, specifically, here.

Here’s the documentation that I included with the function:

besselfilter. Function to implement a bandpass Bessel Filter.

[filtData, b, a] = besselfilter(order,low,high,sampling,data)

Inputs:

- order:      Number of poles in the filter. Scalar numeric value.
Eg.: 4
- low:        Lower frequency bound (Hz). Scalar numeric value.
Eg.: 50000 (= 50kHz)
- high:       Upper frequency bound (Hz). Scalar numeric value.
Eg.: 1000000 (= 1MHz)
- sampling:   Sampling frequency (Hz). Scalar numeric value.
Eg.: 25000000 (= 25MHz)
- data:       Input data. Numeric vector.
Eg.: data vector of size (n x 1)

Output:

- filtData:   Output filtered data. Numeric vector.
Eg.: data vector of size (n x 1)
- b, a:       Transfer function values for the filter. Scalar numeric.


## MATLAB repository

I’ve been using MATLAB for quite a few years now, using it both for my own research as well for work at VTTI. Well, I decided to share some of the code that I’ve been writing, which may come in handy to others in the same field. I’ve long appreciated the help I’ve received from the larger MATLAB community, and I thought I should start contributing as well. :)