CentOS 6.8 image with Qt5.7, Python 3.5, LLVM 3.8

While trying to bring my setup to package KDevelop standalone for Linux into a shape where it has a nonzero probability of me picking it up again in half a year and actually understanding how to use it, I created a docker base image which I think might be useful to other people trying to package Linux software as well. It is based on CentOS 6.8 and includes Qt 5.7 (including QtWebKit), Python 3.5 and LLVM, all built against the old CentOS libs (and thus e.g. compatible with most glibc versions out there). If you want to use it, simply install docker, and

systemctl start docker
docker pull scummos/centos6.8-qt5.7
docker run -i -t scummos/centos6.8-qt5.7

Good luck with it!

If docker fails to pull the image, you might have to lower the MTU of your network device to 900 (sic).

Randa sprint wrap-up and considerations on packaging complex applications with AppImage

On Sunday, this year’s KDE sprint in Randa, Switzerland came to an end and I am now back home. It was a productive and fun week, as always when meeting the other KDE folks.

Navigation widget in KDevelop with proper spacing and margins

I spent a lot of time on polishing things and fixing small issues mostly in KDevelop, such as:

  • reduce flickering of the completion widget in kdevelop
  • fix various cases where KDevelop’s navigation widget (see above) would have weirdly large spacing in between or at the end
  • fix kdev-python’s completion widget from being absurdly wide in full mode for long type names
  • fix some scoping issues in kdev-python
  • fix colors in kdevelop’s Find in Files widget
  • use a nicely formatted file name in the window title when no project is open
  • fix a few crashes

Open project dialog

One mentionable major task I did was to make the “Open project” dialog work properly on non-linux platforms; that required a bit of thinking, because before, you could open either a project manager config file (e.g. CMakeLists.txt) or a directory from the dialog. However, when using the native file dialog, you have to select in advance whether you want to select a file or a directory (can’t have both). The solution is to always select only the directory, and offer a choice of possible project managers (and the associated config files) later in the process.

Remove assistant overlay in favour of navigation widget

Another major thing Kevin Funk and I worked on was rethinking KDevelop’s assistant popup; especially in the current 5.0 betas, it tends to be a bit annoying and gets in the way a lot. We are thus trying to remove the assistant overlay in favour of offering executions of assistants from the navigation widget:

“Adjust signature” assistant invoked from the navigation widget.

It’s not working perfectly yet, but it is certainly much less agressive than the previous solution, while behaving very similar in terms of usage (Alt+1 still executes the first fix, etc.). It’s currently only in the assistantpopup-ng branch, but we hope to be able to merge that soon.

Packaging KDevelop with AppImage: advanced thoughs

For users stuck on old systems and for testing purposes, we’d like to be able to ship single-file executables containing KDevelop to users (click here to get one). One solution for creating such executables is AppImage. An AppImage basically contains a file system hierarchy containing all the things which cannot be reasonably assumed to be on the user’s system (it is not a container or chroot though, just a set of libraries and assets). In our case, that is most notably a recent version of Qt, some Qt plugins, various KDE frameworks, llvm, KDevelop itself, and libpython plus the kdevelop-python plugin.

In order to start up KDevelop in this environment, various environment variables have to be set in order to make it access the things from the AppImage. Most notably, that is $PATH, $LD_LIBRARY_PATH, $XDG_DATA_DIRS and $QT_PLUGIN_PATH. Sounds simple enough, right? But now consider this: the user wants to run cmake from inside KDevelop. cmake is not shipped with the AppImage, but inherits this environment — and crashes, because it picks up a wrong version of some library from the bundle. The solution seems simple enough: undo the changes to LD_LIBRARY_PATH before launching cmake.
However, in practice, how do you do that? We cannot undo the changes “globally”, because executables shipped with the AppImage, such as  kioexec (used to open [remote] files) still need it even after KDevelop is running. One solution is to go over the code base and change it back everywhere manually where it’s required. But even that is not sufficient; library calls (like KRun::run) might start processes as well, and the calling application cannot influence those processes’ environment directly.

KDevelop with kdev-python running from an AppImage. The image was built on CentOS6 from 2011 (and runs there), and runs on a bleeding-edge Arch system. It should run on most things in between as well.

The solution I finally managed to come up with, and which seems to work okay, is by using LD_PRELOAD. By setting this environment variable, you tell the dynamic linker to load your own versions of library functions instead of the ones it would otherwise load. What use is that to us? Well, when a program wants to start another program, it has to ask the operating system. That usually happens through calling a function from the glibc library. The most common function to launch a process is execve(). We can now provide a library in LD_PRELOAD which overrides the system’s execve() and does this:

  • Look at the path of the process which is launched.
    • If the path is inside the AppImage, do nothing.
    • Otherwise, modify the environment such that the changes we did for start-up of the application are rolled back (remove added components from $PATH, etc.).
  • Call the original system’s execve() to execute the program in the modified environment.

It is quite a hack, but it does seem to work well, and the beauty is that you don’t have to scatter “change the environment back” code all over your application code.

AppImage: Next steps

I will clean up my exec-wrapper and publish it. I will also clean up my KDevelop build script, and try to create a docker image I can run it in, so that people can do that easily (not requiring them to set up a CentOS6 VM themselves).

If you like what we’re doing in those sprints, consider supporting us so we can do another one next year 🙂

KDevelop 5.0 standalone executable for Linux

I am currently at the KDE sprint in Randa, Switzerland. There are about 40 nice people here, working on making KDE software awesome. This year’s focus is on deploying our software more nicely — on Windows and OS X, but on Linux as well.

The landscape around our sprint location

Especially for user-faced applications like KDevelop, it really makes a big difference whether you use today’s version, or the one from two years ago. Many people run quite old Linux distributions, which do not provide up-to-date packages for KDevelop. Another situation where it can be hard to use an up-to-date version of our software is when you are at your workplace or at a university computer pool, and do not have privileges to install software. Thus in this week, besides enjoying the landscape, I spent some time building a stand-alone KDevelop executable. It is available from here: http://files.svenbrauch.de/kdevelop-linux/. The version from 16th June 2016 contains the beta version of KDevelop 5 from the master branch, including the new clang-based C++ plugin and the Python 3 language support plugin.

After downloading, simply make the file executable (either by running chmod +x file or using your file manager’s properties dialog) and run it. If you try, please let me know if it worked and what distribution (and version!) you used — that would be interesting to know.

KDevelop standalone running. Very interesting to look at.

The executable is built by compiling all our dependencies (most notably Qt, a list of KDE Frameworks, and LLVM) on a very old system, in this case CentOS 6. Then all required libraries are installed into a directory, and the libraries they link against are copied there as well by a shell script. What is stripped out is the base system — glibc, libgl, libX, libxcb, things like that. The idea is that these libraries are forward binary-compatible, and by building on an old system and linking against old versions of those libraries, you can run the resulting programs on any more recent system. You could then simply tar up this tree, and include a script to update a few environment variables and launch KDevelop. A somewhat nicer variant of that, which was used for the above installer, is by using AppImage, which puts all the files into an ISO image and adds a startup header to make it look like an ELF binary. The startup header mounts the ISO using fuse, sets up a few environment variables, and runs the program.

This reminds me of something else I would be interested in: On your computer, do you have fuse installed by default? If that is not the case for a lot of systems, it might be worth publishing a tarball instead.

I did lots of other things at Randa too, but I’ll report on those in followup posts.

If you like what we’re doing here, consider supporting the meeting with a donation!

Processing scientific data in Python and numpy, but doing it fast

Over the years I spent a lot of time writing simulation or data processing code for some kind of data. While some data sets are very small (just a few hundred values) it not uncommon today to have a few dozen, a few hundred or even a few thousand megabytes of data to process in some way. In fact, Python is in many cases well capable of processing such data with performance comparable to e.g. C, but with vastly more readable and especially writeable code.

This article writes down some thoughts and findings on how to do this more efficiently with Python code, and focuses specifically on the situation I find myself in most commonly: What counts is the overall amount of time you need to obtain a certain result, i.e. development time plus runtime of the program. I do not want to discuss Python vs C / Fortran / C++ / whatever: it is clear you can always be faster with a compiled language. But if you can get to 40% of the performance of a C program with Python by writing just a hundreth as much code, it’s probably much faster (and much less painful) overall in practice. I will write down a few basics first but mostly focus on a few more hidden features and tricks later on.

So, basics first:

Vectorize everything!

The basis of all scientific data processing in Python today is the numpy module. It provides an array class, numpy.array, which can be used to store numbers, bascially. Whenever you have some numbers, you most probably want to store them in a np.array. You can select a data type (e.g. float32, float64, int), but the default (float64 for most input) is usually fine.

Anyways: Do not loop over np.array instances. In 99% of cases where people do that, it’s the wrong thing to do. Why? Because it is super slow. For a loop, each individual number is converted to a python object; type checks occur, type casts may happen, null checks happen, checks for exceptions in the operator happen, all kinds of things. If you write 3+5 in Python, it will execute hundreds, if not thousands of lines of C code in the background. And that happens for each number in your array if you loop over it. (The same goes for things like a = np.array([x**2 for x in range(50)]), that generator is a loop as well). So, what to do instead?

Operate on the full array without using a loop

For many operations such as adding, dividing etc, this is trivial, just write a+b where a and b are np.array instances. This just adds the Python overhead one single time — for maybe ten million numbers to add in a and b, which makes it completely negligible. The actual iteration happens in a very fast machine-code loop.

All numpy mathematical functions operate on arrays. If you want to calculate the sine of a thousand values, put them into an array and write np.sin(a). Even complicated functions such as np.fft.rfft take an array of input vectors. For some functions, you need to specify the axis they should operate on.

This should also affect how you write your functions and classes: If possible at all, make your functions such that they can operate on a whole array of values, not just a number. And trust me: it is almost always possible, and for most operations it’s as simple as “use np.sqrt instead of math.sqrt”. Do not store your “objects” (e.g. simulated particles) in individual Python objects; instead, use one Python object for the whole collection of simulated particles, and have e.g. the positions and velocities as np.array instances in that object.

This “use the full array” rule is also true if your array has more than one dimension. Often things require a bit more thinking for multi-dimensional arrays, but in almost all cases it’s still possible to avoid a python loop. Avoid python loops at all costs; whatever you do with numpy’s builtin functions will be faster, and probably much faster.

Use numpy’s data reduction functions

If you want to know the sum of all values in an array, use np.sum. If you want to know the absolute value of an array of position vectors, use np.sqrt(np.sum(x**2, axis=1)). Use np.reshape to compute averages: np.average(np.reshape(x, -1, average_count), axis=1). The most useful thing about reshape I learned one day is that you can give -1 for any one dimension to have the function calculate it automatically.

If you want to categorize/bin values of an array, use np.histogram (note also np.histogramdd which computes multi-dimensional histograms).

Use numpy’s data generation functions

numpy can generate random numbers in various distributions, as well as arrays initialized with ones, zeros, or ranges of numbers in all sizes. Make sure to use those functions, instead of initializing an array in a loop. A few functions to look at are np.random.normal, np.random.uniform, np.zeros, np.zeros_like, np.arange and np.linspace. Make sure you know how those work; all of them are worth it.

Although np.arange feels more “at home” for python because it is basically the built-in range(), I find np.linspace to be more useful in many situations, since it allows you to adjust the range while keeping the amount of values. And the amount of values is often more important than their exact positions.

Cleverly use slicing

If you want to know the difference between one element and the next, use slicing: x[1:] – x[:-1]. Make sure you know what np.newaxis is for. Consider using slices with ellipses.

This was just a short overview of a few basic concepts. There are enough introductions to numpy out there, I don’t want to replicate those. The rest of the article is on some more specialized findings.

Use np.einsum

np.einsum is not exactly the simplest function in the world to understand, but it is very powerful and allows efficient vectorization of almost any operation that requires multiplying elements from two arrays and adding them up subsequently. The most obvious use case is matrix multiplication; for example, you can multiply an array of rotation matrices with an array of vectors:Screenshot_20160413_213942

Use np.interp and np.cumsum

This is not so much about performance but instead usability. The interp (“interpolate”) and cumsum (“cumulative sum”) functions actually quite often are what you want but sometimes it’s hard to realize this. Especially, you can use them for numeric integration and inverting functions. interp is of course also very useful in its most obvious application, which is re-sampling data to a grid.Screenshot_20160413_214758 Screenshot_20160413_215154This makes it for example very simple to calculate inverse cumulative distribution functions as required for drawing weighted random numbers.

And remember, all this has basically the same performance like C code. Try doing that in C. For example, compare this numpy-vectorized implementation of 4th order Runge Kutta to the C one on wikibooks: comparably fast, and vastly more readable.

Sacrifice your soul, errr, memory for performance

Sometimes you can only vectorize things when you sacrifice a bit of memory. A typical case is a system of interacting particles. The interaction will usually depend on the distance between the two particles; so for an array x of 3d coordinates, you need the distance between each pair of particles. One way to write this down without using a python loop is as follows:Screenshot_20160413_234724

The resulting entries r[i][j] will contain the distance between particle i and j. The downside is that you need to allocate twice as much memory as you would actually need to store all the distances, and also need twice the computation time because the result is symmetric but you cannot use that knowledge. Neverthereless, this is vastly faster than the python loop version, and allows you to use the r array in a vectorized version of your energy (or force) function.

Use masks

Another thing which is easy to miss is the use of masks. The idea of masks is that you can index an array with an array of compatible size containing just True and False (such arrays have data type np.bool). The elements where the corresponding slice array entry is True are selected; the others are not. The result of such a mask access is writeable. Additionally, using operators such as > or < on arrays gives you such a mask.Screenshot_20160413_220935The next thing you need to know about masks is that the bitwise operators | and & work on them to take the union or intersect of two masks. A third one which is useful is ~ which inverts a mask.

Screenshot_20160413_221152And there is one last thing to know: you can get the indices where a mask is True by using np.where(). The performance trick here is that if you only want to operate on a few indices, you can first find those by combining a few masks, then get the indices by using np.where, and then only acces those indices, which is much faster than by using the full mask every time. Compare for yourself:


You can count the amount of entries in your mask with np.ma.count(), or find out if any or all entries are masked with .any() and .all().

Avoid copying arrays in critical sections

In general, copying data is cheap. But if your program simulates 25 million particles, each having a float64 location in 3d, you already have 8*3*25e6 = 600 MB of data. Thus, if you write r = r + v*dt, you will copy 1.2 GB of data around in memory: once 600 MB to calculate v*dt, and again to calculate r+(v*dt), and only then the result is written back to r. This can really become a major bottleneck if you aren’t careful. Fortunately, it is usually easy to circumvent; instead of writing r = r+dv, write r += dv. Instead of a = 3*a + b, write a *= 3; a+= b. This avoids the copying completely. For calculating v*dt and adding it to r, the situation is a bit more tricky; one good idea is to just have the unit of v be such that you don’t need to multiply by dt. If that is not possible, it might even be worth it to keep a copy of v which is multiplied by dt already, and update that whenever you update v. This is advantageous if only few v values change per step of your simulation.

I would not recommend writing it like this everywhere though, it’s often not worth the loss in readability; just for really large arrays and when the code is executed frequently.

Note that this consideration does not apply to indexing usually, as indexing does not copy an array but just creates a view of it.

And, as always, be sure to profile your application: python -m cProfile -s tottime yourscript.py. It’s very often surprising what shows up, and the ten minutes you spend running the profiler and fixing the top three pointless bottlenecks will pay off in simulation time in less than a day, I guarantee it.

Use binary data files

Numpy provides very easy, very fast ways to read and write arrays from files. To load some CSV, use np.loadtxt. However, as soon as you have more than a few MB of CSV, this starts to get kind of slow. I really recommend to use np.load and np.save instead, which load from and save to a binary format. This is so much faster that it is frequently a good step to convert data which initially is presented in a CSV to you to a binary file (just load it with np.loadtxt and save it again with np.save) before doing anything else.

There are also versions with compression if you need that.

np.load and np.save are also a very good way to cache intermediate results of your computation. If there’s something your program does every time, just add a command line flag to actually do it and write it to a cache file, and if that flag is not given load it from there instead. It’s two lines of code, and if what you do takes more than second, it’s really worth it in the long run.


numpy includes some utility methods for floating point handling. If you want to test that something is normalized to 1, you can e.g. use np.allclose(x, 1). Note that you shouldn’t use (x == 1).all(), floating point and exact comparison doesn’t mix well. Those methods also include checking for nan values which is sometimes useful, especially you can do x[np.isfinite(x)] to only get the valid values out of x (i.e. not nan or inf).

I hope you found a new thing or two while reading this article 🙂


P.S.: The right way to import numpy is to write “import numpy as np” or “import numpy”. Not “from numpy import *”. The latter clutters your namespace, and makes it very hard to know which function(s) are actually from numpy and which ones are not.

kdev-python on Windows: try it!

I spent the last two or three days playing around with KDE on Windows, with the aim of getting my Python language plugin for KDevelop to run there. In the end, it wasn’t that hard to get this to work — not as hard as I would have expected it to be, anyways. Several things needed to be fixed in kdev-python, the build system required a bit of tickling, but finally I produced an installer for KDevelop 4.90.92 which contains kdev-python:

http://files.svenbrauch.de/kdevelop-x86-setup-4.90.92.exe (SHA256: aa12f8b695c391b5b51fbba112d1ea3d10e9f032262cb8181be634387cd75fcc)

(Update: I’ll put future updates to the installer into http://files.svenbrauch.de/kdevelop-windows, look there for news; I won’t update the link in the post every time)

Don’t consider this a beta release or something. It’s just the current state of things.

You can just run this installer, launch KDevelop, and it should all work. To get completion for the Python standard libary, you need to have a Python interpreter (e.g. from python.org) installed and in your PATH before you start KDevelop. The kdev-python debugger currently does not work (or actually it does work, but the toolbar to control it is missing *grin*). Also it’s probably all a bit slow, because I built some components in debug mode … I think.

Note that KDevelop on windows still has a few quirks, and kdev-python on windows probably has a few extra quirks on its own, so don’t expect this to be too stable. That said, I loaded large amounts of code into it and it didn’t crash and seemed to work just fine.

Here’s a few screenshots:

Code completion
Outline view
Run program (make sure to have python.exe in your PATH or provide the full path to the python interpreter in the “Configure launches” dialog)
variable tooltips

Let me know if this works for you, or what problems you encounter.

Kudos to Kevin Funk for doing lots of work on KDevelop on Windows and helping me set this up!

kate / KDevelop sprint in Berlin

The last few days, we had a joint kate and KDevelop sprint in Berlin. I finally found the time to fix up the Python language support plugin for KDevelop, which was in a quite bad shape in the last few months. Most importantly, the 1.7-py3 branch now supports Python 3.4.3, and the master branch supports KF5 (this mostly thanks to Laurent, who did most of the porting) and Python 3.5, is free of kde4support stuff, and has had most of its more obscure features tested and fixed. Both branches will see a release in the not-too-distant future.

Python 3.5 support

Python 3.5 support includes support for the “async def” and “await” keywords, as well as for the matrix multiplication operator “@”. kdev-python doesn’t really do anything special for those, but they don’t cause syntax errors.

Python 3.5
Python 3.5

PyQt5 support

The master branch (for KF5 / Python 3.5) now has support for PyQt5. It should be quite complete; I can really recommend using kdev-python for developing PyQt applications, that is something it really shines at.

Working with PyQt in kdev-python

New features

There are a few minor new features as well: Bytes (b”Foo”) now finally are a thing which is handled properly (and is distinct from str). You can now call class instances, wich invokes the __call__ operator:

Calling a class instance gives the return type of the __call__ operator, if that exists.

To some extent (I want to improve this further), the plugin now tries to group type guesses into common properties, if there’s too many distinct types. In the following screenshot, you can see how it deduces from the fact that the function is called with a list of string, a list of ints, and a tuple that the argument is probably some kind of iterable object.

To some extent, similar types are summarized in the description.
To some extent, similar types are summarized in the description.

I added or improved support for some modules in the Python standard library which are written in C or which heavily rely on metaprogramming and can’t be parsed so easily:

Code completion for the “re” module was missing for long enough.


kdev-python for Python 3 and KF5 is now again available on the AUR.

The few parts of code completion which rely on it (format string completion inside strings) have been ported to kate’s new template API.

Thanks to Christoph, kate once again has a (now built-in) autobrace feature. I spent some time on making it a bit more clever.

I also spent quite some time to go through the bugtracker and fix various small issues pointed out there. Not counting wishes, kdev-python currently has < 10 open bug reports, only one of them severe (I plan to tend to that tomorrow).

It was a fun experience bein here once again, I’m looking forward to meeting you all again soon!

21cm HI telescope: first results

During the last days, I had the chance to record some radio astronomy data with my 1.4 GHz telescope. The receiver system is more or less unchanged compared to what I described in the last post, except that I have pretty versions of the amplifier and filter boards now:

1.4 GHz Low-Noise preamplifier, pretty version. The amplifier has a gain of about 19 dB and a noise figure of something below 1 dB (I can't measure it exactly).
1.4 GHz Low-Noise preamplifier, pretty version. The amplifier has a gain of about 19 dB and a noise figure of something below 1 dB (I can’t measure it exactly).
1.4 GHz bandpass filter, with a bandwidth of something around 100 MHz. Insertion loss is around 4 dB.
1.4 GHz bandpass filter, with a bandwidth of something around 100 MHz. Insertion loss is around 4 dB. The stubs were manufactured a bit longer than I suspected what would be the right length, and then trimmed with a knife to give the best bandwidth center.

For the measurements I performed until now, I only used a small biquad antenna without parabolic reflector. The collecting area is thus very small, but neverthereless the signal is already recognizable after about a second of integration time under good conditions.

The antenna points (well, “points”, it has an opening angle of 60 degrees) into the same direction during the measurement, which might be a whole day or so. The earth’s rotation moves different parts of the milky way through the field of view over time. By performing this measurement, one obtains a plot of intensity over frequency for each point in time, as can be seen below.

A single line of data (after on-off-calibration). x is frequency, y is amplitude (both in arbitrary units). The broad peak around the center is the HI emission. This plot contains about 10 seconds of data.
A single line of data (after on-off-calibration, see text below). x is frequency, y is amplitude (both in arbitrary units). The broad peak around the center is the HI emission. This plot contains about 200 seconds of data. The narrow peaks are RFI (Radio Frequency Interference) from human-made sources nearby.

This is not the raw data; in the raw data, the actual signal is nearly unrecoginzable because of the bandpass shape (which is basically a frequency-dependent noise floor). To remove this frequency-dependent noise floor, one can perform frequency switching: For each time, two measurements are made, one at a freqency where the signal is expected (“on”) and one where there’s no signal nearby (“off”).

The on (green) and off (blue) curves from which the above plot was obtained by calculating (on – off)/off.

By pointwise subtracting the two signals and dividing by the “off” curve, one obtains a much clearer view on the signal.

The next step in data processing is the removal of an artifact caused by the frequency switching procedure: the “off” curve will have a slightly different base signal level from the “on” curve, because it is recorded at a different frequency. This change in the baseline is approximated by a first-order polynomial and then subtracted.

Original data (blue), fit of first-order polynomial to remove the frequency switching baseline (green), and result of the procedure (red), for different times.

Through this algorithm, one gets from this plot:

Recorded data without baseline subtracted. x is freqency (arbitrary units), y is time (arbitrary units too, the whole plot is a few hours of time).

… to this:

Recorded data with baseline subtracted.

which looks of course much closer to reality.

The last step which I perform is an RFI (Radio Frequency Interference) removal algorithm, which is so unscientific that I don’t want to describe it here. But look, the plot gets much prettier by doing that:

Resulting recoded intensity with most of the RFI removed. x is in km/s now, and the black line is v=0 km/s. No LSR correction was performed yet.

Here are two more data sets obtained in the same way. Each picture contains information from about 1000 GByte of raw sample data.

A full day of data when looking westwards. Unfortunately later on some strong RFI makes the data look not so pretty (vertical blue stripes). You can nicely see how the same signal like in the beginning appears again at the bottom of the plot when the same part of the milky way transits through the field of view.
About a day of data when looking towards zenith. In this plot, one can nicely identify at least two different spiral arms of the milky way. Also observe the weak, but broad background signal which extends almost up to +200 km/s.

If you want to take a look at the data yourself, here’s the record file from which the first plot was produced (data format is timestamp – on/off – cal/uncal – 4096 spectral channels of data, in arbitrary units, all per line).

Just for the record, here’s some more stuff I found interesting along the way:

System temperature in units of calibration temperature over time (x is time in arbitrary units, but it's several hours over the full plot, y is the factor by which TSys is larger than TCal)
System temperature in units of calibration temperature over time (x is time in arbitrary units, but it’s several hours over the full plot, y is the factor by which TSys is larger than TCal). The whole calibration temperature thing is not working terribly well yet, so take this with a grain of salt.
x axis is frequency, y axis is time (downwards is later). Color encodes intensity. In this picture, you can see the milky way set; the green stripe around the x=1000 label is the HI line from the milky way, which disappears for later times. The horizontal blue stripe is somebody pulling the plug of the preamplifier for a few minutes :-)
This is my data recording tool, previewing data it is currently recording. x axis is frequency, y axis is time (downwards is later time). Color encodes intensity. In this picture, you can see the milky way set; the broad green stripe around the x=2100 label is the HI line from the milky way, which disappears towards the top. The horizontal blue stripe is somebody accidentially pulling the plug of the preamplifier for a few minutes 🙂

21cm hydrogen line telescope, version 2

Much has happened since my last status report on my attempts to observe the galactic hydrogen emission at the famous 1420.4 MHz hyperfine structure line. Now, the new setup is finally proven to be basically working, and in this post I want to describe some of the changes I did.

New data acquisition module

Most importantly, I replaced the spectrum analyzer by a self-made data acquisition unit which is basically a Software-defined radio (SDR). Its purpose is to take the 1.4 GHz signal, digitize it, and send it to a computer.

Revision 2 of the new data-acquisition board. Signal input is at at the top right port, then there is a lot of analog stuff (details below) and the signal is finally fed to the microcontroller (the big black IC) which passes it to the computer via USB. The plug in the upper left corner is an additional power supply.

Functionality description

The board contains various submodules, most of which are labelled in the following image.

Some of the components of the acquisition board.

The first thing which happens is down-conversion through a so-called mixer [data sheet of part used here]: it takes the portion of the 1.42 GHz signal we are interested in and moves it down to a lower frequency of approx. 150 MHz. This has various advantages; especially it makes further filtering, amplifying and digitizing the signal much easier. Since we are only interested in a small part of the signal, this conversion can happen in a way such that no information is lost.
Mixing works by multiplying the actual signal with a reference signal, the so-called Local Oscillator (LO) [data sheet of part used here]. You can imagine the LO as a component which generates a sine wave at a specific, but programmable, frequency (in this case around 1.27 GHz); then the mixer basically subtracts this frequency from the frequency of the original signal (so in this case, with 1.42 GHz and 1.27 GHz you end up at 0.15 GHz = 150 MHz). The lower-frequency signal (the 150 MHz in this case) is called the Intermediate Frequency (IF) signal.

Next, there are various filters and amplifiers which are very simple to understand; they just make the signal amplitude large enough so that it can comforatbly be detected by the analog-digital-converter later on.
This part of the circuit ends with a so-called Surface Acoustic Wave (SAW) filter [data sheet of part used here], which is a quite modern kind of filter with the special property that it can select a relatively small part of the signal very accurately, while blocking everything else almost completely. It only lets the frequency components of the signal pass which have frequencies between about 148 and 151 MHz. Which part of the original signal can be found in this range (remember, we move the original signal to a different frequency) is determined by the mixing step: through programming the local oscillator (LO) for a different reference frequency, one can look at the input signal’s spectrum one part at a time.

The bridge between the analog and the digital world is built by the Analog-Digital-Converter (ADC) [data sheet of part used here]. It gets an input clock from the microcontroller, which is just a signal which toggles up and down exactly 11.25 million times per second; on each tick of that clock, it measures the voltage at its input pin, and converts it into an 8-bit digital value. Each of the bits of that value is then provided as a voltage level (either +3.3V for “1” or 0V for “0”) on one of the output pins. They are read in by the microcontroller [data sheet of microcontroller], packed into chunks of a few kB in size in the controller’s RAM, and then sent out via USB.
Getting the microcontroller firmware to do this as fast as required was surprisingly nontrivial, but that’s such a broad topic that I do not want to get started on it here.
The computer receives the data, calculates the fourier transform to get the frequency spectrum, and averages the resulting spectra (see end of post for software screenshots).

(If you ever heard of the Nyquist frequency, you might wonder how it is possible to digitize a 150 MHz signal with just 11.25 MSa/s sample rate. This is why we need the narrow SAW filter in front of the ADC: it limits the bandwidth of the signal to 3 MHz and thus allows to use the undersampling technique. In this case, the ADC samples the 26th alias of the original signal.)

Below is an illustration of the paths the signals described above take through the board. Note that almost all of the blue path is at 150 MHz; just the first centimeter up to the small black mixer IC is at 1.4 GHz.

Signal paths in the data acquisition board.

Manufacturing the board

This is a 4-layer board; that means, it has tracks on the top and bottom side, and two metal layers in between (you cannot see them, they are glued inside the board) which also have tracks. Connections between the layers are established by metal-filled drill holes (in this case about 1000 of them, each 0.3mm in diameter).
Having 4 layers makes it much easier (or possible at all) to arrange tracks in a way that they do not cross, and also has some important advantages for high-frequency signals. Unfortunately, it also makes fabrication of the prototypes more expensive. There are various chinese vendors which will manufacture a small amount of pieces of such a board for an affordable price; including customs and shipping it will end up costing about 100 € for 5 pieces (less pieces doesn’t get cheaper though). Components are extra and you have to put them on the board yourself.
There is specific software to design such circuit boards; first, you usually draw a schematic, and then you use the layout editor to place components and tracks in accordance with the schematic. The layout itself quickly becomes hard to overlook; the software guarantees electrical equivalence between the easy-to-read schematic and the layout (e.g. it guarantees you don’t have wrong connections, missing connections, or tracks which cross or are too close together — of course only if you did it right in the schematic).
The software exports vector-graphics like files for each layer and a list where holes must be drilled, which is then sent to the manufacturer.

This board’s layout loaded in kicad. Only the top (red) and bottom (green) layers are shown.
A raw board without components (but with soldering paste on it) and the stencil for the board. Also an example for how not to use a stencil.

For this, I decided to use a stencil this time, which is a rather thin piece of metal with all the pieces cut out where a component is supposed to be soldered onto the board. The stencil is put on top of the board, then you smear some solder paste on it, and force it through with a credit card; then, you lift off the stencil to only have the solder paste on the places where it belongs. You can then put the components into their spots and heat the whole thing; the solder paste will melt and form a firm, electrically conductive bond between the component and the pad on the board after cooling down again. After a few attempts, I managed to get a good result for the paste print: it is important to firmly fix the stencil in the right postion, then press the paste through with a single swipe, and then remove the stencil again by lifting it straight up carefully.

Board with solder paste, ready for component placement.
Board with components placed, after heating it up and letting it cool down again (and fixing a few things by hand). Most of the solder joints look pretty good.

With all the software and tooling stuff already in place from a first version of the prototype, it was quite easy to get this board to work as expected. Only very few minor fixes were needed.

Mirror filter

When mixing a signal to a different frequency as described above, there is one problem I did not mention so far: if I mix 1.27 GHz with a broadband signal (as it comes from the antenna, it doesn’t only contain the 1.42 GHz part but lots of uninteresting noise at basically all other frequencies as well) of which I am interested in the 1.42 GHz part only, there are actually two frequencies of the broadband signal which will end up at 150 MHz in the intermediate frequency (IF) signal. This is the 1.42 GHz = 1.27 GHz + 0.15 GHz part I want, but also the 1.12 GHz = 1.27 – 0.15 GHz part, which I have no interest in. After the mixing step, those two signals are added together and cannot be separated again. It is thus necessary to remove the 1.12 GHz part of the signal before it reaches the mixer.
To do this, a bandpass filter is needed. I did some simulation with Sonnet (give it a try if you want to do stuff like this, it has a quite functional free version), which basically allows you to simulate electromagnetic waves in planar (i.e. “mostly 2D”) arrangements of metal, and ended up with this magic-looking design:

Geometry for a microstrip bandpass filter at 1.42 GHz.

The green stuff is copper, white is just board material, and the 1 resp. 2 is the input / output port. This is the simulated filter response:

Simulated filter response for the optimized design.

Looks good! It lets through the 1.42 GHz signal nicely, and blocks the unwanted signal at 1.15 GHz by 40 dB (factor 10000 in power).
For a prototype, I then proceeded to cut this geometry into a copper board by hand …

Hand-made prototype (with a cutter knife).

… and measured it to compare with the simulation. As expected, it has more loss in the passband than in the simulation (reality always has more loss than expected) but not much, and the shape of the response is much more irregular, probably mostly because of manufacturing inaccuracies. Still, the important parameters agree really well with the simulation, which I find quite impressive.

Measured filter response. Note that the filed-of-view is much wider to the left than in the simulation image shown above (it goes down to f = 0 Hz here), the two markers M1 and M2 are in the same spot though. Everything below the y= -40dB line can be completely neglegected in practice for this application.


Now, finally all the important components are in place! I set up my 1.2m dish and pointed it at the sky, and tadaa — the familiar HI signature of our galaxy showed up:

The two wide peaks probably are two spiral arms of our galaxy. The slope in the underground signal (more signal to the right, less to the left) is wideband interference from some human-made source. About 21 gigabytes of data were acquired and averaged (to reduce noise) by the board described above to create this image. Integration time is 12 minutes, total measurement time 24 minutes (12 minutes on-source observation and 12 minutes background noise measurement)

I really need to repeat this test outside of a city full of people with interfering electrical devices (that really makes a huge difference), but as a proof-of-concept, this is completely sufficient. As such, this is the first spectrum I acquired with all significant components in the reception chain made by myself 🙂
The signal is calibrated by a method called frequency switching: one periodically shifts the observation frequency by a few MHz, and thus observes just the background noise. This background noise is then subtracted from the actual measurement. This might sound like it reduces noise (it does not, but that’s not trivial to understand), but the real gain is that the frequency dependence of the IF part of the acquisition board is removed from the signal. Why this is important may become evident from the following image:

Screenshot of PC software (written specifically for this board). The curve on the right is the acquired spectrum, and contains information from about a million ADC samples fourier-transformed and averaged together. One can nicely see the bandpass; the shape is almost entirely determined by the Surface Acoustic Wave (SAW) filter mentioned above. At the left, there’s controls for adjusting amplitude and looking at different parts of the spectrum by changing the local oscillator frequency.

None of the peaks shown there is an actual astronomical signal; the signal shown in the plot is far too weak to see it here. The non-flatness of the band is created by the bandpass filter (the sharp spikes are interference); this non-flatness should be calibrated away in order to reliably distinguish real signals from the bandpass shape.

Spectrum obtained by frequency switching. The structure of the bandpass is mostly removed. (You might wonder why the edges left and right are still visible; this is due to wideband interference signals with non-constant amplitude. In a more noise-free environment, those should not be visible either and the baseline should be completely flat)

As you can see, the bandpass shape is removed, and the actual astronomical signal is visible in the middle (this is already averaged for a while, though).

Another interesting thing is to look at the waterfall plot of the record, i.e. encode signal intensity by color, and use the y-axis for time instead:

Waterfall diagram of the acquired signal.

The relatively faint green vertical stripe around x=1000 is the signal we are looking for. Other vertical stripes are narrow-band interference; horizontal stripes are temporary but broad-band interference.

Next steps

The next thing I will try to do is build another of those boards, and do a bit of interferometry. Stay tuned!

Snippets in Kate 5

Recently I spent some time to port and clean up the Snippets plugin and the underlying template interface for Kate 5.  It’s now fully working again and more powerful than ever. The template code was originally written by Joseph Wenniger and most of what I show here is still working like originally implemented by him. Still, there were some improvements I would like to show; also, I’m sure many readers might not be aware of this great feature at all.

Classical snippets use case: insert a for loop witout having to type the iterator variable three times.

The template interface, which is part of the long-time stable KTextEditor API, was heavily cleaned up and now just consists of a single function

    bool insertTemplate(const KTextEditor::Cursor& insertPosition,
                        const QString& templateString,
                        const QString& script = QString());

which inserts a template into a view at the given position. It’s very easy to use and still powerful — if you write an application which uses KTextEditor, it might be worth to spend a moment thinking about how you might be able to make use of it.
I also heavily refactored the implementation of the interface. More than 1000 lines of code were removed while effectively enhancing functionality. 

Core functionality changes

I changed the language of the snippets a bit to make it more clear and easy to use. In the following, I want to give a short overview of how it works now.

The heart of the templates (or snippets) are editable fields (shown in green). They are created in the template string by writing ${fieldname}. They can have a default value, which can be any JavaScript expression. Pressing Tab jumps between the fields of a template. Whenever such a field is changed, all so-called dependent fields are updated. Those can simply be mirror fields (created by having a second field with the same name), or can do something which depends on the contents of the other fields in the template, such as perform replacements or concatenations. Again, you can have arbitrary JavaScript expressions doing that.

An example snippet (not very useful in practice) which has three editable fields (find, replace and sample_text) with a default value for each. Changing the values will update the result in the red “dependent” field in real-time.

Noticeable improvements over the previous functionality (from KDE 4 times) is that you can have fields with arbitrarily complicated default values which are still editable, and that the dependent fields can use all other fields as input (not just one like in KDE 4). It is now also possible to have inline JavaScript doing simple operations in the template.

The Shortcuts feature for the snippets now actually works in Kate.

Snippets now also have proper undo; in KDE 4, only a single character typed could be undone at once while editing a snippet. Now, undo grouping works like it always does.

User interface improvements

For easy testing of your snippets, the “Edit Snippet” dialog has a “Test
snippet” button now, which lets you test your snippet on-the-fly.
The user interface was simplified by removing unneeded options, and an inline quick-help feature was added which introduces the user to the most important features of the snippet language. Just click the “More” button.

Inline documentation on how snippets work

An example: C++ Header guards

As an example for how this feature works, let’s look at how to create a snippet to generate a C++ header guard. First, create a repository for your C++ snippets:

Open the Snippets toolview and click “Add Repository”.

Then, enter a name and specify that you want this only for C++ files:

Create your new repository.

Then, add a snippet:

Add a snippet. Easy.

You can retrieve the document’s file name from the editor, make it upper-case and replace dots by underscores automatically to get a nice header-guard-suitable format by using code like this:

Example code for how you can create C++ header guards fully automatically.

If you do not want the guard field to be editable, just create a function which does the upper(fileName…) stuff, and have three fields which call the function (like ${func()}) instead of the two mirror fields and one default-valued editable field. If you do that, the template handler will immediately exit and not present any editable fields.
The ${cursor} variable can be used to place the cursor after all fields were filled. When you type something there, the handler will exit.

Click Ok. Now, to use your snippet, either press the shortcut you defined (if any), click it in the snippets toolview, or use code completion:

Snippets appear in code completion.
Result after executing our new header guard script. A sensible default value was selected automatically. Pressing Escape or Alt+Enter will exit the template handler and place the cursor at the point marked with ${cursor} in the template.

That should hopefully equip you with most of the knowledge you need to write your own snippets. If you like, you can use the full kate scripting API to write snippet code — it for example allows you to retrieve the text in the current selection and similar useful things.

Some more examples on what you can do

Here’s a few snippets demonstrating the features of the engine while partly being of debatable practical relevance. I’m sure you can come up with better use cases for some of those things though.

Write a clean regular expression in a comment and have the snippet mirror it with added extra-backslashes and removed spaces in a QRegularExpression variable. Makes regular expressions even more write-only than they already are.
Get the file encoding from the editor and use it as the coding of a python file header.

Some base64 in the selection …

… decoded by a snippet which takes the selection and inserts the base64-decoded result.

Next steps

My next step will be to make this plugin loadable in KDevelop as well — which should be quite easily possible due to the awesome work done in kate to make the plugin infrastructure more generic. If you have further ideas on how to improve the snippets, let me know 🙂

1420.4 MHz Hydrogen line: There it is!

With a reasonably simple setup, I finally succeeded in detecting the 1420.4 MHz galactic hydrogen hyperfine structure line:

One of the first successful measurements. The little bulge in the center of the image is the hydrogen line. The sharp, high peaks are man-made interference and not part of the astronomical signal. The yellow line is with the telescope pointed towards the milky way, the pink line is for a different location far off (the bulge is gone here, as should be). The different base levels are probably mostly caused by different elevations leading to different amounts raditation from the ground reaching the antenna.


The setup consists of a 1.2m dish with a feed and two homemade 1.4 GHz low-noise preamplifiers. The amplifiers have about 19dB of gain each at 1.42 GHz and a noise figure too small to measure for my equipment (certainly below 2dB I would claim). The simulated noise figure is about 0.4dB, but that is without taking the Q factor of the matching components into account. A spectrum analyzer (Rigol DSA 815) is used as the detector for now.

Preamplifier made by hand: traces are cut in a copper-coated epoxy board and components are soldered onto the board (linear voltage regulator as example). For small boards with low component count I found this technique to be significantly simpler than toner-transfer etching — and no less effective.
Dish with mounted feed (for the latter, see text and pictures below)

Since you obviously cannot simply attach a coaxial cable to a parabolic dish reflector, a so-called feed is required to absorb the radiation collected by the dish reflector. I adapted a feed design commonly used for wireless LAN for the HI frequency (which is done by just multiplying all lengths by the ratio of the two frequencies). This design consists of a biquad antenna over a ground plane with two reflectors on the side (those are to make the radiation pattern more symmetric and fit the dish better). It provides an excellent match at the design frequency and seems to work well.

Biquad feed. A piece of wire forming two squares is placed above a ground plane. The coaxial cable is attached at the back side and is connected to the wire. The feed can be moved back and forth to focus. Theoretically. If I had a criterion for when it is in focus. (Seriously though, I moved it towards the dish a bit and saw the spillover decrease (less unwanted radiation from the ground reaching the feed, i.e. base signal level drops) thus I picked a point roughly where I saw no further significant decrease in spillover)

Biquad feed and the two amplifiers from the back side. The preamplifier is connected directly to the feed with a SMA connector. Note the paper towel roll wrapped in tape which is hot-glued to the feed for mechanical mounting (hey, it works!)

S11 of the biquad feed shown above. After a few adjustments, it provides an excellent match of more than 18dB return loss at the design frequency. That number means that less than two percent (10^(-18.21/10)) of the signal are lost due to feed mismatch.

For the more serious measurements I conducted, the detector (digital spectrum analyzer, gray box below — the thing which makes the black screenshots with the yellow curves in them) is controlled by a computer. It records 4 sweeps with 30 seconds duration each and about 1 MHz bandwidth and averages them; the computer fetches the result and stores it on disk. The camping mat is metallic and serves as interference protection (yes, it actually works … a bit).

Metallic camping mats designed to isolate the user from low temperatures can also be used as faraday cages — with moderate success.
The complete setup. With chairs to prevent people from tripping over the wires. Left: Parabolic dish antenna with feed; Center: Laboratory power supply to power the amplifiers; right: Notebook and spectrum analyzer for recording the data.


With this recording technique, I was able to take some nice data sets. In all of them, the antenna points into a fixed direction of the sky and the sky moves across the picture. The x (right) axis is frequency; the y axis is time in the upper graph, and intensity in the lower one. Intensity is encoded as color in the upper graphs. Thus, if you look in the upper graph from top to bottom, you actually see different regions of the sky moving across the antenna. In the lower graph, you see the intensity of radiation of all those regions added together (the lower graph is the upper one integrated along the top-to-bottom axis, so to speak). All intensity scales are logarithmic.

Started quite late in the night, with about 80° elevation (almost straight up); different parts (roughly Cyg, Cas, Perseus) of the milky way move through the picture.

Unfortunuately, my aiming is quite inaccurate and the antenna beam size is 10 degrees, so I can only very roughly tell what is currently visible in the picture. Still, in the picture above, you can clearly see three wide peaks which are HI emission, and three narrow peaks which are interference. We can remove the interference to get a nicer picture:

Same picture as above after interference removal algorithm.

This works by manually flagging the locations of the interference peaks, and then fitting a curve of the form a*x^2+b*x+c+d*exp(-f*(x-e)^2) to those locations (second-order polynomial for the “real”, wide spectrum and a gaussian-shaped peak for the interference).

Small section of the spectrum shown above with interference peak (center). The blue line is data, the green line is a fit of a*x^2+b*x+c+d*exp(-f*(x-e)^2) to the data.

The d*exp(-f*(x-e)^2) term is then simply subtracted from the data, which leaves only the baseline, without the interference peak. This method even partially “recovers” data hidden by the interference peak (how reliable that information is is a different question, but in this case it looks fine).
After this treatment, you can clearly see three wide bulges in the curve.
You might have noticed that the x axis is not labeled with frequency, but velocity. This is because the actual emission of the kind of radiation observed here only happens at one specific frequency (1.4204 GHz — where the 0 in the graph is). Still, we see it at different frequencies because of doppler shift. Thus, the observed frequency of the radiation translates directly to the velocity with which the matter which emits the radiation moves towards us — or away from us.
I would carefully claim that the three observed peaks originate from three different spiral arms of the milky way which rotate with different velocities. I’m not sure if that is accurate though.

With the same method, I took several more similar pictures of different regions of the milky way, some shown below.

Transit of (roughly) Perseus through the beam.
Transit of (again, roughly) Sagittarius / Aquila through the beam. Note that — different from the pictures above — all the matter observed here moves towards (positive velocity) us, not away from us.

Future plans

My long-term plan is to attempt creating a survey of a significant part of the sky — basically a map which tells how much HI radiation at what doppler shift is observed at which point of the sky. For that, I need two things: A reliable way to determine the antenna position; and a consistent way to compare signal amplitudes.
For the former, I’m currently trying to build a tilt-compensated compass with elevation sensor (basically a three-axis magnetic field sensor and a three-axis acceleration sensor with software). It’s working a bit, but not really.
For the latter, in professional radio astronomy, one tool which is used is a noise diode. That is a small device which injects noise into the receiver system at the very start of the signal path (in my case, it’s a small device inside the biquad feed). It is periodically switched on and off and adds a constant offset to the amplitude of the observed signal. The trick is that this constant offset is reliably constant, over long periods of time. When we see it change in the recorded data (and we will), we can be fairly sure this change is caused by the receiver system (amplifier, detector) changing — for example because of temperature drift. Thus, by dividing the observed amplitude of this constant offset, signals amplitudes can be evaluted even if the receiver and detector system drifts.
I built a zener-diode based noise generator which seems to work fine for my purpose. It can be switched on and off using a transistor and the Raspberry Pi GPIOs, and is powered by battery to get as little fluctuation as possible in the noise signal itself. The generator is attached to a small piece of wire as an antenna and is glued inside the feed.

Noise diode test. Pink curve: noise diode powered on; yellow curve: noise diode off. This is without the feed and amplifiers, but it looks similar with them.
Noise diode test with actual sky data. Unfortunately, no real astronomical signal visible this time 🙁 The upper graph is as explained above (x-axis frequency, y-axis time), the lower graph has time on the x-axis and integrated intensity on the y-axis. The spikes in the lower graph and the stripe pattern in the upper graph is the switching of the noise diode. Performing a calibration would bring all the spikes to an equal height, and then remove the spikes.

Caveats of using a spectrum analyzer as a detector for radio astronomy

It took me a while to figure out how to best configure the spectrum analyzer for this purpose. One thing which is easy to overlook (because it is nearly irrelevant in most applications) is that a spectrum analyzer does sweeping measurements, i.e. it measures intensity at one frequency span at a time, then goes to the next etc. This means the resolution bandwith (RBW), which basically controls the spectral resolution of the analyzer, also controls how much signal power is detected at once during the sweep. If it is set to 100 Hz, the analyzer will walk through the whole frequency span in 100 Hz chunks, detecting only 100 Hz of the spectrum’s power at once. If set to 10 kHz, it will detect a hundred times as much power at once — which gives an effective signal-to-noise ratio which is a hundred times better in the end! This is very unintuitive because the noise level displayed by the analyzer actually increases for higher RBW values (which makes sense of course if you think about it: if you accumulate a larger part of the spectrum into a single channel, that channel will have more noise power overall). Thus, when using a spectrum analyzer for this purpose, you have a trade-off between S/N and spectral resolution (you always have a trade-off between S/N and spectral resolution, but in this case it’s far more grave than usual — exponent 3/2 instead of 1/2 [as usual] for the channel count if I’m not mistaken). I selected 10 kHz spectral resolution, more resolution (lower RBW) certainly makes no sense for this kind of signal. Probably 30 kHz would be fine as well — but that makes interference detection and removal harder again because the narrow interference peaks are quite smeared out.
This also means a spectrum analyzer is not a very good detector for this kind of telecope — with 10 kHz RBW and 1 MHz bandwidth (about what I used above) 99% of the signal power are lost simply because they are not detected at a given time.


It is very nice to finally see some results come out of this project. I’m looking forward to improving the receiver system (eventually I want to replace the analyzer by an A/D converter + mixer) and the calibration process.