I recently rewrote most of kdev-python’s code completion code, as it was a huge mess (it relied largely on regular expression matching, which just isn’t powerful enough to do this properly). The result is less buggy, easier to maintain, and has unit tests (yay!). In the process, I also implemented quite a few features which I want to post a few screenshots of.
There’s a second issue I want to talk about in this post: A major problem which remains for kdev-python is the documentation data, that is, the way the program gets information about external libraries, for example functions and classes in those libraries. I’d love to have some help here — if you want to help improve kdev-python, this is a great point to start! There is no knowledge required about kdev-python’s code base, or even C++ (all that is to do here can be done in python). Read more below in case you’re interested.
Another note: kdev-python now has its own component on bugs.kde.org. Please report any issues you might encounter there!
Code completion features
Besides the increase in maintainability, a number of features have been added or improved in code completion.
The primary reason for the rewrite was that I wanted to support multi-level function calltips, which just wasn’t easily possible with the old code. That feature works now:
As you can see, if the cursor is inside a function call which is passed as an argument to another function, you’ll get information about both functions (this works with an arbitrary depth, of course, not just with two functions). The parameter which you’re currently editing is highlighted in blue for both functions.
Another improvement in function call completion is support for the “named parameters” feature: If you have a function which defines parameters with default values, you can specify their name followed by =, and then their value. This is now supported by kdev-python:
|Named parameter autocompletion
It’s even smart enough to not offer those items if you haven’t passed all non-default parameters already:
|Named parameters are only suggested if all non-default ones have already been specified
Note that all of the above screenshots use the “full completion” mode, which is only displayed if you explicitly request autocompletion with Ctrl+Space (by default). If you just type, minimal completion will be used instead, which is far less intrusive:
|Minimal completion widget for named parameters
The art of doing nothing
There’s many cases where an IDE just cannot offer useful (semantic) auto-completion, for example if you need to pick a name for a variable, class, or parameter. kdev-python now has more sensible rules for not offering semantic code-completion if it doesn’t make sense (the widget you see is the default kate text completion widget, which you could disable seperately, but which actually makes sense):
|No semantic completion if you need to name a parameter
|No semantic completion if you need to name a variable
There’s now proper support for auto-completing a variable in a generator statement:
|Generator variable auto-completion
Notice the algorithm which guesses the variable you actually want to iterate over, which is not obvious just from the code the line contains.
Another new feature here is that if the last word is “in”, kdev-python will sort variables you can iterate over first in the list (note the green background colour):
|Sorting variables by usefulness
This is probably pretty useless, but it was easy to implement, so I did.
This is not quite finished, it needs a bit of polishing (it currently completes stuff python does not allow, and does weird things like adding parentheses if you import a function), but it’s definitely better than it was before:
|Code completion when importing single declarations from a module
Shebang line completion
A ridiculously trivial but neverthereless useful feature is completion for the Coding and Shebang lines:
|Code completion for the “Coding” line
Listing multiple encodings could be discussed, but doesn’t everyone use utf-8 anyways?
Call for help: Documentation data
Note: I will use the word “documentation” in the following sense here: it means not only the docstring or help text, but also basic information like “which classes are there in this library”, “which methods does this class have”, “which parameters does that function take”, and “which type of object does that function return” — in other words, all important information about a library’s public API. This information is necessary for various things, most importantly semantic highlighting and code completion.
Description of the problem
In C++, providing documentation for libraries (like Qt), is… well, not easy, but it’s at least obvious how to do it: You parse the header file the user code “#include”s, and that will tell you everything you need to know (well, given that your analysis code is intelligent enough to fully understand the file’s contents, which kdevelop’s C++ code is). In Python, the situation is much more difficult:
- The first type of problem are libraries which are written in python, but heavily rely on fancy python magic to set themselves up (for example, modify __builtins__, or iterate over lists of functions to do something with them); it will probably never be possible to determine the correct outcome of those setup scripts without actually executing them. Modules which currently suffer from this are, for example, os and parts of django (especially the ORM, but that will need some special-case handling in any case, which I have yet to think about) — kdev-python is not intelligent enough to understand what those do internally.
- Even worse are libraries which are written in pure C. By default, kdev-python cannot do anything about those.
Python has the “pydoc” tool, which can be used to obtain parts of the required information. However, pydoc will (partially) execute the code it analyzes, which is a no-go. If you wanted to use pydoc in an IDE, you’d have to heavily sandbox it, which I don’t want to do. Thus, I dismissed using pydoc, at least at run-time.
The current solution is simple: Files similar to the header files C/C++ has are shipped with kdev-python in case the information kdev-python can extract from the files that are shipped with the library is not sufficient. An example where this works really well can be seen in the QtGui / QtCore modules; the documentation for those is provided in kdev-python/documentation_files/PyQt4/QtGui.py. Victor Varvariuc wrote a script which generates those files from the Qt documentation a while ago, and it works great.
So, where’s the problem, then? The problem is, PyQt is well-documented in a systematic way, and other libraries aren’t. Look at the python standard library, for example: There’s no machine-readable documentation available which would even list classes and functions; function return types are, if at all, only documented in prosa text (yes, of course, strictly speaking, a function in python has no return type; but in most cases the type a function will return is actually fixed, and in those cases it would be useful to know it). The situation is similar for other important projects like numpy.
What needs to be done to improve the current situation
The following pieces of information are at least required in order to support a library in a remotely useful way:
- All classes and functions, and their members must be listed.
- For functions, at least the names of the parameters must be included, the types are nice but not so important.
- For functions a proper return type is very important (if you do i = button->icon(), and then want to access the properties of i, and icon() doesn’t have a return type, no completion or highlighting can be done — basically, even single functions with wrong return type can break highlighting and completion of large parts of a program’s code).
- Constants and their types must be listed.
That information then needs to be encoded in a python script (create empty functions / classes / etc. which just contain “return int()” or similar statements — see kdev-python/documentation_files for plenty of examples).
Some effort has been made in the past to support various libraries (for numpy there’s something that parses the online docs and converts that to such header files), but the overall situation is rather miserable. The worst thing is probably that not even the standard library is supported properly.
First of all, I don’t really know a good solution for this whole problem. If you have an idea, or know how other IDEs solve that problem, please tell me! I think eclipse uses pydoc or so and collects the information when you configure the interpreter, but the result is not good enough that I’d want to copy their behaviour.
There’s two important packages I have in mind for which the issue could probably be solved well in the “header files” way described above: the whole python standard library (os, sys, random, math, …), and numpy. If you’re interested in helping, please contact me, for example in IRC in #kdevelop on irc.freenode.net. Or just write an email.