I See Dead Code

… as sounding brass, or a tinkling cymbal.

I See Dead Code header image 1

Saarbrücken Lecturers Define Multiword Expressions

Mai 6th, 2008 · No Comments

Today: The Bag of Words

For those of you who don’t know what a bag of words [model] is, just imagine a bag of words, because it’s ultimately the same thing.

→ No CommentsTags: saarbrücken · uni

Look at me, I know Functional Programming!

Mai 5th, 2008 · 1 Comment

Sometimes I wonder what the hell I might have been thinking while writing code like this1 :

1
2
3
4
5
6
7
def uncurry(f):
    return lambda t: f(*t)
 
def longest_common_prefix(Sa, Sb):
    return len(list(
        takewhile(uncurry(eq),
                  izip(Sa, Sb))))
  1. I bet the tenses are all wrong again. []

→ 1 CommentTags: lang:en · programming · python

FAST, A Microsoft Subsidiary

Mai 4th, 2008 · 2 Comments

I don’t know whether the author tries to be funny, or if he is just being himself (from the Microsoft Enterprise Search Blog):

Speaking of Linux and UNIX, some people may be (mis)interpreting our continued support and investment in these platforms as a broader change for Microsoft – so here’s some color. We’re making a pragmatic decision to continue to delight a core part of FAST’s customer base that has chosen the Linux/UNIX OS. […]

Net, our approach doesn’t imply any kind of broader change for our company in its strategy (so conspiracy theorists can stand down :-) ) and you shouldn’t expect to see SharePoint running on UNIX. You can bet that we’ll innovate on Windows, too, and over time we hope customers will see .NET as a preferred platform choice.

Maybe I’m just a bit touchy when Microsoft people try to speak about Linux, but still. It’s laudable at least that they lower themselves to the continued support of ESP of RHEL, AIX, HP-UX and others, especially when there’s quite some support contracts with important customers still running, who would be only too delighted to find out that they might have to switch their whole setup to Windows servers the other day.

I’m not sure if they can continue to make buck from the non-Windows offerings of ESP – honestly, who wants to spend considerable money and time on a new ESP installation, while reading about .NET as the preferred platform at the same time?

The second paragraph is even more hilarious, especially the comment about SharePoint on UNIX. I’d give the little orange „:::fast” rubber ball on my desk to see that happen, or at least to meet the conspiracy theorists who are able to come up with such an idea.

→ 2 CommentsTags: rant · work

Gemmen der maschinellen Übersetzung

April 22nd, 2008 · 4 Comments

Bewertung der Übersetzung von Satzfragmenten, gegeben: „Sporty Spice (Melanie Chisholm)”

  • Sportliches Gewürz (Melanie Chisholm)
  • sportliches Spice (Melanie Chisholm)

Die Fehler oben sind vielleicht lustig, aber erwartbar. Erstaunlich ist eigentlich, dass eines der Systeme die korrekte Übersetzung angibt. Andererseits wurde das System wahrscheinlich einfach nur auf Texten von Klatschseiten trainiert.

→ 4 CommentsTags: coli · lang:de

All work and no play: Psychonauts

März 14th, 2008 · No Comments

Letzte Woche, nach der erfolgreich bestandenen Prüfung in „Pattern and Speech Recognition”, bin ich der Empfehlung von Zero Punctuation gefolgt und habe mir per Steam (Digital Delivery ist die Zukunft!) über meine arme, kleine Leitung Psychonauts gekauft. Weil die Steam-Preise in US-Dollar sind und der Dollar momentan so unglaublich günstig ist, hab ich mir auch noch Shadowgrounds Survivor und Portal geholt, die sind aber eher als Kurzzeitspiele anzusehen.

Geschichte

Um es kurz zu machen: Man spielt einen psi-begabten Jungen in einem Sommerlager für psi-begabte Kinder, der aus dem Zirkus ausgerissen ist, über dementsprechende akrobatische Fähigkeiten verfügt und auf den schönen Namen Rasputin hört. Das Sommerlager verfügt über so typische Einrichtungen wie Hütten mit Mehrstockbetten, Lagerfeuer, Trampoline, Kanus, eine Tauchkugel und Kammern für sensorische Deprivation.

Man könnte sich also hemmungslos dem Erkunden der liebevoll gestalteten Umgebung, dem Jagen von Schnitzeljagd-Gegenständen und anderen versteckten Gegenständen und dem gelegentlichen Ausflug in die Vorstellungswelten anderer Menschen (das wunderschöne „Basic Braining” wird in der ansonsten brillanten deutschen Übersetzung leider zu „Hirnanfängertraining”) beschäftigen, wenn da nicht ein wahnsinniger Zahnarzt wäre, der im Auftrag einer viel größeren Intelligenz die Hirne der Psi-Kadetten stiehlt.

Was danach beginnt, ist eine Reise durch die Köpfe – von Lehrern, von Wahnsinnigen, Tieren und auch den eigenen. In diesen Welten gilt es dann neben Aufträgen auch wieder eine Menge von Gegenständen zu finden, um die eigenen Fähigkeiten zu verbessern. Die Begriffe emotional baggage, geistige Spinnweben und figments of the mind nehmen dabei dann gerne auch mal physische Formen an. Am Ende steht natürlich die große Versöhnung, und auch die Sommerliebe darf nicht zu kurz kommen.

Ein Feuerwerk der Ideen

Um es kurz zu sagen, es ist fast beschämend, wie verschwenderisch die Entwickler mit Ideen umgegangen sind. Wo andere aus einer halben Idee fünfzehn Spiele machen (Beispiel kann sich jeder selber denken), hat Tim Schafer jede Vorstellungswelt einmalig gemacht. Teilweise sind es normale Orte (ein Theater mit einem übergroßen Kritiker, der mit dem Akzent von Reich-Ranicki spricht, die Kriegserinnerungen eines Veteranen, eine nie-endende Party).

Und dann kommt man an andere Wort und denkt, dass man hier ein komplettes Spiel verbringen könnte. Allein aus der Waterloo-Welt hätte man eine Serie von Baldur’s-Gate-Ausmaßen entwickeln können, und in Japan gibt es bestimmt auch einen Markt für Godzilla-Simulatoren (obwohl die kleinen Lungenfische, deren Stadt man verwüsten darf, dann doch etwas irritierend wären). Auch die Innenwelt eines Paranoikers mit dem beständigen Kampf zwischen Gut (äh… den Regenbogenzwergen) und Böse (der Trenchcoat-Brigade) ist wunderbar gestaltet.

Aber auch die Realität, oder zumindest was davon dargestellt wird, lässt an Absurdität nichts zu wünschen übrig. Die Hüpf- und Kletterpartie durch das etwas andere Irrenhaus hat bei mir allein durch die schiere Vorstellung Höhenangst ausgelöst, und am Ende wartete die kleine Shegor (auch hier bricht das englische Original leider durch) mit ihrer süßen Haustier-Schildkröte.

Ende

Am Ende, aber wo ist es anders, wirkt dann alles etwas gehetzt. Allein die Sequenz, in der man ganz Hirn ist, hätte etwas länger sein können. Die Bosskämpfe am Ende beschränken sich darauf, dass man gesagt bekommt, wo die einzige Schwachstelle des jeweiligen Gegners ist, damit man diese dann dreimal hintereinander ausnutzen kann, und die Kletterpartie im Zirkus ist BRUTAL schwer in einem Maße, der den Kampf am Ende von Half-Life, in den man vom G-Man geschickt wird wenn man sein Angebot ablehnt, fast schon wieder fair erscheint. Die schöne Endsequenz versöhnt dann aber wieder.

Fazit

Psychonauts ist extrem ideenreich, und absurd komisch. Die Level laden zum Erkunden ein, was auch immer belohnt wird. Oft gibt es mehrere Möglichkeiten, ein gegebenes Problem zu lösen. Die Geschichte wird gut vorangetrieben, es gibt viele vorgerenderte und In-game-cutscenes, die Charaktere sind liebevoll gestaltet, überhaupt alles stimmt. Alleiniger Wermutstropfen ist, dass das Spiel seine Geburt auf der Konsole nicht ganz verleugnen kann und die Steuerung am PC dementsprechend furchtbar ist (in einem Deus Ex: Invisible War-Sinne furchtbar). Wenn man also etwas Plattformherumspringen mag und auch einer schön erzählten Geschichte nicht abgeneigt ist, dann gibt es, wie Yahtzee schon sagt, keine Grund, das Spiel nicht zu spielen – zumal der Preis von $19,95 wirklich nicht zu hoch ist und es das Spiel für alle Plattformen außer der Wii gibt.

→ No CommentsTags: games · lang:de · review

GMM Code

März 13th, 2008 · No Comments

For one of the exercises, we had to implement the EM algorithm for Gaussian Mixture Models. I’ve spent a considerable amount of time on my solutions, either because I wanted to learn a new language (Scala version) or I wanted to not forgot an old one (C++ version), so I don’t want the code simply rotting on my hard drive.

The C++ version isn’t that much faster than the Scala version if I remember my experiments correctly (about 4x). Judging from the call graph of the C++ version, most of the time is spent in the exp function anyway, which is as fast as it gets.

Callgraph of the C++ version

The input file simply lists one float value per line, and the initial parameters for the Gaussians can be specified in the source files. Be advised that the number of Gaussians used to approximate the data needs to be known before the algorithm is run.

→ No CommentsTags: coli · lang:en · programming

Pattern & Speech Recognition Leftovers

März 13th, 2008 · No Comments

While learning for the exam of this semester’s Pattern & Speech Recognition course by Prof. Klakow (highly, highly recommended), we (a couple of people, look for the names in the document itself) put together a summary with a couple a topics from the course.

Topics
  1. Feature Extraction from Sound
  2. Bayesian Decision Theory
  3. Maximum Likelihood Estimation
  4. Nonparametric Techniques
  5. Gaussian Mixture Models
  6. Decision Trees

The text is put together from various sources, but mostly based on slides and notes from the lectures. Some sections are pending (Hidden Markov Models, Bayesian Networks, Markov Random Fields), other topics from the lecture are plainly missing (HMMs in Speech Recognition, Acoustic Modeling, Speaker Adaptation, Normal Distributions). However, to my knowledge, nobody has been examined in any of the missing speeach-recognition related missing sections.

Get the PDF version of the summary.

LaTeX Sources

The LaTeX source file, along with pictures, is kept in a Mercurial repository. To get the source files, do:
$ hg clone static-http://diotavelli.net/files/psr0708-summary/repository psr

The source file is named summ.tex and should build on most LaTeX installations without requiring additional packages.

Please notify me of any bugs or errors!

→ No CommentsTags: lang:en · uni

Need Some Music for Your Romantic Valentine Dinner?

Februar 14th, 2008 · No Comments

Go to Jason Eisner’s homepage, please don’t come back to complain.

Just to make it more geeky by nitpicking: if you are OOM, or have a segfault, you most likely won’t be able to finish your song.

→ No CommentsTags: coli · lang:en

Die Universität des Saarlandes…

Januar 28th, 2008 · 1 Comment

… erhebt zur Verbesserung der Qualität von Studium und Lehre allgemeine Studiengebühren.

Nehmt mein Geld, aber lügt mich nicht auch noch an. Die letzten 629€ haben auch nichts geholfen. Seid nur froh, dass es nicht genug Geld ist, um wirklich gute Lehre einfordern zu können.

→ 1 CommentTags: lang:de · rant · uni

Rescuing Generics

Januar 12th, 2008 · 1 Comment

This is the first part in what is planned to be a loosely-coupled series of articles on current developments in mainstream programming languages.

Topics include:

  • Evolution of Java
  • New abstractions in programming languages
  • The functional turn
  • Scala: „The next programming language™”

Generics in Java

When I started to program Java 5 professionally after some years of blissful absence from the Java world, I thought myself to be well-prepared for generics. After all, I had done metaprogramming in both C++ and Python for several years.

Of course, experience never saves you from the perils of learning. It took some time until I finally got generics, including the common misunderstandings about covariance and the like. Fortunately, in the project I was working on at the time, we were allowed to go wild and try out all new features in Java 5 at length. We were the first ones to work with the new version and also carrying out the internal training, so we really had to understand what generics were about, and why all tutorials usually contain more don’ts then dos.

When I finally understood them, I was really disappointed. The type system wasn’t generic at all! Type annotations are just some sugary coating stripped out by the compiler after the program passes the type checks. Still, generics proved to be useful from time to time. Some problems just kept coming back, and I will briefly outline them there.

The ugly cast

There is going to be an ugly cast at some point, where you (the programmer) know more about the static or runtime type of an object than the compiler. Our strategy was to isolate ugly casts in minimally small methods with @SuppressWarnings("unchecked") annotations. One prominent example is serialization:

@SuppressWarnings("unchecked")
public List<String> foo(ObjectInputStream i) throws IOException {
  return (List<String>) i.readObject();
}
Generic types + class objects

A lot of generics pain is remedied when you always hand around Class<V> objects whenever you create instances of classes with generic type arguments. This is often cumbersome, as it tends to make your APIs larger, but at least provides some kind of runtime type safety.

We used this often enough to call it a pattern, though I think we never gave it a proper name.

Sun’s Java compiler

We were in for some hard lessons when we found out that Eclipse’s Java compiler was much better at type inference than Sun’s javac. These kinds of errors were especially hard to track down, and some of them were unfixed at least up to 1.5.0 Update 12.1

The backlash

Now, after Java generics have been in the wild for a little more than three years and presumably have seen a wider adoption, a backlash is forming. While early coverage was mostly apologetic of all the oddities that had to be introduced to keep bytecode compatibility2, a lot of complaints about the (perceived) complexity of generics is heard.

Before I’m going to dive into an example, let me state the following:

  • Yes, Java code does get uglier and less readable with generics,
    …but a lot of that could be addressed with typedef’s.
  • Yes, generics have a lot of gotchas,
    …but most of them are due to backwards compatibility. I would have liked to hear the millions of IDE monkeys cry in horror if BC had been broken.
  • Yes, generics are difficult to grasp.
    Get over it. Seriously.

Killing Wildcards

In his recent article „Simplifying Java Generics by Eliminating Wildcards”, Robert Lovatt argues that Java generics could be simplified by removing wildcards altogether and make covariant generic types the default behavior, similar to arrays.

Please note that the following code examples assume that you have read the article.

In arrays, we observe the following behavior:

List[] listArray = new List[10];
Collection[] collArray = listArray;
collArray[0] = new HashSet(); // will result in an ArrayStoreException

His argument is that this behavior could simply be adopted for generics, making this code compile:

List<List> listList = new ArrayList<List>();
List<Collection> collList = listList;
collList.add(new HashSet());

Now, suppose the compiler would accept this piece of code (which it doesn’t), how should an exception similar to ArrayStoreException be thrown? The generic types are not known at runtime, in contrast to arrays3 , since they couldn’t be added without braking backwards compatibility. The only way to ensure the type safety is to have the class object inside List and check if newly added objects have the correct type, laying the burden of type checking on the class designer. While this may be acceptable for the standard library, it is certainly not acceptable for general usage.

An example taken from Lovatt’s article to display the lacking power of generics in Java (and Scala) is:

 ListScalaStyle<Integer> iList = new ListScalaStyle<Integer>();
 ListScalaStyle<Number> nList = iList.prepend( (Number)2.0 ); // OK
 ListScalaStyle<Integer> iList2 = nList.tail(); // Error, still a Number list

This is exactly the pathological case of the ugly cast. You, the programmer, know the static type of something and expect the compiler to be able to infer it as well.

To show why this cannot work in general, I’ll use a trick I’ve found to be very helpful: adding a little bit of randomization.

public ListScalaStyle<Number> getListWithTail() {
  if(Math.random() > .5) {
    ListScalaStyle<Integer> iList = new ListScalaStyle<Integer>();
    return iList.prepend((Number)2.0);
  } else {
    ListScalaStyle<Double> iList = new ListScalaStyle<Double>();
    return iList.prepend((Number)2.0);
  }
}
// ...
// can never work
ListScalaStyle<Integer> iList2 = nList.tail(); // Error, still a Number list

This little example is of course not a total refutation – having the compiler being able to infer more type information statically might always be useful. However, it will always be limited to small pieces of code. It also forces the compiler to actually examine the bytecode of functions in order to see the flow of objects, because prepend might do something wildly different. This removes many advantages of polymorphism, a technique at the very heart of Java.

Conclusion and Outlook

With generics, Java gets more complicated. It allows programmers to make interesting abstractions, but also freely hands out all kinds of guns for shooting yourself in the foot. This is definitely a deviation from Java’s original design principles and, ironically, makes it a bit more like C++ – something which the designers tried to avoid as hard as possible.

In the upcoming articles, we will examine the question why having generics this way still might be a good idea (though for totally different reasons), why the Java designers did it in the first place, and what the generics disaster (Tim Bray) teaches us about the design and evolution of programming languages.

  1. most of them are fixed in Java 6 []
  2. see Generics gotchas for a good example []
  3. see the documentation of Class.getComponentType() []

→ 1 CommentTags: java · lang:en · programming