<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>I See Dead Code &#187; programming</title>
	<atom:link href="http://shlomme.diotavelli.net/category/programming/feed/" rel="self" type="application/rss+xml" />
	<link>http://shlomme.diotavelli.net</link>
	<description>… as sounding brass, or a tinkling cymbal.</description>
	<lastBuildDate>Sun, 11 Dec 2011 00:53:12 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.5</generator>
		<item>
		<title>Oh my god—it&#8217;s full of cores!</title>
		<link>http://shlomme.diotavelli.net/2009/05/24/gpu-computing/</link>
		<comments>http://shlomme.diotavelli.net/2009/05/24/gpu-computing/#comments</comments>
		<pubDate>Sun, 24 May 2009 16:17:48 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[hardware]]></category>
		<category><![CDATA[lang:en]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/?p=204</guid>
		<description><![CDATA[Beginnings After the bleak, joyless work of releasing software, plugging holes of preventing users from clicking buttons they should not have clicked in the first place, writing unctuous documentation and release notes and wrapping code into neat little installers with tiny bows and bells that gently tingle when touched, there are few things as profoundly [...]]]></description>
			<content:encoded><![CDATA[<h5>Beginnings</h5>
<p>After the bleak, joyless work of releasing software, plugging holes of preventing users from clicking buttons they should not have clicked in the first place, writing unctuous documentation and release notes and wrapping code into neat little installers with tiny bows and bells that gently tingle when touched, there are few things as profoundly satisfying as taking code that looked like line noise in the first place, applying some non-trivial transformation to it and still have it look like line noise, only now it&#8217;s twice as fast, or half as long or of some other inaccessible quality that can, on some technical level, be referred to as &#8220;cool&#8221;.</p>
<p>Quite some time ago, I decided that this time, &#8220;cool&#8221; will mean &#8220;runs on graphics hardware, possibly faster&#8221;. For this end, I took my trusty T61 to work, downloaded the latest <a href="http://www.nvidia.com/object/cuda_home.html">CUDA toolkit</a>, found out that nearly all examples crashed when compiled, installed the latest beta drivers from NVIDIA, noticed that everything worked now and started playing around with the examples. </p>
<p>After some time marveling at ball pit simulations and similarly telling examples, I printed out the <a href="http://developer.download.nvidia.com/compute/cuda/2_2/toolkit/docs/NVIDIA_CUDA_Programming_Guide_2.2.pdf">CUDA programming guide</a> and spent a considerable amount of time marveling at how well the NVidia green looked when printed with our institute&#8217;s color laser printer, before reading its more important parts.<br />
<span id="more-204"></span></p>
<h5>Technical Details</h5>
<p>Roughly said, modern graphics hardware from NVIDIA consists of an array of multiprocessors. Each multiprocessor has eight cores, some shared memory, a controller unit and units for transcendental functions. Kernels (methods that run on the device, as opposed to host code) are automatically distributed to the available multiprocessors (their number depending on the hardware, with as many as 120 on the Tesla cards). </p>
<p>Kernel invocations are organized in 3-dimensional thread blocks and 2-dimensional grids. A single thread block can have at most 512 threads (i.e. x × y × z ⩽ 512) and is always executed by a single multiprocessor. Thread blocks are organized in a grid (with at most 65,536 elements on each dimension), which are distributed to the available multiprocessors, and thus inherently scalable. </p>
<p>A multiprocessor divides the threads of a block into <em>warps</em> and all threads of a single warp are executed in parallel. Optimal performance is reached when the control flow in threads of the same warp does not diverge based on the input data, the cores are optimized to execute the same code in several threads on different data (therefore the name SIMT—Single instruction, multiple thread—for this architecture). </p>
<p>There is also a memory hierarchy, from registers to on-die shared memory to global device memory. Data has to be explicitly copied from the system RAM to device memory, which is costly and should be minimized.</p>
<p>In other words, if you have a computation that has to be executed many, many times on different input data, move it the GPU instead and use the freed CPU cycles for shuffling around the data. In other words, ZOOOOOOM.</p>
<h5>Some Experiments</h5>
<p>Back in the day, we had to write code for training <a href="http://shlomme.diotavelli.net/2008/03/13/gmm-code/">Gaussian Mixture Models</a>. At the core of the algorithm, an auxiliary vector containing each point in the training set has to be created for each Gaussian. The creation of the auxiliar vector includes exponentiation, which proved to be quite costly, since it accounted for 50% of the running time of the program<sup><a href="http://shlomme.diotavelli.net/2009/05/24/gpu-computing/#footnote_0_204" id="identifier_0_204" class="footnote-link footnote-identifier-link" title="There might be a smart way of getting rid of the exp() call, though I haven&amp;#8217;t found it yet">1</a></sup>. </p>
<p>Instead of doing comparedly boring optimization work for running that algorithm on a single processor, and even more boring work to have it run on two processors, I partially ported it to have the auxiliary vectors be computed on the GPU, with the result that by moving only this one part of the computation leads to the three-fold speed increase for the whole program. </p>
<p>On my machine (2 GiB RAM, 2.2 GHz Core 2 Duo, NVS 140M with 2 multiprocessors), running the C++-only version<sup><a href="http://shlomme.diotavelli.net/2009/05/24/gpu-computing/#footnote_1_204" id="identifier_1_204" class="footnote-link footnote-identifier-link" title="compiled using g++ 4.4.0 -O3">2</a></sup> takes 33s for 90,000 points and three Gaussians. Running the computation with the same data set on the GPU takes 11s. </p>
<h5>How is it done</h5>
<p>In order to invoke a kernel running on the GPU, the code has to be compiled with <tt>nvcc</tt>, NVIDIA&#8217;s compiler for CUDA. It separates device, which is compiled to some binary code, from host code which is transformed and handed over to the platform compiler. Therefore, it is needed to write some intermediate methods which can be called from &#8220;normal&#8221; C++ and that take care of shuffling data to the device and invoking the kernel methods<sup><a href="http://shlomme.diotavelli.net/2009/05/24/gpu-computing/#footnote_2_204" id="identifier_2_204" class="footnote-link footnote-identifier-link" title="looking at the SDK examples was quite helpful here">3</a></sup>:</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;"><span style="color: #993333;">typedef</span> <span style="color: #993333;">float</span> PROB<span style="color: #339933;">;</span>
&nbsp;
<span style="color: #339933;">#define BLOCK_SIZE 16</span>
&nbsp;
Matrix d_in<span style="color: #339933;">;</span>
Matrix d_out<span style="color: #339933;">;</span>
&nbsp;
__device__ __constant__ KernelGaussian d_params<span style="color: #009900;">&#91;</span><span style="color: #0000dd;">20</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">extern</span> <span style="color: #ff0000;">&quot;C&quot;</span>
<span style="color: #993333;">void</span> LoadPoints<span style="color: #009900;">&#40;</span>Matrix in<span style="color: #339933;">,</span> size_t params<span style="color: #009900;">&#41;</span> 
<span style="color: #009900;">&#123;</span>
    d_in.<span style="color: #202020;">height</span> <span style="color: #339933;">=</span> in.<span style="color: #202020;">height</span><span style="color: #339933;">;</span>
    d_in.<span style="color: #202020;">width</span> <span style="color: #339933;">=</span> in.<span style="color: #202020;">width</span><span style="color: #339933;">;</span>    
    size_t size_in <span style="color: #339933;">=</span> in.<span style="color: #202020;">width</span> <span style="color: #339933;">*</span> in.<span style="color: #202020;">height</span> <span style="color: #339933;">*</span> <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span>PROB<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    cutilSafeCall<span style="color: #009900;">&#40;</span>cudaMalloc<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span><span style="color: #993333;">void</span><span style="color: #339933;">**</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&amp;</span>d_in.<span style="color: #202020;">elements</span><span style="color: #339933;">,</span> size_in<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    cutilSafeCall<span style="color: #009900;">&#40;</span>cudaMemcpy<span style="color: #009900;">&#40;</span>d_in.<span style="color: #202020;">elements</span><span style="color: #339933;">,</span> in.<span style="color: #202020;">elements</span><span style="color: #339933;">,</span> size_in<span style="color: #339933;">,</span> cudaMemcpyHostToDevice<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    d_out.<span style="color: #202020;">width</span> <span style="color: #339933;">=</span> in.<span style="color: #202020;">width</span><span style="color: #339933;">;</span>
    d_out.<span style="color: #202020;">height</span> <span style="color: #339933;">=</span> params<span style="color: #339933;">;</span>
&nbsp;
    size_t size_out <span style="color: #339933;">=</span> d_out.<span style="color: #202020;">width</span> <span style="color: #339933;">*</span> d_out.<span style="color: #202020;">height</span> <span style="color: #339933;">*</span> <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span>PROB<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    cudaMalloc<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span><span style="color: #993333;">void</span><span style="color: #339933;">**</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&amp;</span>d_out.<span style="color: #202020;">elements</span><span style="color: #339933;">,</span> size_out<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
__global__ <span style="color: #993333;">void</span> computeAuxVectors<span style="color: #009900;">&#40;</span>Matrix in<span style="color: #339933;">,</span> Matrix out<span style="color: #009900;">&#41;</span> 
<span style="color: #009900;">&#123;</span>
    <span style="color: #993333;">int</span> p <span style="color: #339933;">=</span> blockIdx.<span style="color: #202020;">y</span> <span style="color: #339933;">*</span> BLOCK_SIZE <span style="color: #339933;">+</span> threadIdx.<span style="color: #202020;">y</span><span style="color: #339933;">;</span>
    KernelGaussian g <span style="color: #339933;">=</span> d_params<span style="color: #009900;">&#91;</span>blockIdx.<span style="color: #202020;">x</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
    PROB d <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>in.<span style="color: #202020;">elements</span><span style="color: #009900;">&#91;</span>p<span style="color: #009900;">&#93;</span> <span style="color: #339933;">-</span> g.<span style="color: #202020;">mean</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">/</span> g.<span style="color: #202020;">stddev</span><span style="color: #339933;">;</span>
    out.<span style="color: #202020;">elements</span><span style="color: #009900;">&#91;</span>blockIdx.<span style="color: #202020;">x</span> <span style="color: #339933;">*</span> out.<span style="color: #202020;">width</span> <span style="color: #339933;">+</span> p<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> g.<span style="color: #202020;">factor</span> <span style="color: #339933;">*</span> expf<span style="color: #009900;">&#40;</span><span style="color: #339933;">-</span><span style="color:#800080;">0.5</span><span style="color: #339933;">*</span>d<span style="color: #339933;">*</span>d<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
<span style="color: #000000; font-weight: bold;">extern</span> <span style="color: #ff0000;">&quot;C&quot;</span>
<span style="color: #993333;">void</span> RunComputeAuxVectors<span style="color: #009900;">&#40;</span>
        KernelGaussian<span style="color: #339933;">*</span> g<span style="color: #339933;">,</span> 
        std<span style="color: #339933;">::</span><span style="color: #202020;">vector</span><span style="color: #339933;">&lt;</span>std<span style="color: #339933;">::</span><span style="color: #202020;">vector</span><span style="color: #339933;">&lt;</span>PROB<span style="color: #339933;">&gt;</span> <span style="color: #339933;">&gt;&amp;</span> out<span style="color: #339933;">,</span> 
        size_t points<span style="color: #339933;">,</span> size_t params<span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
    cutilSafeCall<span style="color: #009900;">&#40;</span>
        cudaMemcpyToSymbol<span style="color: #009900;">&#40;</span>
          d_params<span style="color: #339933;">,</span> g<span style="color: #339933;">,</span> 
          params<span style="color: #339933;">*</span><span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span>KernelGaussian<span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> 
          <span style="color: #0000dd;">0</span><span style="color: #339933;">,</span> cudaMemcpyHostToDevice<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    dim3 dimGrid<span style="color: #009900;">&#40;</span>params<span style="color: #339933;">,</span> points <span style="color: #339933;">/</span> BLOCK_SIZE<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    dim3 dimBlock<span style="color: #009900;">&#40;</span><span style="color: #0000dd;">1</span><span style="color: #339933;">,</span> BLOCK_SIZE<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    computeAuxVectors<span style="color: #339933;">&lt;&lt;&lt;</span>dimGrid<span style="color: #339933;">,</span> dimBlock<span style="color: #339933;">&gt;&gt;&gt;</span><span style="color: #009900;">&#40;</span>d_in<span style="color: #339933;">,</span> d_out<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    cutilCheckMsg<span style="color: #009900;">&#40;</span><span style="color: #ff0000;">&quot;Kernel execution failed&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #b1b100;">for</span><span style="color: #009900;">&#40;</span>size_t i <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span> i <span style="color: #339933;">&lt;</span> params<span style="color: #339933;">;</span> <span style="color: #339933;">++</span>i<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        cutilSafeCall<span style="color: #009900;">&#40;</span>
            cudaMemcpy<span style="color: #009900;">&#40;</span>
              <span style="color: #339933;">&amp;</span>out<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span><span style="color: #009900;">&#91;</span><span style="color: #0000dd;">0</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span> <span style="color: #339933;">&amp;</span>d_out.<span style="color: #202020;">elements</span><span style="color: #009900;">&#91;</span>i <span style="color: #339933;">*</span> points<span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span> 
              <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span>PROB<span style="color: #009900;">&#41;</span> <span style="color: #339933;">*</span> points<span style="color: #339933;">,</span> 
              cudaMemcpyDeviceToHost<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>The matrix <tt>d_in</tt> holds the points, which are constant over the whole computation and therefore copied over to the device exactly once. <tt>d_out</tt> is the matrix that holds the auxiliary values computed for each point and each Gaussians. For faster retrieval, the Gaussians are stored in constant device memory (<tt>d_params</tt>), which is set to contain 20 elements at most, i.e. the algorithm is limited to learn mixture models with at most 20 Gaussians. </p>
<p>In the method <tt>LoadPoints</tt>, device memory is allocated for the points and the output vectors and the points are copied from host to device memory using <tt>cudaMemcpy</tt>. </p>
<p>The actual invocation of the kernel is done in the host method <tt>RunComputeAuxVectors</tt>, which first copies the parameters into the constant storage and invokes the kernel <tt>computeAuxVectors</tt> using the new <tt><<<grid, block>>></tt> syntax. <tt>grid</tt> holds the dimensions of the grid, which is <emph>number of gaussians</em>× <em>points / BLOCK_SIZE</em>. The block size is the width of a thread block and somewhat arbitrarily chosen to be 16<sup><a href="http://shlomme.diotavelli.net/2009/05/24/gpu-computing/#footnote_3_204" id="identifier_3_204" class="footnote-link footnote-identifier-link" title="I didn&amp;#8217;t try out other block sizes">4</a></sup>, currently this also limits the number of points to be a multiple of 16<sup><a href="http://shlomme.diotavelli.net/2009/05/24/gpu-computing/#footnote_4_204" id="identifier_4_204" class="footnote-link footnote-identifier-link" title="this is a restriction of the implementation, one could simply pad the points with however many zeros as needed and compute a little more">5</a></sup>. After the blocking kernel invocation, the results are copied back into host memory.</p>
<p>Important:</p>
<ul>
<li>My card only supports single-precision floats, newer ones also have double-precision</li>
<li><tt>blockIdx</tt> and <tt>threadIdx</tt> are global variables provided by CUDA compiler</li>
<li><tt>__global__</tt> methods are run on the device and can be called from both host and device code</li>
<li>Currently, the device memory for <tt>d_in</tt> and <tt>d_out</tt> is never freed, the program simply exits</li>
</ul>
<h5>Results</h5>
<p>Working with CUDA is definitely fun (if rewarding), the system itself is quite unforgiving though. Wrapping each invocation of a CUDA method into <tt>cutilSafeCall</tt> therefore is strongly recommended, in case of an error it will print a (not always helpful) error message and exit. Though in this case the code running on the GPU itself is quite trivial, it can be hard to figure out what is going wrong, because the GPU itself is a black box, and it&#8217;s not possible to simply print out a message<sup><a href="http://shlomme.diotavelli.net/2009/05/24/gpu-computing/#footnote_5_204" id="identifier_5_204" class="footnote-link footnote-identifier-link" title="or maybe it is, and I don&amp;#8217;t know how to do it&amp;#8230;">6</a></sup>. It&#8217;s possible to emulate the code on the CPU, and there are other tools available which I haven&#8217;t tried out. </p>
<p>I&#8217;ve also started toying around with my original idea, having the constraint checks for the TIGER query evaluation run on the graphics hardware and initial results are mixed. Still, if nothing at all, this gives me justification to spent a lot of money on expensive graphics hardware for my next computer—it&#8217;s all for science, this harsh and demanding mistress.</p>
<ol class="footnotes"><li id="footnote_0_204" class="footnote">There might be a smart way of getting rid of the <tt>exp()</tt> call, though I haven&#8217;t found it yet</li><li id="footnote_1_204" class="footnote">compiled using g++ 4.4.0 -O3</li><li id="footnote_2_204" class="footnote">looking at the SDK examples was quite helpful here</li><li id="footnote_3_204" class="footnote">I didn&#8217;t try out other block sizes</li><li id="footnote_4_204" class="footnote">this is a restriction of the implementation, one could simply pad the points with however many zeros as needed and compute a little more</li><li id="footnote_5_204" class="footnote">or maybe it is, and I don&#8217;t know how to do it&#8230;</li></ol>]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2009/05/24/gpu-computing/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Scrollable Widgets with PyGTK</title>
		<link>http://shlomme.diotavelli.net/2009/05/17/scrollable-widgets-with-pygtk/</link>
		<comments>http://shlomme.diotavelli.net/2009/05/17/scrollable-widgets-with-pygtk/#comments</comments>
		<pubDate>Sat, 16 May 2009 23:49:17 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[lang:en]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/?p=178</guid>
		<description><![CDATA[It is possible to write custom GTK widgets that have &#8220;native&#8221; scrolling support, as opposed to just shoving them into a GtkViewPort and forgetting about them. Apart from having mastered a small coding challenge, as it turned out to be, this also gives you greater control over the scrolling itself, like making sure that certain [...]]]></description>
			<content:encoded><![CDATA[<p>It is possible to write custom GTK widgets that have &#8220;native&#8221; scrolling support, as opposed to just shoving them into a <a href="http://library.gnome.org/devel/gtk/unstable/GtkViewport.html">GtkViewPort</a> and forgetting about them. </p>
<p>Apart from having mastered a small coding challenge, as it turned out to be, this also gives you greater control over the scrolling itself, like making sure that certain elements are visible, viewport panning etc.</p>
<p>Anyway, especially when using PyGTK, it&#8217;s a bit unclear on how to proceed. From the documentation, it somehow gets clear that it has to do with the signal <code>set_scroll_adjustment_signal</code>:</p>
<blockquote><p>
This signal is emitted when a widget of this class is added to a scrolling aware parent, gtk_widget_set_scroll_adjustments() handles the emission. Implementation of this signal is optional.
</p></blockquote>
<p>This is not a signal name, but a signal ID<sup><a href="http://shlomme.diotavelli.net/2009/05/17/scrollable-widgets-with-pygtk/#footnote_0_178" id="identifier_0_178" class="footnote-link footnote-identifier-link" title="which you usually don&amp;#8217;t seen when coding with PyGTK">1</a></sup> that has to be set in <a href="http://developer.gnome.org/doc/GGAD/z144.html">GtkWidgetClass</a><sup><a href="http://shlomme.diotavelli.net/2009/05/17/scrollable-widgets-with-pygtk/#footnote_1_178" id="identifier_1_178" class="footnote-link footnote-identifier-link" title="ditto">2</a></sup>.</p>
<p>Some more documentation reading reveals that you can set this signal by using the <code>set_set_scroll_adjustments_signal</code> method on a widget:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">class</span> ScrollableWidget<span style="color: black;">&#40;</span>gtk.<span style="color: black;">DrawingArea</span><span style="color: black;">&#41;</span>:
    __gsignals__ = <span style="color: black;">&#123;</span>
        <span style="color: #483d8b;">&quot;set-scroll-adjustments&quot;</span>: <span style="color: black;">&#123;</span>
            gobject.<span style="color: black;">SIGNAL_RUN_LAST</span>,
            gobject.<span style="color: black;">TYPE_NONE</span>, <span style="color: black;">&#40;</span>gtk.<span style="color: black;">Adjustment</span>, gtk.<span style="color: black;">Adjustment</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>,
    <span style="color: black;">&#125;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> <span style="color: #0000cd;">__init__</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        gtk.<span style="color: black;">DrawingArea</span>.<span style="color: #0000cd;">__init__</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>
        <span style="color: #008000;">self</span>.<span style="color: black;">set_set_scroll_adjustments_signal</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;set-scroll-adjustments&quot;</span><span style="color: black;">&#41;</span></pre></div></div>

<p>It doesn&#8217;t really matter <em>how</em> you call the signal as long as it takes two arguments (the horizontal and vertical adjustment). This will make the method <a href="http://library.gnome.org/devel/pygtk/stable/class-gtkwidget.html#method-gtkwidget--set-scroll-adjustments"><code>set_scroll_adjustments</code></a> (which you can&#8217;t override from within Python) return <em>True</em> when it is called and signal that the widget supports scrolling.</p>
<p>This, however, is only half the way, because the scrollable widget still needs the adjustments handed in via said methods. It&#8217;s of course possible to connect to the signal explicitly, but there&#8217;s an even more direct way by using action signals. </p>
<p>Action signals are the C programmer&#8217;s idea of &#8220;generic methods&#8221;. In order to create such a signal, it has to have the flag <code>gobject.SIGNAL_ACTION</code> and they are directly connected to a function which is then called on each signal emission. While in C, you have to provide a function pointer, in Python you can just implement functions with a compounded magic name<sup><a href="http://shlomme.diotavelli.net/2009/05/17/scrollable-widgets-with-pygtk/#footnote_2_178" id="identifier_2_178" class="footnote-link footnote-identifier-link" title="a technique which I thoroughly dislike and should be converted to be used with decorators">3</a></sup> and have it called automatically. I haven&#8217;t found any documentation on that in the PyGObject or PyGTK docs, only some examples in the web:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">class</span> ScrollableWidget<span style="color: black;">&#40;</span>gtk.<span style="color: black;">DrawingArea</span><span style="color: black;">&#41;</span>:
    __gsignals__ = <span style="color: black;">&#123;</span>
        <span style="color: #483d8b;">&quot;set-scroll-adjustments&quot;</span>: <span style="color: black;">&#123;</span>
            gobject.<span style="color: black;">SIGNAL_RUN_LAST</span> | gobject.<span style="color: black;">SIGNAL_ACTION</span>, 
            gobject.<span style="color: black;">TYPE_NONE</span>, <span style="color: black;">&#40;</span>gtk.<span style="color: black;">Adjustment</span>, gtk.<span style="color: black;">Adjustment</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>,
    <span style="color: black;">&#125;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> <span style="color: #0000cd;">__init__</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        gtk.<span style="color: black;">DrawingArea</span>.<span style="color: #0000cd;">__init__</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>
        <span style="color: #008000;">self</span>.<span style="color: black;">set_set_scroll_adjustments_signal</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;set-scroll-adjustments&quot;</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> do_set_scroll_adjustments<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, h_adjustment, v_adjustment<span style="color: black;">&#41;</span>:
         <span style="color: #808080; font-style: italic;"># do some useful stuff here, like saving them</span>
         ...</pre></div></div>

<p>The method being called on emission has to start with <code>do_</code>, following by the signal names with hyphens replaced by underscores.</p>
<p>The adjustment objects can then be configured to one&#8217;s own liking to have scroll bars show up or not. However, to know when the user did some scrolling, it&#8217;s necessary to listen on some signals:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">    <span style="color: #ff7700;font-weight:bold;">def</span> do_set_scroll_adjustments<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, h_adjustment, v_adjustment<span style="color: black;">&#41;</span>:
        h_adjustment.<span style="color: black;">connect</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;value-changed&quot;</span>, <span style="color: #008000;">self</span>._scroll_value_changed<span style="color: black;">&#41;</span>
        <span style="color: #008000;">self</span>._hadj = h_adjustment
        v_adjustment.<span style="color: black;">connect</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;value-changed&quot;</span>, <span style="color: #008000;">self</span>._scroll_value_changed<span style="color: black;">&#41;</span>
        <span style="color: #008000;">self</span>._vadj = v_adjustment</pre></div></div>

<p>To make the scroll bar show up, modify <code>upper</code>, <code>lower</code> and <code>page_size</code> on the adjustments.</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #008000;">self</span>._hadj.<span style="color: black;">lower</span> = <span style="color: #ff4500;">0</span>
<span style="color: #008000;">self</span>._hadj.<span style="color: black;">upper</span> = <span style="color: #ff4500;">50</span>
<span style="color: #008000;">self</span>._hadj.<span style="color: black;">page_size</span> = <span style="color: #ff4500;">10</span></pre></div></div>

<p>This tells the scrollbar that the size of the underlying picture (<code>upper - lower</code>) is 50, while the visible size (<code>page_size</code>) is 10. </p>
<p>The page size obviously depends on the current size of the widget, which can be retrieved from the underlying <code>gtk.gdk.Window</code>:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">width, height = <span style="color: #008000;">self</span>.<span style="color: black;">window</span>.<span style="color: black;">get_size</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre></div></div>

<p>The current position of the scroll bar is controlled by the property <code>value</code> of the adjustment object and should be in the range <code>[lower .. upper - page_size]</code>. Whenever the property is changed, the <code>value-changed</code> signal is emitted, which we&#8217;ve connected to previously, and the widget can be repainted.</p>
<p>If you&#8217;re curious, you can also see the whole gloriousness in <a href="http://www.cl.uzh.ch/kitt/hg/sta/torsten/file/dc2c113ef300/STA/app/ui/gtktreeview.py#l199">actual working code</a>.</p>
<ol class="footnotes"><li id="footnote_0_178" class="footnote">which you usually don&#8217;t seen when coding with PyGTK</li><li id="footnote_1_178" class="footnote">ditto</li><li id="footnote_2_178" class="footnote">a technique which I thoroughly dislike and should be converted to be used with decorators</li></ol>]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2009/05/17/scrollable-widgets-with-pygtk/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The big wheel of commits</title>
		<link>http://shlomme.diotavelli.net/2009/01/31/the-big-wheel-of-commits/</link>
		<comments>http://shlomme.diotavelli.net/2009/01/31/the-big-wheel-of-commits/#comments</comments>
		<pubDate>Sat, 31 Jan 2009 22:28:51 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[lang:en]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[treealigner]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/?p=99</guid>
		<description><![CDATA[Yesterday, I merged the frame semantics branch, which I have been working on for my MSc thesis, into my personal repository. Since such a grave step always demands introspection, I looked at all 513 commits I ever to to the TreeAligner repository and created a little statistic on commit times. The picture contains a 24h [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday, I merged the frame semantics branch, which I have been working on for my MSc thesis, into my <a href="http://hg.diotavelli.net/sta/shlomme">personal repository</a>. Since such a grave step always demands introspection, I looked at all 513 commits I ever to to the TreeAligner repository and created a little statistic on commit times. </p>
<p>The picture contains a 24h clock and shows the number of commits which were done in this hour of the day, scaled by the highest commit count (that being 2 in the afternoon).</p>
<div style="text-align: center">
<img src="http://diotavelli.net/files/img/commitwheel.png" alt="Commit statistics" /></div>
<p>What does all this tell us, apart from the fact that I had a free hour today? Well, I never code at 5 or 6 in the morning. For the rest of that, it&#8217;s more worthwhile to split the statistics in two parts:</p>
<div style="text-align: center">
<img src="http://diotavelli.net/files/img/before_master.png" alt="Commit statistics, before 10/2008" /><br />
08/2007 – 10/2008, 234 commits
</div>
<div style="text-align: center">
<img src="http://diotavelli.net/files/img/during_master.png" alt="Commit statistics, starting 10/2008" /><br />
10/2008 – now, 279 commits
</div>
<p>Wow, I used to be cool. Hacking late in the evening. Now, it&#8217;s just a day job.</p>
]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2009/01/31/the-big-wheel-of-commits/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>KONVENS &#8217;08</title>
		<link>http://shlomme.diotavelli.net/2008/06/22/konvens-08/</link>
		<comments>http://shlomme.diotavelli.net/2008/06/22/konvens-08/#comments</comments>
		<pubDate>Sun, 22 Jun 2008 14:59:29 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[treealigner]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/?p=93</guid>
		<description><![CDATA[Just this week, our paper „Extending the TIGER Query Language with Universal Quantification” for KONVENS &#8217;08 has been accepted. The abstract: The query language in TIGERSearch is limited due to its lack of universal quantification. This restriction makes it impossible to ask simple queries like „Find sentences that do not include a certain word”. We [...]]]></description>
			<content:encoded><![CDATA[<p>Just this week, our paper „Extending the TIGER Query Language with Universal Quantification” for <a href="http://konvens.dwds.de">KONVENS &#8217;08</a> has been accepted. The abstract: </p>
<blockquote><p>
The query language in TIGERSearch is limited due to its lack of universal quantification. This restriction makes it impossible to ask simple queries like „Find sentences that do not include a certain word”.  We propose an easy way to formulate such queries. We have implemented this extension to the query language in a tool that allows querying parallel treebanks including their alignment constraints. Our implementation of universal quantification relies on the view of node sets rather than single node unification. Our query tool is freely available.
</p></blockquote>
<p>This means that I&#8217;ll likely be in Berlin for the conference early October.</p>
<p>I haven&#8217;t really managed to do a lot of interesting things for the TreeAligner lately. The query evaluation for complex queries should be a little faster due to a smarter algorithm/data structure, we now have Undo/Redo support (shamelessly copying over all the frameworks from the Eclipse people, only with less code) and the tree renderer is a little faster.</p>
<p>The next step now should be the web-based query server so that interested researchers can try out the query evaluator without having to download the TreeAligner package and going through the hassle of installing GTK+ on Windows.</p>
]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2008/06/22/konvens-08/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>I have no life and I must program</title>
		<link>http://shlomme.diotavelli.net/2008/05/31/i-have-no-life-and-i-must-program/</link>
		<comments>http://shlomme.diotavelli.net/2008/05/31/i-have-no-life-and-i-must-program/#comments</comments>
		<pubDate>Sat, 31 May 2008 10:44:53 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[coli]]></category>
		<category><![CDATA[lang:en]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/?p=91</guid>
		<description><![CDATA[Some participants of this semester&#8217;s „Computational Linguistics” course (which is a code word for „10 different lecturers guide you through the wonderful world of algorithms and theoretical foundations of CoLi”) obviously lack a life and willfully extended their own homework assignment, writing small toolkits for finite state automata. Surprisingly, all those toolkits were written in [...]]]></description>
			<content:encoded><![CDATA[<p>Some participants of this semester&#8217;s „Computational Linguistics” course (which is a code word for „10 different lecturers guide you through the wonderful world of algorithms and theoretical foundations of CoLi”) obviously lack a life and willfully extended their own homework assignment, writing small toolkits for finite state automata.</p>
<p>Surprisingly, all those toolkits were written in Python and made our C++-affine lecturer wonder if he probably should look into this language some more, apart from suggesting that we „maybe […] should get a life.”</p>
<p>So what does my toolkit do?</p>
<ul>
<li>Determinization of NFSAs to DFSAs</li>
<li>Creation of DFAs from simple regular expressions</li>
<li>Application of FSTs</li>
<li>dot graph output for *FSAs</li>
</ul>
<p>All in all, not very impressive, and not very hard. Still, if you like <a href="http://pyparsing.wikispaces.com/">PyParsing</a> and generator-expression-prone Python code, you might want to have look at <a href="http://diotavelli.net/files/tinyfst.tar.bz2">the TinyFST code</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2008/05/31/i-have-no-life-and-i-must-program/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Look at me, I know Functional Programming!</title>
		<link>http://shlomme.diotavelli.net/2008/05/05/look-at-me-i-know-functional-programming/</link>
		<comments>http://shlomme.diotavelli.net/2008/05/05/look-at-me-i-know-functional-programming/#comments</comments>
		<pubDate>Mon, 05 May 2008 20:22:14 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[lang:en]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/2008/05/05/look-at-me-i-know-functional-programming/</guid>
		<description><![CDATA[Sometimes I wonder what the hell I might have been thinking while writing code like this1 : 1 2 3 4 5 6 7 def uncurry&#40;f&#41;: return lambda t: f&#40;*t&#41; &#160; def longest_common_prefix&#40;Sa, Sb&#41;: return len&#40;list&#40; takewhile&#40;uncurry&#40;eq&#41;, izip&#40;Sa, Sb&#41;&#41;&#41;&#41; I bet the tenses are all wrong again.]]></description>
			<content:encoded><![CDATA[<p>Sometimes I wonder what the hell I might have been thinking while writing code like this<sup><a href="http://shlomme.diotavelli.net/2008/05/05/look-at-me-i-know-functional-programming/#footnote_0_89" id="identifier_0_89" class="footnote-link footnote-identifier-link" title="I bet the tenses are all wrong again.">1</a></sup> :</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
</pre></td><td class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">def</span> uncurry<span style="color: black;">&#40;</span>f<span style="color: black;">&#41;</span>:
    <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #ff7700;font-weight:bold;">lambda</span> t: f<span style="color: black;">&#40;</span><span style="color: #66cc66;">*</span>t<span style="color: black;">&#41;</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">def</span> longest_common_prefix<span style="color: black;">&#40;</span>Sa, Sb<span style="color: black;">&#41;</span>:
    <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span><span style="color: #008000;">list</span><span style="color: black;">&#40;</span>
        takewhile<span style="color: black;">&#40;</span>uncurry<span style="color: black;">&#40;</span>eq<span style="color: black;">&#41;</span>,
                  izip<span style="color: black;">&#40;</span>Sa, Sb<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span></pre></td></tr></table></div>

<ol class="footnotes"><li id="footnote_0_89" class="footnote">I bet the tenses are all wrong again.</li></ol>]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2008/05/05/look-at-me-i-know-functional-programming/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>GMM Code</title>
		<link>http://shlomme.diotavelli.net/2008/03/13/gmm-code/</link>
		<comments>http://shlomme.diotavelli.net/2008/03/13/gmm-code/#comments</comments>
		<pubDate>Thu, 13 Mar 2008 21:35:34 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[coli]]></category>
		<category><![CDATA[lang:en]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/2008/03/13/gmm-code/</guid>
		<description><![CDATA[For one of the exercises, we had to implement the EM algorithm for Gaussian Mixture Models. I&#8217;ve spent a considerable amount of time on my solutions, either because I wanted to learn a new language (Scala version) or I wanted to not forgot an old one (C++ version), so I don&#8217;t want the code simply [...]]]></description>
			<content:encoded><![CDATA[<p>For one of the exercises, we had to implement the EM algorithm for <a href="http://en.wikipedia.org/wiki/Mixture_model">Gaussian Mixture Models</a>. I&#8217;ve spent a considerable amount of time on my solutions, either because I wanted to learn a new language (<a href="http://diotavelli.net/files/code/gmm.scala">Scala version</a>) or I wanted to not forgot an old one (<a href="http://diotavelli.net/files/code/gmm.cpp">C++ version</a>), so I don&#8217;t want the code simply rotting on my hard drive.</p>
<p>The C++ version isn&#8217;t that much faster than the Scala version if I remember my experiments correctly (about 4x). Judging from the call graph of the C++ version, most of the time is spent in the <i>exp</i> function anyway, which is as fast as it gets.</p>
<p><img style="margin-left:auto; margin-right:auto; display:block" src="http://diotavelli.net/files/img/gmm-callgraph.png" alt="Callgraph of the C++ version" /></p>
<p>The input file simply lists one float value per line, and the initial parameters for the Gaussians can be specified in the source files. Be advised that the number of Gaussians used to approximate the data needs to be known before the algorithm is run.</p>
]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2008/03/13/gmm-code/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Rescuing Generics</title>
		<link>http://shlomme.diotavelli.net/2008/01/12/rescuing-generics/</link>
		<comments>http://shlomme.diotavelli.net/2008/01/12/rescuing-generics/#comments</comments>
		<pubDate>Sat, 12 Jan 2008 15:54:04 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[lang:en]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/2008/01/12/rescuing-generics/</guid>
		<description><![CDATA[This is the first part in what is planned to be a loosely-coupled series of articles on current developments in mainstream programming languages. Topics include: Evolution of Java New abstractions in programming languages The functional turn Scala: „The next programming language™” Generics in Java When I started to program Java 5 professionally after some years [...]]]></description>
			<content:encoded><![CDATA[<p>This is the first part in what is planned to be a loosely-coupled series of articles on current developments in mainstream programming languages.</p>
<p>Topics include:</p>
<ul>
<li>Evolution of Java</li>
<li>New abstractions in programming languages</li>
<li>The functional turn</li>
<li>Scala: „The next programming language™”</li>
</ul>
<h2>Generics in Java</h2>
<p>When I started to program Java 5 professionally after some years of blissful absence from the Java world, I thought myself to be well-prepared for generics. After all, I had done metaprogramming in both C++ and Python for several years.</p>
<p>Of course, experience never saves you from the perils of learning. It took some time until I finally <i>got</i> generics, including the common misunderstandings about covariance and the like. Fortunately, in the project I was working on at the time, we were allowed to go wild and try out all new features in Java 5 at length. We were the first ones to work with the new version and also carrying out the internal training, so we really had to understand what generics were about, and why all tutorials usually contain more <i>don&#8217;t</i>s then <i>do</i>s.</p>
<p>When I finally understood them, I was really disappointed. The type system wasn&#8217;t generic at all! Type annotations are just some sugary coating stripped out by the compiler after the program passes the type checks. Still, generics proved to be useful from time to time. Some problems just kept coming back, and I will briefly outline them there.</p>
<h5>The <i>ugly cast</i></h5>
<p>There is going to be an <i>ugly cast</i> at some point, where you (the programmer) know more about the static or runtime type of an object than the compiler. Our strategy was to isolate <i>ugly casts</i> in minimally small methods with <tt>@SuppressWarnings("unchecked")</tt> annotations. One prominent example is serialization:</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">@SuppressWarnings<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;unchecked&quot;</span><span style="color: #009900;">&#41;</span>
<span style="color: #000000; font-weight: bold;">public</span> List<span style="color: #339933;">&lt;</span>String<span style="color: #339933;">&gt;</span> foo<span style="color: #009900;">&#40;</span><span style="color: #003399;">ObjectInputStream</span> i<span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">throws</span> <span style="color: #003399;">IOException</span> <span style="color: #009900;">&#123;</span>
  <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #009900;">&#40;</span>List<span style="color: #339933;">&lt;</span>String<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#41;</span> i.<span style="color: #006633;">readObject</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<h5>Generic types + class objects</h5>
<p>A lot of generics pain is remedied when you always hand around <tt>Class&lt;V&gt;</tt> objects whenever you create instances of classes with generic type arguments. This is often cumbersome, as it tends to make your APIs larger, but at least provides some kind of runtime type safety.</p>
<p>We used this often enough to call it a pattern, though I think we never gave it a proper name.</p>
<h5>Sun&#8217;s Java compiler</h5>
<p>We were in for some hard lessons when we found out that Eclipse&#8217;s Java compiler was much better at type inference than Sun&#8217;s <tt>javac</tt>. These kinds of errors were especially hard to track down, and some of them were unfixed at least up to 1.5.0 Update 12.<sup><a href="http://shlomme.diotavelli.net/2008/01/12/rescuing-generics/#footnote_0_80" id="identifier_0_80" class="footnote-link footnote-identifier-link" title="most of them are fixed in Java 6">1</a></sup> </p>
<h2>The backlash</h2>
<p>Now, after Java generics have been in the wild for a little more than three years and presumably have seen a wider adoption, a backlash is forming. While early coverage was mostly apologetic of all the oddities that had to be introduced to keep bytecode compatibility<sup><a href="http://shlomme.diotavelli.net/2008/01/12/rescuing-generics/#footnote_1_80" id="identifier_1_80" class="footnote-link footnote-identifier-link" title="see Generics gotchas for a good example">2</a></sup>, a lot of complaints about the (perceived) complexity of generics is heard.</p>
<p>Before I&#8217;m going to dive into an example, let me state the following:</p>
<ul>
<li>Yes, Java code does get uglier and less readable with generics,<br /> <i>…but a lot of that could be addressed with typedef&#8217;s.</i></li>
<li>Yes, generics have a lot of gotchas,<br /> <i>…but most of them are due to backwards compatibility. I would have liked to hear the millions of IDE monkeys cry in horror if BC had been broken.</i></li>
<li>Yes, generics are difficult to grasp.<br /> <i>Get over it. Seriously.</i></li>
</ul>
<h2>Killing Wildcards</h2>
<p>In his recent article <a href="http://www.artima.com/weblogs/viewpost.jsp?thread=222021">„Simplifying Java Generics by Eliminating Wildcards”</a>, Robert Lovatt argues that Java generics could be simplified by removing wildcards altogether and make covariant generic types the default behavior, similar to arrays.</p>
<p>Please note that the following code examples assume that you have read the article.</p>
<p>In arrays, we observe the following behavior:</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #003399;">List</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> listArray <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">List</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">10</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
<span style="color: #003399;">Collection</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> collArray <span style="color: #339933;">=</span> listArray<span style="color: #339933;">;</span>
collArray<span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">HashSet</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> <span style="color: #666666; font-style: italic;">// will result in an ArrayStoreException</span></pre></div></div>

<p>His argument is that this behavior could simply be adopted for generics, making this code compile:</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">List<span style="color: #339933;">&lt;</span>List<span style="color: #339933;">&gt;</span> listList <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ArrayList<span style="color: #339933;">&lt;</span>List<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
List<span style="color: #339933;">&lt;</span>Collection<span style="color: #339933;">&gt;</span> collList <span style="color: #339933;">=</span> listList<span style="color: #339933;">;</span>
collList.<span style="color: #006633;">add</span><span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">HashSet</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Now, suppose the compiler would accept this piece of code (which it doesn&#8217;t), how should an exception similar to <tt>ArrayStoreException</tt> be thrown? The generic types are not known at runtime, in contrast to arrays<sup><a href="http://shlomme.diotavelli.net/2008/01/12/rescuing-generics/#footnote_2_80" id="identifier_2_80" class="footnote-link footnote-identifier-link" title="see the documentation of Class.getComponentType()">3</a></sup> , since they couldn&#8217;t be added without braking backwards compatibility. The only way to ensure the type safety is to have the class object inside <tt>List</tt> and check if newly added objects have the correct type, laying the burden of type checking on the class designer. While this may be acceptable for the standard library, it is certainly not acceptable for general usage.</p>
<p>An example taken from Lovatt&#8217;s article to display the lacking power of generics in Java (and Scala) is:</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"> ListScalaStyle<span style="color: #339933;">&lt;</span>Integer<span style="color: #339933;">&gt;</span> iList <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ListScalaStyle<span style="color: #339933;">&lt;</span>Integer<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
 ListScalaStyle<span style="color: #339933;">&lt;</span>Number<span style="color: #339933;">&gt;</span> nList <span style="color: #339933;">=</span> iList.<span style="color: #006633;">prepend</span><span style="color: #009900;">&#40;</span> <span style="color: #009900;">&#40;</span><span style="color: #003399;">Number</span><span style="color: #009900;">&#41;</span><span style="color: #cc66cc;">2.0</span> <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> <span style="color: #666666; font-style: italic;">// OK</span>
 ListScalaStyle<span style="color: #339933;">&lt;</span>Integer<span style="color: #339933;">&gt;</span> iList2 <span style="color: #339933;">=</span> nList.<span style="color: #006633;">tail</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> <span style="color: #666666; font-style: italic;">// Error, still a Number list</span></pre></div></div>

<p>This is exactly the pathological case of the <i>ugly cast</i>. You, the programmer, know the static type of something and expect the compiler to be able to infer it as well.</p>
<p>To show why this cannot work in general, I&#8217;ll use a trick I&#8217;ve found to be very helpful: adding a little bit of randomization.</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> ListScalaStyle<span style="color: #339933;">&lt;</span>Number<span style="color: #339933;">&gt;</span> getListWithTail<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
  <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span><span style="color: #003399;">Math</span>.<span style="color: #006633;">random</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&gt;</span> .5<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    ListScalaStyle<span style="color: #339933;">&lt;</span>Integer<span style="color: #339933;">&gt;</span> iList <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ListScalaStyle<span style="color: #339933;">&lt;</span>Integer<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">return</span> iList.<span style="color: #006633;">prepend</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span><span style="color: #003399;">Number</span><span style="color: #009900;">&#41;</span><span style="color: #cc66cc;">2.0</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">else</span> <span style="color: #009900;">&#123;</span>
    ListScalaStyle<span style="color: #339933;">&lt;</span>Double<span style="color: #339933;">&gt;</span> iList <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ListScalaStyle<span style="color: #339933;">&lt;</span>Double<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">return</span> iList.<span style="color: #006633;">prepend</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span><span style="color: #003399;">Number</span><span style="color: #009900;">&#41;</span><span style="color: #cc66cc;">2.0</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
<span style="color: #666666; font-style: italic;">// ...</span>
<span style="color: #666666; font-style: italic;">// can never work</span>
ListScalaStyle<span style="color: #339933;">&lt;</span>Integer<span style="color: #339933;">&gt;</span> iList2 <span style="color: #339933;">=</span> nList.<span style="color: #006633;">tail</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> <span style="color: #666666; font-style: italic;">// Error, still a Number list</span></pre></div></div>

<p>This little example is of course not a total refutation &#8211; having the compiler being able to infer more type information statically might always be useful. However, it will always be limited to small pieces of code. It also forces the compiler to actually examine the bytecode of functions in order to see the flow of objects, because <tt>prepend</tt> might do something wildly different. This removes many advantages of polymorphism, a technique at the very heart of Java.</p>
<h2>Conclusion and Outlook</h2>
<p>With generics, Java gets more complicated. It allows programmers to make interesting abstractions, but also freely hands out all kinds of guns for shooting yourself in the foot. This is definitely a deviation from Java&#8217;s original design principles and, ironically, makes it a bit more like C++ &#8211; something which the designers tried to avoid as hard as possible.</p>
<p>In the upcoming articles, we will examine the question why having generics this way still might be a good idea (though for totally different reasons), why the Java designers did it in the first place, and what the generics <i>disaster</i> (Tim Bray) teaches us about the design and evolution of programming languages.</p>
<ol class="footnotes"><li id="footnote_0_80" class="footnote">most of them are fixed in Java 6</li><li id="footnote_1_80" class="footnote">see <a href="http://www.ibm.com/developerworks/java/library/j-jtp01255.html">Generics gotchas</a> for a good example</li><li id="footnote_2_80" class="footnote">see the documentation of <a href="http://java.sun.com/j2se/1.4.2/docs/api/java/lang/Class.html#getComponentType()"><tt>Class.getComponentType()</tt></a></li></ol>]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2008/01/12/rescuing-generics/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Random Python unittesting snippet #1</title>
		<link>http://shlomme.diotavelli.net/2007/12/18/random-python-unittesting-snippet-1/</link>
		<comments>http://shlomme.diotavelli.net/2007/12/18/random-python-unittesting-snippet-1/#comments</comments>
		<pubDate>Tue, 18 Dec 2007 00:24:01 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[lang:en]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/2007/12/18/random-python-unittesting-snippet-1/</guid>
		<description><![CDATA[A small context manager for stating assertions about exceptions in Python unit tests: @contextmanager def assert_raises&#40;*exc_types&#41;: &#34;&#34;&#34;A context to ensure that an exception of a given type is thrown. &#160; Instead of &#160; @nose.tools.raises(ValueError) def test_that_raises(): # ... lengthy setup raise ValueError &#160; you can write &#160; def test_that_raises_at_the_end(): # ... lengthy setup with assert_raises(ValueError): [...]]]></description>
			<content:encoded><![CDATA[<p>A small context manager for stating assertions about exceptions in Python unit tests:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">@contextmanager
<span style="color: #ff7700;font-weight:bold;">def</span> assert_raises<span style="color: black;">&#40;</span><span style="color: #66cc66;">*</span>exc_types<span style="color: black;">&#41;</span>:
    <span style="color: #483d8b;">&quot;&quot;&quot;A context to ensure that an exception of a given type is thrown.
&nbsp;
    Instead of
&nbsp;
    @nose.tools.raises(ValueError)
    def test_that_raises():
        # ... lengthy setup
        raise ValueError
&nbsp;
    you can write
&nbsp;
    def test_that_raises_at_the_end():
        # ... lengthy setup
        with assert_raises(ValueError):
            raise ValueError
&nbsp;
    to make the scope for catching exceptions as small as possible.
    &quot;&quot;&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">try</span>:
        <span style="color: #ff7700;font-weight:bold;">yield</span>
    <span style="color: #ff7700;font-weight:bold;">except</span> exc_types:
        <span style="color: #ff7700;font-weight:bold;">pass</span>
    <span style="color: #ff7700;font-weight:bold;">except</span>:
        <span style="color: #ff7700;font-weight:bold;">raise</span>
    <span style="color: #ff7700;font-weight:bold;">else</span>:
        <span style="color: #ff7700;font-weight:bold;">raise</span> <span style="color: #008000;">AssertionError</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;Failed to throw exception of type(s) %s.&quot;</span> <span style="color: #66cc66;">%</span> <span style="color: black;">&#40;</span>
            <span style="color: #483d8b;">&quot;, &quot;</span>.<span style="color: black;">join</span><span style="color: black;">&#40;</span>exc_type.__name__ <span style="color: #ff7700;font-weight:bold;">for</span> exc_type <span style="color: #ff7700;font-weight:bold;">in</span> exc_types<span style="color: black;">&#41;</span>,<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span></pre></div></div>

]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2007/12/18/random-python-unittesting-snippet-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Stockholm TreeAligner 0.8 „Gamla Stan” released</title>
		<link>http://shlomme.diotavelli.net/2007/12/13/stockholm-treealigner-08-%e2%80%9egamla-stan%e2%80%9d-released/</link>
		<comments>http://shlomme.diotavelli.net/2007/12/13/stockholm-treealigner-08-%e2%80%9egamla-stan%e2%80%9d-released/#comments</comments>
		<pubDate>Thu, 13 Dec 2007 00:14:10 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[coli]]></category>
		<category><![CDATA[lang:en]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[treealigner]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/2007/12/13/stockholm-treealigner-08-%e2%80%9egamla-stan%e2%80%9d-released/</guid>
		<description><![CDATA[It&#8217;s only a couple of months late, but we&#8217;ve just released a new version of the Stockholm TreeAligner to an awed audience. This release features the prototype implementations of TIGERSearch and alignment queries, which will be perfected in the next release, due in March 2008. For those who are wondering what kind of code name [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s only a couple of months late, but we&#8217;ve just released a new version of the <a href="http://dev.ling.su.se/treealigner">Stockholm TreeAligner</a> to an awed audience. This release features the prototype implementations of TIGERSearch and alignment queries, which will be perfected in the next release, due in March 2008.</p>
<p>For those who are wondering what kind of code name <a href="http://en.wikipedia.org/wiki/Gamla_Stan">Gamla Stan</a> is: STA releases are named after Stockholms subway stations.</p>
<p>Align your trees while the release is still hot!</p>
]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2007/12/13/stockholm-treealigner-08-%e2%80%9egamla-stan%e2%80%9d-released/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

