<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>I See Dead Code</title>
	<atom:link href="http://shlomme.diotavelli.net/feed/" rel="self" type="application/rss+xml" />
	<link>http://shlomme.diotavelli.net</link>
	<description>… as sounding brass, or a tinkling cymbal.</description>
	<lastBuildDate>Sun, 12 Jul 2009 19:38:04 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Breaking Event Cycles</title>
		<link>http://shlomme.diotavelli.net/2009/07/12/breaking-event-cycles/</link>
		<comments>http://shlomme.diotavelli.net/2009/07/12/breaking-event-cycles/#comments</comments>
		<pubDate>Sun, 12 Jul 2009 19:38:04 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/?p=242</guid>
		<description><![CDATA[A recurring problem in (not only) GUI programming are event cycles, i.e. the receiving of events oneself has triggered. These can quickly lead to event cycles, where change triggers change triggers change, until something gives out—usually the stack.
A particularly cheap way of breaking cycles are simple boolean flags:

class Foo&#40;object&#41;:
   listen = True
  [...]]]></description>
			<content:encoded><![CDATA[<p>A recurring problem in (not only) GUI programming are event cycles, i.e. the receiving of events oneself has triggered. These can quickly lead to event cycles, where change triggers change triggers change, until something gives out—usually the stack.</p>
<p>A particularly cheap way of breaking cycles are simple boolean flags:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">class</span> Foo<span style="color: black;">&#40;</span><span style="color: #008000;">object</span><span style="color: black;">&#41;</span>:
   listen = <span style="color: #008000;">True</span>
   <span style="color: #ff7700;font-weight:bold;">def</span> some_method<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
       <span style="color: #008000;">self</span>.<span style="color: black;">listen</span> = <span style="color: #008000;">False</span>
       <span style="color: #ff7700;font-weight:bold;">try</span>:
           <span style="color: #808080; font-style: italic;"># do stuff that triggers some_event</span>
       <span style="color: #ff7700;font-weight:bold;">finally</span>: 
            <span style="color: #008000;">self</span>.<span style="color: black;">listen</span> = <span style="color: #008000;">True</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> on_some_event<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #ff7700;font-weight:bold;">not</span> <span style="color: #008000;">self</span>.<span style="color: black;">listen</span>:
            <span style="color: #ff7700;font-weight:bold;">return</span>
        <span style="color: #808080; font-style: italic;"># handle event normally</span></pre></div></div>

<p>Boolean flags are great. They&#8217;re simple, they&#8217;ll make your code harder and half of the time, you&#8217;ll forget to set them to <tt>True</tt> after you&#8217;re done. Or you forget the <tt>try...finally...</tt> block. Or you forget to check them in the event handler. What could possibly go wrong.<br />
<!-- more --></p>
<p>Fortunately, some GUI toolkits provide ways to temporarily disable event handling for specific events, like GObject&#8217;s <a href="http://library.gnome.org/devel/pygobject/stable/class-gobject.html#method-gobject--handler-block-by-func"><tt>handler_block_by_func</tt></a>. This approach has two problems:</p>
<ul>
<li>You have to know the object (or objects) that emits the event.</li>
<li>It only works for GObject events (signals).</li>
</ul>
<p>Since I do have classes with their own event handling mechanism, in order to be independent of GObject and since it&#8217;s not really difficult to implement, I wanted a cross-event-framework way of temporarily blocking event handling. Or, maybe I just wanted to write another decorator/context manager very badly. Maybe a little bit of both.</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">class</span> Foo<span style="color: black;">&#40;</span><span style="color: #008000;">object</span><span style="color: black;">&#41;</span>:
    <span style="color: #ff7700;font-weight:bold;">def</span> some_method<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        <span style="color: #ff7700;font-weight:bold;">with</span> <span style="color: #008000;">self</span>.<span style="color: black;">on_some_event</span>.<span style="color: black;">suspend</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>:
            <span style="color: #808080; font-style: italic;"># do stuff that triggers some_event</span>
&nbsp;
    @suspendable
    <span style="color: #ff7700;font-weight:bold;">def</span> on_some_event<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        <span style="color: #808080; font-style: italic;"># handle event normally</span></pre></div></div>

<p>The event handlers are wrapped inside another method which swallows the event when the context manager is in suspended mode. The code remains blissfully ignorant of threading issues for now and also breaks down if the event handlers have meaningful result values. I&#8217;ve never encountered this so far, but adding default return values to the decorator is a simple extension. The methods are still proper bound methods and docstrings etc. are all conserved. </p>
<p>Without further ado, the code:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">class</span> Suspender<span style="color: black;">&#40;</span><span style="color: #008000;">object</span><span style="color: black;">&#41;</span>:
    <span style="color: #ff7700;font-weight:bold;">def</span> <span style="color: #0000cd;">__init__</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        <span style="color: #008000;">self</span>.<span style="color: black;">is_suspended</span> = <span style="color: #ff4500;">0</span>
&nbsp;
    @contextmanager
    <span style="color: #ff7700;font-weight:bold;">def</span> <span style="color: #0000cd;">__call__</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        <span style="color: #008000;">self</span>.<span style="color: black;">is_suspended</span> += <span style="color: #ff4500;">1</span>
        <span style="color: #ff7700;font-weight:bold;">try</span>:
            <span style="color: #ff7700;font-weight:bold;">yield</span>
        <span style="color: #ff7700;font-weight:bold;">finally</span>:
            <span style="color: #008000;">self</span>.<span style="color: black;">is_suspended</span> -= <span style="color: #ff4500;">1</span>
&nbsp;
    @<span style="color: #008000;">classmethod</span>
    <span style="color: #ff7700;font-weight:bold;">def</span> suspendable<span style="color: black;">&#40;</span>cls, meth<span style="color: black;">&#41;</span>:
        suspend_manager = cls<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
&nbsp;
        wrapper = suspend_manager.<span style="color: black;">add_suspendable</span><span style="color: black;">&#40;</span>meth<span style="color: black;">&#41;</span>
        wrapper.<span style="color: black;">suspend</span> = suspend_manager
        wrapper.<span style="color: black;">add_suspendable</span> = suspend_manager.<span style="color: black;">add_suspendable</span>
        <span style="color: #ff7700;font-weight:bold;">return</span> wrapper
&nbsp;
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> add_suspendable<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, meth<span style="color: black;">&#41;</span>:
        @wraps<span style="color: black;">&#40;</span>meth<span style="color: black;">&#41;</span>
        <span style="color: #ff7700;font-weight:bold;">def</span> suspended_wrapper<span style="color: black;">&#40;</span><span style="color: #66cc66;">*</span>args<span style="color: black;">&#41;</span>:
            <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">self</span>.<span style="color: black;">is_suspended</span> == <span style="color: #ff4500;">0</span>:
                meth<span style="color: black;">&#40;</span><span style="color: #66cc66;">*</span>args<span style="color: black;">&#41;</span>
&nbsp;
        <span style="color: #ff7700;font-weight:bold;">return</span> suspended_wrapper
&nbsp;
suspendable = Suspender.<span style="color: black;">suspendable</span></pre></div></div>

<p>It&#8217;s possible to have several event handlers block by a single context manager using the <tt>add_suspendable</tt> decorator added to suspendable methods:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">@suspendable
<span style="color: #ff7700;font-weight:bold;">def</span> on_some_event<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>: ...
&nbsp;
@on_some_event.<span style="color: black;">add_suspendable</span>
<span style="color: #ff7700;font-weight:bold;">def</span> on_misc_event<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>: ...</pre></div></div>

<p>Calling the <tt>suspend()</tt> context manager for <tt>on_some_event</tt> will also block <tt>on_misc_event</tt>.</p>
<p>The context manager does not prevent the event from being propagated, so it&#8217;s not a speed optimization; due to the added boolean check event + call indirection, handling actually becomes slightly slower. It likely won&#8217;t make a difference. If it does, event cycles are the least of your problems.</p>
]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2009/07/12/breaking-event-cycles/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New Talks</title>
		<link>http://shlomme.diotavelli.net/2009/07/05/new-talks/</link>
		<comments>http://shlomme.diotavelli.net/2009/07/05/new-talks/#comments</comments>
		<pubDate>Sun, 05 Jul 2009 12:52:45 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/?p=236</guid>
		<description><![CDATA[Over the last months, I gave two talks, one at TaCoS in Heidelberg on Text+Berg digital, the other one in our local Python user group, swiss.py, on Python metaclasses. 
The slides can be found on the Talks page.
]]></description>
			<content:encoded><![CDATA[<p>Over the last months, I gave two talks, one at TaCoS in Heidelberg on <a href="http://www.textberg.ch/">Text+Berg digital</a>, the other one in our local Python user group, <a href="http://swisspy.ch/">swiss.py</a>, on Python metaclasses. </p>
<p>The slides can be found on the <a href="http://shlomme.diotavelli.net/my-talks/">Talks</a> page.</p>
]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2009/07/05/new-talks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Quantitative Linguistics, visualized.</title>
		<link>http://shlomme.diotavelli.net/2009/06/29/quantitative-linguistics-visualized/</link>
		<comments>http://shlomme.diotavelli.net/2009/06/29/quantitative-linguistics-visualized/#comments</comments>
		<pubDate>Mon, 29 Jun 2009 15:12:50 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[shnasel]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/?p=223</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p><a href="http://diotavelli.net/files/img/linguistics.png"><img src="http://diotavelli.net/files/img/linguistics.png" width="500" alt="Quantitative Linguistics" alt="No, you can't have that tool." /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2009/06/29/quantitative-linguistics-visualized/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Oh my god—it&#8217;s full of cores!</title>
		<link>http://shlomme.diotavelli.net/2009/05/24/gpu-computing/</link>
		<comments>http://shlomme.diotavelli.net/2009/05/24/gpu-computing/#comments</comments>
		<pubDate>Sun, 24 May 2009 16:17:48 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[hardware]]></category>
		<category><![CDATA[lang:en]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/?p=204</guid>
		<description><![CDATA[Beginnings
After the bleak, joyless work of releasing software, plugging holes of preventing users from clicking buttons they should not have clicked in the first place, writing unctuous documentation and release notes and wrapping code into neat little installers with tiny bows and bells that gently tingle when touched, there are few things as profoundly satisfying [...]]]></description>
			<content:encoded><![CDATA[<h5>Beginnings</h5>
<p>After the bleak, joyless work of releasing software, plugging holes of preventing users from clicking buttons they should not have clicked in the first place, writing unctuous documentation and release notes and wrapping code into neat little installers with tiny bows and bells that gently tingle when touched, there are few things as profoundly satisfying as taking code that looked like line noise in the first place, applying some non-trivial transformation to it and still have it look like line noise, only now it&#8217;s twice as fast, or half as long or of some other inaccessible quality that can, on some technical level, be referred to as &#8220;cool&#8221;.</p>
<p>Quite some time ago, I decided that this time, &#8220;cool&#8221; will mean &#8220;runs on graphics hardware, possibly faster&#8221;. For this end, I took my trusty T61 to work, downloaded the latest <a href="http://www.nvidia.com/object/cuda_home.html">CUDA toolkit</a>, found out that nearly all examples crashed when compiled, installed the latest beta drivers from NVIDIA, noticed that everything worked now and started playing around with the examples. </p>
<p>After some time marveling at ball pit simulations and similarly telling examples, I printed out the <a href="http://developer.download.nvidia.com/compute/cuda/2_2/toolkit/docs/NVIDIA_CUDA_Programming_Guide_2.2.pdf">CUDA programming guide</a> and spent a considerable amount of time marveling at how well the NVidia green looked when printed with our institute&#8217;s color laser printer, before reading its more important parts.<br />
<span id="more-204"></span></p>
<h5>Technical Details</h5>
<p>Roughly said, modern graphics hardware from NVIDIA consists of an array of multiprocessors. Each multiprocessor has eight cores, some shared memory, a controller unit and units for transcendental functions. Kernels (methods that run on the device, as opposed to host code) are automatically distributed to the available multiprocessors (their number depending on the hardware, with as many as 120 on the Tesla cards). </p>
<p>Kernel invocations are organized in 3-dimensional thread blocks and 2-dimensional grids. A single thread block can have at most 512 threads (i.e. x × y × z ⩽ 512) and is always executed by a single multiprocessor. Thread blocks are organized in a grid (with at most 65,536 elements on each dimension), which are distributed to the available multiprocessors, and thus inherently scalable. </p>
<p>A multiprocessor divides the threads of a block into <em>warps</em> and all threads of a single warp are executed in parallel. Optimal performance is reached when the control flow in threads of the same warp does not diverge based on the input data, the cores are optimized to execute the same code in several threads on different data (therefore the name SIMT—Single instruction, multiple thread—for this architecture). </p>
<p>There is also a memory hierarchy, from registers to on-die shared memory to global device memory. Data has to be explicitly copied from the system RAM to device memory, which is costly and should be minimized.</p>
<p>In other words, if you have a computation that has to be executed many, many times on different input data, move it the GPU instead and use the freed CPU cycles for shuffling around the data. In other words, ZOOOOOOM.</p>
<h5>Some Experiments</h5>
<p>Back in the day, we had to write code for training <a href="http://shlomme.diotavelli.net/2008/03/13/gmm-code/">Gaussian Mixture Models</a>. At the core of the algorithm, an auxiliary vector containing each point in the training set has to be created for each Gaussian. The creation of the auxiliar vector includes exponentiation, which proved to be quite costly, since it accounted for 50% of the running time of the program<sup>1</sup>. </p>
<p>Instead of doing comparedly boring optimization work for running that algorithm on a single processor, and even more boring work to have it run on two processors, I partially ported it to have the auxiliary vectors be computed on the GPU, with the result that by moving only this one part of the computation leads to the three-fold speed increase for the whole program. </p>
<p>On my machine (2 GiB RAM, 2.2 GHz Core 2 Duo, NVS 140M with 2 multiprocessors), running the C++-only version<sup>2</sup> takes 33s for 90,000 points and three Gaussians. Running the computation with the same data set on the GPU takes 11s. </p>
<h5>How is it done</h5>
<p>In order to invoke a kernel running on the GPU, the code has to be compiled with <tt>nvcc</tt>, NVIDIA&#8217;s compiler for CUDA. It separates device, which is compiled to some binary code, from host code which is transformed and handed over to the platform compiler. Therefore, it is needed to write some intermediate methods which can be called from &#8220;normal&#8221; C++ and that take care of shuffling data to the device and invoking the kernel methods<sup>3</sup>:</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;"><span style="color: #993333;">typedef</span> <span style="color: #993333;">float</span> PROB<span style="color: #339933;">;</span>
&nbsp;
<span style="color: #339933;">#define BLOCK_SIZE 16</span>
&nbsp;
Matrix d_in<span style="color: #339933;">;</span>
Matrix d_out<span style="color: #339933;">;</span>
&nbsp;
__device__ __constant__ KernelGaussian d_params<span style="color: #009900;">&#91;</span><span style="color: #0000dd;">20</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">extern</span> <span style="color: #ff0000;">&quot;C&quot;</span>
<span style="color: #993333;">void</span> LoadPoints<span style="color: #009900;">&#40;</span>Matrix in<span style="color: #339933;">,</span> size_t params<span style="color: #009900;">&#41;</span> 
<span style="color: #009900;">&#123;</span>
    d_in.<span style="color: #202020;">height</span> <span style="color: #339933;">=</span> in.<span style="color: #202020;">height</span><span style="color: #339933;">;</span>
    d_in.<span style="color: #202020;">width</span> <span style="color: #339933;">=</span> in.<span style="color: #202020;">width</span><span style="color: #339933;">;</span>    
    size_t size_in <span style="color: #339933;">=</span> in.<span style="color: #202020;">width</span> <span style="color: #339933;">*</span> in.<span style="color: #202020;">height</span> <span style="color: #339933;">*</span> <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span>PROB<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    cutilSafeCall<span style="color: #009900;">&#40;</span>cudaMalloc<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span><span style="color: #993333;">void</span><span style="color: #339933;">**</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&amp;</span>d_in.<span style="color: #202020;">elements</span><span style="color: #339933;">,</span> size_in<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    cutilSafeCall<span style="color: #009900;">&#40;</span>cudaMemcpy<span style="color: #009900;">&#40;</span>d_in.<span style="color: #202020;">elements</span><span style="color: #339933;">,</span> in.<span style="color: #202020;">elements</span><span style="color: #339933;">,</span> size_in<span style="color: #339933;">,</span> cudaMemcpyHostToDevice<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    d_out.<span style="color: #202020;">width</span> <span style="color: #339933;">=</span> in.<span style="color: #202020;">width</span><span style="color: #339933;">;</span>
    d_out.<span style="color: #202020;">height</span> <span style="color: #339933;">=</span> params<span style="color: #339933;">;</span>
&nbsp;
    size_t size_out <span style="color: #339933;">=</span> d_out.<span style="color: #202020;">width</span> <span style="color: #339933;">*</span> d_out.<span style="color: #202020;">height</span> <span style="color: #339933;">*</span> <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span>PROB<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    cudaMalloc<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span><span style="color: #993333;">void</span><span style="color: #339933;">**</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&amp;</span>d_out.<span style="color: #202020;">elements</span><span style="color: #339933;">,</span> size_out<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
__global__ <span style="color: #993333;">void</span> computeAuxVectors<span style="color: #009900;">&#40;</span>Matrix in<span style="color: #339933;">,</span> Matrix out<span style="color: #009900;">&#41;</span> 
<span style="color: #009900;">&#123;</span>
    <span style="color: #993333;">int</span> p <span style="color: #339933;">=</span> blockIdx.<span style="color: #202020;">y</span> <span style="color: #339933;">*</span> BLOCK_SIZE <span style="color: #339933;">+</span> threadIdx.<span style="color: #202020;">y</span><span style="color: #339933;">;</span>
    KernelGaussian g <span style="color: #339933;">=</span> d_params<span style="color: #009900;">&#91;</span>blockIdx.<span style="color: #202020;">x</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
    PROB d <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>in.<span style="color: #202020;">elements</span><span style="color: #009900;">&#91;</span>p<span style="color: #009900;">&#93;</span> <span style="color: #339933;">-</span> g.<span style="color: #202020;">mean</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">/</span> g.<span style="color: #202020;">stddev</span><span style="color: #339933;">;</span>
    out.<span style="color: #202020;">elements</span><span style="color: #009900;">&#91;</span>blockIdx.<span style="color: #202020;">x</span> <span style="color: #339933;">*</span> out.<span style="color: #202020;">width</span> <span style="color: #339933;">+</span> p<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> g.<span style="color: #202020;">factor</span> <span style="color: #339933;">*</span> expf<span style="color: #009900;">&#40;</span><span style="color: #339933;">-</span><span style="color:#800080;">0.5</span><span style="color: #339933;">*</span>d<span style="color: #339933;">*</span>d<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
<span style="color: #000000; font-weight: bold;">extern</span> <span style="color: #ff0000;">&quot;C&quot;</span>
<span style="color: #993333;">void</span> RunComputeAuxVectors<span style="color: #009900;">&#40;</span>
        KernelGaussian<span style="color: #339933;">*</span> g<span style="color: #339933;">,</span> 
        std<span style="color: #339933;">::</span><span style="color: #202020;">vector</span><span style="color: #339933;">&lt;</span>std<span style="color: #339933;">::</span><span style="color: #202020;">vector</span><span style="color: #339933;">&lt;</span>PROB<span style="color: #339933;">&gt;</span> <span style="color: #339933;">&gt;&amp;</span> out<span style="color: #339933;">,</span> 
        size_t points<span style="color: #339933;">,</span> size_t params<span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
    cutilSafeCall<span style="color: #009900;">&#40;</span>
        cudaMemcpyToSymbol<span style="color: #009900;">&#40;</span>
          d_params<span style="color: #339933;">,</span> g<span style="color: #339933;">,</span> 
          params<span style="color: #339933;">*</span><span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span>KernelGaussian<span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> 
          <span style="color: #0000dd;">0</span><span style="color: #339933;">,</span> cudaMemcpyHostToDevice<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    dim3 dimGrid<span style="color: #009900;">&#40;</span>params<span style="color: #339933;">,</span> points <span style="color: #339933;">/</span> BLOCK_SIZE<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    dim3 dimBlock<span style="color: #009900;">&#40;</span><span style="color: #0000dd;">1</span><span style="color: #339933;">,</span> BLOCK_SIZE<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    computeAuxVectors<span style="color: #339933;">&lt;&lt;&lt;</span>dimGrid<span style="color: #339933;">,</span> dimBlock<span style="color: #339933;">&gt;&gt;&gt;</span><span style="color: #009900;">&#40;</span>d_in<span style="color: #339933;">,</span> d_out<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    cutilCheckMsg<span style="color: #009900;">&#40;</span><span style="color: #ff0000;">&quot;Kernel execution failed&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #b1b100;">for</span><span style="color: #009900;">&#40;</span>size_t i <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span> i <span style="color: #339933;">&lt;</span> params<span style="color: #339933;">;</span> <span style="color: #339933;">++</span>i<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        cutilSafeCall<span style="color: #009900;">&#40;</span>
            cudaMemcpy<span style="color: #009900;">&#40;</span>
              <span style="color: #339933;">&amp;</span>out<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span><span style="color: #009900;">&#91;</span><span style="color: #0000dd;">0</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span> <span style="color: #339933;">&amp;</span>d_out.<span style="color: #202020;">elements</span><span style="color: #009900;">&#91;</span>i <span style="color: #339933;">*</span> points<span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span> 
              <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span>PROB<span style="color: #009900;">&#41;</span> <span style="color: #339933;">*</span> points<span style="color: #339933;">,</span> 
              cudaMemcpyDeviceToHost<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>The matrix <tt>d_in</tt> holds the points, which are constant over the whole computation and therefore copied over to the device exactly once. <tt>d_out</tt> is the matrix that holds the auxiliary values computed for each point and each Gaussians. For faster retrieval, the Gaussians are stored in constant device memory (<tt>d_params</tt>), which is set to contain 20 elements at most, i.e. the algorithm is limited to learn mixture models with at most 20 Gaussians. </p>
<p>In the method <tt>LoadPoints</tt>, device memory is allocated for the points and the output vectors and the points are copied from host to device memory using <tt>cudaMemcpy</tt>. </p>
<p>The actual invocation of the kernel is done in the host method <tt>RunComputeAuxVectors</tt>, which first copies the parameters into the constant storage and invokes the kernel <tt>computeAuxVectors</tt> using the new <tt><<<grid, block>>></tt> syntax. <tt>grid</tt> holds the dimensions of the grid, which is <emph>number of gaussians</em>× <em>points / BLOCK_SIZE</em>. The block size is the width of a thread block and somewhat arbitrarily chosen to be 16<sup>4</sup>, currently this also limits the number of points to be a multiple of 16<sup>5</sup>. After the blocking kernel invocation, the results are copied back into host memory.</p>
<p>Important:</p>
<ul>
<li>My card only supports single-precision floats, newer ones also have double-precision</li>
<li><tt>blockIdx</tt> and <tt>threadIdx</tt> are global variables provided by CUDA compiler</li>
<li><tt>__global__</tt> methods are run on the device and can be called from both host and device code</li>
<li>Currently, the device memory for <tt>d_in</tt> and <tt>d_out</tt> is never freed, the program simply exits</li>
</ul>
<h5>Results</h5>
<p>Working with CUDA is definitely fun (if rewarding), the system itself is quite unforgiving though. Wrapping each invocation of a CUDA method into <tt>cutilSafeCall</tt> therefore is strongly recommended, in case of an error it will print a (not always helpful) error message and exit. Though in this case the code running on the GPU itself is quite trivial, it can be hard to figure out what is going wrong, because the GPU itself is a black box, and it&#8217;s not possible to simply print out a message<sup>6</sup>. It&#8217;s possible to emulate the code on the CPU, and there are other tools available which I haven&#8217;t tried out. </p>
<p>I&#8217;ve also started toying around with my original idea, having the constraint checks for the TIGER query evaluation run on the graphics hardware and initial results are mixed. Still, if nothing at all, this gives me justification to spent a lot of money on expensive graphics hardware for my next computer—it&#8217;s all for science, this harsh and demanding mistress.</p>
<ol class="footnotes"><li id="footnote_0_204" class="footnote">There might be a smart way of getting rid of the <tt>exp()</tt> call, though I haven&#8217;t found it yet</li><li id="footnote_1_204" class="footnote">compiled using g++ 4.4.0 -O3</li><li id="footnote_2_204" class="footnote">looking at the SDK examples was quite helpful here</li><li id="footnote_3_204" class="footnote">I didn&#8217;t try out other block sizes</li><li id="footnote_4_204" class="footnote">this is a restriction of the implementation, one could simply pad the points with however many zeros as needed and compute a little more</li><li id="footnote_5_204" class="footnote">or maybe it is, and I don&#8217;t know how to do it&#8230;</li></ol>]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2009/05/24/gpu-computing/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Yes, Master.</title>
		<link>http://shlomme.diotavelli.net/2009/05/24/yes-master/</link>
		<comments>http://shlomme.diotavelli.net/2009/05/24/yes-master/#comments</comments>
		<pubDate>Sun, 24 May 2009 13:50:29 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[coli]]></category>
		<category><![CDATA[studies]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/?p=191</guid>
		<description><![CDATA[My Master thesis &#8220;Integration of Light-weight Semantics into a Syntax Query Formalism&#8221; is available from the SALSA project pages.
Abstract
In the Computational Linguistics community, much work is put into the creation of large, high-quality linguistic resources, often with complex annotation. In order to make these resources accessible to nontechnical audiences, formalisms for searching and filtering are [...]]]></description>
			<content:encoded><![CDATA[<p>My Master thesis &#8220;Integration of Light-weight Semantics into a Syntax Query Formalism&#8221; is available from the <a href="http://www.coli.uni-saarland.de/projects/salsa/page.php?id=theses">SALSA project pages</a>.</p>
<h3>Abstract</h3>
<blockquote><p>In the Computational Linguistics community, much work is put into the creation of large, high-quality linguistic resources, often with complex annotation. In order to make these resources accessible to nontechnical audiences, formalisms for searching and filtering are needed. </p>
<p>The TIGER query language can, by describing partial structures, be used to search treebanks with syntactic annotation. Recently, augmented treebanks have been published, including the SALSA corpus which features frame semantic annotation on top of syntactic structure. Query languages, however, need to keep up with newly introduced annotation, allowing it to be searchable and easy to access.</p>
<p>We design an extension for the TIGER language which allows searching for frame structures along with syntactic annotation. To achieve this, the TIGER object model is expanded to include frame semantics, while remaining fully backwards-compatible.</p>
<p>Finally, these extensions have been added to our own implementation of TIGER, which includes novel indexing features not found in the original work of Lezius (2002a).</p></blockquote>
<p><span id="more-191"></span></p>
<h3>What does it all mean?</h3>
<p>In the most basic sense of all, the TIGER query language allows specification of nodes (which are flat feature structures) and relations between these nodes. So far, only syntactic nodes (words and phrases) and syntactic relations (dominance, precedence and structure sharing) were supported in the query language, while the underlying annotation formalism had been extended to include frame semantics as well. My conservative extension of the query language introduces types and relations for frame semantics. This makes it possible to express linguistic queries such as <emph>Find all sentences where the role TOPIC in the frame<br />
STATEMENT is realized by a PP with the preposition &#8220;über&#8221;</emph>, which was not possible previously:</p>
<pre>
{frame="Statement"} > #r:{role="Topic"} &#038;
#pp:[cat="PP"] >AC [word="über"] &#038;
#r > #pp &#038; arity(#r, 1)
</pre>
<p>What is cryptically referred to as &#8220;novel indexing techniques&#8221; are improvements to the candidate selection for relation checks, which now exploits some graph-theoretic notions which can be used as rough filters prior to actual relation checks, which can be quite expensive. All in all, the implementation is generally faster than TIGERSearch (the original implementation by Lezius) for complex queries, for simple queries, it is slower, because our node index is slower.</p>
<h3>Can I try it out</h3>
<p>A demo system for the original and extended query language is online on the <a href="http://fnps.coli.uni-saarland.de:8080/">CoLi webservers</a> in Saarbrücken. With regard to features, this is the latest version, since then I committed one bugfix to the query evaluator.</p>
<h3>Will you continue work on it?</h3>
<p>Hopefully, yes. Current directions of ongoing development include:</p>
<ul>
<li>In Progress:
<ul>
<li>Client-side rendering of trees in the query front-end (using the <a href="https://developer.mozilla.org/en/Canvas_tutorial">HTML5 canvas</a>)</li>
</ul>
</li>
<li>Planned:
<ul>
<li>Custom-written node index</li>
<li>Relations between graphs and nodes in different graphs</li>
</ul>
</li>
</ul>
<p>I&#8217;m also running some experiments for massively parallel constraint evaluation using GPUs, but that might not lead anywhere and depends on the availabity of special hardware.</p>
<h3>Thanks</h3>
<p>Again, special thanks go to Martin Lazarov and Armin Schmidt, who both read the full draft version and provided many comments and corrections.</p>
]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2009/05/24/yes-master/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Scrollable Widgets with PyGTK</title>
		<link>http://shlomme.diotavelli.net/2009/05/17/scrollable-widgets-with-pygtk/</link>
		<comments>http://shlomme.diotavelli.net/2009/05/17/scrollable-widgets-with-pygtk/#comments</comments>
		<pubDate>Sat, 16 May 2009 23:49:17 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[lang:en]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/?p=178</guid>
		<description><![CDATA[It is possible to write custom GTK widgets that have &#8220;native&#8221; scrolling support, as opposed to just shoving them into a GtkViewPort and forgetting about them. 
Apart from having mastered a small coding challenge, as it turned out to be, this also gives you greater control over the scrolling itself, like making sure that certain [...]]]></description>
			<content:encoded><![CDATA[<p>It is possible to write custom GTK widgets that have &#8220;native&#8221; scrolling support, as opposed to just shoving them into a <a href="http://library.gnome.org/devel/gtk/unstable/GtkViewport.html">GtkViewPort</a> and forgetting about them. </p>
<p>Apart from having mastered a small coding challenge, as it turned out to be, this also gives you greater control over the scrolling itself, like making sure that certain elements are visible, viewport panning etc.</p>
<p>Anyway, especially when using PyGTK, it&#8217;s a bit unclear on how to proceed. From the documentation, it somehow gets clear that it has to do with the signal <code>set_scroll_adjustment_signal</code>:</p>
<blockquote><p>
This signal is emitted when a widget of this class is added to a scrolling aware parent, gtk_widget_set_scroll_adjustments() handles the emission. Implementation of this signal is optional.
</p></blockquote>
<p>This is not a signal name, but a signal ID<sup>1</sup> that has to be set in <a href="http://developer.gnome.org/doc/GGAD/z144.html">GtkWidgetClass</a><sup>2</sup>.</p>
<p>Some more documentation reading reveals that you can set this signal by using the <code>set_set_scroll_adjustments_signal</code> method on a widget:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">class</span> ScrollableWidget<span style="color: black;">&#40;</span>gtk.<span style="color: black;">DrawingArea</span><span style="color: black;">&#41;</span>:
    __gsignals__ = <span style="color: black;">&#123;</span>
        <span style="color: #483d8b;">&quot;set-scroll-adjustments&quot;</span>: <span style="color: black;">&#123;</span>
            gobject.<span style="color: black;">SIGNAL_RUN_LAST</span>,
            gobject.<span style="color: black;">TYPE_NONE</span>, <span style="color: black;">&#40;</span>gtk.<span style="color: black;">Adjustment</span>, gtk.<span style="color: black;">Adjustment</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>,
    <span style="color: black;">&#125;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> <span style="color: #0000cd;">__init__</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        gtk.<span style="color: black;">DrawingArea</span>.<span style="color: #0000cd;">__init__</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>
        <span style="color: #008000;">self</span>.<span style="color: black;">set_set_scroll_adjustments_signal</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;set-scroll-adjustments&quot;</span><span style="color: black;">&#41;</span></pre></div></div>

<p>It doesn&#8217;t really matter <em>how</em> you call the signal as long as it takes two arguments (the horizontal and vertical adjustment). This will make the method <a href="http://library.gnome.org/devel/pygtk/stable/class-gtkwidget.html#method-gtkwidget--set-scroll-adjustments"><code>set_scroll_adjustments</code></a> (which you can&#8217;t override from within Python) return <em>True</em> when it is called and signal that the widget supports scrolling.</p>
<p>This, however, is only half the way, because the scrollable widget still needs the adjustments handed in via said methods. It&#8217;s of course possible to connect to the signal explicitly, but there&#8217;s an even more direct way by using action signals. </p>
<p>Action signals are the C programmer&#8217;s idea of &#8220;generic methods&#8221;. In order to create such a signal, it has to have the flag <code>gobject.SIGNAL_ACTION</code> and they are directly connected to a function which is then called on each signal emission. While in C, you have to provide a function pointer, in Python you can just implement functions with a compounded magic name<sup>3</sup> and have it called automatically. I haven&#8217;t found any documentation on that in the PyGObject or PyGTK docs, only some examples in the web:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">class</span> ScrollableWidget<span style="color: black;">&#40;</span>gtk.<span style="color: black;">DrawingArea</span><span style="color: black;">&#41;</span>:
    __gsignals__ = <span style="color: black;">&#123;</span>
        <span style="color: #483d8b;">&quot;set-scroll-adjustments&quot;</span>: <span style="color: black;">&#123;</span>
            gobject.<span style="color: black;">SIGNAL_RUN_LAST</span> | gobject.<span style="color: black;">SIGNAL_ACTION</span>, 
            gobject.<span style="color: black;">TYPE_NONE</span>, <span style="color: black;">&#40;</span>gtk.<span style="color: black;">Adjustment</span>, gtk.<span style="color: black;">Adjustment</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>,
    <span style="color: black;">&#125;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> <span style="color: #0000cd;">__init__</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        gtk.<span style="color: black;">DrawingArea</span>.<span style="color: #0000cd;">__init__</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>
        <span style="color: #008000;">self</span>.<span style="color: black;">set_set_scroll_adjustments_signal</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;set-scroll-adjustments&quot;</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> do_set_scroll_adjustments<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, h_adjustment, v_adjustment<span style="color: black;">&#41;</span>:
         <span style="color: #808080; font-style: italic;"># do some useful stuff here, like saving them</span>
         ...</pre></div></div>

<p>The method being called on emission has to start with <code>do_</code>, following by the signal names with hyphens replaced by underscores.</p>
<p>The adjustment objects can then be configured to one&#8217;s own liking to have scroll bars show up or not. However, to know when the user did some scrolling, it&#8217;s necessary to listen on some signals:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">    <span style="color: #ff7700;font-weight:bold;">def</span> do_set_scroll_adjustments<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, h_adjustment, v_adjustment<span style="color: black;">&#41;</span>:
        h_adjustment.<span style="color: black;">connect</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;value-changed&quot;</span>, <span style="color: #008000;">self</span>._scroll_value_changed<span style="color: black;">&#41;</span>
        <span style="color: #008000;">self</span>._hadj = h_adjustment
        v_adjustment.<span style="color: black;">connect</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;value-changed&quot;</span>, <span style="color: #008000;">self</span>._scroll_value_changed<span style="color: black;">&#41;</span>
        <span style="color: #008000;">self</span>._vadj = v_adjustment</pre></div></div>

<p>To make the scroll bar show up, modify <code>upper</code>, <code>lower</code> and <code>page_size</code> on the adjustments.</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #008000;">self</span>._hadj.<span style="color: black;">lower</span> = <span style="color: #ff4500;">0</span>
<span style="color: #008000;">self</span>._hadj.<span style="color: black;">upper</span> = <span style="color: #ff4500;">50</span>
<span style="color: #008000;">self</span>._hadj.<span style="color: black;">page_size</span> = <span style="color: #ff4500;">10</span></pre></div></div>

<p>This tells the scrollbar that the size of the underlying picture (<code>upper - lower</code>) is 50, while the visible size (<code>page_size</code>) is 10. </p>
<p>The page size obviously depends on the current size of the widget, which can be retrieved from the underlying <code>gtk.gdk.Window</code>:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">width, height = <span style="color: #008000;">self</span>.<span style="color: black;">window</span>.<span style="color: black;">get_size</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre></div></div>

<p>The current position of the scroll bar is controlled by the property <code>value</code> of the adjustment object and should be in the range <code>[lower .. upper - page_size]</code>. Whenever the property is changed, the <code>value-changed</code> signal is emitted, which we&#8217;ve connected to previously, and the widget can be repainted.</p>
<p>If you&#8217;re curious, you can also see the whole gloriousness in <a href="http://www.cl.uzh.ch/kitt/hg/sta/torsten/file/dc2c113ef300/STA/app/ui/gtktreeview.py#l199">actual working code</a>.</p>
<ol class="footnotes"><li id="footnote_0_178" class="footnote">which you usually don&#8217;t seen when coding with PyGTK</li><li id="footnote_1_178" class="footnote">ditto</li><li id="footnote_2_178" class="footnote">a technique which I thoroughly dislike and should be converted to be used with decorators</li></ol>]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2009/05/17/scrollable-widgets-with-pygtk/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New Hardware</title>
		<link>http://shlomme.diotavelli.net/2009/05/08/new-hardware/</link>
		<comments>http://shlomme.diotavelli.net/2009/05/08/new-hardware/#comments</comments>
		<pubDate>Fri, 08 May 2009 00:07:48 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[hardware]]></category>
		<category><![CDATA[lang:en]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/?p=176</guid>
		<description><![CDATA[Even before I started working at UZH, I got my new hardware. Back in February, when I visited Zürich to look for rooms, I was offered to choose a new notebook for myself. Since I already have a large notebook (at least that&#8217;s what I consider my 14.1&#8243; T61) and toyed around with the idea [...]]]></description>
			<content:encoded><![CDATA[<p>Even before I started working at UZH, I got my new hardware. Back in February, when I visited Zürich to look for rooms, I was offered to choose a new notebook for myself. Since I already have a large notebook (at least that&#8217;s what I consider my 14.1&#8243; T61) and toyed around with the idea of getting a desktop again (after 4 years of exclusive notebook use!), I chose an <a href="http://www.thinkwiki.org/wiki/Category:X301">X301</a>. I included some hardware upgrades that weren&#8217;t included in the basic offer:</p>
<ul>
<li>+2 GB RAM (4 GB overall)</li>
<li>3G card</li>
<li>USB Port Replicator</li>
<li>DisplayPort->DVI converter</li>
</ul>
<p>Being in Switzerland, I had the choice between CH and US keyboard layout. Since CH is physically the same as the German layout (105 keys), I took it and haven&#8217;t really noticed that the keys have different symbols (as far as most of the special characters are concered) printed on that what appears on the screen when I hit them. I should really get <a href="http://www.daskeyboard.com/">Das Keyboard</a> after all.</p>
<p>The X301 is not quite the workhorse the T61 is, which is especially noticeable from graphics speed, although that might be due to an Intel driver being in several different transitions right now. The SSD compensates for that, booting and starting up is a breeze. Fortunately, I don&#8217;t have to do much booting these days, because unlike other major graphics hardware creators, Intel&#8217;s developers are able to support powersave modes on Linux hardware, both Suspend-to-RAM and Suspend-to-disk.</p>
<p>Everything else works more or less, even the DisplayPort, dutyfully serving my 24&#8243; screen at the university. The pièce de résistance is definitely the 3G card. I had to install it myself, which was quite a bit more involved than what I would have liked, but it worked on the second try. I got myself a prepaid data contract which allows me to surf the net for 3 CHF/h, which is a good complement to free wireless access at most public hotspots in Switzerland, which is due to <a href="http://www.switch.ch/">SWITCH</a>. The wireless became especially hand today, when I was shopping for a new router without having a clue about what&#8217;s good. Standing in the Media Markt (yes, this very pillar of the German retail market also metastasised into Switzerland) network hardware aisles browsing the web with a notebook probably didn&#8217;t look as superior as doing the same thing with a smartphone, but it helped me finding a router.</p>
<p>What really sets the X301 apart is the weight. I&#8217;m quite used to it by now, since I&#8217;ve had it for over a month, but it&#8217;s usually the first response I get when I hand it over to somebody else. It is, however, very noticeable in direct comparison, and the T61 feels clunky and hugely oversized when I carry it around or use it on my lap. The battery lifetime is okay, usually I get 3h under Linux. There&#8217;s probably some more running time to be gotten, but I&#8217;m more or less done tweaking the system for now and using it for actual work; or for blogging, emailing and chatting on my nice, new, comfy chair.</p>
]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2009/05/08/new-hardware/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Aus der Reihe…</title>
		<link>http://shlomme.diotavelli.net/2009/05/01/aus-der-reihe%e2%80%a6/</link>
		<comments>http://shlomme.diotavelli.net/2009/05/01/aus-der-reihe%e2%80%a6/#comments</comments>
		<pubDate>Fri, 01 May 2009 15:16:13 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[lang:de]]></category>
		<category><![CDATA[shnasel]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/?p=154</guid>
		<description><![CDATA[… berühmte &#38; tagesaktuelle politische Zitate in parallelen Korpora.





]]></description>
			<content:encoded><![CDATA[<p>… berühmte &amp; <a href="http://de.wikipedia.org/wiki/Erster_Mai">tagesaktuelle</a> politische Zitate in parallelen Korpora.</p>
<div style="text-align: center">
<a href="http://de.wikipedia.org/wiki/Karl_Marx"><br />
<img src="http://diotavelli.net/files/img/laborday.png" alt="Parallel aligned version of the famous Marx quote" /><br />
</a>
</div>
]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2009/05/01/aus-der-reihe%e2%80%a6/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Immer weiter</title>
		<link>http://shlomme.diotavelli.net/2009/04/29/immer-weiter/</link>
		<comments>http://shlomme.diotavelli.net/2009/04/29/immer-weiter/#comments</comments>
		<pubDate>Wed, 29 Apr 2009 22:49:55 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[lang:de]]></category>
		<category><![CDATA[treealigner]]></category>
		<category><![CDATA[uni]]></category>
		<category><![CDATA[zürich]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/?p=152</guid>
		<description><![CDATA[Wenn man sich die Mühe macht und auf einer Karte alle Stationen der letzten Jahre miteinander, kommt leider kein sonderlich spannendes Gebilde zusammen, im Sinne einer interessanten unterliegenden 2D-Struktur (Kreis, Pentagramm, Kassiopeia, Dependenzbaum) ist meine bisherige Umzugsaktivität also ein Misserfolg. Das einzige Muster ist eine leichte Alternation zwischen Nord- und Südumzügen (Süd-Nord-Süd-Nord-Süd), der nächste Umzug [...]]]></description>
			<content:encoded><![CDATA[<p>Wenn man sich die Mühe macht und auf einer Karte alle Stationen der letzten Jahre miteinander, kommt leider kein sonderlich spannendes Gebilde zusammen, im Sinne einer interessanten unterliegenden 2D-Struktur (Kreis, Pentagramm, Kassiopeia, Dependenzbaum) ist meine bisherige Umzugsaktivität also ein Misserfolg. Das einzige Muster ist eine leichte Alternation zwischen Nord- und Südumzügen (Süd-Nord-Süd-Nord-Süd), der nächste Umzug geht also nach höchstwahrscheinlich nach Murmansk.</p>
<p>Bis dahin also bin ich in Zürich, von der Urbanität her durchaus ein Aufstieg, dafür laufe ich aber deutlich länger als 5 min zum nächsten Karstadt (bzw. der Innenstadt). Das mir nächste Zentrum hat aber neben meinem Arbeitsplatz auch genügend andere Geschäfte zu bieten, sodass sich solche Verluste leichter verschmerzen lassen.</p>
<p>Nach einer dann doch wieder recht hektischen Umzugsphase bin ich seit dem 15. 4. offiziell Assistent an der Universität Zürich (immerhin Nummer 53 im <a href="http://en.wikipedia.org/wiki/Academic_Ranking_of_World_Universities">Academic Ranking of World Universities</a> und damit vor allen deutschen Universitäten). Das beinhaltet neben der Lehre (ab dem Herbstsemester) auch Arbeit am <a href="http://www.cl.uzh.ch/kitt/treealigner">TreeAligner</a>) und allem anderen, was mir so einfällt, bisher größtenteils Release-Polishing für die grandiose 1.1, nachdem es eine Version 1.0 nie gab—ein Release-Trick, den sich noch so einige von uns abschauen sollten. Sobald das abgeschlossen ist, dem Plan nach so Mitte Mai, geht es mit voller Kraft auf dem Query-Modul weiter, dazu aber morgen mehr.</p>
]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2009/04/29/immer-weiter/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Stack Overflow Statistics</title>
		<link>http://shlomme.diotavelli.net/2009/02/08/stack-overflow-statistics/</link>
		<comments>http://shlomme.diotavelli.net/2009/02/08/stack-overflow-statistics/#comments</comments>
		<pubDate>Sun, 08 Feb 2009 14:38:23 +0000</pubDate>
		<dc:creator>shlomme</dc:creator>
				<category><![CDATA[lang:en]]></category>
		<category><![CDATA[other]]></category>

		<guid isPermaLink="false">http://shlomme.diotavelli.net/?p=129</guid>
		<description><![CDATA[Going through a computational linguistics program will bring you in touch with Zipf&#8217;s Law. Its core claim:
In a corpus, the frequency of any word is inversely proportional to its rank.
Translated into less-wordy terms, it means that some words (events) occur very often and many words only occur a few times, or only once.
Zipf&#8217;s Law also [...]]]></description>
			<content:encoded><![CDATA[<p>Going through a computational linguistics program will bring you in touch with <a href="http://en.wikipedia.org/wiki/Zipf%27s_law">Zipf&#8217;s Law</a>. Its core claim:</p>
<blockquote><p>In a corpus, the frequency of any word is inversely proportional to its rank.</p></blockquote>
<p>Translated into less-wordy terms, it means that some words (events) occur very often and many words only occur a few times, or only once.</p>
<p>Zipf&#8217;s Law also holds for similar structures like DNA, but the distribution can also be observed in the user reputation of <a href="http://stackoverflow.com">Stack Overflow</a>. The following three graphs contain the reputation (X) and the frequency of this particular reputation value (Y) on log-scaled axes. With increasing normalization, the plot gets more Zipf-like, with the typical long &#8220;tails&#8221; at the lower end.</p>
<p><img width="500" src="http://diotavelli.net/files/img/zipf-overflow.png" alt="Distribution of Reputation, no normalization" /></p>
<p><img width="500" src="http://diotavelli.net/files/img/zipf-overflow-10.png" alt="Distribution of Reputation, 10r normalization" /></p>
<p><img width="500" src="http://diotavelli.net/files/img/zipf-overflow-100.png" alt="Distribution of Reputation, 100r normalization" /></p>
<p>If we plot the mass distribution of reputation orderd by decreasing reputation on log-log axes, we get something that looks like the cumulative of an exponential distribution:</p>
<p><img width="500" src="http://diotavelli.net/files/img/rmd.png" alt="Reputation Mass Distribution" /></p>
<p>On 2009-02-05, the total amount of reputation on Stack Overflow was 8,491,989, and around 15% of the users make up 85% of the reputation (not completely <a href="http://en.wikipedia.org/wiki/Pareto_distribution">Pareto&#8217;s 80-20</a>), with the top user (of 41,082) owning 0.39% of the overall reputation.</p>
<p>For these graphs, I&#8217;ve scraped the user overview pages, scraping every single user page would allow for more interesting (and accurate, since inactive users can be removed) statistics, but I&#8217;d rather wait for a proper API. </p>
]]></content:encoded>
			<wfw:commentRss>http://shlomme.diotavelli.net/2009/02/08/stack-overflow-statistics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
