panthema (page 4 of 1 2 3 4 5 6 7 8 9 10)
Animation showing binary search and linear scan

STX B+ Tree Revisiting Binary Search

Posted on 2013-05-04 12:44 by Timo Bingmann at Permlink with 0 Comments. Tags: stx-btree

While developing another piece of software, I happened to require a customizable binary search implementation, which lead me to revisit the binary search function of my quite popular STX B+ Tree implementation.

The binary search is a central component in the container, both for performance and correctness, as it is used when traversing the tree in search for a key or an insertion point. It is implemented in the find_lower() (and find_upper()) functions.

In a first step, I cleaned the implementation to use an exclusive right boundary. In this binary search variant, hi points to the first element beyond the end (with the same meaning as usual end() C++ iterator). This got rid of the special case handled after the while loop. The exclusive right boundary is also a more natural implementation variant (even though most computer science textbooks contain the inclusive version).

Having thus changed the binary search, I reran the speed tests. However, to my surprise, the performance of the library decreased slightly, but consistently. See the code diff 39580c19 and resulting speed test PDF, where solid lines are after the patch and dashed ones before.

After some research, I found a good, independent study of binary search variants by Stephan Brumme. His summary is that a linear scan is more efficient than binary search, if the keys are located in only one cache line. This (of course) explained the performance decrease I measured, as my "special case" after the search was in fact a very short linear scan of two element.

This blog entry continues on the next page ...

Thumbnail of a pie chart filling to 100%

Released disk-filltest 0.7 - Simple Tool to Detect Bad Disks by Filling with Random Data

Posted on 2013-03-27 21:32 by Timo Bingmann at Permlink with 0 Comments. Tags: c++ utilities

This post announces the first version of disk-filltest, a very simple tool to test for bad blocks on a disk by filling it with random data. The function of disk-filltest is simple:

See the disk-filltest project page for more information about version 0.7.

Memory profile plot as generated by example in malloc_count tarball

Released malloc_count 0.7 - Tools for Runtime Memory Usage Analysis and Profiling

Posted on 2013-03-16 22:17 by Timo Bingmann at Permlink with 1 Comments. Tags: c++ coding tricks

This post announces the first version of malloc_count, a very useful tool that I have been fine-tuning in the past months. The code library provides facilities to

The code tool works by intercepting the standard malloc(), free(), etc functions. Thus no changes are necessary to the inspected source code.

See the malloc_count project page for more information about version 0.7.

Instacode coloring of assembler code

Coding Tricks 101: How to Save the Assembler Code Generated by GCC

Posted on 2013-01-24 18:07 by Timo Bingmann at Permlink with 2 Comments. Tags: c++ coding tricks frontpage

This is the first issue of a series of blog posts about some Linux coding tricks I have collected in the last few years.

Folklore says that compilers are among the most complex computer programs written today. They incorporate many optimization algorithms, inline functions and fold constant expressions; all without changing output, correctness or side effects of the code. If you think about it, the work gcc, llvm and other compilers do is really amazing and mostly works just great.

Sometimes, however, you want to know exactly what a compiler does with your C/C++ code. Most straight-forward questions can be answered using a debugger. However, if you want to verify whether the compiler really applies those optimizations to your program, that your intuition expects it to do, then a debugger is usually not useful, because optimized programs can look very different from the original. Some example questions are:

These questions can be answered definitely by investigating the compiler's output. On the Net, there are multiple "online compilers," which can visualize the assembler output of popular compilers for small pieces of code: see the "GCC Explorer" or "C/C++ to Assembly v2". However, for inspecting parts of a larger project, these tools are unusable, because the interesting pieces are embedded in much larger source files.

Luckily, gcc does not output binary machine code directly. Instead, it internally writes assembler code, which then is translated by as into binary machine code (actually, gcc creates more intermediate structures). This internal assembler code can be outputted to a file, with some annotation to make it easier to read.

This blog entry continues on the next page ...

Example of the Inducing Process

eSAIS - Inducing Suffix and LCP Arrays in External Memory

Posted on 2012-11-19 15:49 by Timo Bingmann at Permlink with 2 Comments. Tags: research stringology stxxl c++

This web page accompanies our conference paper "Inducing Suffix and LCP Arrays in External Memory", which we presented at the Workshop on Algorithm Engineering and Experiments (ALENEX 2013). A PDF of the publication is available from this site as alenex13esais.pdf alenex13esais.pdf or from the online proceedings. The paper was joint work with my colleagues Johannes Fischer and Vitaly Osipov.

Download alenex13esais.pdf

The slides to my presentation of the paper on January 7th, 2013 in New Orleans, LA, USA is available: alenex13esais-slides.pdf alenex13esais-slides.pdf. They contain little text and an example of the eSAIS algorithm with a simplified PQ.

Download alenex13esais-slides.pdf

We have also submitted a full version of the eSAIS paper to a journal. Due to long publication cycles, we make a pre-print of the journal article is available here: esais-preprint.pdf esais-preprint.pdf. The full paper contains more details on the inducing algorithm for the LCP array and additional experimental details.

Download esais-preprint.pdf

Our implementations of eSAIS, the eSAIS-LCP variants, DC3 and DC3-LCP algorithms as described in the paper are available below under the GNU General Public License v3 (GPL).

eSAIS and DC3 with LCP version 0.5.4 (current) updated 2013-12-13
Source code archive:
(includes STXXL 1.4.0)
eSAIS-DC3-LCP-0.5.4.tar.bz2 eSAIS-DC3-LCP-0.5.4.tar.bz2 (1.37 MiB) Browse online
Git repositories Suffix and LCP construction algorithms
git clone https://github.com/bingmann/eSAIS
cd eSAIS; git submodule init; git submodule update
STXXL 1.4.0
git clone https://github.com/stxxl/stxxl

For more information about compiling and testing the implementation, please refer to the README included in the source.

This blog entry continues on the next page ...

The QuadClip Algorithm

Finding Roots of Polynomials by Clipping - Report and Implementation from my Lab Course in Numerical Mathematics

Posted on 2012-03-20 22:29 by Timo Bingmann at Permlink with 0 Comments. Tags: maths university c++

This semester I had the pleasure to take part in a lab exercise course supervised by Prof. Thomas Linß at the FernUniversity of Hagen. The objective was to comprehend, implement and evaluate a particular recent advancement in the field of numerical mathematics. My topic was finding the roots of a polynomial by clipping in Bézier representation using two new methods, one devised by Michael Bartoň and Bert Jüttler [1], the other extended from the first by Ligang Liu, Lei Zhang, Binbin Lin and Guojin Wang [2].

My implementation of this topic was done for the lab course in C++ and contains many in themselves interesting sub-algorithms, which are combined into the clipping algorithms for finding roots. These sub-algorithms may prove useful for other purposes, which is the main reason for publishing this website. Among these are:

For the lab course I wrote two documents, both in German: one is an abstract Kurzfassung.pdf Kurzfassung.pdf (1 page), which is translated into English below, and the other a short report Ausarbeitung.pdf Ausarbeitung.pdf (6 pages). The report contains a short description of the algorithms together with execution and convergence speed measurements, which verify the original authors experiments. For presenting the lab work I created these Slides.pdf Slides.pdf, which however are not self-explanatory due to my minimum-text presentation style.

This blog entry continues on the next page ...

Drawing of PG(2,3)

Vervollständigung meiner Seminararbeit in Diskreter Mathematik

Posted on 2011-08-31 20:12 by Timo Bingmann at Permlink with 0 Comments. Tags: maths university

Heute habe ich meine Seminararbeit in Diskreter Mathematik an der FernUni Hagen vervollständigt. Ich möchte mich bei Prof. Hochstättler bedanken für die erfahrene Leitung und planvolle Einführung in die interessanten Gebiete der Design Theorie und endlicher Geometrien. Die Arbeit ist auf deutsch und unter folgendem Link downloadbar.

Download PDF: Seminararbeit-Singer-Differenzenmengen.pdf Seminararbeit-Singer-Differenzenmengen.pdf (2.79 MiB).

Download Seminararbeit-Singer-Differenzenmengen.pdf

Zusammenfassung

Die Entwicklung von Differenzenmengen in der Design Theorie begann 1938 mit einer algebraischen Untersuchung endlicher projektiver Geometrien von James Singer. Nach Einführung der wichtigsten Grundbegriffe der Design Theorie und projektiver Geometrien werden in dieser Ausarbeitung die beiden Resultate von Singer entwickelt. Die Existenz einer Kollineation mit Periode \frac{n^{d+1}-1}{n-1} führt zur Definition von Differenzenmengen und beide Ergebnisse liefern elegante Konstruktionsverfahren für PG(d,p^k).


Small drawing of a B+ tree

Update Release of STX B+ Tree 0.8.6

Posted on 2011-05-18 12:44 by Timo Bingmann at Permlink with 1 Comments. Tags: c++ stx-btree

After four years I have decided to release an updated version 0.8.6 of the STX B+ Tree C++ Template Classes package. The updated release contains all patches that have accumulated in my inbox over the years. So yes, please send me patches for this project, it is not in vain! Below the highlights on the changes in this release:

I also reran the speed test done back in 2007 on my new hardware and compared the results with the old data. Due to the larger L2 cache sizes in my new Intel Core i7, the B-tree speed-up first starts to show at about 100,000 integer items, rather than 16,000 items with my older Pentium 4. This might also have something to do with the new CPU using 64-bit pointers and thus requiring larger nodes for child references. Read the complete speed test here.

result plot from new speedtest

The updated source code package is available for download from this webpage.


Yet Another Release of digup 0.6.40 - A Digest Updating Tool

Posted on 2011-01-31 20:25 by Timo Bingmann at Permlink with 0 Comments. Tags: c++ utilities

This is yet another release entry of digup. This time, however, it is a major release with lots of new improvements and some old fixes:

For more information and the new version see the digup web page.


Bugfix Release: digup 0.6.30 - A Digest Updating Tool

Posted on 2010-10-03 16:12 by Timo Bingmann at Permlink with 0 Comments. Tags: c++ utilities

Fixed another severe bug in the digup tool: on the amd64 architecture the tool crashed when writing the digest file, thanks goes to Daniel D. for reporting and fixing this bug.

The bug was caused by the variable arguments lists va_list used twice in the fprintfcrc() function. Apparently, on the amd64 platform va_start() and va_end() must be called twice even when passed the list to vsprintf().

For more information and the new version see the digup web page.


Show Page: 1 2 3 4 5 6 7 8 9 10 Next >
RSS 2.0 Weblog Feed Atom 1.0 Weblog Feed Valid XHTML 1.1 Valid CSS (2.1)
Copyright 2005-2018 Timo Bingmann - Impressum