EEF Summer School on Massive Data Sets

June 27-July 1, 2002, BRICS, University of Aarhus, Denmark



Data sets in large applications are often too massive to fit completely inside the computer's internal memory. The resulting input/output communication (or I/O) between fast internal memory and slower external memory (such as disks) can be a major performance bottleneck. During the last decade a major body of research has been devoted to the development of efficient external memory algorithms, where the goal is to exploit locality in order to reduce the I/O costs. This summer school will survey the state of the art in the design and analysis of external memory algorithms and data structures.

The lecturers and the main topics for the summer school are:

Lars Arge, Duke University - External geometric data structures
Erik D. Demaine, MIT - Cache oblivious algorithms and data structures
Paolo Ferragina, University of Pisa - String algorithms and data structures
Jeff Vitter, Duke University - Fundamental external memory algorithms and use of parallel disks
Norbert Zeh, Duke University - Graph algorithms


Lars Arge (external geometric data structures): geometric data structures (interval trees, priority search trees), range searching (range trees, kd-B trees, O-trees, cross-trees, R-trees), point location, proximity queries. B-trees, weight-balanced B-trees, persistent B-trees, buffer trees, priority queues Erik Demaine (cache oblivious algorithms and data structures): Cache-oblivious model, basic algorithms, sorting, data structures (search trees, priority queues), graph and geometry algorithms. Paolo Ferragina (string algorithms and data structures): String sorting, the indexing problem (word vs full-text indexes, time vs space trade-off), word indexes (inverted lists, text and index compression, block-addressing indexes, complex queries), full-text indexes (suffix array, suffix tree, string B-tree, opportunistic data structures, experiments). Jeff Vitter (fundamental external memory algorithms and use of parallel disks): I/O model, sorting upper and lower bounds, hashing, algorithms for parallel disks, scientific computing (matrix computations, FFT), hierarchical memory models, batched computational geometry/geographic information systems (distribution sweeping, spatial join, line segment intersection, convex hull), Norbert Zeh (graph algorithms): Data structures useful for solving graph problems I/O-efficiently (time-forward processing, list ranking, Euler tour, graph contraction), fundamental graph algorithms for general graphs (connectivity problems, breadth first search (BFS), depth first search (DFS), single source shortest path (SSSP)), algorithms for sparse graphs (general algorithms that happen to work well on sparse graphs, overview of graph classes where things can be done I/O-efficiently, planar graphs).


Lecture Notes

Preliminary proceedings for the summer school will be published locally by BRICS. The final proceedings are here:

Who Should Attend?

The school is open to anyone interested. A primary target group is PhD and graduate students. A substantial number of grants for students to attend the school, covering registration, accommodation and local costs are available. Applications via the electronic registration form on the homepage of the summer school. Deadline for applications is May 15, 2002.

European Educational Forum

The European Educational Forum (EEF) is a joint initiative of European interuniversitary research schools in computer science, and involves 32 universities in Denmark, Finland, France, Germany, Italy, The Netherlands, and the United Kingdom. The summer school on Massive Data Sets is organised as a part of the EEF foundations series of summer schools, and is supported by the European Commission Research DG (Human Potential Programme, High-Level Scientific Conferences, HPCF-CT-1999-00186) and BRICS.

Organizing Committee

Gerth Stølting Brodal
Uffe Engberg
Rolf Fagerberg
Karen Kjær Møller
Erik Meineche Schmidt


Questions concerning the summer school should be directed to

Summer School on Massive Data Sets
Att.: Gerth S. Brodal
BRICS, Department of Computer Science
University of Aarhus
DK - 8000 Aarhus C

Phone: (+45) 8942 3200
Fax: (+45) 8942 3255

Page maintained by Gerth Stølting Brodal