Nope, the pub evening wasn’t very good, I’m afraid. Loud, crowded, hot and filled with only a couple of familiar faces. So retired early and woke even earlier. Not a perfect start for a long day.
Divided the sessions among the four of us (there’s four concurrent tracks), and I got the short stick and had to sit in on some very desktop-y presentations.
Linux 2.6 Roadmap by Jonathan Corbet (Linux Weekly News).
Like last year, the conference kicks off with a presentation about what was decided in the preceding kernel summit. Slides of the presentation are available at the LWN site.
This time around there were less major bombshells (ie. the development model had not been thoroughly changed, once again).
Presentation was divided into two parts – a recap where the kernel is right now, and where it is expected to move in the near future (though calling it a real roadmap is an insult to the cartographers of the world).
All in all the new development model seems to be in pretty good shape, Andrew Morton maintains the development flavored branch (the -mm tree), and Linus maintains the stable branch. Not perfect by any means, patches can, of course, avoid the -mm branch altogether and there’s no formal bug tracking going on. The adoption of sucker trees (2.6.x.y – maintained by Greg Kroah-Hartmann and Chris Wright), where only bugfixes are rolled in has been beneficial as well, since it prevents the need for wild backporting.
Another big development model-related thing is the sudden move from bitkeeper to git. Git is not at “1.0″-quality yet, and known issues exist (diskspace is wasted). Competitors exist as well (mercurial rated to be the best of the bunch).
The roadmap-portion of the presentation went on the vague side immediately when inclusion of a feature was not feasible in the next upcoming release. But it was still a very useful talk, since a lot of interesting, previously strenuously resisted features are making it in, now. Some of them already in the just-around-the-corner next release.
First up, 2.6.13, expected in august: kdump, inotify, execute-in-place, voluntary pre-emption, selectable timer-frequency.
The roadmap was not fully devoted to technical issues, process-related things were discussed as well (as shown on slide 24).
Latency improvements, ie. the time required to respond to an event, is a huge and hard problem to resolve. Ingo Molnar’s patches for “deterministic scheduling response time, always” are invasive and very hard to sell to the community as a whole (there are many beneficaries, but unfortunately everyday desktop use is not in for a big haul – audio/video, data acquisition and all kinds of [pseudo]-realtime control are). So each and every aspect must be argued individually, some of the improvements have already been rolled in, to a positive reception.
A controversial change is the move from spinlocks to priority-inheriting mutexes, where processes unable to run sleep instead of constant spinning. This brings with it the possibility for even wider pre-emption of processes in the kernel as well prevention of core stalling between entirely unrelated processes due to contention.
Another long-discussed change proposal is the implementation of all interrupt handlers as kernel threads. This would mean that everything is scheduled in a similar fashion. However, the additional locking primitives needed are not trivially determinable.
An “interrupt pipe” is another implementation of an improved handler. Both ADEOS and RTAI use it as the default mechanism.
Virtualization is a big and popular topic. So popular that a full day’s track (friday) is devoted to the subject. Xen is by far the leader in this niche. But far from completion – issues in scalability domain (ie. not ready for SMP or PAE).
On filesystem front the score is pretty simple: Reiser4 will go in “when it’s ready” as will FUSE. Cluster filesystems have a long road, as there’s no 100% clear vision which parts can a) be shared between implementations (especially the distributed locking manager, if at all possible) and b) need to be in kernel. GFS and OCFS2 are the two clear leaders here.
Security and resource management are moving along on expected routes, SELinux and CKRM, respectively. The support for TPM has already been added for the former.
Timekeeping has been changed to be more dynamic – for the benefit of virtualization (especially), there’s a separate presentation devoted to the issue.
Memory management has also been a target for tweaks, some of them large indeed (the 4-level page tables).
All in all the conclusion is that the kernel remains very much a “work in progress”, but steps are being taken into many interesting directions.
Building Linux Software with Conary by Michael K. Johnson (Rpath).
The slides are not available, the company site appears to be undergoing major reconstruction.
The rpath-guys have implemented a RPM-replacement called conary. Which does away with one of the most menial task of software configuration management (maintenance of the spec-file of a package).
Definitely an interesting addition to the currently available solutions is the availability of “shadows”, basically copy-on-write branches, that allow for trivially easy inheritance.
Python (the language) is used as definition language for building “recipes”, which are further simplified by the availability of superclasses (with which pretty much eg. most KDE-apps can use the same scripting).
The technologies involved are still evolving, and especially large scale SCM issues (releases etc.) are still incomplete.
[ And yeah, it's indeed the same Michael Johnson, who wrote a lot of the early "getting started with Linux"-type guides. Partially responsible for what I do these days, for sure. ]
Had lunch in the Teriyaki-place at the food court of Rideau Centre. Simple, fast and good, not to mention way healthier than the usual BK/McD/whatever-fare. Helsinki ought to have one as well.
SNAP Computing and the X Window System by Jim Gettys (ex-HP, laid off during the summit).
Slides available at freedesktop.org. Good summary at lwn.
Very much centered on the idea that current computers do not really scale beyond single users. Single screen, single mouse etc. Approaching ubiquitous computing, where applications follow users and users do not have to lug around any hardware. Also, the management of the environment is way too expensive – currently up to 3/4 of the cost of any non-trivial installation goes into maintenance.
Quite academic and necessarily provocative. Bluetooth is supposedly “useless”, whereas Zigbee is seen as a potential solution to networking issues.
The big idea is to make the “plumbing” of the network (eg. discovery and authentication of services) as seamless as possible. X must go truly multi-user, and that requires lots of design and implementation, eg. ssh is not seen as an adequate connectivity tool, but could be leveraged as the underlying transport – it has legal issues (export control) and is not ever truly ad-hoc (needs an account on the target system). IPsec is not usable either – it is not end-to-end, and does not handle user-level authentication.
Input devices must become true network services, otherwise the plumbing’s only half-done – the control must be able to migrate between devices.
And services must be able to be seamlessly used, without regard to network topology. HAL and DBUS technologies are seen as big enablers in this area.
Clearly the problem domain is much wider than just graphics, and there are really no places to copy good ideas from – so it’s really time to innovate.
TWIN: An Even Smaller Window System for Even Smaller Devices by Keith Packard (HP).
Slides available at freedesktop.org.
Easily the best speaker thus far, and pretty much in the top 5 ever (among technical folks, that is). Keith Packard’s an old X Window System guru and it shows – the presentation is peppered with war stories and anecdots from old days.
The basic idea behind the presentation is the need for a new truly lightweight window system – lightweight in the sense of memory footprint, not in CPU consumption. And with severe wizardry (more than adequately explained in the material) the entire system including scalable fonts, translucency support and PostScript-based geometry engine fits in 100 kilobytes. Yes, one hundred kilobytes. That definitely qualifies as lightweight. There are a lot of interesting shortcuts taken, while hugging the requirement set at height of several molecules only – showing that truly impressive results can be had even when the initial set of reqs seem mindbogglingly hard.
Can You handle the Pressure? Making Linux bulletproof under load by Martin Bligh (IBM).
No presentation available as far as I googled.
Struggling to make the virtual memory system of Linux truly bulletproof is a painful task.
The presentation described the current state of affairs and included ideas how to improve the situation.
The current state is that the behavior is easily explained, but contains lots of issues. Basically all memory is used – and some of it is easily reclaimable when needed. Clean pages can be reclaimed immediately, whereas dirty ones must be dumped onto backing store first. The current page selection algorithm is LRU (with lots of extra spicing on top), based on HW-level information (pages). Complicated by the need to balance pagecache and slabcache (kernel’s pages) with the users’ paging needs.
By far the most problematic structure at the moment is the buffer-head, which has a lot of dependencies and is thus extremely hard to reclaim (eg. filesystem metadata caching, differences between filesystem blocksize and HW page size, mappings between data in memory and on actual disk surface, ordering guarantees for transaction-based entities).
Some pages are just plain unreclaimable – eg. kernel, locked pages, RDMA. But a big omission in the current system is the lack of differentation between physical and virtual pinning of pages, they are after all two separate address spaces and should be treated as such. Fragmentation of memory complicates reclamation further (especially when most kernel structures are definitely sub-page sized). Dependencies are not managed well – a full tree to map eg. inodes/dentries/pagecache entries is needed.
The OOM killer is usually a good first sign that there’s either a) suboptimal workload b) bugs. Usually the former. But the diagnostics provided by the killer are not as good as they could be. Usually there’s just indication how much memory is not available, and that’s not enough in most cases.
Clearly there’s a need for better tools here. Both to monitor the memory consumption (both live system [ie. instrumentation] and in postmortem fashion), current state of the art vmtop and meminfo/slabinfo do not reach very far. Some ideas to create “event receptors” with priority sockets exist, but no complete implementation is available. Recreation of the fault is usually between hard and impossible, and kprobes/tap cannot be hooked in after the fault has manifested itself. The dirty page evaluator is slow, and usually not available – heuristics to switch between two modes at certain threshold would be beneficial.
Instrumentation would indeed be very good addition, but both space and performance criteria apply – adding per-page tags is not easily accomplishable.
In swapless systems the code short-circuits and avoids very lengthy page management methods.
This time it seems that no free sodas are offered, only water.
Ate dinner in Fishmarket restaurant, monkfish is still a good selection.
Visited the Intel/IBM-hosted evening reception in the conference center. And quite something else it was. The initial presenter from Intel was slick and harmless (especially considering how much longer he could have spoken). Performance was marred by the revelation that he was actually showing slides from an XP-machine (to loud booing). Which was probably a wise choice, since the next presenter took a sweet fifteen minutes to tune his X-window system settings before commencing the show. The less said about the quality of the content, the better. The evening was capped off by a traditional lottery, but someone had obviosuly screwed up gravely, since among the twenty-odd tickets drawn were no winning matches whatsoever. So, the speaker switched over to trivia questions, to which the answers shouted to – kinda putting anyone sitting more than five meters distant at a severe disadvantage. Drinks were available, otherwise I (and I guess most of the others as well) would’ve left before the official program had reached its halfway mark.