H8QG6 labels for edac-utils

We’re using edac-ctl, from edac-utils, to get information about the configuration of our servers. I recently added the labels for our H8QG6-based models to the database; if you’re lucky enough to own one of these beautiful machines, the following patch might be of interest to you.

This has been sent upstream so hopefully this will appear in a future release of edac-utils.

From 997fbda58cbf5eadb426b0d5f169d70c6d796afb Mon Sep 17 00:00:00 2001
From: Jonas Bonn <jonas@southpole.se>
Date: Wed, 9 Nov 2011 15:05:41 +0100
Subject: [PATCH 1/1] Add labels for Supermicro H8QG6

---
src/etc/labels.db |   18 ++++++++++++++++++
1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/src/etc/labels.db b/src/etc/labels.db
index 5b01c7f..bc68777 100644
--- a/src/etc/labels.db
+++ b/src/etc/labels.db
@@ -58,6 +58,24 @@ Vendor: Supermicro
DIMMA3: 0.2.0, 0.3.0; DIMMB3: 1.0.1, 1.1.1;
DIMMA4: 0.2.1, 0.3.1; DIMMB4: 1.0.0, 1.1.0;

+ Model: H8QG6
+    P1-DIMM1B: 0.0.0, 0.1.0; P1-DIMM2B: 0.0.1, 0.1.1;
+    P1-DIMM3B: 1.0.0, 1.1.0; P1-DIMM4B: 1.0.1, 1.1.1;
+    P1-DIMM1A: 0.2.0, 0.3.0; P1-DIMM2A: 0.2.1, 0.3.1;
+    P1-DIMM3A: 1.2.0, 1.3.0; P1-DIMM4A: 1.2.1, 1.3.1;
+    P2-DIMM1B: 2.0.0, 2.1.0; P2-DIMM2B: 2.0.1, 2.1.1;
+    P2-DIMM3B: 3.0.0, 3.1.0; P2-DIMM4B: 3.0.1, 3.1.1;
+    P2-DIMM1A: 2.2.0, 2.3.0; P2-DIMM2A: 2.2.1, 2.3.1;
+    P2-DIMM3A: 3.2.0, 3.3.0; P2-DIMM4A: 3.2.1, 3.3.1;
+    P3-DIMM1B: 4.0.0, 4.1.0; P3-DIMM2B: 4.0.1, 4.1.1;
+    P3-DIMM3B: 5.0.0, 5.1.0; P3-DIMM4B: 5.0.1, 5.1.1;
+    P3-DIMM1A: 4.2.0, 4.3.0; P3-DIMM2A: 4.2.1, 4.3.1;
+    P3-DIMM3A: 5.2.0, 5.3.0; P3-DIMM4A: 5.2.1, 5.3.1;
+    P4-DIMM1B: 6.0.0, 6.1.0; P4-DIMM2B: 6.0.1, 6.1.1;
+    P4-DIMM3B: 7.0.0, 7.1.0; P4-DIMM4B: 7.0.1, 7.1.1;
+    P4-DIMM1A: 6.2.0, 6.3.0; P4-DIMM2A: 6.2.1, 6.3.1;
+    P4-DIMM3A: 7.2.0, 7.3.0; P4-DIMM4A: 7.2.1, 7.3.1;
+
Model: H8QM8
DIMMA 2A: 0.0.0, 0.1.0; DIMMA 2B: 0.0.1, 0.1.1;
DIMMA 1A: 0.2.0, 0.3.0; DIMMA 1B: 0.2.1, 0.3.1;
--
1.7.5.4

How to Choose a Linux Distribution for Embedded Development

There are a plethora of Linux distributions to choose from today. For the uninitiated, the process of selecting a “distro” can become a nightmare of trying to disentangle the distinguishing features from the web of names and market-speak. This article takes a look at the process of selecting a distribution for an embedded device; presents in the context of some of the more mainstream distributions; and considers the benefits and drawbacks of each given a typical set of project constraints.

Important to keep in mind when selecting a distribution is that, despite what the marketing material may say, they are all more similar than they are different. They all run Linux as the operating system kernel and, thus, all require the same skill set to get the software running on a new device. If your device requires a new hardware driver, that driver will be the same no matter which distribution you choose. If you are running on top of hardware that is already well-supported in the Linux kernel, then any distribution using a sufficiently up-to-date kernel will be equally capable of running on your device.

On top of the Linux kernel, distributions provide a system layer in which reside the usual requisite system libraries and the necessary utilities for configuration of system features such as network interfaces. Most distributions provide here the familiar set of utilities from the GNU stack. Exceptions, however, do exist; the Android distribution, for example, provides a system layer sufficient for the requirements of its Java runtime environment and chooses not to target generic applications.

Finally, the application layer of each distribution is where you will find the most variation, but also here variation is not something to fear as most probably an embedded device will be fitted with custom application software in any case. The variation at this level is more easily overcome than that of the lower-level distribution plumbing.

Selecting a distribution
When selecting a distribution, it is important to begin with a clear idea of what is to be achieved therewith. After all, the distribution is a just a small piece of the puzzle that is your product and will in most cases not even be visible to the user when the finished product is released. A set of fundamental questions should be put to yourself and your development team and the answers kept in mind when weighing the pros and cons of each distribution against each other. These answers constitute the fundamental criteria upon which your decision should be based.

Is your hardware already supported by the Linux kernel? Is standard hardware being used? New and non-standard hardware present a significant set of challenges that require a very specific skill-set to overcome. The contracting of professional assistance for the bring-up stage of your project should be taken into consideration as this can save you man-months of work and future headaches. Wind River offers such assistance for their own distribution, and consultancies such as South Pole can provide assistance for the distribution of your choice.

What are the strengths of your developers? An application programmer may not feel comfortable debugging a kernel or packaging system libraries. A Java programmer may not feel comfortable working within the differing set of constraints presented by lower-level languages such as C and C++. Select a distribution that allows your development team to play to its strengths, while leaving the dirty work to the distribution developers.

Do you have an existing application that will require certain platform characteristics in order to run? A typical POSIX application, for example, will expect a set of standard libraries to be present on the system; more specialized environments, like that offered by the Android distribution, do not provide all the components of a generic distribution and may fail to meet the requirements of your application.

What level of long-term support will you require from the distribution provider? Some distributions move quickly and provide short periods of support; some distributions provide no support at all. Support always comes at a cost, so the benefits of paid-for support needs to be weighed against the costs of possibly having to create and provide security updates yourself.

Given this set of criteria, one can begin evaluating the set of available distributions. This article will present three mainstream Linux distributions for embedded devices, considered from the point of view of the aforementioned criteria.

Android
Android is quickly becoming the distribution of choice for smartphone-type devices. This
distribution has been designed from the ground up to appeal to mobile telephone manufacturers; packaging, security model, and licensing decisions have all been based on the needs of this market segment. As such, Android is by no means a standard Linux distribution. Apart from the Linux kernel that it runs atop of, Android bears little resemblance to other Linux distributions.

Android’s strength lies primarily in two key attributes: the components of the underlying system are well defined and do not vary from device to device; the Java-based application layer presents a well-defined and limited view of the platform, presenting the application developer with only one way of doing things. The distribution is supported with security fixes from Google and the Android community.

On the downside, Android requires a set of patches to be applied to the Linux kernel. If your device already uses a non-standard kernel, these patches may not cleanly apply. Furthermore, applications are intended to run in a virtual machine and many common system libraries are not included, largely limiting application development to being done in Java (at least for the time being). If you have an existing application, it likely will require a lot of work to make it run on Android, including possibly requiring a rewrite in Java.

Why choose Android? You want the ease of application development that comes with Android’s Java API; you want access to the multitude of Android applications already available today; you are using hardware that is already supported by the Android Linux kernel.

Wind River Linux
Wind River Linux (WRL) is a commercial distribution for embedded development. This is a fully-fledged distribution that presents few constraints to the application developer: a large number of architectures and platforms are supported; a complete set of system libraries is provided; and the distribution is fully modularized allowing the slimmest possible build for minimal hardware requirements.

WRL comes with all the advantages of a fully paid-for development environment. Security audits are continuously performed by competent staffers and patches are promptly made available when vulnerabilities are discovered. Full integration with the Eclipse IDE is provided, complete with second-to-none profiling and debugging tools that should make the developer more productive. When all else fails, professional support is available in several forms ranging from email assistance to taking full charge of the development of your device.

Why choose WRL? You value the professional support and assistance in bringing up your device that a paid-for distribution can provide. Your customers put a premium on the security that comes with running a brand-name software stack. You are comfortable working with Eclipse and covet high-quality development tools for profiling and debugging.

OpenEmbedded and MontaVista Linux
OpenEmbedded is in many ways comparable to Wind River’s offering: it targets a multitude of devices on a multitude of architectures. What sets OpenEmbedded’s offering apart is that it is community developed and aligns closely with Open Source software development ideals. For the experienced developer, OpenEmbedded’s unofficial support via mailing lists and IRC is fully sufficient; the strength of Open Source software lies in the community and OpenEmbedded’s aim is to leverage it fully.

OpenEmbedded is a full Linux distribution, providing the standard set of system libraries that one expects in a Linux system. Components ranging from low-level utilities to full-blown desktop technologies are readily available, and components can easily be added for the case where you want to include your own proprietary offerings. Hardware support is extensive, especially for popular commodity hardware; however, be prepared to bring in a local kernel developer to help you get your specialized device running.

For those wary of “community” projects, comfort can be had in the fact that MontaVista Linux 6 is a derivative of OpenEmbedded. Added value is provided in the form of “recipes” for building typical configurations; pre-built binaries and toolchains; and mirrored source code repositories to ensure extended availability. MontaVista Linux fills a niche for when your customers want “brand-name” and you want “community.”

Why choose OpenEmbedded? You want a “free” distribution, both in terms of license and cost; you enjoy working with the community; and you value having a full Linux distribution at your disposal. OpenEmbedded can give you the feeling of “do-it-yourself” without having to sacrifice support in the form of a community to fall back on.

The selection of a distribution will not make or break your project; yet, the decision is one with which you will have to live for the lifetime of the product. The happiness of your developers, the support requirements of your organization, and the extensibility of your product are the parameters at stake in the decision. A professional assessment of your project may assist you in making the best decision that takes into account the interests and values of all implicated parties. This is where the services of consultancies such as South Pole find their niche in the distribution ecosystem!

Asterisk and Record-Route

We recently ran into problems with our Asterisk-based telephony whereby suddenly a forwarded connection through our SIP provider would be disconnected after 20 seconds. Traces showed that our Asterisk proxy was not respecting the Record-Route header and responding to ACK’s to the endpoint and not to the proxy.

It seems that this has been a known issue in Asterisk for a while and a fix (if I am interpreting the mailing lists correctly) was merged in January 2008. As such, we upgraded our Asterisk installation to version 1.4.18, and presto, problem solved.

There’s still no explanation as to why this problem suddenly appeared, why this ever worked at all, and our SIP provider swears they’ve changed nothing; nonetheless, this little issue aside, Asterisk has provided us with a robust telephony solution which, running on a WL-500g, has a small overall footprint and which we’d recommend to anyone. Low cost, robust, and fully-featured; hats off to the Asterisk community who are doing their part in making free software ubiquitous.

FOSDEM 2008

At the last minute, I made it down to FOSDEM in Brussels. This was my first developer conference and I can only say that it exceeded my expectations in every way. It was wonderful to be in a place with so many like-minded people and to validate what I guess I already knew anyway: free software is in good hands and the future looks bright.

Kudos to the organizers and I hope to see you again in 2009.

Debugging a Kernel Oops

After an Al Viro post to LKML, I made an effort to check out kerneloops.org. Inspired by Al, I picked an oops and gave it a go.

With four reports, __lock_acquire looked interesting: unable to handle kernel paging request at virtual address fffffffa. Additionally, it appeared at first glance to follow the same pattern as the problem outlined by Al, so it seemed to be a reasonable starting point Call trace leading up to __lock_acquire as follows:

[] lock_acquire+0×78/0xa0
[] journal_start+0xcd/0×100
[] journal_force_commit+0xd/0×30

The requested address fffffffa (-10) is, as in the case outlined by Al, close to zero and would seem to indicate an ERR_PTR() pointer. Checking errno-base.h, we see that this would seem to indicate ECHILD; however, there does not seem to be any way to get ECHILD from the functions in the call trace.

So if we are not dereferencing an ERR_PTR(), what is happening here? Consider the following:

struct ex {
some_t* a;
some_t* b;
}

struct ex* handle;
some_t* member;

handle = NULL;
member = &handle->b;

Event though handle is set to NULL (invalid memory reference), it is still valid to reference the address of one of its members. In this example case, the struct member b is offset sizeof(void*) from the start of the struct, so member will be set to NULL + sizeof(void*).

This is what is happening in this kernel oops.

err = start_this_handle(journal, handle);
if (err < 0) {
jbd_free_handle(handle);
current->journal_info = NULL;
handle = ERR_PTR(err);
}

lock_acquire(&handle->h_lockdep_map, 0, 0, 0, 2, _THIS_IP_);

Looking in jbd.h, we can see that the member h_lockdep_map has offset 20 from the start of the handle_t struct that it is encapsulated in. Thus, the value of handle is -10 – 20 = -30, which is -EROFS; this error can, in fact, be returned by start_this_handle, so this is a reasonable source for the oops in question. The fix, of course, is trivial: don’t call lock_acquire after setting handle with ERR_PTR.