Real-time Embedded Linux and POSIX RTOSs For Microcontrollers (MCUs)

Monday, July 23, 2007

gdb and multicore are mutually exclusive

I've spent the last few days looking at debugging technology for multicore chips with a view towards real-time embedded with volume applications. In this mix I'm including FPGA solutions which will soon offer the capability to have many cores on one chip, particularly given the Altera C to H compiler and NIOS as well as the offerings of their main competitor Xlinx.

My initial look for debugging solutions is clearly to have a debugger which supports many processors and has a simple user interface. After the last Multicore Expo it was clear that heterogeneous processors were necessary and that it was highly likely that collections of networked and shared memory processors would exist as part of a complex multicore processor.

First, I think the case where many cores are on a chip requires some hardware assist on the chip to support debugging. Multicore register access and a stop all cores capability will be added to most chips but this is largely unsuitable for many applications. Emulator people will be happy but real applications will have to run partially with some ability to control the rest of the application to support debugging. Emulator type solutions could do this but with a severe performance penalty that gets worse as the number of cores grow (using traditional approaches).

To do this type of debugging, a kernel must be running on the chip and there has to be some common debugger interface that talks to the system under test. The connection could be multiple or a single multiplexed connection - it makes little difference in most applications. The debugger on the other hand, must understand the processor type that it is debugging and talk to the common graphical user interface to provide debugging support across all the cores.

I was looking at gdb to solve some of these problems and it seems that it might do a half a job in the case where all the cores run some huge kernel, but it is completely unsuitable in the case where many independent cores running a small kernel are communicating to solve a problem.

For a start, it is highly complex for absolutely no reason. The remedy debugger meets almost all criteria to do this with 10% of the bulk of gdb and far less complexity. What went wrong here? How did the debugger get so large and complex with no apparent value added? It is not that remedy is much less functional, as a matter of fact, it offers support for 8 processor types and the user can select any one of them for any core on the fly. Doing a port for a new processor is much simpler too - the disassembler is separate, the register understanding and memory/stack unravelling is done with a few simple routines which are easily understood and the symbol table info is relatively standard.

It would be great if someone could provide some insight to me and others on this issue. I don't relish porting remedy to a raft of new processors that gdb already supports but it seems like it will be less work and better functionality in the long run.

I should add here that this is for this specific case that gdb seems less than attractive. For the case of debugging a single core, it does the job. The downside is that it likely takes months to do a new processor type instead of a few weeks because of all the extra complexity.