| Multi-Thread Debugging | |||
This is a list of multithread debugging projects, which
will run on AIX.
Publications:
Abstracts from a Special Issue of: The Journal on Testing and Debugging of Distributed and Parallel Software
High Performance Debugging Forum Archive (by Thread)
General Programming Concepts:
Writing
and Debugging Programs -for AIX
Developing Multi-Threaded Program Debuggers.
Writings on Posix by Ulrich Drepper
This page serves
provides mainly access to some of the documents I have written.
Description of the POSIX signal model
The POSIX signal model isn't simple. To get the kernel people to look at the issues I wrote this summary.
Requirements of POSIX thread on the kernel
Written to answer the often asked question about the requirements on the kernel to get a POSIX compliant thread implementation. The main requirement, that a M-on-N implementation is needed, turned fortunately out to not be true. Therefore all the extra complexity involved which scheduler activations fortunately are not needed.
Native POSIX Thread Library (PDF)
This is the white paper describing the actual implementation of the thread library for which the requirements are spelled out in the previous document. Several things are very different, including the most fundamental ones. This means we learned something during the implementation.
Update 2002-1-29: I've updated some of the text and added measurements.
POSIX is actually is pretty modular standard. A basic set of interfaces always has to be present but the rest is optional. These optional features are organized in option groups which are listed and briefly described in this document.
Quite an old document as well, this describes the design of an debug environment which is useful in very distributed and uptime-critical environments. If an error has to be analyzed the first time it happens this might be a solution.
Parallel Code
The Parallel Tools Consortium
The Parallel Tools Consortium (Ptools) is a special-interest group that
brings together tool users, developers, and researchers with the goal
of improving the
usability and availability of parallel tools.
Ptools has three primary roles:
1.Ptools provides a forum for interactions involving tool users, developers, and researchers Creates opportunities for dialog between tool users and tool developers to identify user needs and how tools can be made more responsive to them. Promotes discussion and technical exchanges among tool researchers and developers from different organizations
2.Ptools promotes the development and dissemination of usable tools Encourages and facilitates projects to develop parallel tools that respond to particular user needs and can be made freely available on multiple computer platforms. Assists the dissemination of parallel tools by publicizing information on their availability.
3.Ptools serves as a liaison with other special-interest groups and standards efforts. Provides input on behalf of tool users and developers to groups defining standards that relate directly or indirectly to parallel tools. Communicates information about standards and other developments of interest to tool users, developers, or researchers.
Welcome to the home page of the High Performance Debugging Forum (HPDF). The goal of HPDF is to define a useful and appropriate set of standards relevant to debugging tools for high-performance computers. HPDF efforts focus on parallel systems being used for research and production in the HPC arena. I am sorry to say the HPD reference implementation is a dead project due to lack of funding. However, I believe Etnus (www.etnus.com) has incorporated as much as they could of the standard into the new command-line interface to TotalView 4.0 .
The Consortium was established in November 1993.
Etnus provides commercial debugging software. They provide free trail software at their website.
University of Illinois Parrallel Processing LaboratoryDownloads
Libraries and Algorithms
Standard Library for Parallel Programming
Reusable Libraries and Software Engineering for Parallel Computing
What is Converse ?
Converse is an interoperable, message-driven parallel runtime
system. New parallel programming paradigms often face the difficulty of
user acceptance. Converse mitigates this difficulty by allowing
multiple paradigms to coexist in a single parallel application.
Converse is a component based portable run-time system, that allows
easy implementation of
run-time systems of novel parallel languages. It provides support for
user-level threads, communication, shared memory and a collection of
useful run-time libraries. Several languages/parallel paradigms have
been implemented using Converse, and demonstrated to interoperate in
parallel applications.
What is Charm++ ?
Charm++ is a parallel extension to C++ developed at the parallel programming laboratory for the past several years. Charm++ is a data driven (actor like) language. It supports three kinds of parallel objects: chares (individual actors), chare groups, and chare arrays. The execution of objects' methods is triggered by availability of messages (method invocations), under the control of a prioritized scheduler. The System supports automatic load balancing. This is a mature and robust programming system, using which several major applications have been developed.
Charm++ offers a variety of parallel debugging options, from the extremely basic to the extremely sophisticated. The traditional debugging methods, such as logging (via the CkPrintf routine) and interactive debugging (via the "++debug" command line option) are supported under Charm++.
In addition, Charm++ offers several additional features designed to
simplify application development. Linking with "-memory
paranoid" checks all dynamic heap allocation calls for common
errors, such as double-delete, random-delete, read-after-delete, buffer
over- and under-write errors. Charm++, when compiled without
"-DCMK_OPTIMIZE", contains hundreds of assertations to catch
invalid parameters and unintialized data passed to API routines.
We are working on a sophisticated parallel debugger, with the ability to set breakpoints, examine variables, objects, and messages across the entire machine. The design of this debugger is described in the paper below.
Debuggers
A Thread Debug Interface (TDI) for Implementations of the POSIX Threads
TDI - A Thread Debug Interface for Pthreads Implementations is an enhancement of the popular gdb debugger for extensive thread debugging. It handles FSU Pthreads, MIT Threads and LinuxThreads.
GDB, the GNU Project debugger
GDB, the GNU
Project debugger, allows you to see what is going on `inside' another
program while it executes -- or
what another program was doing at the moment it crashed.
GDB can do four main kinds of things (plus other things in support of these) to help you catch bugs in the act:
Start your program, specifying anything that might affect its behavior. Make your program stop on specified conditions. Examine what has happened, when your program has stopped. Change things in your program, so you can experiment with correcting the effects of one bug and go on to learn about another.
The program being debugged can be written in C, C++, Pascal, Objective-C (and many other languages). Those programs might be executing on the same machine as GDB (native) or on another machine (remote). GDB can run on most popular UNIX and Microsoft Windows variants.
The GNU Project was launched in 1984 to develop a complete Unix-like operating system which is free software: the GNU system. (GNU is a recursive acronym for "GNU's Not Unix"; it is pronounced "guh-noo".) Variants of the GNU operating system, which use the kernel Linux, are now widely used; though these systems are often referred to as "Linux", they are more accurately called GNU/Linux systems.
Detail: Debugging
with GDB
------------------------------------------------------------------------
trivial gdb
- Trivial GDB (tgdb)
is a library for making front ends
to GDB using a simple API. Included with tgdb is a lightweight, but
fully functional curses front end called cgdb.
------------------------------------------------------------------------
- Thread-GDB - Extend the multi-thread capabilities of the gdb debugger.
This targets POSIX threads implementations, although attempts to remain neutral with respect to the threads implementation. Creating a robust environment for thread debugging is the goal.
Development Status: 1 - Planning [Filter]
------------------------------------------------------------------------
DDD - Data Display Debugger
GNU DDD is a graphical front-end for command-line debuggers such as GDB, DBX, WDB, Ladebug, JDB, XDB, the Perl debugger, or the Python debugger.
Besides ``usual'' front-end features such as viewing source texts, DDD
has become famous through its interactive graphical data display, where
data structures are displayed as graphs.
BASH
with debug (rebash)
and more
- These contain patched sources to bash that enable better debugging support as well as improved error reporting.
One additional feature is a customizable timestamped history.
In addition, this project contains the most comprehensive source-code debug.
Ganglia
- Ganglia is a scalable distributed monitoring system for\ high-performance computing systems such as clusters and Grids.
It is based on a hierarchical design targeted at federations of clusters. Supports clusters up to 2000 nodes in size.
Development Status: 5 - Production/Stable
AIX Performance Monitor
AIX system performance monitor tools.
Displays or records the AIX kernel performance parameters. This the 'monitor' program by Jussy Mäki and Marcel Mol with new features (support for RRDtool) added.
libCapsiNetwork
- libCapsiNetwork is a C++ network library to allow fast development of server daemon processes.
Development Status: 4 - Beta
Recycle Logs
- A bain of Systems Administration work is managing log files: system log messages (from syslogd), mail log messages (from sendmail), printer spooling error logs ...
This is a configuration-based program to help manage recycle files of this sort.
Development Status: - Production/Stable
Programming Language: Perl
socat
- socat is a relay for bidirectional data transfer between two independent data channels. Each of these data channels may be a file, pipe, device (terminal or modem), socket (UNIX, TCP, UDP, IP6, raw), a file descriptor, a program etc.
Development Status: - Production/Stable
AIX Perl Modules
- The AIX:: name space. Perl modules that deal with AIX specific issues.
Development Status: - Alpha
|
|||