As embedded systems requirements grow and development cycles shrink, developers increasingly integrate commercial application programming interfaces (APIs), or collections of functions that the software tool publisher provides for utilizing the tool's features within an application. Programmers choose these pre-built libraries rather than code the required functionality by hand. Common examples are communication, messaging, database, and user-interface libraries. Such "middleware" APIs offer benefits in terms of convenience, portability, productivity and time-to-market.
But such libraries also often carry the risk of introducing destructive and devilishly hard-to-find programming errors. This risk stems from the manner in which commercial APIs are implemented. The software functions comprising APIs are nearly always data structure ignorant. Through their use of void pointers to pass data between the API library and the application program, they handle data without "knowing" the type of data on which they are operating.
However, the potential to create an API that catches a much wider range of programming mistakes, and reduces the API learning curve to boot, is built into the C++ and C languages. It is possible to create a programming interface that is data-aware, and thus self-diagnostic, by exploiting the function argument type-checking ability of every ANSI C/C++ compiler. C/C++ continues to grow as the preferred embedded systems development environment, therefore any improvement based on the environment's inherent capabilities has wide applicability.
Data management is often a core application requirement, and a number of commercial database APIs have emerged to meet this need while addressing embedded systems' performance and footprint requirements.
Historically, database SDKs have offered pre-defined, static programming interfaces to the services offered by the database. For embedded systems, most of these APIs are navigational, with functions that sort, store and retrieve data while navigating through the contents of a database one record at a time. That a developer must learn such a database library to complete a task has generally been viewed positively, or at least neutrally: while an API presents a learning curve that can increase project time, such memorization will potentially be useful in future projects. The common expectation is that this API can address virtually any type and organization of data.
Yet one significant downside is that for a pre-defined database function library to be able to manage data of any database definition, its interfaces must ignore the type of all data. In other words, the database programming interface must treat the data as opaque, or un-typed data. In plain English, the database library cannot know if company, people, network node, sensor, highway or any other specific type of information is being read or written from/to the database. The programming interface can only know that some data is being written.
To accomplish this, all such databases use void pointers to pass data between the database library and the application program. A void pointer is a C/C++ language program variable that can legally point to any type of data. Void pointers are what are called un-typed which, exactly as the name implies, have no type.
Having no type, neither the C/C++ compiler nor the database runtime can perform any validation on them. This opens the possibility for programming mistakes deriving from passing a pointer to the wrong type of data. The consequences of such mistakes range from nonsense data in the database to a corrupted (unusable) database to a crashed program.
The consequence of an error in coding the function arguments will result in the database run-time putting data into a location in the database that it was not intended for (e.g. putting make data into a place the database has designated for model data). At best, this causes gibberish to be stored in the database. Worse, the database runtime could try to read beyond the end of the program's stack and cause a memory violation (i.e. crash).
Reading data from the database entails other risks. An attempt to read data N bytes wide into a program variable that is less than N bytes wide will cause the database to overwrite random locations in memory. Critical data may be overwritten (such as the program call stack), causing a crash. It is also possible that important database run-time structures will be overwritten and lead to database corruption.
How easy is it to introduce errors? In fact, such mistakes slip into code with great frequency through the labor-saving practice of cutting and pasting blocks of code. Any editing mistake related to a void pointer, whether it is passing a pointer to the wrong host program variable, or passing a pointer to which insufficient memory has been allocated, is undetectable by the compiler or middleware. Regardless of the type of error, using void pointers to pass data strips the C/C+ compiler and middleware run-time of their capability to detect mistakes. The effort to correct these types of errors varies from minimal to monumental.
The self-diagnostic API
The potential to create a better database API one that catches such programming mistakes, and reduces the API learning curve to boot has existed since function prototypes were first introduced to C and C++ in the 80's: Create a programming interface that is data-aware, and thus self-diagnostic, by exploiting the function argument type-checking ability of every ANSI C/C++ compiler.
Function prototypes are the "signature" of functions. A function prototype declares the name of a function, the number of arguments (parameters) to the function, the data type of each argument, and the data type of the return value of the function. If the actual use of a function doesn't match its signature, the compiler will emit an error message and the offending code must be corrected before the program can be successfully compiled.
Exploiting the modern ANSI C/C++ compiler's capability for function prototyping requires us to abandon the old ideas that a database programming interface must be a static library of functions that a programmer learns and then applies to every possible database design. Instead, the programming interface must be specific to each database design, and therefore aware of the data types of each particular database. In other words, the only way for the database function that populates a model record to mandate that model information is passed by the programmer is if that interface is derived from, and specific to, the database design in which model participates.
McObject's eXtremeDB, an in-memory database system (IMDS) for embedded systems, shows how a self-diagnostic API can be applied to embedded systems middleware. eXtremeDB has a small, static API for universal tasks (opening and establishing a connection to a database, beginning and ending transactions, etc.). However, the majority of the interface the portion concerned with populating, searching, and reading data-s generated dynamically from the database definition.
eXtremeDB database users describe the database using the eXtremeDB database definition language (DDL) typed into a text file. A compiler, mcocomp, processes this DDL, validating its syntax and, if there are no errors, generating .c and .h files that developers include in their application projects. The .c and .h files define the programming interface for that unique database.
Within the generated files are function prototypes (.h file) and implementations (.c file) to create, search, and read every type of class and index that was defined by the database designer. Each interface is purpose-specific for a certain data element and operation; therefore, the element's type is accounted for in the interface definition.
eXtremeDB also builds on the advantage of exploiting the ANSI C function prototypes by providing a developer edition of the database library that includes extensive (and configurable) run-time checks for other types of programming mistakes that cannot be detected by function prototypes, such as an attempt to use a handle to an object outside the scope of a transaction, or to use a transaction handle that is not valid.
Intuitive interfaces lead to greater programmer productivity in the beginning stages of the project, and extend through the life of the software. Maintenance programmers that come onto the project find it much easier to read and understand functions, versus the non-descriptive code based on obscure, static programming interfaces.
While a new interface emerges for each project, very simple rules govern its generation and use. Mastering the basics of generating and using this type of API can provide an even more powerful and flexible "tool for life" than learning the 100 to 250 functions of a static middleware API.
By Steven T. Graves, President and CEO, McObject LLC,Issaquah, WA