PermaStore

What is the PermaStore project?
The PermaStore system was developed to address emerging needs in stateless programming, particularly for the World-Wide Web. CGI programming presents a significant challenge to developers of interactive applications: CGI processes are independent programs, lacking any model for shared data. The nature of interactive web sites requires that the users perceive a sense of data consistency within and even between web sessions -- imagine an airline reservation system in which the user must log on and recite their travel plans at each step of the process!

The PermaStore system is designed to be a more robust programming model to meet the needs of emerging applications, and to facilitate further evolution of web-based programming. In particular, the most critical aspect of software migration is the preservation and compatibility of existing data. This research addresses these issues by focusing on three primary goals:

  1. Provide a robust data storage model that enables programmers to rapidly develop applications without expending unnecessary effort to store and retrieve complex data structures;
  2. Sufficiently dissociate data I/O from application logic, to facilitate migration between rapidly-evolving programming environments; and
  3. Maintain or improve the performance of data operations, as perceived by the end user.

How does programming with PermaStore differ from typical CGI programming?
The figure below depicts a traditional data flow model for file-based software. Internal data structures that must exist only for the lifetime of the current process may be passed around in their native format (typically using pointers, references or handles). Data that must be available to other programs or that must otherwise persist beyond the lifetime of the current program, are passed through I/O routines that convert between the native data structure and a "serialized" stream of bytes which can be written to files. These I/O routines are usually dependent upon the application's data format, and must be developed for every data structure to be stored.

Because CGI processes are short-lived but increasingly require persistent data, a more efficient data model is needed. This advanced model should separate application logic from the function of storing and retrieving state data. The following figure depicts the PermaStore data flow. Applications utilize a data-format independent I/O client that provides serialization of arbitrarily complex data structures, as well as a language- and location-independent communication protocol for storage and retrieval. This communication is implemented over standard TCP/IP sockets, and a fine-tuned data server ensures reliable and efficient persistence.

What are the advantages of the PermaStore architecture?
This client/server model has many advantages. First, it enables the programmer to focus on a proper data model, rather than spending countless hours programming and debugging file I/O routines. If a hash table of pointers to arrays is appropriate to the application data, then the developer should not be burdened with writing functions to span the data structure in order to store and retrieve the necessary information. Second, a storage management process can be built and optimized specifically for CGI environments. Such a process may be tuned to the application at hand, or provide general-purpose performance enhancements such as data caching, prefetching, and distributed-protocol semantics. Third, if necessary, the storage mechanism can reside on hardware dedicated to that purpose rather than relying on the sometimes insecure and often overloaded web server.

Where can I find out more information?
The PermaStore system was developed by Michael Stangel under the guidance of Professor Donna J. Brown. A postscript version of the thesis is available here. Professor Brown can be reached at djb@uiuc.edu.