Research > MS Thesis > Introduction

Chapter 1. Introduction

Subject of the Study and Motivation

In this study I propose a new approach to developing data-intensive web-based applications, which is based on automatic code generation. I examine this approach to determine whether it simplifies and improves the development process of these applications.

The motivation for this thesis is based on my personal experience as a web developer. Having implemented numerous data-driven web-based applications, I noticed multiple recurring patterns in their implementation, which demonstrated a significant amount of repetitive work. Examining current research in this area revealed that researchers and developers alike are faced with a similar problem: many scholars agree that “web developers spend a significant amount of time for the construction of typical, standardized software modules” (Milosavljevic, Vidakovic, & Konjovic, 2002, p. 1).

Thus, the primary motivation for this study became the search for a way to simplify the development process of data-intensive web-based applications.

Research Hypothesis

This study examines the concept of specifying and automatically generating the code which is used for accessing the data, located in the system’s data storage component, such as a relational database.

A review of current research and similar code generation systems has shown that most approaches require the developer to specify the data access functionality in detail – i.e., each operation in particular. The main argument of this thesis is that, contrary to existing research, specifying the data model of the application using a modeling approach similar to the entity-relationship model (Chen, 1976), is enough for automatically generating most of the required data access functionality.

The data access functionality, generated based on the data model alone, certainly, will not be an exhaustive listing of all possible operations on this data; however, in my opinion, it will represent the most commonly used functionality in data-intensive web-based applications.

Therefore, I hypothesize that it is possible to build a code generator which will significantly improve development of data-intensive web-based applications by generating at least 50% of the data access code based on a specification of the application’s data model. To test this hypothesis I have built such a code generator and used it in developing the data layer for several “real world” applications.

Significance of the Thesis

This thesis offers the following contributions to current research:

  1. A code generator which has been successfully tested and is currently used in web application development to generate data access code for the Microsoft .Net / SQL Server platform. The generator produces SQL code as database-level code and in c# or VB.Net as application-level code.

  2. The data definition language used to describe the data model of an application is a simplified and modified version of the entity-relationship model and has been successfully tested with the code generator.

  3. Generating data access code without explicit specification is, to my best knowledge, a novel approach in this field.

Structure of the Thesis

In Chapter 2, I will provide an overview of the research problem: I will briefly describe data-intensive web-based applications, as well as some implementation issues, caused by recurring functionality requirements. I will review automatic code generation as a possible solution to these issues, after which I will examine several existing code generators and approaches to modeling the data layer of such applications.

In Chapter 3, I will describe the study’s methodology. I will discuss the target application’s architecture, after which I will define the scope of the code to be generated through examining the data layer implementation and abstracting common functionality. I will conclude with describing the experiment itself, which includes implementation tasks, such as describing the data model, defining rules for data access methods, and constructing the code generator; and testing tasks, such as applying the system to generating code for real applications and measuring the results of using this approach.

In Chapter 4, I will describe the results of the implementation and testing tasks listed in the previous chapter. I will introduce the simple XML-based data definition language I created to describe the application’s data model, after which I will define the common data access methods generated by my system. After providing a brief overview on the code generator’s implementation and its process model, I will examine the experimental part, including numeric data on how useful the code generator proved to be in terms of the amount of generated code and how much of that code was actually used by the application.

Chapter 5 will interpret the results of the experiment. I will discuss the major findings, and whether this code generator improved the development process of data-intensive web-based applications. I will also discuss the lessons I have learned through this experiment, which include a review of limitations of this code generation approach and the numerous tradeoffs I had to make, as well possibilities for future improvements of such systems and research in this area.

 
contact me
blog
research
about