Software Re-Engineering – Screen-Scraping Techniques

Screen Scraping

Screen scraping is the act of capturing data from a system or program by snooping the contents of screens displayed by the legacy application. It often refers to parsing the HTML in generated web pages with programs designed to mine out particular patterns of content. Essentially, it is an ad-hoc technique that is very likely to break on even minor changes to the format of the data being snooped. The middle-tier requires some tool to capture the mainframe screens and to communicate with the transaction manager (e.g., CICS, IMS/DC).

 

Some such tools available in the market are from PeerLogic / Critical Path (erstwhile uniKix). This process does not require any modification to the legacy mainframe source programs at all, nor does it require any special product to be used on the mainframe side. The HTML pages generated at the middle-tier from the mainframe screens can be modified to improve the look-and-feel and also to incorporate some additional logic at the client-end. Additionally, a number of existing legacy screens may be combined together at the HTML side to achieve consolidation at the presentation layer.

 

Screen Scraping is an attractive solution since it can be achieved without any changes at all in the mainframe side. However, it potentially has higher maintenance overhead because any changes made to the mainframe screens need to be applied to the corresponding HTML pages as well.

 

Web Enabling involving “wrapping”

Wrapping involves building a set of objects, called “wrappers”, around the existing application. The legacy application, encapsulated this way, can be deployed in a multi-tier environment with middle-tier support for either COM or EJB based architecture. Wrapping ensures that the existing application logic of the legacy system is not altered in any way. New applications can interact with the legacy system via the APIs provided by the wrapper. There are existing tools in the market (e.g., MERANT Net Express) for building wrappers as well as deploying the wrapped legacy applications in a multi-tier environment.

For example, MERANT Java support enables you to send messages to Java objects from Object COBOL programs and classes. Java is supported through a Java domain in Object COBOL. The Java domain enables you to declare Java classes inside a COBOL program. You can then send messages to the Java classes. You can also send messages from Java classes to COBOL. The Java domain support works by creating a COBOL proxy object for each Java object. The class itself that you declare is a proxy for the static methods of the Java class.

Wrapping is generally useful for retaining complex and field-proven business logic at the cost of reduced effectiveness in terms of deployment in a multi-tier architecture.  The maintenance overhead may not be significant although cross-platform skills would be ideal for supporting such transformed systems.

Data re-engineering / Re-architecture

This involves re-organization of some of the existing business logic residing on the mainframe, and moving them to the middle-tier. This requires transforming programs to a new structure where files are replaced by message queues and "business functions" are more clearly separated as components. In the process, the application is partitioned between the legacy environment and the middle-tier. There are tools like CGI, MQSeries, which are used for communicating between the mainframe and the middle-tier application servers.

 

This naturally requires a reasonably in-depth analysis of the existing application to decide the distribution of the business logic and associated data. Although time-taking this provides a much better architected solution, which is amenable towards future evolution of technology.

 

Componentisation

 

 

 

 

Back to the LinksMultiple eBay Auction | eBay Shopping home page

Copyright 2005 LinksMultiple - all rights reserved. No part of this information may be copied or reproduced without prior written permission.