Step 1: Collect information about the source code
At the end of this step, you should know how:
- To find the components required for the analysis
- To check if they are correctly defined in order to be processed by the Mainframe Analyzer
- To identify the characteristics of the COBOL program
You should also have information about:
- The location of all required components
- The different types of COBOL programs
- The databases accessed by the programs
- The list of potential missing components
In Mainframe environments, source files do not have extensions. An extension must be added during or after the file transfer from the host machine to a Windows server. It is useful to use dedicated extensions for each type of source file. The following table shows the extensions you can use:
COB or CBL
CPY or COP
CICS CSD flat file
CSD, CICS or TXT
In addition, it is very useful to isolate the source files in dedicated folders as soon as possible. The following figure shows an example of a source file tree:
A source file tree example
If this is done, the Delivery Manager Tool could be used more easily to collect the whole source code or specific parts required for the analysis.
The Delivery Manager Tool, known as DMT, is introduced here. Please refer to this for more information regarding features of DMT.
Distributing the source code
The larger the application you need to analyze, the longer the analysis can take to complete. Large applications and consequently long analysis times also increase the risk of memory related errors occurring during the analysis. If this is the case it may be necessary to distribute the source code in several packages in the Delivery Manager Tool that will lead to the creation of multiple Analysis Units in CAST Management Studio, according to the recommendations listed hereafter:
- Place all JCL components for analysis in the same package where possible even if multiple packages contain the same JCL components (see 5 below)
- Always include copybooks in packages that also contain COBOL programs.
- COBOL programs can be distributed over multiple packages - CAST will then automatically associate linked objects. It is difficult to estimate the size of COBOL programs because the expanded version will be analyzed - as a result, you should try to distribute the COBOL programs evenly across the available packages.
- CICS files can be placed in their own package.
- For applications that contain COBOL programs, JCL components and IMS databases together, IMS files can be placed in one separate package, but it is preferable to place the COBOL programs and the all JCL components in the same package (see 1 and 2 above) - if you split the COBOL programs over multiple packages (see 3 above), you should place all the accompanying JCL files in all packages that contain COBOL programs.
- Identical objects taken into account in multiple packages associated to the same Analysis Service (formerly known as the Local/Knowledge Base) will be merged.
- AU 1: all IMS files
- AU 2: all JCL files and first part of COBOL program files and all copybook files
- AU 3: all JCL files and second part of COBOL program files and all copybook files
- AU 4: all CICS files
All packages must be associated to the same parent Application if you want links between objects in the different associated Analysis Units to be detected.
Components that are taken into account
The COBOL analysis forms the main part of the Mainframe Analyzer. Consequently, it is important to find all COBOL components and to qualify the COBOL source code properly.
The COBOL language analyzer is the most important part of the Mainframe Analyzer. You need to find the following information in order to know how many jobs to create and which parameters require activation:
- COBOL type
- Code format
- Source files location
The COBOL Analyzer is based on the COBOL ANSI 85 standard and several extensions (refer to the Cast AIP Release Notes to find out which extensions are taken into account). You need to know this before running the analysis. The customer can provide you with this information.
Initially, COBOL is a strict code-formatted programming language. A source code line is based on five areas:
1 through 6
Used for comment line, continuation character, debug line…
8 through 12
Program division and section names; first level data definition; procedure division section and paragraph names
13 through 72
Program clauses; second level (or more) data definition; procedure division statements
73 through 80
There are extensions available which make it possible to avoid this strict column format. The Left area and the Right area disappear and the developer can use more than 60 columns for his code (in area B). This is the case, for instance, with Microfocus COBOL.
You need to search for the source files: where are the programs and where are the copybooks? It is quite common for the files to be located in several directories. For instance, you could have a directory for the batch programs, another for the transactional programs, another for the subprograms, another for the common copybooks, another for the transactional copybooks…
It is important to look at these directories and to check their content by answering the following questions:
- Is it really and only COBOL code?
- Are they programs or copybooks?
- Which column number is used for the Indicator Area?
- Do they have the Right Area?
Occasionally, component source files implemented in other programming languages than COBOL are hidden among COBOL source files. When you try to analyze them, the analyzer will produce syntax error messages or will ignore them. If you find this type of source files then you can change the file extension (for instance, "NOT-CBL"). As a result, they will not be analyzed. You will probably find them after the first analysis.
The COBOL Analyzer needs to know the Indicator Area column number. You can easily find this information by opening a program source file and searching for the '*' column in comment lines. If there is no comment line then look for the "IDENTIFICATION DIVISION" statement. The Indicator Area column number is the column number of this statement minus 1.
It is possible that program source code does not have the Right Area. You can verify this by opening a source file. If there are statements after column number 72 then you will need to check the corresponding option ("End Column No Comments") in the analyzer wizard ("Options" page).
The following piece of code shows Microfocus COBOL code without the Left area and Right area. The Indicator area starts at column number 1:
The COBOL terminal format.
If programs have different Indicator Area column numbers or if some programs use the Right Area and others do not, then you will have to analyze the programs in different Analysis Units configured with specific parameters and options.
Partitioned Data Set (PDS)
If the source code of your COBOL programs is stored in Partitioned Data Sets (PDS), you need to extract all MEMBERs that correspond to the COBOL programs you want to analyze and place them in a flat file. In order that this flat file can be divided up (on the Windows machine) into as many files as there are COBOL programs, it is important that each MEMBER is preceded by a banner that contains its name (ex: "ADD MEMBER=myprog", where myprog is the name of a member). The DMT tool provides a specific extractor allowing to get component source code through PDS dump files.
The following image shows an example of a JCL file that can extract MEMBERS stored in a PDS and then save them in a flat file. This JCL must be configured according to the norms used in the execution environment.
JCL for extracting MEMBERs to a flat file
A COBOL-based application can have a batch processing part. This step explains how to identify batch components and how to verify the completeness of the source code. The Mainframe Analyzer takes into account JCL for the IBM z/OS platform only.
Batch processing allows you to run large groups of tasks (which can invoke programs) without any human interaction. A task is implemented by a JCL running a sequence of programs. A JCL source file can include procedure source files which contain common commands.
A JCL is a sequence of commands (named cards such as JOB, EXEC or DD) and is composed of one or more steps which run a program and assign resources such as data files, database environment… The first thing to do is to look for the source files for JCL and for procedures. You can use a GREP tool to find them. A JCL source file starts with the "JOB" card and a procedure source file starts with the "PROC" card. The second thing to do is to find out if the JCL includes procedures. Calls to procedures are made through the "EXEC" cards. Use your GREP tool to find these cards. There are two categories of "EXEC" cards:
- EXEC <procedure_name>
- EXEC PGM=<program_name>
The first allows the inclusion of procedures into a JCL and the second allows a program to be called (which is not necessarily a COBOL program).
JCL source code example
If you find the first category of "EXEC" cards then it means that there are procedures used by the jobs. You must then make sure you have the corresponding source files. If not, then you can ask the customer to give you the missing procedures.
You will inevitably find the second category of "EXEC" cards because the main goal of a JCL is to run programs (utility, technical and application programs). If you build the list of called programs then you will be able to check if it does not miss any. However, you will have to distinguish the technical programs from the others. It is often easier to run the analysis and to check any unresolved programs after the analysis has completed.
A COBOL-based application can have a transactional part. This step explains how to find out usages related to transactions and helps the reader to identify the corresponding components. The Mainframe Analyzer can analyze on-line programs developed to run on the CICS environment. It works with several types of information:
- Macros embedded in the program source code
- Screen definition
- CSD table
You can easily find out if the programs use CICS commands by scanning the source files with a GREP tool to find embedded macros. These macros look like the embedded SQL macros and start with the "EXEC CICS" string.
If there are CICS programs then there are generally also screen definition files (except if the screens are managed by another software layer). These files contain BMS code describing the screens (named maps). You can find them by using your GREP tool and searching for the "DFHMSD" string.
There is a similar concept on IMS DC environment. Screens are described in MFS files. The MFS language is a macro language like BMS but it can not be analyzed by the CAST CICS analyzer. So be careful if you hear the word MFS. In addition, if the customer uses IMS DC, then it is not appropriate to ask him BMS files…
BMS map source code example
Finally, if there are CICS programs, then it is useful to also have the CSD table. It allows the creation of CICS objects and more particularly links from transactions to programs.
The CICS Analyzer currently works with the definition of the CSD resources and not with a dump of them. This should contain the following statements (you can see an example in the Figure 8):
- DEFINE TRANSACTION()
- DEFINE PROGRAM()
- DEFINE TDQUEUE()
- DEFINE FILE()
- DEFINE MAPSET()
This resource definition can be delivered via a copy (into an ASCII flat file) of the script used to define the CICS environment or by using a JCL in order to extract this information from CICS. The following JCL is shown as an example. If you want to use it, then you have to configure it according to the norms used in the execution environment and then you must specify the lists or the groups of CICS objects to collect (please note that you cannot specify a list and a group on the same command). The application owners and/or CICS administrators are best placed to help you retrieve the information you need.
Example of JCL code to extract the CSD
The resulting flat file must look like this:
CICS CSD flat file example
The goal of this step is to explain to the reader how to find information about accesses to databases made by COBOL programs and to indicate which different files the Mainframe Analyzer needs.
A COBOL program can access several types of database:
- Relational (e.g: IBM DB2)
- Network (e.g: CA IDMS)
- Hierarchical (e.g: IBM IMS/DB)
The Mainframe Analyzer can detect the accesses made to a DB2 or an Oracle relational database and to an IMS/DB hierarchical database. However, these detections are not based on the same types of information.
DB2 and Oracle
If the COBOL programs access a relational database like DB2 or Oracle through embedded SQL (queries are delimited by the "EXEC SQL" and "END-EXEC" strings), then it is necessary to analyze it in the same application. The CAST Management Studio will automatically perform database analysis before programs analysis. Links will be drawn when analyzing programs. As such, it is important to check if the COBOL code contains SQL queries before defining the Analysis Units.
If not, then the links will not be created in the Analysis Service. However, it is possible to draw complementary links by using the "Dependencies" tab of the Application editor available in CAST Management Studio.
Please read the following Cook Books related to database analysis for more information about that subject:
- "CB - 007 - Oracle Database Analysis", section 3.6
- "CB - 008 - DB2 LUW Database Analysis", section 3.5
- "CB - 010 - ASE and SQL Server Database Analysis", section 3.3
- "CB - 015 - DB2 for zOS Database Analysis", Step 1
IMS database structure is directly analyzed by the Mainframe Analyzer through the database definition files (DBD). Without these files, you will not be able to analyze these databases. The IMS/DB accesses are made via embedded macros starting with "EXEC DLI" (in the CICS context only) or more generally through a technical sub-program called "CBLTDLI". In addition, main programs making access to IMS contain the entry point "DLITCBL" at the beginning of the PROCEDURE division. So if you want to know if a program is an IMS program, then you should find out one of these syntaxes:
- EXEC DLI … END-EXEC
- CALL "CBLTDLI" USING …
- ENTRY DLITCBL USING …
If you find one or other of these syntaxes in the COBOL source code then you can be sure that programs make accesses to IMS.
The Mainframe Analyzer does not take into account the IMS/DC transaction manager. Accesses to this transaction manager are also made via the CBLTDLI technical sub-program and associated to PSB files. It is not really possible to know if a program works with IMS/DC or with IMS/DB by looking at the source code. It is necessary to look at the PSB associated to the COBOL program. If this PSB contains a first PCB that is not typed as a DB PCB, then you can be sure that the program makes access to IMS/DC.
If the program works with IMS/DB, then you need to obtain the DBD and the PSB files associated to the IMS databases accessed by programs. You can search for the DBD using the "DBD[ ]+NAME=" string.
Figure 9 - DBD source code example.
You can search for the PSB by using the "PCB[ ]+TYPE=" and "PSBGEN" or "PSBNAME" strings.
PSB source code example.
If you do not have the PSB and the associated DBD then it will be impossible for the Mainframe Analyzer to recognize the database elements accessed by the programs and it will not be able to create the corresponding links. The analyzer will indicate these missing components in the log file/window.
Step 2: Create the application in the CAST Management Studio
Analyzing mainframe applications can be done easily by using CAST Management Studio. Following steps describe how to do this in the Regular mode. A separate section will explain how to configure analysis settings in Expert mode.
The first thing to do in CAST Management Studio is to have an application through which you can analyze the mainframe source code. If your application has a database part, then you can also add the corresponding Analysis Unit into the CAST Management Studio application object.
Either the application already exists and you only have to add mainframe source code packages, or it does not and then you must create it. To do that, you must either right-click on the Application tree-view and select the Add Application item or push the associated button:
The Add application dialog appears and allows you to create the new application in CAST Management Studio. You must set a name to your application and decide if you want to manage its content directly from CAST Management Studio or if you want to register it in AIC Portal.
Click the Next button to define the first version of your application. The following panel is displayed.
You can now click the Finish button, the CAST Management Studio launches the DMT window to manage your application and its content.
Step 3: Create source code packages
|The Delivery Manager Tool, known as DMT, is introduced here. Please refer to this for more information regarding features of DMT. In this cookbook, we will only explain how to use the general features in a Mainframe analysis context.|
In the version editor, you can rename the version you have just created and you can start adding package to your application. To do that, you have to click on the Add New Package button. A wizard appears to help you to define package. The first thing to do is to decide how the source code must be collected:
To configure a package for the Mainframe Analyzer, you can choose either the Mainframe item in the "Select a vendor specific repository" section or Files on your file system item in the "Others" section.
Extracting source code from a PDS dump file
If you selected the first item above, the following panel appears in the wizard. The DMT will propose you the parameters allowing to carry out this type of extraction. Click the Finish button when you are ready to configure the extraction.
The package configuration editor appears to allow you to specify how to populate your new package. You can there set the name of the new package and the parameters related to the extractor:
- the location of the PDS dump file
- the type of source files to generate (note that only one type of members is supported)
- the left part of the banner (note that this one is based on 2 parts: a left part or prefix that is a static text and a right part that corresponds the member name)
- the left margin to take into account when extracting lines from the PDS dump
- the number of columns to extract from the PDS dump
Extracting source code from files or directories
If you selected the Files on your file system item, that is certainly the most common way to deliver source code, the following panel appears in the wizard. The DMT will propose you the parameters allowing to carry out this type of extraction. Click the Finish button when you are ready to configure the extraction.
The package configuration editor appears to allow you to specify how to populate your new package. You can there set the name of the new package and the location of the source code. It is necessary to define a root folder and, if necessary, to specify filters to include or exclude specific source files from extraction operation.
Generate the version
Once the packages are defined, you can generate the version content by clicking the button Package button in the tool bar.
|It is also possible to generate each package separately by clicking the Generate Package or Version button available in the toolbar when you are working at the package level.|
When the operation is complete, you can go to the Package Content tab of each package to see the different types of files that have been collected by the extractors. It is especially interesting to verify if unexpected files have been found or if all the expected files are there.
If the different packages defined for the version correspond to what you expected, then deliver the version and quit the DMT. In case you decided to split the application source code in several parts, you can also go back to the version you created before to add another package.
Accept the version
When you are back to CAST Management Studio, you have to accept the packages you created and to consider the new version as the current one. These operations are done through the Delivery tab of the Application editor.
The following operations will be performed:
- set the version you created as the new "current" version
- copy the source code into the deployment source code repository
- create the Analysis Units for the projects that have been found
- perform a discovery of the code to propose a first configuration of the analysis settings
Step 4: Configure analysis settings
If you want to configure any special analysis settings, then you must use the "Advanced" or the "Expert" mode of CAST Management Studio. You can change the mode by choosing it in the following tool bar:
The "Advanced" mode and "Expert" mode will allow you to access more settings:
Advanced configuration (Inference Engine)
If you experience performance, call resolution, or quality rule computation issues, various technical settings related to the Inference Engine can be used. The Inference Engine is heavily used by the Cobol Analyzer to carry out several tasks:
- Building of the paragraph call graphs through the different types of calls allowed by the Cobol language (GO TO, PERFORM, fall through logic)
- Resolving dynamic calls executed through variables
- Computing intermediary results used by some Quality Rules
These settings can be changed in very specific situations for which default parameter values are not sufficient. This can occur when programs to be analyzed are complex. Here, "complex programs" does not necessary mean "large programs" with a high number of lines of code. Instead, it means programs implementing a large control flow graph involving a high number of paragraphs. In this case, results can be partial and it is possible to force the analyzer to generate better results by changing some technical settings. Nevertheless, it is important to keep in mind that the more we force the analyzer to provide accurate results the more the performance decreases.
The Inference Engine default settings are defined in the Mainframe Analyzer itself, but can be changed by specifying different values in the Process Settings section of the Production tab of the Application editor.
The main option allows you to deactivate the Inference Engine in the Mainframe Analyzer (by default the Inference Engine is always activated). This will directly impact the building of the paragraph call graph and dynamic call resolution operations. As such, CAST does not recommend deactivating the Inference Engine.
Other options are all related to the COBOL Analyzer. Note that their value must be greater than 0.
The String Concatenation options allow you to limit the number of different strings that will be found during the search of each object called dynamically. This can be associated to the different paths in the program logic leading to the variable use, in which a value is moved into the variable (see illustration below). As mentioned above, limiting the number of strings can lead to incomplete results but improves the performance of the Inference Engine. If you want to get more accurate results (if programs are too complex for the current settings), then you will have to accept a performance reduction. Nevertheless, this option should be changed in very rare occasions where a large number of objects can be dynamically called via the same CALL statement.
Possible values for a variable
The Procedure Call Depth options are used to limit the number of steps that the Inference Engine will follow in order to obtain the value stored in a variable. These steps are parts of the control flow graph reduced to paragraph calls (see illustration below). The higher the number of potential steps the quicker the limit of this option is reached. As a consequence, some links to external objects (programs, CICS transactions, ...) will not be drawn. However, increasing the value of this option will force the Inference Engine to follow lengthy paragraph call paths but will consume more time.
Paragraph call path
If you need to change the value for this option, then CAST recommends doing so in several iterations - by increasing the value by (for example) 10% each time.
It is important to understand that there is no general rule regarding the tuning of the Inference Engine. It depends on the complexity of the logic implemented in programs and then each application has its own characteristics and requires specific settings. Nevertheless, what is certain is that changing the Inference Engine settings impacts both accuracy and performance. The table below shows an example of such changes with associated results for the analysis of a complex program:
Procedure Call Depth
Analysis runtime (in seconds)
Dynamic links drawn
The CyclicCall.*, OpenInLoop.*, and UninitialisedVariable.* options target the intermediary results calculation for the following Quality Rules, respectively:
- Avoid cyclic calls with PERFORM statements
- Avoid OPEN/CLOSE inside loops
- Variables defined in Working-Storage section must be initialized before to be read
These options can be tuned by respecting the above recommendations related to the LimitString and LimitSubTarget options. The CyclicCall options impact the size of cycles to be searched for, the OpenInLoop options impact the number of nested paragraph levels involved in loops, and the UninitialisedVariable options impact the size of the paths in the control flow in which variables can be assigned.
Before changing the settings of the Inference Engine, the first thing to do is to try to estimate the gain you expect versus the impact on the analysis performance. This is the case for applications with a small number of large and complex programs.
On completion of an analysis containing COBOL programs, the Mainframe Analyzer will display details in the log about the percentage success rate for the various link resolution operations carried out by each of the Inference Engine parameters, for all analyzed programs. This information will help you better configure the Inference Engine for future analyses.
For example, the following log output indicates that all the dynamic calls and the paragraph call paths have been resolved but that the detection of cyclical paragraph calls via the PERFORM instruction was only successful in 80% of the cases. As such, the CyclicCall."String Concatenation" and CyclicCall."Procedure Call Depth" values could be raised slightly.
COBOL dynamic call resolution and paragraph call graph : - 100% on string concatenation - 100% on procedure call depth Paragraph cyclic calls detection : - 80% on string concatenation - 100% on procedure call depth - 83% on local procedure complexity Uninitialised COBOL variables detection : - 100% on string concatenation - 97% on procedure call depth - 100% on local procedure complexity OPEN in loop detection : - 100% on string concatenation - 100% on procedure call dept
Step 5: Run analysis
To perform analysis and to generate a snapshot, you can directly go to the Execute tab of the Application editor and click the Take a snapshot of the application action. You can also execute an analysis without generating a snapshot (click Run Analysis Only) or without saving results at all (click Test Analysis). This last action can be useful the first time you set up an application analysis and you want to test your analysis implementation before running the complete operation.
The following wizard appears and allows you to set a date, a name, and an application version to the snapshot. Click the Next button to go to the next step of the process:
If the source code has not been analyzed yet, then click the Finish button to run the analysis. If you already performed the analysis and you only want to generate a snapshot, then check the Skip Analysis Job check-box:
CAST Management Studio is going then to run required operations to generate the snapshot. At the end of the process, you could have a look to the log files by selecting a task that generated a log (there is a "Yes" tag in the Progress column) and by clicking on the "Log file" link.
Then, the log viewer will appear with the expected log file:
You could also see the results in Enlighten and in the CAST Dashboard.