Search

Effective Use of MVS Workload Manager Controls

[ Table of Contents | Previous ]

10.0 Setting Goals for APPC Work

MVS Workload Manager recognizes APPC work which has been scheduled by IBM's APPC scheduler (ASCH). Therefore, an installation classifies this work in the WLM application using rules under subsystem type ASCH.

If an installation uses a different APPC scheduler, consult that product's documentation for information on classifying its transactions.

10.0.1 Treat the ASCH address space as an system started task

Follow the guidelines in this paper to allow the ASCH address space to be assigned the SYSSTC service class. That is, it should be part of the transaction name group HI_STC.

This will ensure that the scheduler can quickly process requests for new APPC transaction programs.

10.0.2 Use a single period, velocity goal for APPC transactions

An APPC transaction begins when the ASCH address space schedules a transaction program. That program is then started by an APPC initiator address space. This concept is very similar to the JES subsystem scheduling a batch job. The APPC transaction ends when the program returns to the initiator. This is similar to the completion of a batch job. Just as there are different profiles for batch jobs, there are different types of APPC transaction programs.

If an installation is sure that all APPC transaction programs process a single network request and then complete, the service class for these transactions could contain multiple periods with response time goals. But many APPC transaction programs are not structured so simply. Many keep a permanent, or lengthy, network connection and repeatedly process individual conversations across that network.

Therefore the transaction program continually consumes more and more service. If the APPC service class contained multiple periods, the transaction would fall to the last period and remain there. Assuming the last period's goal were less desirable than first period's goal, users connected to that APPC transaction program would experience decreasing performance.

These long running transaction programs may stay connected to the network for an indeterminate time period. Therefore a response time goal is inappropriate and a velocity goal should be used. (3)

In summary, unless an installation is sure that an APPC transaction program is started and ends for each network interaction, it is best to assign APPC transactions to single period service classes with velocity goals.

11.0 Setting Goals for OpenEdition/MVS Work

OpenEdition/MVS (OMVS) is a new environment for many MVS customers. Controlling this work through MVS Workload Manager is not difficult, but it has implications across three subsystem types --- STC, OMVS, and TSO.

11.0.1 Treat the OMVS Kernel and OMVS processes as all other started tasks

The basic recommendation for setting goals for the OMVS Kernel and OMVS daemon processes is to follow the guidelines this paper has outlined for assigning service classes to started tasks. The OMVS Kernel should be treated as a high priority started task and should be classified to the SYSSTC system service class. The OMVS daemons should be treated the same as other medium started tasks. OMVS daemons are long-lived processes that run to perform continuous or periodic system-wide functions such as network control. A goal such as the one recommended for the MED_STC service class discussed in the started task section of this paper should be very adequate for these OMVS daemon processes.

11.0.2 Use multiple periods and response time goals for OMVS forked children

Some of the OpenEdition/MVS work runs within an APPC initiator. These transactions are called OMVS forked children. The footnote in the APPC section of this paper mentioned that an APPC transaction program can tell MVS Workload Manager about it's transactions. OMVS does this, so an installation can establish goals for these transactions. An installation classifies this work under subsystem type OMVS.

Many, but not all, OMVS transactions use few resources. This is similar to the normal experience with TSO transactions. Therefore, a structure of a service class with multiple periods is appropriate for OMVS work, to reflect a different goal for CPU intensive work compared to interactive work. The first period should have a response time goal. This is consistent with other places in this paper recommending that interactive work be given response time goals.

The duration to be associated with first period will need to be determined iteratively, such that a predetermined percentage of transactions are considered trivial and complete in first period.

The last period should contain a velocity goal to describe the installation's view of long running and/or CPU intensive OMVS forked children. A low velocity goal should be given to this period, with an importance to compare this goal with others in the system.

An installation could choose to insert another period with a response time goal between the two discussed above. This would only be valuable if the installation has a very significant amount of OMVS forked children and wants more distinction between trivial and complex transactions.

11.0.3 Revisit goals for first period TSO

A TSO user signals the intent to submit OpenEdition work via the OMVS command. The ensuing OpenEdition work may run in the TSO address space (in the case of built-in commands) or may run in an OMVS APPC initiator (in the case of forked children). All the built-in commands are processed the same as any other TSO work, according to the goals associated with TSO's service class.

Each command which results in a forked child originates as a TSO transaction, so that command is initially associated with a goal according to TSO user's service class. When the forked child begins to execute in the OMVS APPC initiator, a new transaction begins and is managed per the prior discussion, according to the OMVS service class goals. The original TSO transaction is suspended and remains, most likely, in first period and no longer accumulating service, but gathering a longer and longer response time.

Some of the OMVS transactions initiated via TSO can run for a long time, resulting in high average response times for TSO first period. Therefore, when the OpenEdition interactive environment is established through TSO, the installation should revisit the goals associated with first period TSO to ensure that a percentile response time goal is used rather than an average response time goal.

When OpenEdition work is initiated directly into OMVS, via rlogin rather than from a TSO command, there is of course no TSO transaction created initially.

12.0 How to use Workloads

For MVS Workload Manager, a workload is nothing more than a collection of service classes that an installation would like grouped together for an RMF report. Most installations today freely use the term 'workload' to refer to a collection of their work -- whether that is Production, Test, and End_User; or Banking, Shipping, and Inventory.

In the first example, Production might include some OLTP work, some TSO work, and some batch jobs. That is, very different types of work that collectively are the Production workload. Since the types of work are so different, they would require different goals. So this customer would create a service class for the OLTP work, a service class for the TSO work, and a service class for the batch jobs, and have all three service classes combined into the production workload. RMF will provide reports for each service class and summarize information for the workload.

In the second example, it is possible that the workloads are just different CICS transactions targeted to separate CICS regions. In this case, the customer should probably create unique service classes to define goals for the Banking, Shipping, and Inventory work. This will allow RMF to report on the units that are of interest to the account. But the installation will associate those service classes with some new "OLTP" workload.

In summary, the only purpose of a workload for WLM is to allow RMF to automatically summarize transaction data and resource usage data for related service classes. RMF will provide reports alphabetically by service class, within a workload.

13.0 How to use Report Classes

RMF will generate type 72 records for every service class and for each workload. A report class can be used to either aggregate data from multiple service classes, or to receive a report on some part of a service class.

There is no benefit in assigning a report class to mirror the exact contents of a service class, since RMF already provides data for each service class. Similarly, there is no benefit in aggregating work into a report class exactly identical with the aggregation into a workload. But suppose a customer has a Production workload of 3 batch service classes, a TSO service class, and an OLTP service class. The 3 batch service classes could all be aggregated into a BATCH report class.

Another use of a report class is to isolate some part of a service class.

For example, suppose a customer wants a unique type 72 record for each CICS region. Since with CICS/ESA V4.1, each region will be managed to meet the goals of the transactions it is running, it makes no sense to create individual service classes for each of these regions. Instead the customer should classify the regions to unique report classes, since the only objective is to obtain selective reporting.

14.0 Considerations for Using Resource Groups

A resource group is a named collection of work for which an installation wants to control CPU access. An installation can guarantee work will have access to a given amount of CPU cycles per second, or restrain the work from using more than a given amount. As stated earlier, constraining processor cycles available to work could be in direct conflict with the setting of service objectives for the same work. Since resource groups introduce these conflicts, it is expected that most customers will not use resource groups. However, there are several environments that are appropriate for their use.

14.0.1 Resource group maximums can override goals.

Assume a certain department or organization has purchased an amount of processing capacity. The installation will want to deliver that amount but no more to them, while still setting goals for the transactions originating from that group. The work could all be associated with the same service class, or could be several service classes combined into the same resource group.

The following table summarizes the experience of associating a resource group with a collection of TSO users all in the same service class. The service class had 3 periods, as follows:

Period 1: 0.4 seconds avg response time, importance 2
Period 2: 1.0 second avg response time, importance 3
Period 3: 5 seconds avg response time, importance 4

In three separate measurements, the resource group was constrained to a maximum of 2000, 1500, or 1000 service units per second.

Table 1. Resource Group Maximums Limit CPU Consumption Per Second

TSO Period CPU SU/sec PI CPU SU/sec PI CPU SU/sec PI

1 883.2 0.3 728.0 0.3 505.4 0.4

2 307.6 0.6 245.4 0.7 159.1 1.0

3 586.4 0.4 517.7 12.2 298.2 33.0

Total 1777.2 1491.1 963.7

Group Max 2000 1500 1000

As the table shows, periods 1 and 2 continued to maintain their goals even with a very constrained resource group maximum, while third period suffered because its goal was the least important.

14.0.2 Resource group constraints can apply to OLTP regions.

For years installations have known that service units are accumulated into the RMF records or SMF type 30 record for OLTP regions, regardless of the type of transactions running in those regions. This is still true with MVS Workload Manager, even though the installation can now specify response time goals for the transactions. Since the CPU service units are accumulating to the regions, it is possible to assign a resource group minimum or maximum capacity to a collection of CICS or IMS regions.

This could be an effective way to constrain the CPU usage of those subsystems to a predefined level of capacity. Possibly that would be valuable for a test subsystem.

14.0.3 Resource group maximums cannot be used as a replacement for RTO.

Resource group maximums cannot be used as a replacement for the old Response Time Objective function in the IPS. Each addresses a different environment. RTO was used to induce some swap-in delay when a small amount of work had excess processor capacity available to it. A resource group maximum is used to induce processor delay when some work has greater demand than the maximum specified. Enough work must exist to use the maximum allowed before capping due to resource group maximums will be experienced.

14.0.4 Resource group maximum is different than RESET QUIESCE.

A resource group controls access to the CPU. If a group specifies a maximum of 1 and work in the group wants to use the CPU, SRM could make every task in the group non-dispatchable over 98% of the time. But this would not force an automatic swap-out. The RESET command with the QUIESCE option described earlier is a way to force a specific swappable address space to be swapped out, or to move a non-swappable address space to the bottom of the dispatching queue. That non-swappable space could continue to use any CPU resource not used by other work.

14.0.5 Resource group minimums can allocate equal access for discretionary work

Suppose an installation has three departments or organizations that individually could run enough discretionary work to monopolize the processor. The installation may want to ensure that each department has "its fair share", or at least some occasional access to the processor.

If all the work were run just as discretionary, it is conceivable that for some period of time, only the work from department "X" was running. The installation could have the three departments' work associated with three separate resource groups, each specifying a minimum service rate, and SRM would ensure that if possible each would have its minimum appetite satisfied.

14.0.6 Resource group maximums can artificially lower processor capacity

IBM's parallel sysplex has addressed a long-standing customer concern over the significant jump in CPU capacity following an upgrade. For those installations still wishing to reserve some of the upgrade's capacity, a resource group could be created for a CPU intensive piece of work that is guaranteed to soak up a given amount of service units per second.

SRM will then have an artificially limited amount of processor capacity remaining to use in trying to achieve the goals for other work.

15.0 Summary

This paper has described the MVS Workload Manager function introduced with MVS/ESA SP V5, and has described some of the changes within MVS to support simple statements of customer goals for work. In addition, the paper has made many recommendations for customers to consider before planning a migration to goal mode. What remains is the question most frequently received at all presentations about WLM:

15.0.1 Can WLM do as good a job as a system programmer?

This depends heavily on how much time an installation is able to invest in tuning its SRM parameters and how fast the installation can react in the event of a performance problem. With Workload Manager, the MVS operating system will react immediately when goals are being missed, and work aggressively on the most important ones.

15.1 References

"MVS/ESA Planning: Workload Management" - IBM GC28-1493
"Preemptible SRBs: Understanding, Exploiting, and Managing Them" - John Arwe, IBM, CMG '95 Proceedings

15.2 Acknowledgements

The authors wish to thank the following people for their constructive review and suggestions for this paper: John Arwe, Cathy Eilert, Steve Grabarits, Cheryl Watson.

Footnotes:

(3) An APPC transaction program could choose to tell MVS Workload Manager about it's individual transactions. Then an installation could establish response time goals, and even multiple period controls, for the transactions according to that product's conventions.

[ Table of Contents | Previous ]