Effective Use of MVS Workload Manager Controls
[ Table of Contents
| Previous | Next
]
Deciding how to handle started tasks is probably the most difficult part
of preparing an installation's WLM service definition. There are several
approaches, each with its own advantages and disadvantages. They are summarized
as follows:
-
Collect started tasks into a small number of similar groups.
-
Classify started tasks to individual Service class.
-
Do not even bother classifying any started tasks.
-
A combination of the previous approaches.
The first option is the best one and it is strongly recommended that installations
choose this option for controlling their started tasks while in MVS Workload
Manager goal mode. The other options are discussed for completeness.
It is probably very easy to list really important started tasks, and also
to list medium-importance started tasks, and then accept that the rest
are not worth listing separately.
Creating 3 or 4 such classification groups of all started tasks is a
very legitimate approach. Probably the installation would want the bottom
group to just have a discretionary goal. Then it could assign increasing
velocity goals to the other groups, based on the installation's compatibility
mode RMF reports. (To obtain the velocity from the RMF reports, the installation
might change its IEAICSxx member to assign a unique report PGN using this
same classification group structure).
Using the ISPF WLM application, the installation's STC classification
rules can simply refer to a named group of started tasks (transaction name
group), or it could list tasks individually if the installation intends
to use report classes to obtain extra RMF data for any individual started
task.
The highest group probably contains tasks like JES and VTAM. If an installation
prefers to let SRM just handle this group of tasks "well", then it should
consider letting these run in the SYSSTC service class. For example, suppose
a transaction name group is called HI_STC, and it contains VTAM, JES, RMF,
etc. It could also contain the SYSTEM address spaces listed earlier, or
it is ok to not even bother including them. Next, suppose the installation
created a group called MED_STC as well, containing the group of less-favored
started tasks.
The following example of an ISPF WLM classification panel for STC shows
how the high started tasks are allowed to run in the SYSSTC service class:
Subsystem Type . : STC
Description . . . Classification Rules for Started Tasks
-------Qualifier------------- -------Class--------
Action Type Name Start Service Report
DEFAULTS: ________ ________
____ 1 TNG HI_STC__ ___ ________ ________
____ 1 TNG MED_STC_ ___ VEL35___ ________
____ 1 TN * ___ DISCRETN ________
The default service class is left blank. This is an indication that
the installation wishes some started tasks to be handled with the SYSTEM
or SYSSTC service classes described under the topic "System Goals". The
first rule assigns no service class to transaction name group "HI_STC",
so that group inherits the blank default. In other words, the started tasks
in group HI_STC run in the SYSSTC service class, or if they are the special
system address spaces, they will be run in the SYSTEM service class.
The second rule assigns the started tasks identified in the MED_STC
transaction name group into a service class called VEL35. And the last
rule says everything else is classified to a service class called DISCRETN
which has a discretionary goal. SRM recognizes that this last rule should
not apply to the standard system address spaces (like GRS, DUMPSRV, MASTER,
WLM, etc) and will continue to run them in the appropriate SYSTEM service
class.
Though not done in this example, the installation could choose to classify
any of the system address spaces explicitly by name into a service class
just like any other started task. This means the classified address space
would be treated according to the goal specified in that service class.
That would involve less favorable treatment than just leaving the space
in the SYSTEM service class, so this is not recommended.
(As an aside, DUMPSRV is handled differently than the other SYSTEM spaces.
Its access to expanded storage will be managed via a space available policy,
rather than least recently used).
Again, an installation can define a classification rule for any of the
system started tasks or address spaces by classifying the address space
with an explicit classification rule. That is, the classification rule
must specify more than just a complete wild-card or 8-character mask.
In summary, this first approach was to collect the installation's started
tasks into 3 or 4 groups, letting the more critical tasks run with default
treatment, and using different velocity or discretionary goals for the
other groups.
This is an important point. For many years there has been tremendous
energy spent trying to achieve the optimal ordering of a dispatching priority
queue. That was very important when all systems were uni-processors. This
effort is unnecessary now because of significant multiprocessing and changes
in the dispatcher to ensure work at a given priority does not monopolize
the CPU compared to other work at the same priority. Therefore more work
can be clustered together and given a single dispatch priority without
worrying about potential lockouts. This is a cornerstone of the success
of the goal mode algorithms.
That brings us to another approach for handling started tasks. With this
approach, each major started task is put into its own service class, similar
to the way many accounts today have each STC in its own Performance Group.
Thus an installation could have service classes with names like VTAM, JES,
LLA, VLF, etc.
One drawback is that this is more complicated for the installation to
control. It would not be uncommon for the system programmers to continually
spend effort measuring and adjusting lots of different velocity goals,
and trying to determine what impact, if any, the system will incur when
changing a service class's goal from a velocity 43 to a velocity 42 goal.
(ps -- What a waste of time!)
A second drawback to this approach is that each additional service class
uses more storage to hold accumulated data. This is true for SRM, WLM,
RMF, and SMF type 72 and type 99 records.
Another drawback to this approach is that SRM will be spending time
trying to adjust these many service classes to achieve whatever velocity
goal is assigned to them. SRM will help only one service class each time
it assesses how well the sysplex is meeting the installation's specified
goals. If many service classes exist for individual started tasks, SRM
could spend several intervals trying to handle individual started tasks
rather than addressing a problem facing the on-line or interactive work.
Finally, as mentioned earlier, with only 1 address space in a service
class, SRM will depend on a longer time interval for obtaining the number
of samples that can justify changing resource allocations. This means SRM
will not be as responsive to fluctuating circumstances as it could be if
more address spaces were combined in a single service class.
Suppose an installation's WLM service definition does not have any classification
rules for started tasks. SRM will recognize certain system address spaces
and put them into the pre-defined SYSTEM service class. And RMF will provide
its usual data for that service class just the same as for any service
class defined.
All other started tasks will be put into the SYSSTC service class, and
kept at a high dispatching priority. This is nice for JES and VTAM, etc.
If any given started task is recognized as serving on-line work with a
goal (CICS or IMS transactions), then that address space would be managed
as necessary based on the OLTP goal.
This approach has the advantage of simplicity. But not all started tasks
are well-behaved. So having them all in SYSSTC could allow a CPU-intensive,
or unstudied, started task to use a large amount of processor cycles. If
the processor is lightly loaded, or in a 6-way, 8-way, or 10-way MP, this
might still be desirable because that one task will not affect the ability
of the remaining processors to attack the important work with goals.
This option is not seriously recommended for a regular production environment,
since many started tasks cannot be trusted. But it demonstrates that in
certain environments, it is not necessary to bother with velocity goals
& classification for started tasks. A TSO, Batch, DB2 production sysplex
has been running in goal mode for several months, supporting hundreds of
users, using this approach of not classifying started tasks.
Of course, the installation could take this simple approach and still
classify a few "horrible" started tasks to a discretionary goal, leaving
the rest to be handled by SRM's defaults.
Finally, if using this approach, an installation could still gather
the traditional RMF data for any given started task by writing a classification
rule that just assigns a report class, not a service class, to that started
task.
It is of course possible to combine the above approaches, so that many
tasks are allowed to run in the default SYSTEM or SYSSTC service classes,
many others are combined into a group with a given goal, yet a selected
number of address spaces are explicitly classified to a specific service
class.
Prior to MVS/ESA SP 5.2 and DB2 Version 4 Distributed Data Facility, DDF,
all the DDF transactions ran within the DDF address space at the same dispatch
priority. If the DDF address space was assigned a high dispatch priority
to help benefit the OLTP DDF transactions, this high priority also benefited
the low importance 'batch-oriented' type DDF transactions. In essence,
these low importance transactions got 'a free ride'. If the DDF address
space was assigned a low dispatch priority to limit the CPU consumption
of the low importance 'batch-oriented' DDF transactions, the low priority
also limited the CPU consumption of the OLTP DDF transactions.
DB2 V4 DDF takes advantage of a new function in MVS/ESA SP 5.2 called
Enclaves. Enclaves enable MVS to manage and report on DDF transactions
by providing the DDF transactions an anchor to which they could be managed.
This could be thought of as being similar to the way an address space is
an anchor for other types of transactions (ie. TSO, batch, etc.). These
enclaves allow the DDF transactions to be managed separately from the DDF
address space. This support also allows these transactions to be managed
separately from one another.
Enclaves allow the WLM to manage individual DDF transactions by allowing
DDF threads to be assigned to service classes. Thus, DDF transactions of
differing characteristics can now be associated with different workload
manager goals. Deciding what goals to assign to enclaves for DDF transactions
will be very dependant on what type of DDF work is being processed.
As with most other types of work, all goal types are appropriate for DDF
transactions.
One of the largest benefits of enclaves and being able to assign goals
for DDF transactions is that DDF transactions can now transition through
multi-period service classes. Any query has the potential to scan thousands
of rows and merge multiple tables. This may mean that the same singular
select type query run on two different occasions may require more or less
CPU service depending upon what rows and tables the query went against.
By assigning these DDF transactions to multi-period service classes an
installation has more control how the query will be managed.
An installation my choose to lower the importance of a query the longer
it runs, or set a response time objective for short DDF transactions that
do not consume over a certain amount of service. This would also allow
an installation to solve the problem described earlier regarding OLTP DDF
transactions vs the low importance 'batch-oriented' DDF transactions. These
two types of transactions can not only now be managed away from the DDF
address space, but they can now also be managed separately from one another.
Determining the duration to set to DDF transaction service class periods
can be tricky. Installations have been able to figure out over time how
to set durations for TSO period to get '80% complete in period 1' via trial
and error. A similar sort of discovery will have to be done for DDF transactions.
These duration values will vary from installation to installation depending
on if an installation's DDF queries are small or large, and whether they
are a critical part of the installation's workload.
[ Table of Contents
| Previous | Next
]