COPYRIGHT (C) 1984-2021 MERRILL CONSULTANTS DALLAS TEXAS USA
MXG NEWSLETTER FORTY-SEVEN
*********************NEWSLETTER FORTY-SEVEN ****************************
MXG NEWSLETTER NUMBER FORTY-SEVEN, June 5, 2005.
Technical Newsletter for Users of MXG : Merrill's Expanded Guide to CPE
TABLE OF CONTENTS
I. MXG Software Version.
II. MXG Technical Notes
III. MVS Technical Notes
IV. DB2 Technical Notes.
V. IMS Technical Notes.
VI. SAS Technical Notes.
VII. CICS Technical Notes.
VIII. Windows NT Technical Notes.
IX z/VM Technical Notes.
X. Incompatibilities and Installation of MXG.
See member CHANGES and member INSTALL.
XI. Online Documentation of MXG Software.
See member DOCUMENT.
XII. Changes Log
Alphabetical list of important changes
Highlights of Changes - See Member CHANGES.
COPYRIGHT (C) 2005 MERRILL CONSULTANTS DALLAS TEXAS USA
I. The Annual MXG Version, 22.22, dated February 1, 2005, was sent
on CD-ROM to all sites by Feb 2, 2005.
1. The current version is MXG 23.05, dated Jun 5, 2005.
See CHANGES member of MXG Source, or http://www.mxg.com/changes.
II. MXG Technical Notes
2. SAS Clones
I. WPS from WPC - THIS NOTE WAS WRITTEN IN 2005.
READ MORE RECENT NEWSLETTERS, AS MUCH HAS CHANGED SINCE THEN
a. A purported clone of the SAS System, the WPS, "World Programming
System", from World Programming Corporation, is, in my opinion,
still only vapor-ware. Several postings to MXG-L have questioned
MXG support for WPS, and MXG users have been told many things by
IBM sales reps; IBM is now an authorized reseller of the product:
- "80% of SAS code is supported"
- "most PROCs are supported but not all proc statements"
- "We are due a new version in third quarter that will have quite
a few of those PROCs you have asked about, except we will not
have PROC REPORT, CHART and CALENDAR by then. If you are
looking for MXG support, we should be able to help by around
end third quarter as well."
- "WPS provides the functionality of SAS BASE utilizing the SAS
Language. WPS in some situations may be a viable alternative
to SAS Base."
- "Existing mainframe SAS applications can be executed under WPS
and in many cases there will be no need to change your existing
applications at all."
And the company's home page states:
- "Note that MXG is not CURRENTLY supported, check back soon".
Several USA sites told me they expected to install WPS in Feb/Mar,
but none have received the product.
I first became aware of WPS in August, 2004, when an IBM contact
asked if I would consider supporting a SAS clone. At that time
I had reservations, but after an hour-long conversation with Sam,
the President of WPC, I made him this offer:
- Send me your product to install. If it works perfectly to run
MXG, then I will be morally obliged to announce that fact to my
MXG users; while I have a strong allegiance to SAS Institute and
its products, I have a stronger allegiance to the thousands of
individuals who have based their careers on MXG Software, and I
would be wrong to remain mute if there existed an alternative
product that they should consider.
- However, if your product fails to execute MXG properly, then I
have no obligation to tell you why it failed.
- He chose not accept my offer.
b. The ONLY possible reason for replacing the SAS System would be
for cost savings, but it was very unclear that there are any
savings, especially in the short term, based on these 2004 price
quotes from IBM:
-One site was told WPS's price was "no more than 1.5 times the
cost of SAS." The site's quote was $600,000 for a site license
on a "Class K" system, for which a Base SAS license is $38,000.
-One site currently paying $250,000 US for Base SAS site licenses
was quoted $1,250,000 list price for WPS acquisition, although
IBM suggested that might be discounted by 20%. The annual fee
for maintenance (2nd year?) would be 20% of that list price,
($250,000 annually), with no discount possible. And those were
for the "Enterprise License"; the MSU-based license price was
well over $2,000,000.
-But now, apparently having read earlier versions of this article,
IBM has corrected their prices cited above, and have new claims.
The site paying $250,000 annually was told their cost now to
acquire the product would be $170,000 for an 18 month license (so
they would have overlap with the existing SAS License!), and that
maintenance would be $34,000 annually. According to the IBM
announcement, purchase would be from IBM, but maintenance would
be via a separate contract with WPC, rather than with IBM, so it
is no longer an IBM product with a single interface!
But these were only quotes, not actual contracts presented for
review, nor were terms provided (does maintenance include new
versions, or are they priced upgrades, etc.), and prices can
always be changed, and discounts may be offered; if you actually
have a license and can provide your price and terms, I'll gladly
update this note with actual facts rather than these sales quotes.
The difference between a used car salesman
and a computer salesman
is that the used car salesman KNOWS when he's lying.
c. Legality of a clone. There is no legal issue for creating a
product that processes SAS statements, as SAS statements have been
legally defined to be a language. The SAS database architecture
and the code implementation is protected. The original WPS
product was written java and uses their own proprietary database
architecture, providing even further legal isolation.
Historical note: in the 1970s, Vanderbilt University folks used
the Source Code that was then distributed with the $100 SAS tape
to attempt to clone SAS for their unix; that copyright case is
cited as the clearest example of copyright infringement, with
entire chunks of SAS source code, and with the same spelling
errors in comments, found in their clone's code.
d. In early 2005, my comments on their java implementation stated:
The WPS product is completely written in Java with their own data
store. Sam acknowledged last year that performance was not up to
snuff yet, but he touted that the new zAAP (IFA) engines would
solve that problem. Since IFAs on z/990s are the same speed as
z/900 CPs, IFAs can't improve performance on those boxes; it is
true that IFAs on z/890s can be faster than the CPs, but that's
only if you chose to buy "degraded" CP engines. I'm not aware of
Java itself being touted as boosting performance, so I remain
skeptical that WPS will ever outperform SAS run times.
Then, in Spring, 2005, I was told that performance was still such
an issue, especially I/O performance, that they were considering
a rewrite of parts of the Java kernel to improve its performance.
e. Now, in Fall 2005: it turns out that WPS had so many problems with
java performance that their entire product was re-written in C,
and java is used no more. So, WPS can NOT exploit IFA engines,
and it can't offload any CPU time from your CP engines. Its CPU
time will impact any MSU-based charges, just like SAS would.
Opinion: I do not believe it is possible to out-perform SAS in
CPU time per megabyte of data processed.
II. Clones in general
a. Whether it's WPS or some future SAS clone, however, my real issue
is: what product evaluation criteria would have to be met before
you or I would consider a technical alternative to the SAS System?
The following bullets are my list of considerations, and if there
ever exists a possible contender to investigate, I will elaborate
and expand these criteria after benchmarking the contender:
1. Execution performance attributes
- CPU consumption, interference, chargeback
(Save $$ for software, but CPU cost $$$$ ??)
- I/O performance, elongation exposure, chargeback
- Elapsed run time comparisons
- Disk Space requirements
- File compression, effectiveness, CPU costs of compression
- Data store - z/OS files or NFS, backup, etc.
- SORT performance - internal to the product, or third party?
2. Numerical accuracy
- SAS floating point representation protects decimal places
- variable length to control significant digits
3. Full support of ALL SAS features and facilities
- informats for input side conversion
- formats for presentation and table lookup
- bit and byte level tests
- important Procedures
- datetime, hex, etc., representations
- old-style substitution macros
- "new-style" %macros, argument sizes.
- multilevel nesting, interleaved old-style and new-style macros
- National Language symbols (UK Pounds vs US Dollars, etc.)
4. Accuracy of the interpreter
- It's one thing to successfully compile error-free SAS code
- But how good are the diagnostics if the code has errors?
- years of SAS development to identify errors clearly
- substitutions when you typo
- clarity of the location and cause of syntax errors
- And of equal importance, how are data errors handled?
- invalid data recognition
- hex dumps of bad records
- exact location of the bad data in the record
- error handling when invalid data is detected
5. Size of the product development and support team
- WPC originally 5 ex-IBMers from Hursley, England, labs
- Now supposedly 21 (how many admin now, versus technical?)
- compare the staff at SAS Institute
b. Morality of replacing the SAS System with a clone?
Should a product like MXG
- that was created by 'children of the sixties'
- to 'give away the keys to the kingdom'
- that exists solely because of the brilliant design of the SAS
language and its incredible performance with large data files
- that can execute on any platform thanks to the multiple platform
implementation of the SAS System
consider supporting a clone product
- whose sole and total motivation and only possible value is to
make money for them by undercutting the price of SAS on z/OS
- by only replicating parts of the original product?
There are only two reasons that should cause you to consider the
replacement of the SAS System on z/OS:
1. the high cost of SAS on z/OS due to its CEC-based pricing, or
2. bitterness toward SAS Institute for its perceived predatory
pricing on z/OS, and/or toward the perceived arrogant attitude
of its z/OS sales reps.
If the first applies, then you should carefully examine the below
benchmark results of running MXG on a PC, because it shows you can
- save far more money
- have guaranteed execution of all of your SAS programs
- have full technical support from SAS Institute
- and even in your local language
- and gain benefit of continued enhancements to the SAS System
simply by moving your existing SAS applications to a platform with
lower price.
If the second applies, perhaps moving to a cheaper platform will
get you a new sales rep, but as a minimum, you'll impact his/her
sales performance! But in fairness, you need to look, not at the
absolute dollar cost, but the cost of SAS as a percentage of your
total budget for hardware and software, and recognize, that for a
small percentage of total, you are able to measure, manage, and to
control the costs and services provided by that budget, so the
"high" cost of SAS may not really be all that high after all.
1. Can I really move my MXG processing to a PC (or unix Workstation)?
A. Yes, you can:
Glenn Bowman reported via a posting to MXG-L that he had moved his
processing of 13 GigaBytes per day to an IBM PC running Windows/XP.
He provided run time statistics, and at my request ran additional
reports on the sizes of the output libraries, for this note.
This first table shows the size of the output "PDB" data libraries
created from the 13 GB input SMF file; his tailored PDB processes
additional SMF records that create more output datasets in his PDB
library than the "vanilla" MXG PDB. There were 3,642,163 CICSTRAN
observations and 3,046,583 DB2ACCT observations created.
His 13 GB of SMF input required only 9GB of (compressed) disk space
on his PC for all of that data, and the work file was only 5GB in
size (because CICSTRAN and DB2ACCT were directly written to their
own "PDB" libraries):
MXG ANALYSIS OF OUTPUT LIBRARIES SPACE REQUIRED
BYTES
Library Number of OF COMPRESSED AVG %
Name DATASETS Variables DATA BYTES COMPRESSION
CICSTAT 89 6021 693M 440M 36.40
CICSTRAN 1 397 9274M 3895M 58.00
DB2ACCT 1 494 9044M 3708M 59.00
PDB 156 15199 1889M 1152M 38.98
SPIN 7 385 21K 21K 0.00
Total 20900MB 9195MB
Glenn initially processed this data with the original technique of
converting the data from VBS to RECFM=U on MVS, then ftping the SMF
data to a PC file, and then running BUILDPDB to read the disk file.
But then, after his initial post, he was made aware that it is NOT
necessary to download any raw data to the PC, by using the SAS ftp
access method to directly read the raw SMF data, with these results:
Original MVS: Daily job on 9672-X77 8 hours, over 480 minutes
Old way on PC: Convert VBS to RECFM=U 25 minutes
ftp the 13 GB SMF file 36 minutes
BUILD001 (READ SMF) 67 minutes
BUILD002-5,ASUMs, etc 44 minutes
Total daily run time 172 minutes
New way on PC: BUILD001 with ftp access 48 minutes
BUILD002-5,ASUMs, etc 44 minutes
Total daily run now 92 minutes
Using ftp access dropped his daily run time from about 3 hours to
an hour and a half, and no disk space for SMF data on PC is needed.
And his hardware platform: A 3.2GHz Pentium 4 with 512MB RAM, with
one 40GB and two 250GB Internal Hard Drives (all 7200 RPM), and one
USB 250GB External Hard Drive.
Additional notes from Glenn:
- Originally he ftp'd with an FTP server product called NFM, but it
took ten times as long as ftp, so NFM was abandoned.
- Originally he used the MVS batch ftp to PUT the SMF data to the PC
but every other day it would fail with some unknown error, so he
then changed his process and used the PC to GET the SMF data.
- His 2x250GB drives are compressed and he keeps two weeks input SMF
data and two weeks detail PDBs online; his weekly PDB backups are
subsequently uploaded back to MVS and backed up to Tape GDGs for
archiving.
- His 13GB of SMF is software compressed by SMS on the mainframe,
and MXG sets the SAS COMPRESS=YES so all MXG data is compressed.
- SAS is NOT licensed on MVS; SAS is ONLY licensed on his PC.
B. "Just because you CAN does not necessarily mean you SHOULD!"
If there are other users of SAS on the mainframe, that may be all
that is needed to justify the cost of SAS on MVS, but even in their
absence, I still recommend that it is better if you can execute
your MXG application to build your MXG datasets on the industrial
strength mainframe, not only because SAS under MVS is technically
superb at handling large volumes of data efficiently, but also
because
- the job scheduling products eliminate human management time
- the file management facilities of MVS (i.e., MVS catalog, JCL,
single DDNAME for an entire PDB library are more human-friendly,
- ESPECIALLY, HSM performs automatic DASD space management of what
data is on disk and what data is migrated to tape, eliminating
the need for human management of backups and disk space.
- ESPECIALLY, Workload Manager/Goal Mode, can prevent MXG from
interfering with interactive users (if that's what you want!);
ASCII platforms have nothing in the way of intelligent software
that lets you determine who gets what on a shared system.
- everything is automated, so that MVS execution of MXG Software
requires much less human time to monitor and manage the MXG job
stream.
But once you have built your MXG datasets on your mainframe, then
it does make sense to use SAS on your Workstation or PC to graph,
to report, and develop analyses and reports, especially when your
plotters and graphic presentation devices are PC or Workstation
based. You can bring down only the summary data you need for
today's analysis or testing, keeping the PDB data libraries
archived on the mainframe you are measuring.
Many MXG users on MVS process not only their MVS data, but they
also ftp their other platform measurement data (z/VM with its linux
data, raw NTSMF data for their Windows Platforms, AS/400 QACS data,
and TNG and PTX and other unix data) to MVS where they clone their
SMF BUILDPDB process to create and archive the MXG-built SAS
datasets for all of their platforms on MVS.
But MXG was enhanced in 1995 to run on PCs and Workstations, not
because it was the best place to execute MXG, but because some MXG
technicians were told they would lose their job if they could not
move the SAS work off the mainframe (typically, MXG was the only
SAS application at these sites, and the SAS mainframe license cost
can exceed the technician's salary!) MXG Software successfully
executes under all flavors of Windows, UNIX, and linux, and there
are performance benchmarks in past MXG Technical Newsletters.
And you no longer have to even download the raw SMF data; you use
the SAS ftp access method to read the z/OS data directly with MXG.
However: you really must use a dedicated Workstation for MXG. Do
NOT believe that you can put your MXG Application on an existing
shared unix/linux/windows server without serious impact. As NONE
of those operating systems have a concept of a Workload Manager,
and because SAS is designed to be fast and efficient, your daily
MXG job can easily "steal" the entire system from the other users
of those unprotected "ASCII" systems, and thus a dedicated
workstation is far safer than a shared server for MXG execution on
an ASCII Platform. Even if you plan to run your MXG daily job at
3am, there will be a time when a 9am rerun is required, so make
sure your boss is aware of this paragraph if he/she makes you use a
shared server.
And, you will spend much more of your time managing the execution,
backup, etc, as noted above.
A small number of MVS data sources still require partial execution
on MVS, although SAS is NOT required for these programs:
- Assembly programs (all start with ASM in MXG source) cannot be
executed on PCs or Workstations, so IBM'S RMF III data must be
first processed with ASMRMFV before download, and users of
ASMIMSLx will have to run part of the JCLIMSLx process on MVS
before download for the SAS steps.
- INFILE exits are not supported on ASCII platforms. Compressed
data from Landmark and Candle can be read directly by SAS
on MVS because assembly routines for decompression are provided
as INFILE exits (EXITMON6 for ASG/Landmark, Candle provides a
load module), but if you move your processing to ASCII, you will
have to first uncompressed on the mainframe before download.
But due to SAS Pricing on z/OS platforms, many MXG customers have
successfully moved their MXG application from the mainframe to
Unix/Linux on workstations or dedicated PCs with Windows/XP, all of
my own development and Alpha tests are executed with SAS V9.1.3 on
a Windows/XP platform; only final QA tests are executed on z/OS.
However, many MVS sites have still kept SAS running on their z/OS
platforms, so they can still enjoy the data management with minimal
human time, by creating a "penalty box" on a small capacity machine
to minimize the cost of the SAS base license, using the Scheduling
Environment to direct all of their SAS jobs to run in that LPAR,
using PROC CPORT/CIMPORT to move SAS datasets to their Workstation,
where SAS and SAS/Graph are licensed for analysis and reporting.
The Mullen Award winning paper at CMG 2005 by MP Welch and Chris
Schreck, "Mainframe Software Licensing - Software Licensing Cost
Reduction Strategies for Large Mainframe Environments" described
exactly how SPRINT set up that environment. Their paper can be
downloaded from http://www.mxg.com/downloads.asp.
If you cannot afford to keep SAS on your z/OS platform, you can
save some software dollars by moving the MXG Application to ASCII,
but it will cost some people dollars for management of job streams
and for data management, archiving, and backup, and be prepared to
install a dedicated workstation because of the lack of any workload
management on those ASCII systems in their present state.
III. MVS Technical Notes.
23. DCOLLECT run against 3390-54 produces incorrect information for
DCVPERCT: DCOLLECT shows 21% free, but ISPF shows 0% full with
978,450 tracks total, and 977,940 tracks free, and ISMF shows 99%
free. IBM delivered Usermod ANDCOL1 (as FIXTEST) with Prereq
UA17960, and that corrected the DCOLLECT output.
22. A discussion on IBM-Main cause me to post this technical note:
Don't get a rumor started that RMF is expensive!
Looking at data from one of our largest sites, with ten LPARs and
over 30,000 DASD devices to monitor, they run both RMF Monitor I
and RMFGAT Monitor III (and as strongly recommended, they only
capture the DASD CACHE data from one system):
Monitor I daily CPU costs:
RMF Cache Collecting LPAR 26 minutes per day
RMF Sum of other 9 LPARS 51 minutes per day
total 77 minutes per day
They also choose to run RMF III, which is more than ten time as
expensive than Monitor I, but well worth it for its ability to
investigate delays:
Monitor III daily CPU costs:
RMFGAT Cache Collecting LPAR 330 minutes per day
RMFGAT Sum of other 9 LPARS 540 minutes per day
total 870 minutes per day
Total RMF I and RMF III 947 minutes per day
Their 47 Engines have a total capacity of 67,680 CPU Minutes, so
the total cost of RMF I and RMF III is only 1.3% of capacity.
They create and process about 400 GigaBytes of SMF data daily.
The SMF dumping of that data took 163 minutes CPU time, and 27
hours elapsed time for all ten systems. The MXG processing of
those 400GB took 12 hours of CPU time and 33 Elapsed Hours (but
there is tremendous parallelism: SMF is dumped frequently and
split, the DB2 and CICS data is processed with each dump, and
the reports are run in parallel, etc.).
Summarized: RMF and RMFGAT Address Spaces 947 minutes
IFASMFDP runs 163 minutes
MXG Processing 400GB data 720 minutes
Total Daily Cost 1830 minutes
which is 2.1% of installed CPU capacity for the site.
21. APAR OA04644 discusses imbalance in page dataset activity if you
have PAV enabled on some paging devices, but not all. At one site
10 of their 20 page volumes had zero utilization when D ASM command
was issued from the console. The 10 unused were all non-PAV.
The APAR applies from z/OS 1.3 onwards, and it creates a separate
queue for PARTE control blocks for page datasets residing on PAV
devices. Then, if a page out occurs, ASM's search for where to put
that page first searches the PAV device queue, before looking at
the other devices. If a PAV device is located with a response time
less than the average of all other device types, it will be chosen
as the candidate for the page out.
20. There are some address spaces that must run at DPRTY='FF' because
they cannot be classified by WLM, but instead must run in the
SYSTEM service class. These include these ASID names:
*MASTER* RASP TRACE DUMPSRV XCFAS GRS SMSPDSE SMSVSAM
CONSOLE WLM IEFSCHAS JESXCF ALLOCAS IOSAS BPXOINIT
IXGLOGR SMF CATALOG JES2MON ANTMAIN WAD1CT
There's nothing you can do about them, but if they start using lots
of CPU time, they can cause severe degradation to other work. For
example, without APAROA06341, ANTMAIN began eating up full engines
when 4 or 5 SnapShot copies were simultaneously run, killing batch!
19. APAR OA06341 reports increased job elapsed time for DSS Copy and
Virtual Concurrent Copy performing COPY DATASET of data residing
on SVA storage subsystem, and a big increase in CPU utilization
in the ANTMAIN address space. A circumvention to specify
FASTREPLICATION(NONE) if the DFDSS parameters.
18. APAR OA08991 reports high CPU utilization in SMSPDSE address space
after installation of UA10647; utilization steadily increases and
doesn't level off, eventually leading to CPU performance problems.
17. A reoccurrence of these errors under SAS Version 9.1.3:
EXPECTING PAGE 218, GOT PAGE -1 INSTEAD.
PAGE VALIDATION ERROR WHILE READING WORK.XXXXXXX.DATA
FILE WORK.XXXXXXX.DATA IS DAMAGED. I/O PROCESSING DID NOT COMPLETE
which are erratic (4 failures in 25 executions) are believed by SAS
to be caused either by BMC's MainView Batch Optimizer, MBO, fixed
by BMC APAR BQ32297, or by IBM's SHARK DASD, fixed by OA11453.
Aug 2007: BMC says BQ32297 was created in 2003 and released for
general availability in MAINVIEW Batch Optimizer Release 2.3.00,
and that maintenance is also in the now-current Release 2.4.00.
Oct 2010: CATALOG VMXGIN IS IN A DAMAGED STATE with SAS V9.2 is
caused by HyperPAV and Mainview Batch Optimizer.
See MXG Newsletter 42 MVS Technical Note "6. BMC reports Fix...."
16. APAR OA09846 reports high response time in RMF 74 records for DASD
connected to FICON channels when using the Extended Measurement Word
facility; EMW is used on all 2084 and 2086 machines with full
exploitation of the new hardware features. The Disconnect time can
be extremely large; the text indicates that while Connect Time might
be exposed to the same problem, that there were no actual reports of
connect time errors.
15. APAR OA11326 reports SMF 70 CPUWAITM much larger than RMF DURATM, in
records that also have SMF70CNF bit indicating CPU reconfiguration
activity during the interval.
14. APAR OA11469 reports values for low impact frames could exceed the
total central storage on the system, for 64-bit systems.
13. APAR OA11068 discusses high CPU usage with PDSE default Hyperspace
values after APAR OA06884, and documents new PDSE Hyperspace parms
to rectify the problem. It also documents that the SMF_TIME(YES) is
the default option in IGDSMSxx that synchronizes of SMF 42 interval
record subtypes 1, 2, 15, 16, 17 and 18, and that SMF_TIME(YES)
overrides IGDSMSxx parameters CACHETIME, BMFTIME, and CF_TIME.
12. APAR OA11465 corrects "negative" values for IFA CPU times in SMF 30
records for multi-step jobs that use zAAP processors, both in
CPUIFATM (SMF30_TIME_ON_IFA) and CPUIFETM (SMF30_TIME_IFA_ON_CP).
An IBM "negative" value means the first bit is on, but MXG INPUTs as
PIB, MXG will have a large positive number and not a negative one.
These other APARs also exist in the area of IFAs:
OA10707, OA07950, OA05731, OA09650, OA08110, OA07091, OA07320.
11. Using BMC's CMF Monitor (RMF Replacement) Version 5.5.05, you will
need to apply PTF #BPM9335; without that fix, variable PVTFPFN,
RMF field SMF71FIN (frames in nucleus) that was 5424 before the
version change became only 2339 frames after the new version.
Since PVTFPFN and PVTPOOL are added to create CSTORE in TYPE71
dataset, CSTORE was also incorrect without the fix.
10. APAR II14006 reports a serious problem on 2084 and 2086 CPUs with
(maintenance) MCL089 in the J13486 (SE-SP) stream if the machines
are at Driver 55. The SU_SEC value is one-quarter what it should
be after that maintenance, both in the operating system and in the
RMF records and on RMF reports. The major operational impact is
that period switching is based on service units that are calculated
from that constant, so tasks will receive four times the defined
DUR service units before they are switched to the next period, and
the DB2 facility that terminates DB2 transactions after some number
of service units will also not kick in until the transaction has
used four times the CPU time limit that you chose. Fixes for the
IBM error are available and documented in the APAR text.
While not documented in the APAR text, the value of R723NFFI, the
IFA Normalization Factor, was also incorrect due to MCL089, as
were R791NFFI and R792NFFI values.
9. z/OS 1.6 has revised the SMF buffer processing, and has externalized
new BUFSIZMAX and BUFUSEWARN parameters that you can set in SMFPRMXX
and documented the BUFSIZMAX can be 1024MB, but 1.6 also removed the
"knobs" that were used with APAR OW56001 to set the size of buffer
expansion increments. Those changes are now very well documented in
Item BDC000031621, in response to a series of user questions:
The IBM answers:
You are correct, with z/os 1.6 level, you lose some control. The
APAR offset will NOT work to set parms. A new mechanism was used to
set the parameters. You can only set the max buffer size and the
warning level percentage. You cannot set the increment size and the
init size as they default to 8 MB. But I understand your concern
and have verified the changes made to SMF buffer processing, and the
conclusion is that you will NOT need these changes that you have
used in the past. Let me now explain the reasons motivating this
positive statement:
During initialization, SMF gets the BUFSIZMAX value that you have
specified and getmains in SP 229 of the SMF address space BCAs
(buffer chains) of BQEs (Buffer Queue element holding SMF CIs)for
the size that you have specified.
SMF now has two chains:
In-use chain:
two BCAs are initially set for the initial 8 MB
(pointed to by SLCABCAP)
Available chain:
contains the number of BCAs needed to represent
the storage you have requested minus the initial 8 MB
As SMF records get written, SMF records gets saved into BQEs while
waiting for physical write to the SMF datasets. When the in_use
chain fills at 75%, buffers from the available chain get added to
the In-Use chain to offer staging space for the incoming SMF
records. When records have been written and the activity is such
that the buffers are no longer needed, they get returned to the
available chain. Buffers from the available chain never get
freemained unless you dynamically reduce the BUFSIZMAX.
As you can see, logic has changed and SMF does not do multiple
getmains and freemains as it did before. Therefore the need to
change the increment size is not necessary since it is only used to
set the initial In_use buffers.
8. APAR OA10901 adds the zAAP Normalization factor into the SMF 30
record.
7. If your site's ACS (allocation) rules have a really stupid default
that has a "fall thru" that puts that allocation in a Data Class
that allocates an Extended Sequential dataset, instead of a simple
Physical Sequential Data Class, when you write to that allocation,
you will get these SAS error messages on the log:
ERROR: OPEN TYPE=J FAILED TO POSITION LIBRARY DATA SET PDB....
ERROR: ATTEMPT TO OPEN SAS DATA LIBRARY PDB FAILED.
and your SYSMSG will have this IBM error message
IEC143I 213-B8,IFG0194D,....
because SAS does not support Extended Sequential for data libraries.
(You can allocate a sequential format, tape, library to a Data Class
that is Extended Sequential, so that you can hardware compress it;
see "How do you hardware compress a SAS dataset" in Newsletter 36.)
6. APAR OA07672 revises how WLM-managed initiators are started, and
adds new SMF 99 trace codes.
5. APAR OA06687 reports incorrect values for Queue Delay (R723CQ) and
Other Unknown Delay (R723CUNK) can be propagated into other service
classes if one of the WLM managed initiators registered batch
classes experiences Queue or Unknown delay, and large values may be
seen in those variables for non-batch service classes.
4. MXG revised: Use the IBM Default MSOCOEFF of zero, not .0001.
Thanks to a query to IBM support from an MXG user when I couldn't
answer why his MSOUNITS didn't match his PAGESECS with the IBM
published formula, IBM has discovered that the MSOCOEFF value that
is used internally in their code has a minimum value of 0.0122.
While the WLM definition supports a value as small as 0.0001, and
while that value is reported in TYPE72s and TYPE30s, the smallest
value that can be represented internally is 0.0122. IBM will update
the SMF Manual to show the actual formula used for scaling:
MSO COEFF used for Calculations =
Input MSO Coeff multiplied by 10000 4096
____________________________________ * ____ + 1
10000 50
The result is truncated to the nearest integer value. As a result,
input values between 0.0001 and 0.0122 all result in a value of 1,
and a MSO coefficient of 1 results in a value of 82, which is then
used by WLM to calculate MSO service. That 4096 is a page frame
size of 4096 bytes: "To make MSO roughly commensurate with CPU
service units, the raw number is divided by fifty to yield MSO
Service Units."
IBM's recommended default MSOCOEFF is zero, but MXG had previously
recommended that you use 0.0001, for these two reasons:
- So that both MSOUNITS in type30 and type 72 are non-zero. While
memory service units are, in principle, poor metrics, because they
ONLY count memory when the program is executing TCB CPU time, and
do not record any pages owned when a task is resident and waiting,
they are useful when non-zero primarily to show how poor they have
always been, since you can now compare the MSOUNITS page-seconds
with the ACTFRMTM resident-frame seconds; I've seen completely
erratic MSOUNITS when compared with the better ACTFRMTM metric.
However: memory variability is to be EXPECTED; memory is NOT
something a program chooses to use some amount of, but
real memory is "doled out" to programs by WLM, based
on your service level objectives, and the instantaneous
system utilization when your program ran, etc.
I have used that variability to prove to management exactly why we
cannot "charge" for "memory used", so I preferred to record them.
- So that MSOUNITS are NOT a significant percentage of total service
units. In compat mode, Service Unit absorption rates were used in
MPL management (i.e., swapping decisions) and even earlier were
used in those beloved "OBJ" curves, and the early IBM defaults of
10,10,5,3 for CPU/SRB/IO/MSO caused MSO to be a major percent of
total service. But memory is NOT work - work is CPU and I/O, and
making "work" and swap decisions based on total service that was
dominated by MSOUNITS was wrong, which is precisely what the APAR
that changed the minimum MSOCOEFF from 0.1 to 0.0001 stated was
its purpose - to REMOVE the impact of MSO from total service, and,
ultimately, why IBM now sets a default of zero for MSOCOEFF.
That second issue, major decisions based on service units, is no
longer true. In WLM/Goal Mode, service units are now used ONLY to
determine period switch of service classes with multiple periods,
so a high percentage from MSO Service doesn't impact all workloads.
But how bad can the percent of service from MSOUNITS be with 64-bit
hardware, even with a 100:1 ratio of CPUCOEFF to MSOCOEFF? Pretty
bad, as this z/OS 1.6 data from CPUs with SU_SEC 10K-19K shows:
CPU=10 SRB=10 IO=6 MSO=0.1 (i.e. 100:1 CPU:MSO ratio)
Total Service= 40,010,255,811
MSO Service = 29,371,433,067
CPU Service = 6,340,430,161
SRB Service = 2,414,650,968
IO Service = 1,187,416,751
CPU+SRB+IO = 10,388,822,744
Even with a 100:1 ratio, MSOUNITS were 73% of total service units!
Reducing the MSOCOEFF from 0.1 to 0.0122 would reduce MSO Service
to 3,583,384,834, and total to 13,972,137,579, but even then, with
a ratio over 800:1, MSOUNITS would still be 25% of total service.
And note that this data had CPUCOEFF=10, which is 10 times recent
recommendations to set CPUCOEFF to ONE! I believe the motivation for
the CPUCOEFF=1 was because the SMF 30 records did not contain the
service coefficients, so by setting CPU/SRB=1, you could calculate
CPU time from CPU Service. But IBM recently put the coefficients in
the 30s, so using a larger CPUCOEFF could be one way to ensure that
MSOUNITS are a small percent of total service but still record the
MSOUNITS in your 30s and 72s.
Ok, the real truth: I had thought that with MSOUNITS=0 that the
PAGESECS field in SMF 30s was also zero, and so my suggestion to
use MSOCOEFF=0.0001 was really to preserve that SMF 30 field; I
had only used MSOUNITS in SMF 72s to show how bad they were when
IBM added the Active Frame Time field (ACTFRMTM) in TYPE72GO, and
now, knowing that the "terrible" PAGESECS are still recorded with
MSOCOEFF=0, I believe that MSOCOEFF=0 is the best choice, since it
will prevent MSOUNITS from being included at all in total service
units, and thus MSOUNITS will NOT impact period switching of your
service classes with multiple periods, and you won't have to change
your existing CPU/SRB/IO coefficients!
Note that you may need to change your DUR value in those multiple
period service classes, if MSOUNITS are currently a significant
portion of the total service units with your present coefficients.
APAR OA10641 documents the above 0.0122 value and the equation.
3. APAR PQ98205 reports excessive SMF 80 records and "memory leak"
(known to real System Programmers as a MEMORY CODING ERROR) in
WebSphere V5.0. The SMF80 records are EVENT CODE 67.
The fix, to free the old memory before replacing the security token
with a new one, is still not immediate; this is JAVA folks, and the
cleanup is, like Basic on TRS-80s, done via "garbage collection".
2. Info APAR II13427 lists changes required for CA-TOP SECRET and ACF2
by WebSphere 401, 500, and 501 releases.
1. Info APAR II13360 lists many problems caused by third party products
in OPEN/CLOSE/EOV and Access Methods PDSE/HFS/CMM/CVAF/DADSM.
IV. DB2 Technical Notes.
4. IBM Redbook "Stored Procedures: Through the Call and Beyond"
discusses how to keep track of "nested" DB2 times.
3. APAR PQ99525 fixes a couple of problems for "nested" DB2 access;
i.e. Stored Procedures, Triggers, and User-Defined Functions (UDF).
a. Class 1 accounting for these items could capture and record small
fractions of Class 2 In-DB2-Time, causing QWACASC and QWACAJST
having non-zero values when Class 2 accounting was NOT active.
b. UDFs and Stored Procedures require In-DB2-Time to connect and
disconnect to DB2; this time was not being accounted for in
Class 2 times (QWACSPTT,QWACSPEB,QWACUDTT,QWACUDEB). Class 3
suspension time is recorded during this connect and disconnect
processing, and thus Class 3 time could be significantly greater
than Class 2 time.
2. APAR PK04803 reports QWACEJST may be zero, or less than QWACBJST for
RRSAF threads that execute in multiple tasks; these records would
also have QWACRINV equal to 6 and/or 16. Fix expected July, 2005.
1. The IBM Redbook "Distributed Functions of DB2 for z/OS and OS/390",
publication SG24-6952 discusses high volumes of SMF 101 records if
DB2 thread pooling is used:
-DB2 thread pooling can be a disadvantage at very high transaction
rates, because an SMF accounting record is cut every time a
thread becomes inactive. In such scenarios it may be better to
avoid thread pooling and stick with CMTSTAT=ACTIVE. If so, you
should ensure that some kind of client-side pooling is active
(such as WebSphere connection pooling, discussed in 6.1.6,
"Application connection pooling with ODBC and JDBC" on page 159),
so that threads are kept open for the application connections,
and the applications perform regular commits, but do not
disconnect.
-The recently announced Version 8 of DB2 for z/OS will provide
some relief in this area as well. DB2 V8 will allow roll-up
accounting information for DDF threads. Instead of writing an
accounting record every time a thread gets pooled, an accounting
record is only written after 'n' occurrences, where 'n' is a new
DSNZPARM. See "DB2 UDB for z/OS Version 8 Technical Preview",
publication SG24-6871:
-z/OS Version 8 Technical Preview, SG24-6871:
6.8.2 Roll up accounting data for DDF and RRSAF threads.
If you establish a connection to DB2 V7 via DDF, you normally
want to use DB2's type 2 inactive threads. This function is also
known as thread pooling. This feature is enabled by specifying
CMSTAT=INACTIVE in DSNZPARM. Using the inactive thread support
allows you to connect up to 150,000 users to a single DB2
subsystem. However, if you are running high volume OLTP workloads
in this environment, you might encounter a performance
bottleneck, because DB2 cuts an accounting record on every COMMIT
or ROLLBACK when using thread pooling, and SMF might have a hard
time to keep up with the massive number of DB2 accounting records
that are produced.
-You may encounter a similar problem when using the RRS attach, in
combination with WebSphere. WebSphere drives the RRS signon
interface on each new transaction, and DB2 cuts an accounting
record when this happens. An accounting record is cut, even
though some of these transactions contain just one SELECT
statement followed by a COMMIT.
-DB2 V8 adds a new installation option to activate the rollup of
accounting information for DDF threads that become inactive, and
RRS threads. The new option DDF/RRSAF ACCUM has been added to
installation panel DSNTIPN. The default is NO. The values
accepted for this option range from 2 to 65535. The
corresponding DSNZPARM is ACCUMACC.
-When NO is specified, DB2 writes an accounting record when a DDF
thread is made inactive, or when signon occurs for an RRSAF
thread. If any number between 2 and 65535 is specified, DB2
writes an accounting record after every n occurrences of end user
on any RRS or DDF thread, where n is the number specified for
this parameter. An end user is identified as the concatenation
of the following three values:
End user userid (QWHEUID, 16 bytes),
End user transaction name (QWHCEUTS, 32 bytes), and
End user workstation name (QWHCEUWN, 18 bytes).
-Even when you specify a value between two and 65535, DB2 may
choose to write an accounting record prior to the nth occurrence
of the end user in the following cases:
-A storage threshold is reached for the accounting rollup
blocks.
-You have specified accounting interval = 'COMMIT' on the RRSAF
signon call.
-When no updates have been made to the rollup block for 10
minutes, that is the user has not performed any activity for over
10 minutes that can be detected in accounting.
-Search the archives at MXG-L for ACCUMACC for several postings
that discuss the data that is lost when ACCUMACC is enabled.
V. IMS Technical Notes.
VI. SAS Technical Notes.
5. SAS Clones.
This article was written in the Summer of 2006, and has been
superseded by the WPS Technical Note VI. in MXG Technical
Newsletter FIFTY-ONE, dated December 6, 2007.
a. WPS is still vapor-ware:
I offered to test WPS last August to WPC company president
- if it worked perfectly I'd be morally obligated to report
- if it failed I would not be obliged to share my knowledge
- He couldn't handle that offer
- Several USA sites were to get the software in March, not yet.
- Purportedly '80%' of base SAS statements supported?
- 'Not Ready for Prime Time', maybe this fall?
- Company stated that 'MXG IS NOT SUPPORTED', maybe in 2006!
- Company wants to run your SAS code first to see what breaks
- Written Java, can exploit zAAP.
- But they are rewriting their Java Kernel for performance!
- Not one single execution comparison yet published
- Pricing is not competitive in first year(s)?
Site paying equivalent of USD200,000 per year for base SAS only
IBM quotes to that site in equivalent USD:
WPS - MSU Based Price $2,100,000
"List price is $2,500/MSU for WPS"
WPS - 'Enterprise' License $1,200,000
b. Issues for MXG to support any SAS clone:
i. Execution Performance attributes
- CPU consumption/interference/chargeback
- I/O performance/elongation/chargeback
- Elapsed run time comparisons
- Disk Space required
- File compression, effectiveness, costs
- z/OS files or NFS files - backup, etc.
- SORT performance
ii. Numeric accuracy
- floating point representation
- significant digits
iii. Full support of all SAS language facilities
- informats
- formats
- functions
- bit and byte tests
- important Procedures
- datetime, hex, etc, representations
- old-style macros, %macros, heavily nested, interleaved
iv. Accuracy of interpreter
- it's one thing to interpret error-free SAS code
- but how good are the diagnostics when the code has errors?
- years of SAS development to identify errors clearly
- invalid data handling:
- hex dumps of bad records?
- exact location of data error?
- error handling when invalid data detected?
- broken VBS data segment support?
v. Size of development and support team
- was 5 ex-IBMers from Hursley
- now supposedly 21 people
vi. Morality
Should a SAS-based product that is essentially free,
created by 'children of the sixties',
support a SAS-clone product whose total motivation
is to under-cut the price of that outstanding product,
by only replicating parts of that product?
WHEN YOU CAN SAVE MORE MONEY BY MOVING SAS TO A PC?????
4. The ID variable is kept from the last observation when PROC MEANS is
used to create an output dataset.
3. SAS Note SN-014639 documents why you get those BPX messages, like
BPXP018I THREAD ... ENDED WITHOUT BEING UNDUBBED ... CODE 0003E7
They only show that there was a USER 999 ABEND ('3E7'x=999 dec),
and that Unix System Services had been called ('dubbed'), and have
no impact. The USER 999 ABEND means: look on the SAS log for the
real ERROR message; option ERRORABEND is in effect, and it causes
any SAS error message to cause the step to end with USER 999 ABEND.
2. Using PROC MEANS N MIN MAX SUM DATA=PDB.RMFINTRV; VARIABLES CPU: ;
(to investigate all variables starting with CPU) fails with RMFINTRV
because it has character variables that start with CPU. In emails
with SAS Support requesting that the ERROR be changed to a WARNING,
they suggested this alternative, using the _CHARACTER_ option:
PROC MEANS N MIN MAX SUM DATA=RMFINTRV (DROP=_CHARACTER_);
VARS CPU: ; which works with all PROCs that expect only numerics.
And, if only character vars are wanted, you can use this syntax:
PROC FREQ DATA=RMFINTRV (DROP=_NUMERIC_); TABLES CPU: ;
1. SAS option MAUTOLOCDISPLAY will show the library/directory name from
which an AUTOCALLed macro was loaded, useful in diagnosing problems
when a user hasn't removed old VMXGxxxx members that define %MACROs,
since MXG uses AUTOCALL to compile all of its %MACROs.
VII. CICS Technical Notes.
VIII. Windows NT Technical Notes.
IX. z/VM Technical Notes.
1. APAR VM63636 reportedly corrects paging problems i z/VM 5.1
X. Incompatibilities and Installation of MXG 23.01.
1. Incompatibilities introduced in MXG 23.01 (since MXG 22.22):
See CHANGES.
2. Installation and re-installation procedures are described in detail
in member INSTALL (which also lists common Error/Warning messages a
new user might encounter), and sample JCL is in member JCLINSTL.
XI. Online Documentation of MXG Software.
MXG Documentation is now described in member DOCUMENT.
XII. Changes Log
--------------------------Changes Log---------------------------------
You MUST read each Change description to determine if a Change will
impact your site. All changes have been made in this MXG Library.
Member CHANGES always identifies the actual version and release of
MXG Software that is contained in that library.
The CHANGES selection on our homepage at http://www.MXG.com
is always the most current information on MXG Software status,
and is frequently updated.
Important changes are also posted to the MXG-L ListServer, which is
also described by a selection on the homepage. Please subscribe.
The actual code implementation of some changes in MXG SOURCLIB may be
different than described in the change text (which might have printed
only the critical part of the correction that need be made by users).
Scan each source member named in any impacting change for any comments
at the beginning of the member for additional documentation, since the
documentation of new datasets, variables, validation status, and notes,
are often found in comments in the source members.
Alphabetical list of important changes after MXG 22.22 now in MXG 23.01:
Dataset/
Member Change Description
See Member CHANGES or CHANGESS in your MXG Source Library, or
on the homepage www.mxg.com.
Inverse chronological list of all Changes:
Changes 23.yyy thru 23.001 are contained in member CHANGES.