COPYRIGHT (C) 1984-2021 MERRILL CONSULTANTS DALLAS TEXAS USA
MXG NEWSLETTER FIFTY-FIVE
***********************NEWSLETTER FIFTY-FIVE****************************
MXG NEWSLETTER NUMBER FIFTY-FIVE, JAN 20, 2010
Technical Newsletter for Users of MXG : Merrill's Expanded Guide to CPE
TABLE OF CONTENTS
I. MXG Software Version.
II. MXG Technical Notes
III. MVS, aka z/OS, Technical Notes
IV. DB2 Technical Notes.
V. IMS Technical Notes.
VI. SAS Technical Notes.
VI.A. WPS Technical Notes.
VII. CICS Technical Notes.
VIII. Windows NT Technical Notes.
IX. z/VM Technical Notes.
X. Email notes.
XI. Incompatibilities and Installation of MXG.
See member CHANGES and member INSTALL.
XII. Online Documentation of MXG Software.
See member DOCUMENT.
XIII. Changes Log
Alphabetical list of important changes
Highlights of Changes - See Member CHANGES.
COPYRIGHT (C) 1984,2010 MERRILL CONSULTANTS DALLAS TEXAS USA
I. The 2010 Annual Version MXG 27.27 is dated Jan 20, 2010.
All sites were mailed a letter with the ftp download instructions.
The availability announcement was posted to both MXG-L and ITSV-L.
You can always request the current version using the form at
http://www.mxg.com/ship_current_version.
1. The current version is MXG 27.27, dated Jan 20, 2010.
See CHANGES member of MXG Source, or http://www.mxg.com/changes.
II. MXG Technical Notes
3. SAS option COMPRESS=YES impact on z/OS MXG Execution.
The MXG default for all platforms is COMPRESS=YES. For all ASCII
platforms benchmarks have proven that IS correct and desirable: on
ASCII systems, COMPRESS=YES minimizes both disk space and CPU time
needed to create MXG datasets.
On z/OS, while COMPRESS=YES does minimize disk space required, it
does require additional CPU time. So, like most performance issues,
it all depends - on whether your disk space is the limiting factor
(in spite of the incredible reduction in the real costs for disks)
or if the CPU Time consumption is of more concern (at 3am??).
At CMG 2009, Chuck Hopf reported an MXG tailoring that set option
COMPRESS=NO for the SMF/SORT processing to a TEMPPDB to save CPU
time, followed by a PROC COPY with COMPRESS=YES to minimize the
size of the output PDB data library. But then I discovered that in
SAS V9, COMPRESS=YES can be specified on a LIBNAME statement, which
eliminated the tailoring, the TEMPPDB and the PROC COPY. Chuck then
reran these tests, reading an 11GB SMF file, always compressing the
the output PDB library:
Compress Option CPUTM WORK PDB CICSTRAN DB2ACCT PDBTEMP TOTAL
sec cyl cyl cyl cyl cyl cyl
YES, GLOBAL 2745 2441 1813 1223 2765 8242
NO, PROC COPY 1867 6934 1816 5114 10046 6006 29916
NO, LIBNAME YES 2376 6934 1816 1223 2765 12737
NO, TAPE 2061 6934 1816 TAPE TAPE 8750
The COMPRESS=YES option minimizes the total disk space for all of
the output data libraries AND for the WORK library, but at a cost
of 15 minutes more CPU time, from 31 to 46, a 48% increase.
To save those 15 CPU minutes using COMPRESS=NO+PROC COPY for only
the PDB library, the total disk space for the job increased from
8242 to 29,916 cyls, nearly four times, and the output libraries
increased nearly three times, from 5801 to 16976 cylinders.
The intermediate choice, using COMPRESS=NO for WORK library with
COMPRESS=YES on the three output LIBNAMES, saves 8 minutes CPU Time
(or an increase of 27% from the minimum); the total disk space for
the job increased only by 50%, from 8242 to 12,737 cylinders, and
that increase was all in the uncompressed temporary WORK library
//SYSIN DD *
OPTIONS COMPRESS=NO;
%LET PDB2ACC=DB2ACCT;
LIBNAME PDB COMPRESS=YES;
LIBNAME DB2ACCT COMPRESS=YES;
LIBNAME CICSTRAN COMPRESS=YES;
%INCLUDE SOURCLIB(BUILDPDB);
The last choice, writing the CICSTRAN and DB2ACCT datasets to TAPE
LIBNAMEs, compressing only the output PDB library, increased the
CPU time from minimum by only 3 minutes, to 34, with only 8750 cyl
required for the uncompressed WORK and compressed PDB libraries.
Note that for TAPE output on z/OS, again, it all depends!!
With the Global COMPRESS=YES option, TAPE output is NOT COMPRESSED;
this makes complete sense, since ALL tape control units compress at
that hardware level, at NO CPU cost. However, if COMPRESSS=YES
is specified either as a dataset option or as a LIBNAME option, SAS
does compress tape output datasets. This could be required if you
have virtual tape systems that write uncompressed to DASD.
Additionally, compression of datasets in a libref are always at
the dataset level. For example, if you use COMPRESS=YES option on
a LIBNAME, all created datasets will be compressed, but you could,
later, add an uncompressed dataset to that data library.
So, my recommendation for z/OS is to still use the MXG default of
Global COMPRESS=YES. I think the exposure and cost of running out
of disk space, causing a BUILDPDB job to ABEND, is far higher than
the small increase in CPU time, especially if the BUILDPDB is run
in the slack time of day. But, the above tests do quantify the
possible CPU time savings, if that truly is the limiting factor.
2. Removal of duplicate observations from MXG's PDB.JOBS.
A "job" is a unique instance of READTIME JOB JESNR, but a PDB.JOBS
observation is created from multiple SMF type 6, 26 and 30 records
which might be created in several days' SMF datasets.
-There are several sources of possible duplicates in PDB.JOBS:
a. Duplicate records have NEVER been created in the VSAM SMF dataset,
but design errors in the SMF VSAM dumping procedures, human errors
or hardware or software failures in the SMF processing jobstream
have created actual duplicate records in the SMF datasets that MXG
processes. If you have a well designed SMF dumping procedure,
and never experience a job failure, you cannot have duplicates.
b. If duplicated SMF records do exist in the input SMF file that MXG
reads (e.g., same SMF dataset concatenated to itself), BUILDPDB
will NOT create duplicates in PDB.JOBS, because the NODUPRECS SORT
option is used to remove duplicates from the datasets MXG creates
in the BUILDPDB program. These sorts require BY lists that span
all possible sequences so that duplicates are physically adjacent,
and that is why sometimes, the MXG BY list has had to be changed
to guarantee that adjacency for duplicate removal.
c. But "pseudo-duplicates" can be created by BUILDPDB that we do NOT
want to remove: PDB.JOBS observations with the same READTIME JOB
and JESNR, but that are not actual duplicates. The SPINCNT value
in IMACSPIN sets the number of BUILDPDB executions (days) when
records for inflight (incomplete) jobs are held; jobs are "spun"
until the job's Purge record is read. When SPINCNT is exceeded,
whatever records happen to be in that SPIN library will be output
to that PDB.JOBS. Then, when more of that job's records are read,
another observation with the same READTIME JOB JESNR is output, in
a different day's PDB.JOBS. But these are not duplicates; each
will have different sets of variables populated from the different
SMF records that were read. For example, if SPINCNT=0, a job that
executed today, but was in the held output queue when SMF VSAM was
dumped, will create a PDB.JOBS observation with the CPU/EXCP/etc
execution resource variables populated, but all of the scheduling
datetimes (JSTRTIME,JPURTIME,etc) from the Purge record will have
missing values. Then, tomorrow, when the print/purge SMF records
are read, a second observation for that job will be output with
that same READTIME JOB JESNR, but with only the print lines and
scheduling datetime variables populated. We do NOT want to delete
these pseudo-duplicate observations from our PDB.JOBS dataset.
d. But "real" duplicate observations can be created, if
SMF records that were read "yesterday" are accidentally reread
again "today". This would create separate PDB.JOBS observations
in two different daily PDBs that WOULD have identical values for
all resources. Those duplicate observations differ ONLY in their
ZDATE/ZTIME values (the "run date" of the BUILDPDB execution), so
if you do then combine the daily PDBs into the same WEEKs PDB,
you CAN use this PROC SORT to delete these true duplicates.
PROC SORT NODUPKEY DATA=WEEK.JOBS OUT=WEEK.NODUPJOB;
BY READTIME JOB JESNR INBITS JINITIME JTRMTIME INITTIME
PRINTIME JPURTIME CPUTM EXCPTOTL EXCPTODD EXCPNODD PRINTLNE;
Option NODUPKEY must be used here, instead of MXG's normal NODUP,
because ZDATE/ZTIME are NOT identical in each pair of duplicates.
e. BUT: if only some of the job's records are repeated, or if the job
already is "SPINing" (has some records held in the SPIN library),
then the re-reading of only some of a job's SMF records is much
more insidious, and the above PROC SORT would not likely detect
that kind of duplication.
1. Search Arguments.
Some examples of search arguments for MXG and related information:
Using Google to keyword search at a specific site, for example, at
www.mxg.com or at www.ibm.com:
+websphere +db2 +wlm +classification site:mxg.com
+websphere +db2 +wlm +classification site:ibm.com
Alternatively, this url is the Google Advanced Search page:
http://www.google.com/advanced_search?q=site:www.mxg.com&hl=en
For mxg.com, you can also use the SITE SEARCH option (on left) at
http://www.mxg.com/newsletters/
But the MXG-L ListServer Postings Archive is not at www.mxg.com,
so the above site searches will not find MXG-L postings. The link
to search all MXG-L postings, since its Oct, 1996, inception, is:
http://peach.ease.lsoft.com/scripts/wa.exe?A0=MXG-L
III. MVS, a/k/a z/OS, Technical Notes.
13. APAR OA30974 reports that if you use SMF Logger AND have removed all
MANx datasets from SMFPRmxx, but have LASTDS(HALT) specified there,
an IPL will fail, as it enters a WAIT DOD RSN01 wait state. Remove
the LASTDS(HALT) to circumvent until IBM has a PTF for the APAR.
12. APAR OA31547 reports that SMF 89 records can stop being written if
they change SMF recording from NOACTIVE to ACTIVE. APAR Still Open.
11. APAR PK86020 reports A REPORTING PROBLEM WITH THE RMF WORKLOAD
ACTIVITY REPORT WITH RESPECT TO THE TRANS-TIME
ERROR DESCRIPTION:
A reporting problem with the RMF Workload Activity report with
respect to the Trans-Time. Here is a sample report that shows the
TRANS-TIME values that are not correct:
TRANS-TIME HHH.MM.SS.TTT
ACTUAL 973
EXECUTION 875
QUEUED 97
USERS AFFECTED: All users of IBM WebSphere Application
Server V6.1.0 viewing RMF diagnostic
reports.
The RMF Workload Activity report shows an incorrect queued
value under the TRANS-TIME values.
PROBLEM CONCLUSION:
Because the queue times are reported based on the values in the
thread at the time of capture, the values presented may be incorrect
if the thread has switched during the course of processing. This
may occur if SSL or webservices are active. The problem is resolved
by copying these values through the java portion to circumvent the
loss of the values during thread switching.
APAR PK86020 is currently targeted for inclusion in Service Level
(Fix Pack) 6.1.0.29 of WebSphere Application Server V6.1
10. APAR OA31088 reports Z/OS 1.11 B10 INCORRECT FREE SPACE AND LARGEST
FREE EXTENT IN SMS VOLUME CONTROL BLOCK IMMEDIATELY AFTER VARY
ONLINE
ERROR DESCRIPTION:
In z/OS 1.11 environment, free space and largest free extent are
incorrect for new volumes that have been varied online and have not
had any datasets allocated to it. The incorrect statistics are in
the volume statistics control block (IGDVLD) which feeds downstream
systems such as RMF. Beginning in Z/OS 1.11, an additional call is
now being made such that the volume statistics will be updated when
the volume is varied online eliminating the need to allocate a small
1 trk dataset to the volume (APAR OA23901 included in 1.11 base).
This client saw incorrect RMF Storage Group and volume statistics
Additional symptom:
ISMF LISTSYS statistics are incorrect after the vary.
ISMF LISTVOL command statistics seem to be correct.
LOCAL FIX:
Allocate a dataset to the volume which will update the free space
and largest free extent statistics correctly.
9. APAR OA30299 reports SMF74DCI SMF74DCT BLANK FOR DEVICE FOLLOWING
HIPERSWAP
After a hyperswap or DDR device swap, IOS will issue an ENF 28 DDR
and ENF 23 Device Online for the target device of the swap. Both
ENFs are processed by RMF listen exit module ERBMFEAR. But RMF
module ERBMFIDA, which updates the DDB, is only called when
processing the ENF 23 for device online event. Since ENF 28 DDR
runs asynchronously, it can happen that the ENF 28 is processed
before ENF23 so that the call to ERBMFIDA is skipped. As result of
this the RMF DDB is not updated. Affected RMF releases: z/OS-1.9 up
to and z/OS-1.11
PROBLEM CONCLUSION: With this APAR, the ENF 28 listen exit handler
in module ERBMFEAR is changed to call ERBMFIDA when the
configuration token in EDDDB field EDDDSDCT is blank.
8. APAR OA31471 reports variable SMF75AVU, MXG variable AVGUSED,
the average number of used slots in the RMF PP PAGESP report is too
low. The problem occurs when large page datasets are used and RMF
Monitor I zz data gatherer session runs with a very small cycle time
(e.g. 100 ms).
7. APAR OA30633 reports HIGH CPU IN GRS WHEN ZFS QUERYING FILE SYSTEM
OWNER FOR RMFGAT DATA GATHERING.
When running with RMFGAT data gathering option "ZFS", RMF makes
requests to zFS to collect statistics on zFS file systems. As a
result, zFS makes ISGQUERY requests to GRS to determine the owner of
each file system. The GRS GQSCAN routine scans for enqueues across
the sysplex and can be CPU intensive. LOCAL FIX:
BYPASS/CIRCUMVENTION: The collection of ZFS data can be turned off
by specifying Monitor III data gatherer option "NOZFS".
6. APAR OS30551 reports zeros for buffer statistics above 2GB until
buffer utilization exceeds 80%. APAR OA27343 created the error.
APAR OA72343 was installed.
5. Daylight Savings Time and CMF and GMT Offset.
With BMC's CMF monitor instead of RMF, you must bounce the MVSPAS
(CMF) Address Space after the CEC Clocks were reset for DST Fall
Back of the clocks. If the STCs were not bounced, the values in
the CMF fields that MXG INPUTs as GMTOFFTM and GMTOFF72 continued
to remain the offset prior to the Fall Back. The incorrect GMTOFF72
did not cause incorrect timestamps in the TYPE72GO dataset, but the
incorrect GMTOFFTM variable apparently caused datasets ASUM70PR &
ASUM70LP timestamps to correspond to the incorrect GMT offset.
The wrong GMT Offset will continue to be in your RMF SMF records
until the CMF Address Space is restarted.
4. Comparison of Seconds of CPUTM in PDB.TYPE72GO and PDB.SMFINTRV,
shows RMF and SMF interval data match very well.
Startime SYS1 SYS2 SYS3 SYS4 Total
SMF 05NOV09:00 4350 671 2641 212 7876
RMF 05NOV09:00 4339 665 2751 217 7974 + 96
SMF 05NOV09:01 3696 670 1473 201 6041
RMF 05NOV09:01 3802 678 1330 206 6016 - 25
SMF 05NOV09:02 5044 753 3041 204 9043
RMF 05NOV09:02 5050 761 3012 211 9036 - 7
SMF 05NOV09:03 7527 836 4359 213 12936
RMF 05NOV09:03 7507 856 4369 268 13002 + 66
SMF 05NOV09:04 4465 851 4411 278 10006
RMF 05NOV09:04 4752 868 4522 237 10380 + 374
3. An interesting post on IBM-MAIN by John Eells, IBM z/OS Technical
Marketing, on what IBM can/can't do when a change is introduced:
We can't win on the default. We can only pick which group of
customers to annoy:
- If we default a behavioral change we introduce a migration
action. Customers overwhelmingly tell us they hate migration
actions. "Look at this behavioral change, see if you care, and
change something if you don't want it to happen" is a migration
action.
- If we don't default the behavioral change, people who want it
tell us that "everyone" would want it to be the default.
We have historically been poor predictors of which group will be
larger, so we are "defaulting" more and more to avoiding
behavioral changes that "just happen."
2. APAR OA30246 reports that XRC zIIP-eligible-work in Service Class
SYSTEM is not dispatched on a zIIP, but executes on the CP engines
when HiperDispatch is Enabled. Pending a PTF, the APAR recommends
that zip-eligible work be moved to Service Class SYSSTC, or to
disable HiperDispatch.
1. Summary: "EXCP" counts recorded for access to HFS & ZFS filesystems:
An HFS file, 10,000 50-byte records, 496K, or 123 4096-byte blocks,
& a ZFS file, 1,000 50-byte records, 49K, or 13 4096-byte blocks,
was created/copied on z/OS 1.9 by different programs.
Total "EXCP"s were between 50 and 23,710 for HFS.
Total "EXCP"s were between 37 and 5,416 for ZFS
These "EXCP" counts are displayed on JOBLOG and are included in the
SMF 30 Address Space Total EXCP count, EXCPTOTL (SMF30TEP).
HFS ZFS
496K 49K
Job Description EXCPTOTL EXCPTOTL
TEST92LD -SAS92 LOAD 23710,23290 5416
TEST91LD -SAS91 LOAD 21856,21785 3867
TEST92RD -SAS92 READ 13364,13295 4464
TEST91RD -SAS91 READ 11787,11763 2891
TESTGENR -IEBGENER READ 309, 306 n/a
TESTFAST -FASTGENR READ 298, 285 65
TESTSORT -SYNCSORT READ 209, 209 70
TZOS92LD -SAS92 LOAD z/OS 3301 3324
TZOS91LD -SAS91 LOAD z/OS 1764 1771
TSTWGENR -IEBGENER WRITE 301 n/a
TSTWFAST -FASTGENR WRITE 268 53
TSTWSORT -SYNCSORT WRITE 252 62
ZOSCGENR -IEBGENER COPY 113 53
ZOSCFAST -FASTGENR COPY 50 28
ZOSCSORT -SYNCSORT COPY 46 37
All of the SMF records written for two of these test jobs were analyzed
in detail: the SAS-TEST91LD and FASTGENR-TESTFAST are analyzed in
detail below; the other job's SMF data will
SAS was used to create a 10,000 record text file of 50 byte records,
written to an dynamically allocated HFS1 Filename.
FASTGENR was then used to copy that hfs file, with a static SYSUT1 DD,
to a disk data set.
A. EXCP counts in DD Segments in SMF type 30 subtype 2, 3, 4, and 5):
1. There was no DD segment created segment for the dynamically
allocated HFS1 DDNAME in the SAS job.
2. While there was a SYSUT1 DDNAME in the type 30 records for the
FASTGENR job, it contained ONLY the DDNAME; there were no EXCPs
recorded, and there was no DEVNR nor DEVCLASS/DEVTYPE information.
B. EXCP counts in the address space fields in the SMF 30 record:
1. HFS "EXCP" counts ARE captured in the SMF 30 record; but only in
in the address space total EXCP Count EXCPTOTL(SMF30TEP/TEX).
- RMFEXCP are the EXCPs counted in IO Service Units (SMF30IO/IOL),
and the HFS EXCP count IS included in RMF IO Service Units.
- EXCPTODD is the sum of all EXCPs in the DD segments.
- EXCPNODD is the EXCPs count NOT counted in the DD segments,
calculated as EXCPTOTL minus EXCPTODD.
- EXCPDASD is the total DD EXCPs count to DASD devices.
- SMF30AIS is the total count of DASD SSCH's (NOT BLOCKS/EXCPs)
- IOTM variables are the IO Connect Time durations, as above.
JOB EXCPTOTL RMFEXCP EXCPTODD EXCPNODD EXCDASD SMF30AIS
SAS 21785 21778 1379 20406 1379 704
FASTGENR 285 280 101 184 101 213
JOB IOTMTOTL RMFIOTM IOTMTODD IOTMNODD
SAS 0.51 n/a 0.37 0.14
FASTGENR 0.02 n/a 0.02 0.01
Observations:
a. SAS wrote 10,000 blocks of 50 bytes each, but counted 20,000 EXCP,
and that count was also shown on the SAS log; why 20000 was the
count will be investigated with their HFS guy when he is back from
vacation, but that count of 20000 was passed to IEASMFEX, as it
does show up in the EXCPTOTL and RMFEXCP.
b. FASTGENR, the SYNCSORT replacement for IEBGENER, counted 101
"EXCP"s to the 3390 output disk device in the EXCP segment for
SYSUT2; the "EXCP"s reading the HFS file were counted as 184 in
the EXCPNODD (i.e., included in EXCPTOTL and RMFEXCP).
But FASTGENR and SYNCSORT have NEVER counted EXCPs, but, instead
count SSCHs, and that is what it passed to IEASMFEX.
(I was involved in legal issues between DFSORT and SYNCSORT
because SYNCSORT published false I/O comparisons that used the
SIOs for SYNCSORT but BLOCKS/EXCPs for DFSORT, many years ago.
There was a "Special Core Zap" from SYNCSORT that would change
their count to BLOCKS, but I don't know if it still exists, and
I suspect no one uses is now!).
In addition, the FASTGENR log shows that it read and wrote
10,000 logical records; however it shows a total size of
800,000 bytes, whereas only 500,000 bytes were written, so
even FASTGENR can't correctly count HFS activity.
c. While HFS EXCP counts are in the EXCPNODD field, there are other
I/O counts included in EXCPNODD, for all file I/O that does not
have a DD: Catalog I/O, LinkList I/O, and JES2 SPOOL I/O, and the
JES2 Spool I/O count can be significant.
C. HFS-only EXCP counts do exist in the OMVS Segment of type 30s.
The old "OMVS" segment is now known as
"z/OS UNIX System Services Process Section" in the SMF manual.
I LOVE the fact that UNIX is in CAPITAL LETTERS!
MXG's first technical note on measuring unix, by Chuck Hopf,
was subtitled "or how i learned to type in lower case".
1. The SAS job created one "OMVS" segment, while the FASTGENR created
two segments, having apparently spawned/forked/whatever unix does
that created a second PID for their copy program. The first three
columns are the only block count fields that were non-zero; the
last columns are the only other metrics that were non-zero.
DIR DATA DATA PATHNAME PATHNAME SYSCALLS
READ READ WRITE LOOKCALL LOOKCALL REQUESTED
JOB BLOCKS BLOCKS BLOCKS LOGICAL PHYSICAL BY
FILES FILES PROCESS
OMVSODR OMVSOFR OMVSOFW OMVSOLL OMVSOLP OMVSOSC
SAS 65 0 20000 8 37 21
FASTGENR-1 16 0 0 2 8 3
FASTGENR-2 26 125 0 3 13 4
FASTGENR 42 125 0 5 21 7
Comparing the type 30 with the type 30 OMVS segment:
Total I/O Blocks OMVS Total NODD IO COUNT
SAS 20065 20406
FASTGENR 167 184
Observations:
a. The UNIX segment EXCP counts can indeed be subtracted from the
address space EXCP counts, for sites that do NOT want to include
HFS EXCPs in their billing, if they are using the EXCPTOTL field.
b. I polled MXG users, and most said that when EXCP counts were used
in chargeback, they used only the EXCPDASD and EXCPTAPE counts
(MXG sums DD EXCP counts by device type); the use of EXCPTOTL that
includes HFS (and SPOOL) counts are not commonly used in billing.
D. HFS-only EXCP counts do exist in the Type 92 records.
The jobs each created one subtype 10 and one subtype 11; only the 11
has resource metrics:
BYTES BYTES DIR I/O DATA I/O DATA I/O READCALL WRITECALL
READ WRITTEN BLOCKS BLOCKS BLOCKS ISSUED ISSUED
RD/WR READ WRITTEN
SMF92CBR SMF92CBW SMF92CDI SMF92CIR SMF92CIW SMF92CSR SMF92CSW
SAS: 0 498K 12 0 20000 0 20000
SYNC: 498K 0 10 125 0 9 0
Observations:
a. While FASTGENR reported 800,000 bytes copied, the SMF 92 shows that
FASTGENR is wrong (it used a default LRECL=80 times 10,000 logical
records), and that SAS was right (it showed 10,000 logical records
with the correct 50 byte LRECL).
b. The EXCP counts for HFS activity, 20012 for SAS and 135 for SYNC
in the SMF 92 are consistent with the counts in the OMVS segment
and the EXCP counts passed into the type 30 step records, but
the values are the counts passed by the application, blocks for
SAS, and SSCH for FASTGENR, and there's no way to tell which is
which.
E. No SMF 42 subtype 6 records were created for hfs for these jobs.
And I did NOT expect to see those records, as they are documented in
the SMF manual "records DASD data set level I/O statistics", and, for
these two jobs, hfs was NOT a DASD data set.
There were type 42 subtype 6 records created for the DASD DDnames for
the two jobs, and they captured these counts, for comparison with the
SMF 30s:
JOB TOTAL NUMBER CACHE Sequential Read Sequential
TOTAL OF IOS CANDIDATES blocks Operations I/O's
IOCOUNT read to Dataset
(S42DSION) (S42DSCND) (S42aMSRB) (S42DSRDT) (S42DSSEQ)
SAS 442 187 27 431 5
FASTGENR 101 1 1 100
Observations
a. Whereas the EXCP counts in the TYPE 30 are whatever the application
access method passed to SMF, type 42 subtype 6 counts are direct
from the hardware, independent of the access method, etc.
b. The FASTGENR SSCH count of 101 SSCHs in the type 42 is the same as
the SSCH count passed by FASTGENR into the SYSUT2 DD segment, and
that was the only DD allocated to DASD, since SYSUT1 points to the
hfs file. But the (relatively new) SMF30AIS field, documented as
"DASD I/O Start Subchannel Count for address space and dependent
enclaves" count of 213 appears to me to be in error.
The SAS EXCPDASD count of 1379 is consistent with SMF30AIS of 704,
as half-track blocking is normally used by SAS.
c. I believe there would be type 42 subtype 6 records created for the
z/OS VSAM file that "contains" the HFS file system, but those data
would have the JOB name of the address space from which the actual
physical I/O occurs, and those counts would be for all users of the
file system, with no counts for the actual jobs that cause the I/O.
F. Data on the banner page may include HFS counts in the "EXCP Count"
This site uses the IBM "banner page" to print EXCP counts on Job Log;
the EXCP count that is printed is, indeed, that EXCPTOTL/SMF30TEP
Address Space Total Count, and which we now know DOES include the HFS
"EXCP"s, and those counts are only slightly larger than the two
products reported on their execution logs:
Banner Product
Page Log's
EXCPs EXCPs
SAS 21785 20240
FASTGENR 285 240
Observation:
a. This is very likely the source of the large EXCP counts that have
been reported, since it requires no analysis of the SMF 30 records
(and I think this is also the EXCP count displayed by SDSF).
G. Conclusions
1. Whatever is counted by the application as an "EXCP" for HFS access
whether blocks or SSCHs (at the whim of the I/O programmer!) is
included in the EXCPTOTL field in the SMF 30 records, and is the
count that is displayed by banner pages and SDSF.
2. The type 30 OMVS segments are now used in MXG 27.08 Change 27.213,
to create the new USSEXCPS count variable that could be used to
"back-out" these large counts, if the site is actually using the
EXCPTOTL field in chargeback and has significant USS activity.
See MXG Newsletter FIFTY-FIVE, MXG Technical Note titled
1. Summary: "EXCP" counts recorded for access to HFS & ZFS ....
HFS "EXCP" counts ARE captured in the SMF 30 record, BUT....
3. With the inaccuracies in counting HFS and zFS EXCPs, and because
they are included in the RMF IO Service Units, alternative ways
to count, including dividing the total bytes in the 92s by 4096
are under consideration by IBM. This research is in progress and
this note will be updated is corrections are made.
IV. DB2 Technical Notes.
1. X
V. IMS Technical Notes.
1. X
VI. SAS Technical Notes.
9. SAS Note 32778 reports ABEND 413 Return Code 18 (413-18) can occur
with SAS V9.2, if you create a new library on tape, when the new
tape dataset is allocated in Job Control. For example, this JCL
can cause this ABEND:
//CICSTRAN DD DSN=TAPE.CICSTRAN,DISP=(NEW,CATLG,DELETE),
// UNIT=3590-1
The error message that results will be similar to the following:
IEC145I 413-18,IFG0194A,TAPEDD,V921M0,CICSTRAN,0470,,TAPE.CICSTRAN
The following error messages might also appear in the SAS log:
ERROR: OPEN of CICSTRAN failed. Abend code 413. Return code 18.
ERROR: An I/O error has occurred on file CICSTRAN.
To circumvent this problem, explicitly name the engine with which
the library should be assigned, as in the following example:
//SYSIN DD *
LIBNAME CICSTRAN V9SEQ;
8. Exposure on Windows to FAIL/ABEND with LOCK NOT AVAILABLE ERROR.
SAS Technical Support confirms that execution of SAS under Windows
has ALWAYS been exposed to a LOCK NOT AVAILABLE error because any
file's lock can be "grabbed" by another process at any time, even
a SAS dataset file in the WORK data library! MXG creates a dataset
WORK.ZZdddddd with PROC SORT, reads it with SET ZZdddddd and then
PROC DELETE DATA=ZZdddddd. But in several QA runs under Windows 7,
SAS lost its file lock after the DATA step closed successfully,
causing the PROC DELETE to fail, terminating the QA job:
-"Lock held by another process" is probably caused by a backup
program, antivirus program, encryption, or an indexing
application like Google Desktop that is accessing or touching
the SAS temporary files while they are in use by SAS. If a
backup program or virus scan is running on an interval, that
would explain why the problem is intermittent.
-To fix the lock, add the file extensions used by SAS to the
exclude list of the interfering application; you should exclude
.lck , .sd2, .sc2 , .SPDS, and .Sas*
where the .SAS* wild card excludes these extensions:
.sas7bdat /* DATA */ .sas7bfdb /* FDB */
.sas7butl /* UTILITY */ .sas7bitm /* ITEMSTOR */
.sas7bput /* PUTILITY */ .sas7baud /* AUDIT */
.sas7bcat /* CATALOG */ .sas7bbak /* BACKUP */
.sas7bpgm /* PROGRAM */ .sas7bdmd /* DMDB */
.sas7bndx /* INDEX */ .sas7bods /* SASODS */
.sas7bvew /* VIEW */ .sas /* SAS program file */
.sas7bacs /* ACCESS */
.sas7bmdb /* MDDB */
Caution: careful when excluding non-temporary SAS data sets from
a backup. SAS Recommends that backups occur when SAS is not
running.
Caution two: other applications can use those suffixes:
SC2 - windows scheduler
SD2 - sound designer
LCK - database control
SPDS - ACROBAT
-If the problem application is not a backup program or virus scan
then the cause is still probably a third party program. A way to
determine which program(s) are causing the lock is to use
utility from Microsoft Sysinternals called Process Monitor. You
can download Process Monitor for free from Microsoft at
http://technet.microsoft.com/en-us/sysinternals/
bb896645.aspx?PHPSESSID=d926
Open Process Monitor, click filter and make these 3 changes:
1)Path "begins with" "%temp%\SAS Temporary Files"
(Click ADD) (use your work path name, if different).
2)Process Name is Sas.exe then Exclude (click Add)
3)Process Name is Explorer.exe then Exclude (click Add)
Click Apply and OK.
Then clear the log.
Then start SAS and run the SAS program that creates the lock
error. What Process Name(s) are listed in Process Monitor?
This particular filter doesn't always find the problem.
Usually the best advice is to ask your internal support team
for help using this tool to find the problem
We have not yet been able to identify what process grabbed the file
lock, because the lock conflict is intermittent.
BUT: The pathname of the WORK data library was NOT the SAS default:
it did NOT contain the text "TEMP" nor "SAS Temporary"
We have changed that pathname to the SAS default, and there has not
(YET!) been a lock conflict, so we presume/assume that the process
causing the conflict automatically excluded scanning of directories
with "TEMP" in their name.
Please See MXG Change 38.091 for Same Problem, Twenty Years Later!
7. SAS USER U1319 ABEND if EXITCICS/CICSIFUE and /VIEW=_WCICTRN used,
OR WITHOUT SAS HOT FIX 37166 for SAS 9.1.3 SP4. Fixed in SAS 9.2.
Using a VIEW for CICSTRAN with the CICSIFUE decompression INFILE
user exit caused a USER ABEND U1319 error, that is now corrected in
the SAS HotFix for SAS Note 37166.
This SYSIN input caused the U1319 abend :
%LET SMFEXIT=CICS;
%INCLUDE SOURCLIB(VMACSMF,VMAC110,VMXGUOW,IMACKEEP);
DATA
_VAR110
/VIEW=_WCICTRN;
_SMF
_CDE110
_S110
with these cryptic messages on the SAS log:
+No MKLEs found
+ERROR: VM 1319: The PCE address= 1848CB54
and MEMORY address=000D98D8
IEA995I SYMPTOM DUMP OUTPUT 749
USER COMPLETION CODE=1319
Removing /VIEW=_WCICTRN, the execution works fine with the Exit.
Also using TYPS110 worked fine (because it doesn't have a /VIEW).
But the same error message will occur with BUILDPDB due to the view
for VMACID. This error can be circumvented by inserting this
statement in your //SYSIN
%LET VWVMACID=;
which disables that sole VIEW in the BUILDPDB.
Change 27.260 is a VERY-EXPENSIVE-ON-Z/OS-alternative to EXITCICS.
6. You can NOT concatenate DSNAMEs behind //LIBRARY DD on z/OS; the
job will die with a 0C4 ABEND, as documented in SAS Note 12807 or
SAS Note 16096. The SYSMSG shows these z/OS messages:
IGD103I SMS ALLOCATED TO DDNAME LIBRARY
IGD103I SMS ALLOCATED TO DDNAME
And subsequently we see this:
IGD104I SYSDPCP.SL9.BILLPROJ.LBL4MATS RETAINED,DDNAME=SYS00004
IEC131I 1,MXGDAY,MXGSASV9,RDJFCB ISSUED FOR DCB WITH BLANK DDNAME
And the SAS log has this error message:
+ERROR: SYSTEM ABEND00C4 OCCURRED IN MODULE SASVC FUNCTION VVCLCHK.
5. The use of WHERE ALSO statement and OPEN=DEFER with a SET statement
with multiple datasets does not work as expected; while the WHERE
and WHERE ALSO are applied to the first dataset in the SET, only
the WHERE expression is applied to all other datasets in the SET
statement. Removing OPEN=DEFER causes the WHERE ALSO to be used
for all data sets, or, if OPEN=DEFER is required (when datasets
in the SET statement are on tape), then the WHERE and WHERE ALSO
expressions must be combined (with an AND) into a single WHERE.
4. SYSTEM COMPLETION CODE=EC6 REASON CODE=0000FD1D is actually a USS
ABEND, because SAS 9 is now a thread running as a USS process,
but that REASON is the old SYSTEM 322 ABEND, CPU Time Exceeded.
It can be a little cumbersome finding the appropriate doc for the
particular failure. However, for the FD* reason codes on the SEC6
abend here is what is documented:
FDxx
If xx is in the range of X'01' to X'7F', a signal was received
causing termination and a dump to be taken. This condition is
usually the result of an application programming exception. For a
description of the signal represented by the value xx, check the
appropriate appendix "BPXYSIGH - Signal Constants" or "Signal
Defaults" in z/OS UNIX System Services Programming: Assembler
Callable Services Reference.
In that referenced document, not very pleasant to read, the FD is
fixed, and the 1D is the signal constant in Hex. The doc shows the
decimal. So convert x'1d' to decimal and it is 29. For 29 we see:
SIGXCPU# EQU 29 CPU time limit exceeded
SAS 9 with the threading is the cause of these new USS type ABENDS,
rather than what we are accustomed to. So when executing within a
thread and a failure such as the CPU timeout occurs it will surface
the SEC6 system abend code. From this type of abend code it is the
REASON CODE which has the information needed to further determine
the cause. While MXG sets OPTION NOTHREADS (See Change 22.207),
that simply disables thread enabled PROCs from using threads. SAS
itself is running as a thread; in SAS V9, the entry points were
changed from the earlier SASHOST/SASXA1/SASXAL to SAS/SASB/SASLPA,
which are the wrapper programs for TK environment, which makes SAS
itself run as a thread. Hence the system requirement for an OMVS
segment sufficient that the user environment can be "dubbed".
3. ERROR: SYSTEM ABEND 0C4 OCCURRED IN MODULE SASXKERN FUNCTION YPCDO2
was caused by a back-level DSNAME for the SASMSG file.
2. SAS Hot Fix for SN-35332 is REQUIRED for z/OS 1.10 with SAS V9.1.3,
because ERROR: LIBNAME XXXXXXXX IS NOT ASSIGNED can occur for
jobs with a completely valid //XXXXXXXX DD statement. Jobs that
run without error on z/OS 1.9 can fail on z/OS 1.10 using the same
JCL and SAS/MXG datasets. If LIBNAME is LIBRARY, there may also be
a separate message ERROR: FORMAT MGBYTES COULD NOT BE LOADED.
The error has NOT occurred with SAS V9.2.
The error can be circumvented with addition of a LIBNAME statement
that explicitly specifies the engine name: LIBNAME LIBRARY V9 .
But, INSTALL the Hot Fix (or, better, INSTALL SAS V9.2), as adding
a source statement to PROD Source libraries may not be possible!
In z/OS 1.10 IBM increased the internal work area required for its
OBTAIN service call to 140 bytes (from 101), but SAS's work area
was the old size; OBTAIN in 1.10 validates that now-required size,
which caused an OBTAIN failure, which SAS surfaced with the above
error message. The LIBNAME with ENGINE circumvention works because
SAS doesn't need to issue an OBTAIN when the ENGINE is known.
SN-35332 is dated March, 2009, but only one MXG site saw the error,
and not until September, and only on one of their several z/OS 1.10
systems!
1. Out of Space conditions running MXG jobs on WINDOWS may need to be
examined by issuing DOS DIR commands at various points in the job.
You can use
systask command "dir d:\*.* >> d:\mxg\dirsize.txt" nowait;run;
to APPENDed each execution to the single file dirsize.txt, or
systask command "dir d:\*.* > d:\mxg\atstart.txt" nowait;run;
systask command "dir d:\*.* > d:\mxg\aftersort.txt" nowait;run;
etc to send each dir command to a separately named file.
-You can run out of space on an empty volume if Disk Quotas have
enabled by your System Administrator. You can view if quotas
are enabled and their size with these steps:
1. Open My Computer.
2. Right click the volume you want to enable disk quotas and click
Properties.
3. Click the Quota tab.
4. Click the Enable Quota Management option.
5. To limit the amount of disk space for new users click the Limit
disk space to option.
6. Set the appropriate values for the Limit disk space to and the
Set warning level to options.
7. Click OK.
VI.A. WPS Technical Notes.
1. X
VII. CICS Technical Notes.
1. CICS Capacity was limited by the single QR TCB.
In the old days, a CICS subsystem's capacity was limited by the
amount of CPU TCB time needed for that single QR TCB.
Based on my analysis when OTE was brand new, of the CPU time
consumed by each of these new CICS TCBs, I planned this post to
argue that going to OTE didn't help much, because most of the CICS
CPU time was still being spent under the QR TCB.
I could NOT have been more wrong!
Analyzing new CICS/TS 4.1 Open Beta data from a VERY aggressive OTE
exploiter site shows (from their SMF 110, subtype 2 Dispatcher
Statistics segments, MXG CICDS and CICINTRV datasets):
Total TCB CPU in Dispatcher Records = 13,080 seconds
Total TCB CPU in QR TCB = 2,776 seconds
Total TCB CPU in L8 TCB = 10,298 seconds
Total TCB CPU in all other TCBs = 6 seconds
Aha, you say, OTE still doesn't help; the CPU time just moved from
the QR TCB to the L8 TCB, so the capacity limit just moved from one
TCB to the other, right?
Wrong again.
While the QR TCB can attach only a single TCB, these new TCBs can
attach multiple TCBs; in fact, the SMF data shows that the L8 TCB
attached a maximum of 22 TCBs, each of which is a separate
dispatchable unit.
So, it REALLY does look like that these multiple OTE TCBs do
eliminate the old "one-TCB" CICS capacity limitations, and does
indeed spread your CICS time across MANY TCBs.
(Total SRB time in the Dispatcher Records was only 65 seconds.)
VIII. Windows NT Technical Notes.
1. X
IX. z/VM Technical Notes.
1. X
X. Email notes.
1.
XI. Incompatibilities and Installation of MXG vv.yy.
1. Incompatibilities introduced in MXG 27.yy (since MXG 26.26):
See CHANGES.
2. Installation and re-installation procedures are described in detail
in member INSTALL (separate sections for each platform, z/OS, WIN,
or *nix), with examples of common Errors/Warnings messages a new MXG
user might encounter, and in member JCLINSTT for SAS V9.2 or member
JCLINSTW for WPS. INSTALL also shows how to read SMF data on PCs/nix
using the FTP ACCESS METHOD.
XII. Online Documentation of MXG Software.
MXG Documentation is now described in member DOCUMENT.
XIIV. Changes Log
--------------------------Changes Log---------------------------------
You MUST read each Change description to determine if a Change will
impact your site. All changes have been made in this MXG Library.
Member CHANGES always identifies the actual version and release of
MXG Software that is contained in that library.
The CHANGES selection on our homepage at http://www.MXG.com
is always the most current information on MXG Software status,
and is frequently updated.
Important changes are also posted to the MXG-L ListServer, which is
also described by a selection on the homepage. Please subscribe.
The actual code implementation of some changes in MXG SOURCLIB may be
different than described in the change text (which might have printed
only the critical part of the correction that need be made by users).
Scan each source member named in any impacting change for any comments
at the beginning of the member for additional documentation, since the
documentation of new datasets, variables, validation status, and notes,
are often found in comments in the source members.
Alphabetical list of important changes after MXG 26.26 now in MXG 27.yy:
Dataset/
Member Change Description
See Member CHANGES or CHANGESS in your MXG Source Library, or
on the homepage www.mxg.com.
Inverse chronological list of all Changes:
Changes 27.yyy thru 27.001 are contained in member CHANGES.