Experiments in virtualizing pay off, till they perform too well.
The story unfolds in the Midwest in the early '70s, when I was the lead systems analyst for a large timesharing installation that was co-located with a large batch-processing complex for the same company. The timesharing environment was hosted on an IBM Model 67, and the batch systems were hosted on several IBM 65s, 50s, and an assortment of peripheral 30s -- all of this was state of the art at the time.
The 67 was running the CP-67 operating system and the 65s were running MVT (an early IBM multitasking batch operating system). We were a small installation when compared with our HQ in the NW, and we generally followed their lead and maintained compatibility with them to facilitate mobility of jobs between the sites as needed. We also had -- for that time -- a 56k high-speed data link between the two sites.
We had a job sharing arrangement between the sites to "offload" eight hours of work each night from the HQ center to help balance the budget. We were obligated to run their operating environment (which contained a few quirks we didn't need) and the modus operandi was to shut down the timeshare system at midnight, run their work, and then reboot the timeshare system in time for the engineers to access it in the morning when they came to work.
My efforts at the time were directed to increasing the capacity of the timeshare system without spending any more money. This left me with the only option of pursuing software improvements. In this pursuit, I discovered that the factor limiting our capacity was fast access paging space. I further discovered that the stock IBM PC-67 operating system (predecessor to VM 370) had an algorithm for looking for places to put pages on slow devices before it looked at faster devices -- in other words, 2311 (really slow moveable head disks), then 2314 (much faster but still moveable head disks), then 2303 (fixed head drums, but multiplexed heads over several tracks), and finally 2301 (fixed dedicated head drums).
Each of these hardware steps netted approximately an order of magnitude in faster access relative to its predecessor. After experimenting a bit to ensure no negative effects, I reversed the order of the search/allocation algorithm, which netted approximately a 40-percent increase in system capacity and response time. Great.
To conduct the experiments I had to convince the batch folks to let me run their system as a virtual machine to allow me to do my testing in parallel. My only other option was to come in at 3:00 on Sundays. They were very reluctant at first until they noticed that their jobs were running faster on the virtual machine than when they ran it in native mode. The improvement wasn't spectacular, but it gave them a little more breathing room to return their results to the NW each morning. I did my testing using a virtual VM environment, which allowed me to test while they were doing their production. A win-win situation.
Once I completed my testing, I put the revised version of CP-67 into production (and reinstalled an additional 2301 drum that had been removed from the system because it hadn't improved response or capacity earlier), which allowed more timeshare users to use the system simultaneously while also allowing the batch folks to start earlier without having to kick the timeshare users off the system. Another win-win. I made a few adjustments to the virtual machine configuration that they were using to run the batch stuff, namely giving them much larger virtual memory than the machine had for physical memory, a simple consequence of having the virtual machine to support their operating environment.
I thought little of it, thinking I had done them a favor in return for their co-operation with my testing requirements, although I was getting comments back that it ran so fast that the operators had a lot of time on their hands after running their daily allotment of NW work.
Things went along well for a few days, then I was called into the datacenter manager's office. He was livid. He had been chewed out by the NW datacenter manager for making him look bad for running his estimated eight-hour workload in less than an hour in one instance, and averaging only two hours instead of the eight since the changes I had installed for the batch virtual machine were implemented. Just goes to show ya -- no good deed goes unpunished!
About Author Michael C. Muma: "I am retired but work on embedded microprocessor hardware/software applications as a hobby/consultant. I worked for a large manufacturer for 38+ years in the IT organization. My projects ranged from engineering support applications to large-scale operating system architecture, development, and maintenance as well as network architecture. I started on IBM 7090/94 systems, was heavily involved in the migration to IBM 360/370 large scale systems, primarily 65 & 75 MVT systems as well as CP-67 and their successors the MVT & VM-370 operating systems. I was a primary architect of the first multi-system, multi-vendor batch job network, which evolved into IBM's NJE product. We adapted it to include CDC and Cray systems as well.
I also implemented the first large scale interactive development system on MVS using IBM 2260 video terminals. It allowed an engineer or programmer to interactively code in the supported languages primarily Fortran or Cobol before TSO existed. It also supported operations use as a batch process schedule and launch system. These efforts evolved into interactive graphics/design systems supported first on large scale systems and later on mini and micro systems and finally on the high-end PC class systems we use today. This trend also justified high speed networks to support the data volumes required to accommodate the design process.
I played major roles in the design and implementation of those networks including participation on ANSI/IEEE committees defining high speed network protocols.
I retired as a member of the Technical Fellow fraternity at the company."
Submit your product repair or redesign story as part of our Frankenstein's Fix competition on EE Life, and you could win a Tektronix MSO2024B digital oscilloscope. The deadline is October 26, 2013. Submission details and full contest rules here.