MAIL017

From ZaInternetHistory

        swf 9,73;sj y                                           MAIL017
  
                       RHODES UNIVERSITY COMPUTING CENTRE
                       ----------------------------------
  
                                21 November 1989
                                ----------------
  
                         Changes to the Fidonet Gateway
                         ------------------------------
  
        1. Scope
        2. Changes
        2.1. Points off Settler City
        2.2. Conferences on Settler City
        2.3. Relocate the PC
        2.3.1. Logging by Ops Staff
        2.3.2. Operations Schedule
        2.3.3. Controlling .BAT File
        2.4. String Handling
        2.5. Log for Operations
        2.6. Responsibilities to Fidonet
        2.7. Better Logic
        2.8. Possible Bug in Archiving?
        2.9. Error Recovery
        2.10. Testing Environment
        2.11. Current Log File
        2.12. Commenting
        2.13.  Operations Log
        2.14. Backup
        2.15. Manual for Ops Staff
        2.16.  Explanation of Disk Files
        2.17. Description of Sources
        2.18.  Parameter for START.BAT
        2.19. Recovery from Failure
        2.20. Function Key for Forced Dialing
        2.21. Mailspec Extensions
        2.22. Algorithms
        2.23.  Reptition
        2.24. Modem Parameters
        2.25. Break-in
        2.26. Changing Directories
        2.27. Cleanup
        2.28. Regular Backups
  
1       1. Scope
        --------
  
        The present Fidonet operation has been developed under all kinds
        of pressures, with never enough time to stand back and review.
        This review must now be done, as beyond doubt, the system is
        creaking.
  
        This document sets out to specify the changes that need to be
        made to the present operation of the Fidonet Gateway.  It is NOT
        intended to be a criticism of how the gateway currently operates
        - if it is read in such a light, then it will have been
        mis-interpreted.
  
  
        2. Changes
        ----------
  
        2.1. Points off Settler City
        ----------------------------
  
        Close all Points by 31 December 1989, except for
  
                Pat Terry
                the Rhodes PRO office
                Dave Wilson testing
                Tim Bouwer testing
  
        The affected sysops are to be notified by 30 November 1989 of
        this proposed action.
  
        Operation of points carries some kind of commitment to see
        that facilities are available to these Points. These facilities
        appear to be quite inconsistent
  
        There is nothing to stop these sysops from operating their own
        Fidonet nodes, or from using Uninet (given the necessary
        authorisation by a Uninet participant site).  Indeed, the
        closing notification is to state these options very clearly.
  
        The reasons for encouraging Points no longer apply.  Uninet is
        accessible to most research institutions, and Fidonet is now
        officially recognised by the SAPT as a common interest group.
        So the present Point sysops have at least one viable
        alternative.
  
  
        2.2. Conferences on Settler City
        --------------------------------
  
        Close all conferences by 31 December 1989 except for
  
                those required to operate the zonegate
                those of direct interest to Pat Terry
  
        Subscribers who will be affected are to be notified by 30
        November 1989 of this proposed action.
  
        It is not beyond reasonable expectations that an alternative
        mailing gateway will come into existence.  This gateway will not
        be based on Fidonet.  If there are users who are dependent on
        the Fidonet system for their conferences, they will assert
        pressure to keep Fidonet open.  It will be better to stop this
        right now, as it appears that these conferences interfere with
        the primary operation of the gateway, viz to transfer mail to
        and from Uninet.
  
        There is nothing whatsoever preventing anyone who is
        inconvenienced by this action from operating their own Fidonet
        system to receive these conferences.
  
  
        2.3. Relocate the PC
        --------------------
  
        Install the PC and modem in the Computer Room so that it can be
        driven and monitored by the operations staff.
  
        If the operation of the zonegate is not brought up to the point
        whereby it is operated in a routine manner, then it will not be
        possible for the present sysop to be absent from the Computing
        Centre.
  
        In order to bring the system up to a high level of reliability,
        the operation should be made to be manual.  What will be needed
        includes (amongst possible omissions that are to be added to
        this document):-
  
  
        2.3.1. Logging by Ops Staff
        ---------------------------
  
        A log to be prepared for the Ops staff. This must include
  
                a record of operator activities
  
                a record of incidents that are inconsistent
                with reliable operation of the system
  
        The record of system failures must have place for a report or
        comment on what caused the incident, how it was cleared, what
        the lost time was, when the incident occured, who reported the
        incident, who cleared the incident.
  
  
        2.3.2. Operations Schedule
        --------------------------
  
        A schedule of operation must be drawn up.  This must show when
        various events should take place.  These include, for example:-
  
                time for the sysop
  
                national mail hour
  
  
        2.3.3. Controlling .BAT File
        ----------------------------
  
        Set up a function key that will (in this order)
  
                cause the PC to exchange mail with RURES, cause
                mail received from RURES to be prepared for
                sending by the PC.
  
        Set up a function key that will
  
                cause a single dialing dialing attempt to 1:105/42,
                allowing a simple break-in process (eg some sort
                of stop-key, or modem powering off) should the
                operator decide to abort the dialing attempt
  
                if the connection to 1:105/42 was successful cause
                any incoming Uninet mail to be delivered to RURES.
  
        It must be possible to repeat any of these operations ad nauseum
        without any adverse effect on the system.  The repeat should run
        quickly, avoiding duplication of runs, it must not cause
        information to be lost.  If errors occur, it must fail safe.
  
        At times when the PC is not being driven by the operator, it
        will be in a state to receive calls from other Fidonet PCs.
  
        The PC will also respond to National Mail Hour, causing dialing
        out to other Fidonets in accordance with standards and
        guidelines acceptable to the RSA Fidonet organisation.
  
        Critical points in the operation must be identified, so that it
        is just about impossible to continue automatically after a crash
        or an untoward interruption if corruption or loss of information
        is about to occur.
  
  
        2.4. String Handling
        --------------------
  
        Double-check the programs that have been written at Rhodes -
        there seems to be a problem with string handling in CONF2NOS.C,
        and this might well be more widespread.
  
  
        2.5. Log for Operations
        -----------------------
  
        Design a log for the Ops Staff.
  
        This should follow the lines of that used for the Cybers.  It
        should record when the gate is in production mode, when it is
        down for scheduled maintenance, when it has had a problem and
        the method of attending to the problem and the cause of the
        problem.
  
        It must be possible to produce daily and weekly reports on the
        performance of the facility, its reliability, and the causes and
        durations of stoppages.  The number of times of dial-ups to the
        USA were attempted, and the outcome, should be readily visible.
        These reports will be produced manually by the Ops Staff, as is
        currently done for the Cybers.
  
  
        2.6. Responsibilities to Fidonet
        --------------------------------
  
        The primary reason why Rhodes uses Fidonet at all is for the
        international gateway for Uninet email.  The responsibilities to
        Fidonet need to be spelled out in a Computing Centre document.
  
        For example, what is supposed to happen about the nodefile, what
        files must be transferred from the USA, and what is supposed to
        happen to them when they arrive, what times the Zonegate is
        supposed to accept incoming calls, etc etc.
  
        It would be sensible to classify ideas in this regard as either
        requirements or desireables.
  
  
        2.7. Better Logic
        -----------------
  
        The processing of the messages seems to be illogical.  There are
        too many copies of files being kept and being re-processed.  A
        good system diagram is needed to make the flow of information
        more visible, but it is clearly quite ridiculous to use the PC
        as a filestore when it is a gateway.  If backup copies of files
        need be kept, which is very likely an absolute necessity, then
        they should not be kept on the PC.  Also, when a message has
        been processed, it should be removed.  Here, processing of a
        message does not just mean getting it out of the received
        archived file and into a packet, nor out of a packet into a
        message area.  Once all of the work is done on a message, it
        must be removed from the system.  Reprocessing of messages is to
        be avoided.
  
        Similarly, when an archived file has had all of its processing
        done, it should be removed, as should a packet file (and any
        other file).  Too many files are being left on the PC.
  
        Further, it seems that not all messages are being archived.
        Those that are relevant to a conference are not being archived.
        Everything that passes through the gate should be archived to
        tapes on the Cyber.
  
        There seems to be a problem with looking for an RFC 822
        subheader within the Fidonet text field.  Is this a new
        specification that has been introduced, and if so, where is it
        documented?  What is the purpose of checking for the text
        "Return-Path:" in the first 5000 characters of the text field?
  
        There is no guard against pressing a wrong function key.  When a
        function key is pressed, before proceeding with what might be a
        time-consuming operation the system should notify the operator
        what is about to be done, and wait for a proceed / abort reply.
  
  
        2.8. Possible Bug in Archiving?
        -------------------------------
  
        Check for a bug in the process that archives the .MSG files.  If
        these files do not have a 4- (or more-?) digit number, they do
        not get archived.
  
        (Refer to Pat Terry for details - he reported this problem).
  
  
        2.9. Error Recovery
        -------------------
  
        The present method of error recovery leaves a lot to be desired.
        It would be far better to implement a method of recovery that is
        sensitive to the context in which a failure occurred, so that
        recovery can be automated where possible, and prevented (and the
        operator advised accordingly) when auto-recovery is not
        possible.
  
        The system as it was (17 Nov 89) was capable of starting up
        quite merrily, and then failing on transfers to RURES, because
        an interlock file had not been reset.  Yet, there was no warning
        at startup that this interlock was set.
  
  
        2.10. Testing Environment
        -------------------------
  
        It must be appreciated that the Settler City PC is in full-scale
        production.  It is therefore most unwise to use it as a test-bed
        for ideas.  Changes should not be made "on the fly" except in
        cases of emergency.  A test-bed system must be set up on the AT,
        so that, for example,
  
                programs can be debugged
  
                new .BAT files can be tested
  
                operations on incoming and outgoing .MSG, .PKT
                and .MOx files can be tested
  
                the BINKLEY.EVT file can be edited
  
        The production PC would be used to copy files to a floppy (or a
        series of floppies, using the DOS BACKUP command) and then
        loaded onto the AT for testing / examination or whatever.
  
        Similarly, new processes can be thoroughly tested on the AT
        before being loaded onto the production PC.
  
        Apart from the obvious benefits that will arise from the above,
        it will then be possible to strip the production PC to a minimum
        as far as files are concerned.  Far fewer utilities need be kept
        on-line, only those required for emergency use.  Many of these
        emergency utilities can be kept off-line on floppy disks anyway.
  
  
        2.11. Current Log File
        ----------------------
  
        It should be possible to produce a snapshot of the current daily
        log information at the press of a function key.  This would
        typically then be copied to a floppy disk for examination.  The
        process must not destroy the contents of the daily log.
  
  
        2.12. Commenting
        ----------------
  
        The small amount of comments in the .BAT, .EVT and .C files is
        appalling.  It is simply not possible for an intelligent person
        to make any progress with these files without a great deal of
        further study of the Fidonet system, or without getting hold of
        an expert and annoying him considerably.
  
        The lack of a version number in the major files (eg START.BAT,
        BINKLEY.EVT) indicates a sloppy approach to computing, and
        indicates a total lack of appreciation about running a
        production system.
  
        These files should have at least
  
                a version number
  
                a modification record
  
                a method of obtaining ANY of the earlier versions
  
                a ratio of 2:1 for lines of comment to lines of code
  
                a description of how to install, compile or otherwise
                put into production any changes that are made (this to
                be comments in the file itself)
  
                appropriate words of warning or other cautions that
                should be known to anyone who contemplates changing
                the file.
  
  
        2.13.  Operations Log
        ---------------------
  
        A log of the activities that are performed by the operator and
        the system maintainer must be kept. This should reflect the
        amount of time that the system was
  
                in normal production
  
                stopped due to hardware failure
  
                stopped due to program failure
  
                stopped for hardware maintenance
  
                stopped for program maintenance
  
                manually initiated mail transfer attempts
  
                any other untoward incident
  
        The log must show the date/time of these events, the name of the
        person who recorded the event, and a brief comment about the
        event itself.
  
        This log will have an associated fault reporting system, similar
        to that for the equipment in the computer room.  When, say, a
        program failure occurs, there must be a comprehensive report
        provided on the failure, describing what program failed, what
        the problem was, how the situation was cleared, what was done to
        fix the program.
  
  
        2.14. Backup
        ------------
  
        A backup system must be put into place AND TESTED to ensure that
        the system can be reloaded.  The DOS BACKUP and RESTORE commands
        are to be used for this.  The testing of the reloading must be
        onto a PC or AT that is cleared of all except the DOS bootup
        files.
  
        A .BAT controlling file must be provided to cause the backup to
        take place.  The backing up of files over and beyond those
        needed to run the system is to be avoided.  The normal backup
        process is to backup entire directories, and this should be the
        case here.  At the same time, the TREE/F (or similar) command
        must be used to produce a floppy disk file for documentation.
  
  
        2.15. Manual for Ops Staff
        --------------------------
  
        A write-up for the Ops Staff must be provided.  This must
        describe at the least how to
  
                how to carry out normal operations
  
                distinguish between normal and failed conditions
  
                cause a full international email interchange to
                take place
  
                record faults
  
                whom to notify in case of failure
  
  
        2.16.  Explanation of Disk Files
        --------------------------------
  
        For each file or category of file on the PC there must be a
        description containing at least
  
        Pathname    Purpose   Creating process  Deleting process
        --------    -------   ----------------  ----------------
  
        Without this it is impossible to determine what files are
        necessary and what are not.
  
        There are also some Ramdisks set up.  There must be a
        description of their use, the size constraints on them, which
        programs use them, and any further information of use to an
        intelligent programmer who is unfamiliar with the details of how
        Fidonet works.
  
  
        2.17. Description of Sources
        ----------------------------
  
        On the Fidonet PC, there must be a documentation file as the
        first file of the sorted C:\FIDO directory describing at least
  
                where to find the original Fidonet disks
  
                how to get help in times of trouble
  
                where the source of Rhodes University programs
                are stored
  
                how to store any changed source programs (NB this
                includes .BAT, .EVT files)
  
                how to use the test AT for debugging
  
                any other useful information
  
  
        2.18.  Parameter for START.BAT
        ------------------------------
  
        Modify START.BAT to take a parameter to allow it to be invoked
        to run from a particular point, by default from the beginning.
        Ensure that only sensible value can be provided for this
        parameter.
  
  
        2.19. Recovery from Failure
        ---------------------------
  
        Modify START.BAT so that it will restart automatically from the
        most sensible point after a power failure or any other
        interruption.
  
        If restart is not possible, then START.BAT must inform the
        operator accordingly, and should provide as much useful
        information as possible to help with the manual restart.
  
  
        2.20. Function Key for Forced Dialing
        -------------------------------------
  
        A function key should be set up to force the dialing to the USA.
        The method of typing Alt-M and responding with 1:105/42 is not
        satisfactory, as it is easy to type the wrong characters.
  
        Before dialing, the operator should be prompted to indicate
        whether mail from RURES should be collected first.  The question
        should timeout after 2 minutes, and assume an affirmative reply.
  
        After the telephone connection has been completed, the operator
        should be prompted to indicate whether the incoming mail should
        be processed.  This question should timeout after 2 minutes, and
        assume an affirmative response.
  
  
        2.21. Mailspec Extensions
        -------------------------
  
        What extensions have been added or attempted over and above the
        mailspec given in the MAILnnn files?
  
        For example, it seems that a test was put into CONF2NOS to look
        in the first 5000 charactes for the text "Return-path:".  This
        is a case-sensitive check, which violates RFC822 standards for
        starters, and ignores totally the concept of a header followed
        optionally by a blank line and text.
  
  
        2.22. Algorithms
        ----------------
  
        Sorely lacking in CONF2NOS.C is any description of the
        algorithms used.  This is a major oversight, and must be
        corrected.
  
        The programs should be written first and foremost for people to
        read, and then have the computer code added.
  
  
        2.23.  Reptition
        ----------------
  
        It seems pretty obvious that when function key F10 is pressed
        immediately after the PC has finished processing a previous F10,
        a great deal of processing takes place.  This is grossly
        inefficient - when a file is processed successfully, it should
        not be processed again except by deliberate and intentional act.
  
  
        2.24. Modem Parameters
        ----------------------
  
        Provide a .BAT file to set up the modem to its correct settings.
        The file should have a comprehensive description of what it is
        doing.  It must also describe any hardware settings required by
        the modem.
  
        Currently, if the modem gets a bad setting, the settings have to
        be guessed.
  
  
        2.25. Break-in
        --------------
  
        Describe how to break into a poll attempt, and how to break into
        other telephone activities.  (eg at worst, power off the modem).
  
  
        2.26. Changing Directories
        --------------------------
  
        AVOID LIKE THE PLAGUE any changing of directories from within a
        program or .BAT file.  It is dangerous, and caused a series of
        files to be wiped out when a directory did not exist.
  
        When a program fails, the PC is left in an arbitrary directory,
        and this is dangerous.
  
        In cases where it is absolutely unavoidable, then devise a
        foolproof check to see that the change took place.
  
  
        2.27. Cleanup
        -------------
  
        When mail has passed through the PC, it should be cleaned out of
        the PC.  Any files required for backup should be stored in a
        tape archive on the Cyber.
  
        Do not use the PC as a filestore, nor as a backup system.  It is
        a gateway.
  
  
        2.28. Regular Backups
        ---------------------
  
        A regular backup procedure must be instituted.  This must use
        the DOS BACKUP process, to backup the PC files onto floppy
        disks.  This backup should be used to re-create the Fidonet
        system, and must be tested to do this.  It should not re-create
        any files with messages unless these are essential to run the
        system.
  
  
 1      MAIL017 Ends

Navigation