Disks

By Steve Pedler

 

Issue 25

Jan/Feb 87

Next Article >>

<< Prev Article

 

 

Steve Pedler discovers just how all that data is stored on your disks

Although knowledge of the structure of files stored on disk is not necessary in order to use a disk drive, the subject is an interesting one and information about it is essential if you wish to carry out certain tasks such as repairing damaged files or creating boot programs. The following article examines the structure of various types of disk file, and in the second part of the article I will present a sector editor enabling you to directly read and write to disk sectors.

All references to DOS and disk drives in the article relate to the current Atari standard of 1050 drive and DOS 2.5, unless stated otherwise.

THE DISK ITSELF

A floppy disk consists of a thin, circular piece of plastic coated with metal oxides which store the data in magnetic form. As initially supplied the disk is not usable, and the surface must first be organised to store data, by a process known as formatting. The surface of a formatted disk is divided into 40 concentric tracks. Each track is in turn divided into 18 (single density, or 26 enhanced density) sectors, each of which holds 128 bytes of data. Data is therefore packed rather more closely onto an enhanced density disk, which means that the disk surface must be higher quality to ensure reliable storage. In fact, the only difference between disks designated by the manufacturer as single or double density is that one has been tested for higher quality. Prior to formatting the drive cannot distinguish between them. It is important to use a quality disk as formatting a disk designated as single density with DOS option I will automatically result in an enhanced density format, which might lead to unreliable data storage. To specifically format a disk in single density, use DOS option P.

Once the disk is formatted, the 1050 (but not the 810) drive can distinguish between single and enhanced density and use the disk accordingly. The 810 drive can use a single density disk formatted on a 1050, but not an enhanced density one. Note that DOS 2.OS can read an enhanced density disk in a 1050 drive, but sectors numbered 720 or greater are invisible to it and files using these sectors will be unavailable.

SECTOR NUMBERS

From the figures above, you will see that theoretically a single density disk contains 720 sectors (40 tracks * 18 sectors per track = 720 sectors) and an enhanced density disk contains 1040 sectors. Examination of a freshly formatted disk (not containing DOS files) shows however that you only have 707 or 1010 free sectors respectively. What happened to all those missing sectors?

On a single density disk, as part of the format process, eight sectors (361-368) are reserved for the disk directory and a further sector (360) for the Volume Table of Contents (VTOC). The structure and use of these sectors is described below. Three more sectors (1-3) are reserved for the DOS file manager boot file (see below). Finally, one sector is lost due a discrepancy between the original version of DOS and the original disk drives. As far as the drive is concerned, the 720 sectors on the disk are numbered from 1 to 720, but DOS numbers them from 0-719. The result is that sector 720 just does not exist as far as DOS is concerned. No doubt this could have been corrected with later versions of DOS, but then there would have been a loss of compatibility between the various versions. Anyway, this makes a total of 13 unavailable sectors, leaving 707 free for use. (Note that these sectors are only unavailable within the confines of DOS - you can use any of them in any way you like by bypassing DOS and doing direct sector-oriented disk access.)

Although 1040 sectors are present on an enhanced density disk, due to the file link structure DOS 2.5 cannot use sector numbers greater than 1023. The reason for this will become apparent when discussing linked sector files below. Of the 1023 sectors available, 12 are reserved for the directory, VTOC, and DOS boot file as above. Although sectors numbered 720 or above can be used by DOS 2.5, to ensure maximum compatibility with DOS 2.OS sector 720 is marked as unavailable. This leaves 1010 sectors free for use.

THE DIRECTORY

The directory consists of eight sectors starting at sector 361. These were chosen, because they are in the middle of a single density disk and therefore give the shortest average disk access time. Each directory entry is 16 bytes long, giving eight entries per sector and a total of 64 entries. The 16 bytes of each entry are used as follows:

Byte 1 Flag or status byte. The various bits in this byte, if set, have the following meanings:

bit 0 - special meaning for DOS 2.5 - see below
bit 1 - file created by DOS 2 (if this bit is clear, it is a DOS 1 file)
bits 2 -4 - spare

bit 5 - file is locked
bit 6 - entry in use (i.e. not that the file is OPEN, but that this directory entry is valid and cannot be used for a new file)
bit 7 - file has been deleted

In most publications the setting of bit 0 of the status byte is said to indicate that the file is OPEN. However, under DOS 2.5 if this bit is set it appears to indicate that the file uses sectors numbered 721 or greater, this file therefore being unavailable to DOS 2.OS. When doing a directory read, DOS 2.5 will bracket these files to indicate this to the user. Such files have the value 3 in the directory entry status byte. (Not 67 as you might expect from the list of bit values above. If you deliberately change the value from 3 to 67 using a sector editor, the file will no longer appear when the directory is read.) The status byte can therefore contain the following values:

value (decimal)

meaning  

 

3

DOS 2.5 file using sectors numbered 721 or more

35

as above, but file locked

66

DOS 2 file, entry in use

98

as above, but file locked

128

file deleted

When a file is deleted, bit 7 of the flag byte is set (and all other bits cleared) but the filename is not removed from the directory. The file data itself is not erased, but the sectors used by the file are marked in the VTOC as being available for use again (see below). Under certain conditions it may be possible to recover a deleted file (e.g. using the DOS 2.5 utility DISKFIX.COM), but probably not if another file has been written to the disk since the old one was deleted. The new file may have used the directory space and sectors occupied by the deleted file, making recovery impossible.

 Bytes 2 and 3

total number of sectors used by the file in low and high byte format.

 Bytes 4 and 5

sector number of the first sector in the file, again in low and highbyte format.

 Bytes 6 - 13

primary filename. If this directory space has never been used, this area contains only zeroes.

 Bytes 14 - 16

filename extension (or zeroes).

Normally, when you do a directory read you only get the filename and sector count, plus an asterisk marker if the file is locked. To get the rest of the information in the directory entry, you will need to use a sector reader which bypasses DOS and reads in the entire sector. From BASIC the directory is usually read using a statement such as: OPEN #1,6,0,"D:*.*". However, DOS 2.5 can use sector numbers greater than 720, which would not be usable by DOS 2.OS. If you use the following statement: OPEN #1,7,0,"D:*.*", DOS will bracket any file using sector numbers of 720 or more (e.g. as <FILENAME.EXT> ).

THE VTOC

This is located in sector 360 (single density) or sectors 360 and 1024 (enhanced density). As indicated above, two VTOC sectors are necessary for an enhanced density disk as one sector is insufficient to store information about all 1023 sectors. Its purpose is to provide a map of which sectors are being used to store files and which are currently free to be used in a new file. The first five bytes of sector 360 contain miscellaneous information:

Byte 0  directory type byte. According to the OS User's Manual, this should always be zero, but appears to be set to 2 under DOS 2.5 and DOS 2.OS.

Bytes 1 and 2  total sector count (in low and high byte format) on the disk available to DOS. Should equal 707 for single density and 1010 for enhanced density.

Bytes 3 and 4  free sector count. This is the number of currently available (free) sectors up to a maximum of 707. It is therefore the same number that appears at the end of a directory read as 'xxx FREE SECTORS' on a single density (but not an enhanced density) disk. On an enhanced density disk, the number of additional free sectors is stored in bytes 122 and 123 of sector 1024.

Starting at byte 10 of sector 360 is the sector use bitmap. Each byte in the map contains the in-use status of eight sectors, one bit per sector. On a single density disk, the map continues to byte 99 of sector 360, but one sector is insufficient to map all the sectors on an enhanced density disk and so sector 1024 is used as well. Each byte is used as shown:

 Byte 10

bit 

 7

 

sector  

 0

 Byte 11

bit 

 7

 

sector 

 8

10 

11 

12 

13 

14 

15 

If a bit is clear, the sector is in use; if set, it is available for a new file. Note that sector zero, although present in the map, does not exist (see above). The map continues as shown above to byte 99 of sector 360, bit 0 (the rightmost bit) of which represents sector 719. It should be noted that even on an enhanced density disk the map finishes here, and no more bytes of this sector are used. On such a disk, the bitmap in sector 1024 starts at byte 0 (not byte 10 as in sector 360). Bit 7 (the leftmost bit) of byte 0 represents sector 48. The bitmap continues to byte 121, bit 0 of this byte representing sector 1023. Bytes 122 and 123 store the number of currently available free sectors in addition to those stored in bytes 3 and 4 of sector 360. In other words, a freshly formatted enhanced density disk (without DOS files) will have a total of 1010 free sectors. This number is stored in bytes 1 and 2 of sector 360 and will remain unchanged. Bytes 3 and 4 of sector 360 will contain the number 707, and bytes 122 and 123 of sector 1024 the number 303 (707 + 303 = 1010). These numbers will be updated as files are saved and deleted.

Because the bitmap in sector 1024 starts at sector 48, there is a considerable amount of overlap between the two VTOC sectors. Both sectors will need to be examined to get the free sector count on a directory read, and both may need to be updated when a file is written to disk. This presumably accounts for the considerable amount of drive head movement with this version of DOS, which did not happen with DOS 2.OS or DOS 3.

DISK FILE STRUCTURE.

After all the above (necessary) preliminaries, let us now look at the structure of files stored on disk. Generally speaking, there are two main types of file. These are firstly, files created and maintained by the disk file manager (linked or chained sector files) and secondly boot program files.

CHAINED SECTOR FILES.

These are the commonest type of file and examples include those created by BASIC SAVE or LIST commands, the Binary Save option from DOS, word processor text output, assembler object files and so on. With this type of file, only the first 125 bytes (bytes 0 - 124) of each sector contain file data. The remaining three bytes contain the file link data, which is stored in the following way:

Byte 125 the most significant six bits of this byte contain the file number, which corresponds to the position of the filename in the directory, and will be in the range 0 - 63. The remaining two bits (bits 0 and 1) plus the whole of byte 126, make up the 'forward pointer'.

Byte 126 this byte plus two bits from byte 125 is the forward pointer, and contains the sector number of the next sector in the file. Bit 1 of byte 125 is therefore the most significant bit of the pointer. 10 bits of pointer can only store a maximum number of 1023 in binary form and this is why the sectors numbered from 1024 to 1040 on an enhanced density disk are unavailable to DOS 2.5. The same amount of pointer was also used on DOS 3, but note that just one extra bit of pointer would have allowed a true double density disk drive! Presumably Atari did not do this when developing the 1050 and DOS 3 in order to maintain compatibility with previous versions of DOS. However, DOS 3 when produced was totally incompatible with DOS 2.OS for other reasons!

Byte 127 this byte contains the actual number of data bytes stored in this sector. For all but the last sector in the file, this should be 125. The last sector might contain 125 bytes, but this won't happen unless the file length is an exact multiple of 125.

From this you can see that the disk file manager finds the first sector of a file from the directory. 125 bytes of data are loaded from that sector and loading continues from the sector specified in the link data. This process is repeated until the forward pointer reads zero, which indicates that this is the last sector in the file. As each sector is loaded, DOS checks that the file number (stored in byte 125) is the same as the file entry position in the directory. If the numbers differ, loading stops and error 164 (File Number Mismatch) is returned. Although this may seem a complex process, it does have the advantage that files do not need to be stored in a string of consecutive sectors, but can be scattered around the disk if necessary, depending on the availability of storage space.

There are two special cases of this kind of file we should consider. Binary files are machine code programs created by the Binary Save option of DOS (which saves a specified area of memory to disk) or the object code output from an assembler. The first six bytes of any such file are known as the file header, and have this format:

Bytes 0 and 1 - both set to 255 (hex $FF). This is an identifier for a binary file.
Bytes 2 and 3 - the start address in low and high byte format.
Bytes 4 and 5 - the end address, again in low and high byte format.

When you select DOS option L (Binary Load) the start and end addresses are obtained from the first six bytes of the first sector of the file, and the program itself loaded into memory, beginning at the load address and continuing until the end address is loaded. The Binary Save option of DOS allows you to specify optional initialization and run addresses. If present, these are appended to the end of the file. On loading the file, the initialization address will be loaded into locations 738 and 739 (INITAD) and the run address into locations 736 and 737 (RUNAD). On completing the load, control is passed back to the DOS menu if neither of these addresses have been specified. If an initialization address is present, DOS performs a machine language JSR instruction to the address contained in INITAD. The code specified here should end with an RTS instruction to return control to DOS. If a run address is specified, DOS will then JSR to this. Either or both (or neither) of these addresses may be used. Note that they do not need to point to code within the loaded program - they could be used to call operating system routines for example, or pass control to BASIC. An AUTORUN.SYS file is simply a special case of a binary file. After DOS is booted on powerup, it will look for a file named AUTORUN.SYS on the disk and load and run it if present. To autorun, the file must have either an initialization or run address appended.

The second 'special case' is that of a file created by the BASIC SAVE command. A BASIC program is stored in memory in tokenised form, whereby the BASIC keywords and variable names are represented by one byte tokens rather than their full ATASCII form. This has the advantage of saving considerable amounts of memory, but means that BASIC must maintain lists of variable names and their current values so that it knows which token represents which variable. Logically enough, these are called the variable name and variable value tables. When a BASIC SAVE is made, the program is saved in tokenised form and the above tables must be saved with it. In fact, a series of zero page pointers and several blocks of memory are also saved, including the following:

1) zero page pointers:

locations

name

function

 

128,129

LOMEM

pointer to the lowest memory location usable by BASIC

130,131

VNTP

pointer to the beginning of the variable name table

132,133

VNTD

pointer to the end of the variable name table

134,135

VVTP

pointer to the beginning of the variable value table

136,137

STMTAB

pointer to the beginning of the tokenised program

138,139

STMCUR

pointer to the token in a program line currently being processed, either during input of a line or when the program is run

140,141

STARP

pointer to the beginning of the string and array storage area, and therefore to the end of the program

 

These seven pointers are saved to disk in the order shown, but before doing so one change is made - the value in LOMEM is subtracted from each one and the resulting value saved. Since LOMEM itself is saved first, this means that the first two bytes of the file are always zero.

2) sections of the tokenised program:

This comprises the following blocks of memory in this order:

the variable name table

the variable value table

the tokenised program

the immediate mode line

Note that the string/array storage area is not saved, as all strings and arrays are redimensioned each time the program is run.

 

When a BASIC LOAD is made, the seven pointers are read in first, and the value in MEMLO (locations 743,744 - the operating system pointer to the bottom of free memory) is added to each one. The values in two more zero page pointers, RUNSTK (142,143 - pointer to a software stack used by BASIC in processing GOSUB statements and FOR...NEXT loops) and MEMTOP (144,145 - pointer to the top of memory used by BASIC, including the string/array area) are set to the value in STARP. Next, 256 bytes directly above the value in LOMEM are reserved as an output buffer used when BASIC is tokenising a line. Finally, the variable tables and the tokenised program are read in to memory immediately following the output buffer.

BOOT PROGRAM FILES

These are machine code programs which are loaded into memory and run (if desired) by the operating system at powerup. Unlike the binary files discussed previously they do not require DOS to be present in memory or on the disk in order to be loaded or run, nor do they need the presence of BASIC or any other language. The file structure therefore differs fundamentally from chained sector files. Because DOS is not used, sector chaining is not needed and boot program sectors contain 128 bytes of program data and no link data. The operating system boot loader routine always attempts to load boot files at powerup starting at sector 1 of drive 1, meaning that generally speaking there can only be one boot file per disk and this must consist of a consecutive string of sectors beginning at sector 1. These files do not require a directory entry, and sector usage need not be indicated in the VTOC. There is an important exception to these rules, discussed below. As with the binary files discussed earlier, these files contain a six byte header. The six bytes are used as follows:

Byte 0 - flags byte. This is not generally used and is usually zero.
Byte 1 - number of sectors to be loaded, including the first sector. This can range from 1 - 255. If it is zero, 256 sectors will be loaded. What if the file is longer than 256 sectors? See below for the explanation.

Bytes 2 and 3 - the load address. The file is read into memory starting at this address.

Bytes 4 and 5 - the initialization address.


What exactly happens during the boot process? The procedure is described in considerable detail in De Re Atari or the Operating System User's Manual, but the following is a brief outline. Cassette users should note that the process is essentially similar for the cassette boot process.

 

As part of the powerup routine, the operating system (OS) checks to see if a cartridge is present (or built-in BASIC enabled). If so, the cartridge's 'Allow disk boot' flag is checked, to determine if the cartridge software permits the disk to be booted (as it would in the case of BASIC or other languages, but not in most games). Providing a disk boot is allowed, or if no cartridge is present and BASIC is disabled, the boot process goes ahead.


Assuming drive 1 is switched on, the OS will attempt to read sector 1 into memory. If it cannot do so - if no disk is in the drive for example - the boot process is aborted and the message 'BOOT ERROR' written to the screen. If all is well, the 128 bytes in sector 1 are read into a specified area of RAM (the cassette buffer in fact). The first six bytes (the header) are described above. The values in these bytes are then moved to the following locations:

 

Byte 0 to location 576 (DFLAGS)

Byte 1 to 577 (DBSECT)
Bytes 2 and 3 to 578,579 (BOOTAD)

Bytes 4 and 5 to 12,13 (DOSINI).

The entire sector (including the header) is then moved to the area of memory beginning at the address now present in BOOTAD. The remaining sectors are then read from disk directly into the memory area following the first sector.

When the load is complete, the OS performs a JSR to the address contained in BOOTAD, + 6 (i.e. to the first byte of the actual program). This part of the program need not do anything, but if the file was longer than 256 sectors any remaining sectors should be loaded by the part of the program contained here. This part of the program should end by clearing the 6502 carry flag to indicate a successful load (even if no further sectors were loaded) or set the carry flag if the load was unsuccessful.. It must terminate with an RTS.

The OS will next JSR to the address in DOSINI for program initialization. Again, this section need do nothing, if so desired. It must end with an RTS. However, if the booted program is at some stage to take control of the computer, this section of the program should store the run (or 'restart') address of the program into locations 10 and 11 (DOSVEC). If this is not to be the case, DOSVEC should be left unchanged. On powerup, DOSVEC is set to point to the memopad (400j800) or self-test (XL/XE) routines. If DOS is booted, it will change DOSVEC to point to the routine to load the DOS menu. BASIC will jump through DOSVEC when you type the keyword DOS, and this explains why, if you call DOS when it has not been booted, you go into the self-test/memopad routine.

Finally, the OS will pass control to the cartridge software (or BASIC) if present. If both BASIC and cartridges are absent, the OS passes control directly to the booted program by jumping through DOSVEC. Booting DOS without a cartridge or BASIC will therefore go straight to the DOS menu; powering up the machine without cartridge or disk boot and with BASIC disabled will proceed to the memopad/selftest routine. Note that whenever the Reset button is pressed, at the end of the warmstart process the OS will carry out the final two steps described above.

One special case of booted software is that of DOS itself. Although DOS is booted into memory on powerup, it actually consists of two separate files - the three boot sectors (1-3) and the file DOS.SYS. On powerup, the OS reads in the boot sectors and these will in turn load DOS.SYS. This has the advantage that DOS.SYS can be located anywhere the disk, and can be deleted if required. Otherwise, a string of 40 consecutive sectors would have to be permanently reserved for it, even if you did not want DOS on a particular disk. However, this does mean that sector 1 takes on a slightly different format. The six byte header is the same as before, but the three bytes following the header are a JMP instruction to the code which loads in DOS.SYS. Following these three bytes, there are a series of data bytes needed by DOS. The use of these bytes and their (usual) value is as follows (bytes 0 - 5 are the file header):

byte

usual
value

 

function

0

0

flagbyte

1

3 

number of sectors to load

2,3

0,7

load address for the three boot sectors

4,5

64,21

initialization address

6,7,8

76,20,7

JMP instruction to bypass the data bytes (JMP $0714)

9

3

maximum number of simultaneously open disk files (you can have open files to other device as well). Each open file is allocated a 128 byte buffer. You can increase this number to a maximum of seven, but you will lose 128 bytes for every additional buffer.

10

3

drive numbers supported - in this case drives 1 and 2. Up to four drives can be supported, and each drive is represented by one bit in this byte (bit 0 = drive 1, bit 1 = drive 2 and so on). Again, this byte can be altered to add more drives to your system.

11

0 

buffer allocation direction (no, I don't know what it means either, but apparently it should always be zero)

12,13

204,25

boot image end address + 1

14

1

if zero, it means that the file DOS.SYS is not present on the disk. A nonzero-value means that it is.

15,16

4,0 

starting sector of the file DOS.SYS in low and high byte format.

17,18,19

125,203,4

I am uncertain of the use of these bytes.

Note that the value of some of these bytes may vary from the above depending on disk configuration and customisation of DOS. The Disk File Manager (three boot sectors and the file DOS.SYS) form an exception to the usual rules for boot programs. Although DOS.SYS acts to all intents and purposes as a boot file, it has a directory entry, its sectors are marked as 'in use' in the VTOL and it has a linked sector structure. The initial three boot sectors however are a conventional boot file with the slight variation to sector 1 described above.

And that just about completes our discussion of Atari disk file structure! In order that you may learn a little more about disk files, I have written a simple sector editor but that will have to wait for the next issue. See you then!

top