-->

File Management

10 minute read

 

File Management

 Q.1 What is a role of a file manager?

To carry out its responsibilities, file manager must perform these four tasks:

1. Keep track of where each file is stored.

2. Use a policy that will determine where and how the files will be stored, making sure to efficiently use the available storage space and provide efficient access to the files.

3. Allocate each file when a user has been cleared for access to it, then record its use.

4. Deallocate the file when the file is to be returned to storage, and communicate its availability to others who may be waiting for it.

 

Q.2 What are the tasks performed by the file manager?

File manager perform the four tasks.

1.   In a computer system, the File Manager keeps track of its files with directories that contain the filename, its physical location in secondary storage, and important information about each file.

2.   The File Manager’s policy determines where each file is stored and how the system, and its users, will be able to access them simply—via commands that are independent from device details. In addition, the policy must determine who will have access to what material, and this involves two factors: flexibility of access to the information and its subsequent protection.

3.   The computer system allocates a file by activating the appropriate secondary storage device and loading the file into memory while updating its records of who is using what file.

4.   Finally, the File Manager deallocates a file by updating the file tables and rewriting the file (if revised) to the secondary storage device. Any processes waiting to access the file are then notified of its availability.

 

 Q.3 Define the following terms: field, file, record, database, program files, directories.

Field: A field is a group of related bytes that can be identified by the user with a name, type, and size.

Record: A record is a group of related fields.

file: a group of related records that contains information to be used by specific application programs to generate reports.

database: a group of related files that are interconnected at various levels to give users flexibility of access to the data stored.

program file: a file that contains instructions for the computer.

directory: a storage area in a secondary storage volume (disk, disk pack, etc.) containing information about files stored in that volume.

 

Q.4  State the embedded & interactive command with which user communicates with file manager. Give examples of each. Or

What do you mean by commands being device independent?

The user communicates with the File Manager, which responds to specific commands.

Some examples displayed are OPEN, DELETE, RENAME, and COPY. the first time a user gives the command to save a file, it’s actually the CREATE command. In other operating systems, the OPEN NEW command within a program indicates to the File Manager that a file must be created. These commands and many more were designed to be very simple to use, they’re device independent. Therefore, to access a file, the user doesn’t need to know its exact physical location on the disk pack or the network specifics.

Each logical command is broken down into a sequence of signals that trigger the step-by-step actions performed by the device and supervise the progress of the operation by testing the device’s status.

For example, when a user’s program issues a command to read a record from a disk the READ instruction has to be decomposed into the following steps:

1. Move the read/write heads to the cylinder or track where the record is to be found.

2. Wait for the rotational delay until the sector containing the desired record passes under the read/write head.

3. Activate the appropriate read/write head and read the record.

4. Transfer the record to main memory.

5. Set a flag to indicate that the device is free to satisfy another request.

While all of this is going on, the system must check for possible error conditions. The File Manager does all of this, freeing the user from including in each program the lowlevel instructions for every device to be used: the terminal, keyboard, printer, CD, disk drive, etc.

 

Q.5 What is file descriptor? What all information is displayed by the file descriptor?

file descriptor: information kept in the directory to describe a file or file extent.



Q.6 Describe the terms: master file directory, subdirectory.

Master file directory (MFD): a file stored immediately after the volume descriptor. It lists the names and characteristics of every file contained in that volume.

Subdirectory: a directory created by the user within the boundaries of an existing directory. Some operating systems call this a folder.

 

Q.7 State the difference between a relative file name and an absolute file name.

 We’ll use the term “complete filename” to identify the file’s absolute filename (that’s the long name that includes all path information),

and “relative filename” to indicate the name without path information that appears in directory listings and folders.

 

Q.8 Differentiate between fixed length and variable length records. What are their disadvantages?

Fixed-length records are the most common because they’re the easiest to access directly. That’s why they’re ideal for data files. The disadvantage of fixed-length records is the size of the record. If it’s too small—smaller than the number of characters to be stored in the record—the leftover characters are truncated. But if the record size is too large— larger than the number of characters to be stored—storage space is wasted.

 

Variable-length records don’t leave empty storage space and don’t truncate any characters, thus eliminating the two disadvantages of fixed-length records. But while they can easily be read (one after the other), the disadvantage of variable length records is they’re difficult to access directly because it’s hard to calculate exactly where the record is located. That’s why they’re used most frequently in files that are likely to be accessed sequentially, such as text files and program files or files that use an index to access their records.

 

Q.9 Explain the concept of contiguous storage allocation. What are its disadvantages and advantages.

When records use contiguous storage, they’re stored one after the other. This was the scheme used in early operating systems.

Advantages: 1) it’s very simple to implement and manage.

                      2)     Any record can be found and read, once its starting address and size are known,

Disadvantages:

1) File might become completely full or the number of records stored in the overflow area  might become so large that the efficiency of retrieval is lost.

2) At that time, the file must be reorganized and rewritten, which requires intervention by the File Manager.

 

Q.10 Explain non-contiguous storage allocation with its disadvantages and advantages if any.

Noncontiguous storage allocation allows files to use any storage space available on the disk. A file’s records are stored in a contiguous manner, only if there’s enough empty space. Any remaining records and all other additions to the file are stored in other sections of the disk. In some systems these are called the extents of the file and are linked together with pointers. The physical size of each extent is determined by the operating system and is usually 256—or another power of two—bytes. File extents are usually linked in one of two ways. Linking can take place at the storage level, where each extent points to the next one in the sequence as



Disadvantages

1) Although both noncontiguous allocation schemes eliminate external storage fragmentation and the need for compaction,

2) They don’t support direct access because there’s no easy way to determine the exact location of a specific record.

 

Q.11 Explain indexed storage allocation. Also state the disadvantages and advantages of each if any.

Indexed storage allocation allows direct record access by bringing together the pointers linking every extent of that file into an index block. Every file has its own index block, which consists of the addresses of each disk sector that make up the file. The index lists each entry in the same order in which the sectors are linked, as shown in Figure For example, the third entry in the index block corresponds to the third sector making up the file. When a file is created, the pointers in the index block are all set to null. Then, as each sector is filled, the pointer is set to the appropriate sector address—to be precise, the address is removed from the empty space list and copied into its position in the index block. This scheme supports both sequential and direct access, but it doesn’t necessarily improve the use of storage space because each file must have an index block—usually the size of one disk sector. For larger files with more entries, several levels of indexes can be generated; in which case, to find a desired record, the File Manager accesses the first index (the highest level), which points to a second index (lower level), which points to an even lower-level index and eventually to the data record.

 

Q.12What is data compression? What are the three methods of data compression?

Data compression algorithms consist of two types: lossless algorithms typically used for text or arithmetic files, which retain all the data in the file throughout the compression- decompression process; and lossy algorithms, which are typically used for image and sound files and remove data permanently. At first glance, one wouldn’t think that a loss of data would be tolerable; but when the deleted data is unwanted noise, tones beyond a human’s ability to hear, or light spectrum that we can’t see, deleting this data can be undetectable and therefore acceptable.

Text Compression

To compress text in a database, three methods are described briefly here: records with repeated characters, repeated terms, and front-end compression.

Records with repeated characters: Data in a fixed-length field might include a short name followed by many blank characters. This can be replaced with a variable-length field and a special code to indicate how many blanks were truncated. For example, let’s say the original string, ADAMS, looks like this when it’s stored uncompressed in a field that’s 15 characters wide (b stands for a blank character):

ADAMSbbbbbbbbbb

When it’s encoded it looks like this:

ADAMSb10

Repeated terms: can be compressed by using symbols to represent each of the most commonly used words in the database. For example, in a university’s student database, common words like student, course, teacher, classroom, grade, and department could each be represented with a single character. Of course, the system must be able to distinguish between compressed and uncompressed data.

 Front-end compression : builds on the previous data element. For example, the student database where the students’ names are kept in alphabetical order could be compressed as shown in Table

Original List          Compressed List

Smith, Betty           Smith, Betty

Smith, Donald        7Donald

Smith, Gino            7Gino

Smithberger, John   5berger, John

 

Q.13 What does an access control matrix state?

The access control matrix is intuitively appealing and easy to implement, but because of its size it only works well for systems with a few files and a few users. In the matrix, each column identifies a user and each row identifies a file. The intersection of the row and column contains the access rights for that user to that file, as Table

User 1            User 2            User 3            User 4                User 5

File 1              RWED    R-E-               ----          RWE-              --E-

File 2              ----          R-E-               R-E-               --E-         ----

File 3              ----          RWED    ----          --E-         ----

File 4              R-E-               ----          ----          ----           RWED

File 5              ----          ----          ----          ----          RWED

 

Q.14 What does an access control lists?

The access control list is a modification of the access control matrix. Each file is entered in the list and contains the names of the users who are allowed to access it and the type of access each is permitted. To shorten the list, only those who may use the file are named; those denied any access are grouped under a global heading such as WORLD, as shown in Table

File                        Access

File 1          USER1 (RWED),        USER2 (R-E-), USER4 (RWE-), USER5 (--E-), WORLD (----)

File 2          USER2 (R-E-), USER3 (R-E-), USER4 (--E-), WORLD (----)

File 3          USER2 (RWED), USER4 (--E-),WORLD (----)

 

Q.15 What is a capability list?

A capability list shows the access control information from a different perspective. It lists every user and the files to which each has access, as shown in Table

User                              Access

User 1                    File 1 (RWED), File 4 (R-E-)

User 2                    File 1 (R-E-), File 2 (R-E-), File 3 (RWED)

User 3                    File 2 (R-E-)

 

Q.16 What is a lockword?

lockword: a sequence of letters and/or numbers provided by users to prevent unauthorized altering with their files. The lockword serves as a secret password in that the system will deny access to the protected file unless the user supplies the correct lockword when accessing the file.