File Management
File Management
Q.1 What is a role of a file
manager?
To carry out its responsibilities, file manager must
perform these four tasks:
1. Keep track of where each file is stored.
2. Use a policy that will determine where and how the
files will be stored, making sure to efficiently use the available storage
space and provide efficient access to the files.
3. Allocate each file when a user has been cleared for
access to it, then record its use.
4. Deallocate the file when the file is to be returned
to storage, and communicate its availability to others who may be waiting for
it.
Q.2 What are the tasks performed by the file manager?
File manager perform the four
tasks.
1. In a computer system, the File Manager keeps track of
its files with directories that contain the filename, its physical location in
secondary storage, and important information about each file.
2. The File Manager’s policy determines where each file
is stored and how the system, and its users, will be able to access them
simply—via commands that are independent from device details. In addition, the
policy must determine who will have access to what material, and this involves
two factors: flexibility of access to the information and its subsequent
protection.
3. The computer system allocates a file by
activating the appropriate secondary storage device and loading the file into
memory while updating its records of who is using what file.
4. Finally, the File Manager deallocates a file by
updating the file tables and rewriting the file (if revised) to the secondary
storage device. Any processes waiting to access the file are then notified of
its availability.
Q.3
Define the following terms: field, file, record, database, program files,
directories.
Field: A field is a group of related bytes that can be
identified by the user with a name, type, and size.
Record: A record is a group of related fields.
file: a
group of related records that contains information to be used by specific
application programs to generate reports.
database: a
group of related files that are interconnected at various levels to give users flexibility
of access to the data stored.
program file: a
file that contains instructions for the computer.
directory: a
storage area in a secondary storage volume (disk, disk pack, etc.) containing information
about files stored in that volume.
Q.4 State the embedded &
interactive command with which user communicates with file manager. Give
examples of each. Or
What do you mean by commands being device independent?
The user communicates with the File Manager, which
responds to specific commands.
Some examples displayed are OPEN, DELETE, RENAME, and
COPY. the first time a user gives the command to save a file, it’s actually the
CREATE command. In other operating systems, the OPEN NEW command within a
program indicates to the File Manager that a file must be created. These
commands and many more were designed to be very simple to use, they’re device
independent. Therefore, to access a file, the user doesn’t need to know its
exact physical location on the disk pack or the network specifics.
Each logical command is broken down into a sequence of
signals that trigger the step-by-step actions performed by the device and
supervise the progress of the operation by testing the device’s status.
For example, when a user’s program issues a command to
read a record from a disk the READ instruction has to be decomposed into the
following steps:
1. Move the read/write heads to the cylinder or track
where the record is to be found.
2. Wait for the rotational delay until the sector
containing the desired record passes under the read/write head.
3. Activate the appropriate read/write head and read
the record.
4. Transfer the record to main memory.
5. Set a flag to indicate that the device is free to
satisfy another request.
While all of this is going on, the system must check
for possible error conditions. The File Manager does all of this, freeing the
user from including in each program the lowlevel instructions for every device
to be used: the terminal, keyboard, printer, CD, disk drive, etc.
Q.5 What is file descriptor? What all information is displayed by the
file descriptor?
file descriptor: information kept in the directory to describe a file
or file extent.
Q.6 Describe the terms: master file directory, subdirectory.
Master file directory (MFD): a file stored immediately after the volume descriptor.
It lists the names and characteristics of every file contained in that volume.
Subdirectory: a
directory created by the user within the boundaries of an existing directory.
Some operating systems call this a folder.
Q.7 State the difference between a relative file name and an absolute
file name.
We’ll use the term “complete filename” to
identify the file’s absolute filename (that’s the long name that
includes all path information),
and “relative filename” to indicate the name without path
information that appears in directory listings and folders.
Q.8 Differentiate between fixed length and variable length records.
What are their disadvantages?
Fixed-length records are the most common because they’re the easiest to
access directly. That’s why they’re ideal for data files. The disadvantage of fixed-length records is
the size of the record. If it’s too small—smaller than the number of characters
to be stored in the record—the leftover characters are truncated. But if the
record size is too large— larger than the number of characters to be
stored—storage space is wasted.
Variable-length records don’t leave empty storage space and don’t truncate any
characters, thus eliminating the two disadvantages of fixed-length records. But
while they can easily be read (one after the other), the disadvantage of variable length records is they’re difficult to
access directly because it’s hard to calculate exactly where the record is
located. That’s why they’re used most frequently in files that are likely to be
accessed sequentially, such as text files and program files or files that use
an index to access their records.
Q.9 Explain the concept of contiguous storage allocation. What are its
disadvantages and advantages.
When records use contiguous storage, they’re
stored one after the other. This was the scheme used in early operating
systems.
Advantages: 1)
it’s very simple to implement and
manage.
2) Any record can be found and
read, once its starting address and size are known,
Disadvantages:
1) File might become completely full or the number of
records stored in the overflow area might
become so large that the efficiency of retrieval is lost.
2) At that time, the file must be reorganized and
rewritten, which requires intervention by the File Manager.
Q.10 Explain non-contiguous storage allocation with its disadvantages
and advantages if any.
Noncontiguous storage allocation allows files to use any storage space available on the disk. A file’s records are stored in a contiguous manner, only if there’s enough empty space. Any remaining records and all other additions to the file are stored in other sections of the disk. In some systems these are called the extents of the file and are linked together with pointers. The physical size of each extent is determined by the operating system and is usually 256—or another power of two—bytes. File extents are usually linked in one of two ways. Linking can take place at the storage level, where each extent points to the next one in the sequence as
Disadvantages
1) Although both noncontiguous allocation schemes
eliminate external storage fragmentation and the need for compaction,
2) They don’t support direct access because there’s no
easy way to determine the exact location of a specific record.
Q.11 Explain indexed storage allocation. Also state the disadvantages
and advantages of each if any.
Indexed storage allocation allows direct record access
by bringing together the pointers linking every extent of that file into an
index block. Every file has its own index block, which consists of the
addresses of each disk sector that make up the file. The index lists each entry
in the same order in which the sectors are linked, as shown in Figure For
example, the third entry in the index block corresponds to the third sector
making up the file. When a file is created, the pointers in the index block are
all set to null. Then, as each sector is filled, the pointer is set to the
appropriate sector address—to be precise, the address is removed from the empty
space list and copied into its position in the index block. This scheme
supports both sequential and direct access, but it doesn’t necessarily improve
the use of storage space because each file must have an index block—usually the
size of one disk sector. For larger files with more entries, several levels of
indexes can be generated; in which case, to find a desired record, the File
Manager accesses the first index (the highest level), which points to a second
index (lower level), which points to an even lower-level index and eventually
to the data record.
Q.12What is data compression? What are the three methods of data
compression?
Data compression algorithms consist of two types: lossless algorithms typically used for
text or arithmetic files, which retain all the data in the file throughout the
compression- decompression process; and lossy algorithms, which are typically
used for image and sound files and remove data permanently. At first glance,
one wouldn’t think that a loss of data would be tolerable; but when the deleted
data is unwanted noise, tones beyond a human’s ability to hear, or light
spectrum that we can’t see, deleting this data can be undetectable and
therefore acceptable.
Text
Compression
To compress text in a database, three methods are
described briefly here: records with repeated characters, repeated terms, and
front-end compression.
Records with repeated characters: Data in a
fixed-length field might include a short name followed by many blank
characters. This can be replaced with a variable-length field and a special
code to indicate how many blanks were truncated. For example, let’s say the
original string, ADAMS, looks like this when it’s stored uncompressed in a
field that’s 15 characters wide (b stands for a blank character):
ADAMSbbbbbbbbbb
When it’s encoded it looks like this:
ADAMSb10
Repeated terms: can be
compressed by using symbols to represent each of the most commonly used words
in the database. For example, in a university’s student database, common words
like student, course, teacher, classroom, grade, and department could
each be represented with a single character. Of course, the system must be able
to distinguish between compressed and uncompressed data.
Front-end
compression : builds on the previous data element. For example,
the student database where the students’ names are kept in alphabetical order
could be compressed as shown in Table
Original List Compressed
List
Smith, Betty Smith,
Betty
Smith, Donald 7Donald
Smith, Gino 7Gino
Smithberger, John 5berger,
John
Q.13 What does an access control matrix state?
The access control matrix is intuitively
appealing and easy to implement, but because of its size it only works well for
systems with a few files and a few users. In the matrix, each column identifies
a user and each row identifies a file. The intersection of the row and column
contains the access rights for that user to that file, as Table
User 1 User 2 User 3 User
4 User 5
File 1 RWED
R-E- ----
RWE- --E-
File 2 ----
R-E- R-E- --E-
----
File 3 ----
RWED ---- --E- ----
File 4 R-E-
---- ---- ---- RWED
File 5 ----
---- ---- ---- RWED
Q.14 What
does an access control lists?
The access control list is a modification of
the access control matrix. Each file is entered in the list and contains the
names of the users who are allowed to access it and the type of access each is
permitted. To shorten the list, only those who may use the file are named;
those denied any access are grouped under a global heading such as WORLD, as
shown in Table
File Access
File 1 USER1 (RWED), USER2 (R-E-), USER4 (RWE-), USER5 (--E-), WORLD (----)
File 2 USER2 (R-E-), USER3 (R-E-), USER4 (--E-), WORLD
(----)
File 3 USER2 (RWED), USER4 (--E-),WORLD (----)
Q.15 What is a capability list?
A capability list shows the access control
information from a different perspective. It lists every user and the files to
which each has access, as shown in Table
User Access
User 1 File
1 (RWED), File 4 (R-E-)
User 2 File
1 (R-E-), File 2 (R-E-), File 3 (RWED)
User 3 File
2 (R-E-)
Q.16 What is a lockword?
lockword: a sequence of letters and/or numbers provided by users
to prevent unauthorized altering with their files. The lockword serves as a
secret password in that the system will deny access to the protected
file unless the user supplies the correct lockword when accessing the file.