A program in Python using the dbm or shelve modules for indexing.

computer science


Project 3, due November 29

RDBMS and Database

The RDBMS for this project is :

A program in Python using the dbm or shelve modules for indexing.

The database format is a binary file of disk blocks. The disk block size is 4,096 bytes, and the blocking factor bfr is 10. Each record is the equivalent of the following SQL DDL statement:



    first_name VARCHAR(20) NOT NULL,

    last_name VARCHAR(20) NOT NULL,

    job VARCHAR(70) NOT NULL,

    company VARCHAR(40) NOT NULL,

    address VARCHAR(80) NOT NULL,

    phone VARCHAR(25) NOT NULL,

    birthdate DATE NOT NULL,

    ssn VARCHAR(12) NOT NULL,

    username VARCHAR(25) NOT NULL,

    email VARCHAR(50) NOT NULL,

    url VARCHAR(50) NOT NULL


Strings are composed of ASCII characters and are null-terminated. Dates are stored as three 32-bit integers in native byte order representing the day, month, and year.

There are two test databases: small.bin.gz, of size 40,960 bytes, containing 100 records, and large.bin.gz, of size 4 GiB, containing over 10 million records. These files are compressed with GNU GZip for download, and should be uncompressed before use.

Indexes will be created as DBM files using one of the libraries listed above.


You may use any platform to develop and test your code,


The Python 3 standard library

Reading binary files

You may use any method to read binary files, but you may find the following useful:

     Python: read() into a bytes object, then decode with the struct module.

Instruction Files

Related Questions in computer science category