Airfield Models

Choosing the "Right" Data Storage Method

December 19, 2021



Home
About
Site Feedback
Register
Contact
Site Map
Add to Favorites
Comments
Search Airfield Models

Back to VB DB

 

Airfield Models (http://www.airfieldmodels.com/)Choosing the "Right" Data Storage Method

Today there exists a wide variety of programming tools.  In any craft, choosing the correct tool is an important part of the building process.  Unfortunately, human nature is such that we find a few tools that we are comfortable using and then habitually use them without considering the other possibilities.

Generally speaking, if a tool works as advertised, then I don't have any predisposition to not using it.  But like anyone else, I make my choices based on what I need to accomplish and what I know about using the tool.

A database is simply a tool for storing information.  They vary widely in capability and functionality, but all commercial database applications also carry a large amount of overhead both in distributed size and memory usage.  I have used these databases frequently as a back-end to my applications and will continue to do so when they are the right tool.

 
 

Common Ways to Store Data

  • Traditional database system — Microsoft Access, Oracle, SQL Server, etc.
  • Sequential file access
  • Random file access
  • Binary file access
  • Simple text files
  • Registry/Ini file entries

There may be other methods that I'm not aware of, but when it comes right down to it, all files are stored by writing bytes to the storage media.  The real difference is the interface provided by the software.

Visual Basic provides capability to access all these file types.  Each has its own advantages and disadvantages.

Points to consider before selecting a file access method

  • Does the program need to be SQL compatible (queries)?

If yes then a commercial database is the way to go unless you want to write an SQL parser as well as the code to do something with the results.

  • Does the program need to link tables?  If it doesn't use SQL, then probably not.

If yes then linking tables in your own programs requires indexing and will certainly be very complex.  If you need this capability then the commercial database is probably the way to go.

  • Does the application already use a back-end database?

If you already have to distribute the database then you can create a new table and store your data in it.  There are times you may not want to do this, however.  For example, you may want to create your own security file that contains user names, passwords and privileges.  By creating a separate file, you can get this information before opening the database.

  • Does the file need to be imported by other applications that you didn't write (Excel, Access, etc.)?

This can be accomplished with either a sequential access file or a commercial database.

  • Does the file need to be written to in a multi-user environment?

Requires a commercial database or carefully written code to ensure users don't overwrite each other's data.

A commercial database back-end is usually the best choice for intensive data-entry applications or multi-user environments.  It provides querying capability, security, multiple-related tables and many other advanced capabilities.

Many applications store information but don't need the advanced capabilities of a commercial database.  A file written using the intrinsic VB methods is much smaller than any commercial database as well as being easier to distribute.

Additionally, there are often installation problems when using a commercial database as a back-end, not to mention that the size of the distributed database components (DLLs) and libraries can be many times larger than the database itself.

Let's look at the intrinsic Visual Basic file writing methods (which are basically the same in every language I've written software in).

 
 

Sequential File Access

Sequential files are generally the easiest for a new programmer to read and write.  Unfortunately, they leave a lot to be desired when it comes to manipulating the data.  The two preferred ways to read them are:

  • Comma delimited Line Input # statement to read lines individually.  The individual values can be used to populate a User-Defined Type (UDT) or fields of a Record object defined by a class in the project.
  • Read the entire file into the program and then get the individual lines using a Split statement.  From there, each line has to be split up even further to get the individual fields.  This is a lot of work to do what the other file access methods achieve automatically.

Advantages

  • Fast — no database engine to initialize.
  • Size — much smaller than a traditional database.  Field length is not pre-defined.
  • Compatibility — can be imported/exported by a variety of applications making them the most flexible way to share data.
  • File can be viewed and edited using a simple text editor.
  • Existing records and new records can be placed directly into the existing file.

The one big advantage a sequential file has over all other types, including commercial databases, is that many other programs that are capable of importing data can recognize the fields and import from or export to the same format. In fact, even if your application uses another file type, having an export function that saves data into a sequential file is an almost foolproof way to allow other applications to access the same data.

Disadvantages

  • Not flexible in data storage.  All records must be of the same type.  That is, if you want your program to loop through and read the fields from a sequential file, then each line needs to have the same number of fields and be of the type expected.
  • Difficult to navigate.  File must be looped through to retrieve an individual record.
  • Usually requires parsing or multiple arrays — particularly in real-world applications.
  • When deleting records, a new file must be written containing the remaining records and then renamed to the original file name.
 
 

Random File Access

Random access files store UDT records.  Each UDT must be a fixed size and the file is limited to storing nothing but one type of record.

Advantages

  • Can easily pull individual records from the file.  Because each record is the same size, there is no need to loop through the file to find an individual record.  Just tell VB which record number you want and it goes and gets it.
  • Speed.  These files read very quickly because of the fixed record-length.  Again, there is no database engine to initialize.
  • Easy to read and write.  Existing records and new records can be placed directly into the existing file.
  • Smaller than a traditional database.

Disadvantages

  • String fields must be a fixed length resulting in wasted space and the need to trim strings before displaying them.  This also means that if you want to have a record that has a field containing the Street address, for example, then the field must be defined to hold the longest conceivable entry.  Records that do not use the entire amount of space just waste it.  The field is stored in the record using the full amount defined resulting in a lot of dead space in the file.  If the data is longer than the defined size then the string is truncated resulting in lost data.
  • Not flexible.  All records must be of the same type.
  • Can't be imported by other applications unless they recognize the specific file type.  Usually this limits importing to other applications you have written.
  • When deleting records, a new file must be written containing the remaining records and then renamed to the original file name.  An option to this is to read all the records into memory, open the file for output (clear the file) and then put the remaining records back.
 
 

Binary File Access

Binary access files are the most difficult of the intrinsic Visual Basic file-writing methods.  They require the program to know exactly how the file was written so that it can read the data back from the file.  In spite of this, I prefer Binary files to store data because they are ultimately flexible.

Advantages

  • Extremely flexible.  Data can be written to the file any way the programmer wants it.  Multiple record types, object data, and arrays of numbers or strings can all be saved in the same file.
  • Fast — no database engine to initialize.
  • Size — much smaller than a traditional database and usually smaller than Random access files.  Size is generally comparable to a Sequential access file containing the same data.

Disadvantages

  • Can't be imported by other applications unless they recognize the specific file type.  Usually this limits importing to other applications you have written.
  • File must be looped through to retrieve an individual record.
  • When editing or deleting existing records, a new file must be written.  Because the record size can vary, an existing record can not be simply placed into the file because it may overwrite subsequent records.  There are two ways that I know of to handle this:
  1. If the entire file is in memory, delete the contents of the file and then write the records in memory to the file on disk.
  2. Make a copy of the file, delete the contents of the current file, copy records before the changed record from the file copy to the empty file, write the current record, copy records after the changed record from the file copy to the current file.

The downside to Binary access is that the program must know how the file was written before it can be read.  My way of handling this is to create a header record that is stored at the beginning of the file.  The header contains specific information that lets my program read the file without problems.

I personally think the advantages of Binary access far outweigh it's disadvantages as well as the advantages of other file types for many purposes.  The flexibility and efficient storage are major selling points to me.

In all cases, developing a good set of file manipulation routines takes time.  Once you have them, they generally aren't easy to port from application to application.  Because of this, I wrote VB DB to create binary access files that are extremely flexible.

I have made this code available to all VB developers.  I personally use it in many of my programs.  It has seen extensive use and has proven to be very solid and reliable.  You can read more about it and download the source code here.

 
 

Previous —
Next —

Visual Basic Database (VB DB) Source Code
How to Create Binary Files using Visual Basic

Comments about this article

 
 

Airfield Models Home

 
 

Copyright © 2002-2005 Paul K. Johnson