(no subject)

From: brindley@ece.orst.edu
Date: Mon Dec 11 1995 - 09:07:41 EET


Marshall Burns wrote:
> Dear Mike,
>
> You wrote:
> > Of course, a tagged data format would handle all of this in its sleep.
> ...
> > I hereby publically request ... use a tagged data format.
>
> I can think of several things you might mean by "tagged data
> format." Could you say specifically the sort of technique you are
> referring to?

Well, the basic idea behind a tagged data format is that the data
includes a tag (label, identifier) which identifies the content.
Traditionally, data on computers have relied on the file name to
identify the contents. This has often taken the form of
content.ext where content identifies the area which the data
in the file refers to and ext defines the form of the data. An
example would be the file 'sales95.wk1'. The first part of the
name tells you that the file has sales figures for the year 1995,
while the last part tells you that the file is a spreadsheet
file in Lotus 1-2-3 format. Now what happens if the name is
changed (perhaps by accident)? The data is still intact, but
you can't identify it. If the data is tagged, it has an identifier
as part of the data, so a person or program can identify the contents
and make sense of them. Now, what happens if you wish to bundle
some other data together with the sales spreadsheet so that they
will stay together (are at least difficult to seperate)? With a
tagged data format, you can put a wrapper around the different
types of data, encapsulating them into one file. So you can
make a file that holds the sales spreadsheet, all the charts and
slides you made for your presentation to the Board of Directors,
a picture of you being congratulated by the President of the
company for your record setting year, a report analyzing why
you did so well and suggesting how to apply those principals
to other parts of the company, and so on.

Now, you send this packet to a friend in another division. She
wanted a copy of some of your slides and your sales figures.
The spreadsheet program reads in the spreadsheat data and
ignores all the rest of the stuff (the different types of data
are clearly labelled, so the program can easily extract the
information it wants while ignoring the rest). And a graphics
program reads in the slides. But wait, there's more! You
find out that your friend read the analysis report and found
some good pointers to apply to her situation.

So, minimally a tagged data format would clearly identify the
data it contains, preferably in the first few bytes of the file.
The STL file format is not a tagged data format; it does not
identify the data in any way. The GIF format is tagged; it
has the tag 'GIF87' or 'GIF89' near the beggining of the file.

The reasons for identifying the overall chunk of data apply
equally well to internal subdivisions of the data. One useful
tag might be a version number so that new features can be added
(the numbers in the GIF tags are basically version numbers,
they refer to the year the standard or revision was made). For
a rapid prototyping file format, individual solid volumes
could be specified (and identified as object or support) and
colors or other attributes could be assigned to the individual
solid volumes.

Does that explain the kind of thing I'm talking about? For more
information, you might try the documentation for one such
scheme: Appendix A of 'Amiga ROM Kernal Reference Manual:
Devices (Third Edition)' by Commodore-Amiga, Inc., 1991,
published by Addison-Wesley, ISBN 0-201-56755-X.

  --> Mike Brindley brindley@ece.orst.edu Corvallis, Oregon, USA
"I take unanimity on any difficult topic as a danger sign."
  - P. J. Plaugher



This archive was generated by hypermail 2.1.2 : Wed Jun 20 2001 - 12:57:31 EEST