Questions TheMusicSack Answers
Home People Books Bands Churches Pianists Theatres Help

Some questions that may help you understand the purpose and function of the MusicSack.



Q:   What is the Music Sack?
Q:   How big is the Music Sack database?
Q:   What is the mission of the Music Sack?
Q:   Is this doable?



Q:   Is the Music Sack a replacement for the books that now contain this information?
Q:   What does the Music Sack currently contain?
Q:   Do you have any more content ready?
Q:   What types of music were these 1,000,000 people involved in?



Q:   How did you decide which people to include?
Q:   Does the Music Sack store the actual texts of books and articles?
Q:   What about videos?, sheet music?. Can I download stuff onto my iPod?
Q:   Are there mistakes in the Music Sack?



Q:   Why does the world need the Music Sack?
          1. The Huge Amount of Data
          2. The Scattered Nature of the Data
               How do you search for information today? (without the Music Sack)
               What about "The Music Index"?
          3. The Complex Nature of the Data



Q:   What information does the Music Sack store about people?
          Types of Names
          Significant Dates
          Interesting Arrivals and Departures
          The Source of this Data
Q: What information does the Music Sack store about Books and Articles?
          Types of Bibliographic Items
          The Relationship Between them
Q:   Don't libraries have this information about books and articles?
Q:   Can you show a sample of the interesting bibliographic items in the Music Sack?



Q:   You state above that you want the Music Sack to replace printed bibliographies. Explain.
Q:   What advantages will the Music Sack offer over traditional printed bibliographies?
Q:   What are the disadvantages of traditional printed bibliographies?
Q:   Who will benefit from this new way of creating and distributing bibliographies?



Q:   How would you define cataloging?
Q:   Do you need a knowledge of cataloging to design a database to store bibliographic data?

Q:   Is a library catalogue a database?
Q:   You said earlier that Library Science is locked into an "1840s-1960s time warp" -- Explain
          A Library as a collection of objects
          The Music Sack consists of several collections of objects
          Library catalogues are different.
          "A 19th century information retrieval system"
          The Cataloging Process.
          The Drive to reduces costs
                1. Cooperative cataloging.
          The Origins of Cataloging.
                2. By restricting the number of entries each item had in the catalogue.
                3. By reducing the amount of information in some of those entries.

          Antonio Panizzi and his Main Entry
          The "Enter Under" Myth.
          The MARC Format
          The IBM Mainframe with 8 K of memory
          Magnetic Tape Storage v. The Hard Drive
          Library science : stuck in the 1960s
                1. The use of a unit record structure.
                2. The use of names to identify objects.
                3. The co-mingling of content and display.

Q:   Is making a choice of "Main Entry" unique to library cataloging?


Notes and Explanations



Q: What is the Music Sack?


Q: How big is the Music Sack database?


Q:   What is the mission of the Music Sack?


Q:   Is this doable?


Q:   Is the Music Sack a replacement for the books that now contain this information?


Q:   What does the Music Sack currently contain?

Q:   Do you have any more content ready?


Q:   What types of music were these 1,000,000 people involved in?


Q:   How did you decide which people to include?

Q:   Does the Music Sack store the actual texts of books and articles?


Q:   What about videos?, sheet music?. Can I download stuff onto my iPod?


Q:   Are there mistakes in the Music Sack?

Anyone finding errors in the should bear in mind that the Music Sack is the work of one person.

A major task in creating the Music Sack has been matching names to real people.

Elvis Presley was a real person and any document that mentions the name "Elvis Presley" must be referring to that real person.

Similarly references to "Beethoven", "Mozart", "Stravinsky" can be assumed to refer to the most famous bearers of those last names.

"Bach" usually means Johann Sebastian Bach; "Haydn" means Josef Haydn and not Michael.

Other names are so unusual that it is safe to assume that there is only one real person (in the field of music) that has that name. For example the following names must be unique:

(Unless a father and son have the same name)

In between these safe assumptions are many thousands of names where the degree of certainty is less that 99%. It would have been impractical to investigate each and every name. Some of the assumptions I have made are incorrect.

Users should be especially wary with author names.

In some instances I have matched the author name to a real person in the database. Some of these will be incorrect.

Users will also find, for example, instances of opera singers giving performances before they were born.

These errors can easily be corrected. When you find the, please let me know.

Other issues:


Q:   Why does the world need the Music Sack?

How do you search for information today? (without the Music Sack)

A music library patron seeking information about a person involved in music would typically do the following:

If the patron is still looking then next would be:

Even step three would not guarantee success:

Of course no library patron would go to all this trouble and would simply give up.

Another option some might suggest is to check the Music Index.

The Music Index is a an index to 775 international music periodicals. Although an impressive resource the Music Index has limited value as a source of information about people involved in music.

Any survey of articles in the Music Index will show a huge number of articles about a very limited number of persons. You can easily find several thousand articles about Bach, Berg, Beethoven, etc.

But this total rapidly diminishes to single figures for even moderately well-known people.

The reason for this is the nature of the material indexed -- periodicals and journals

The scholarly journals publish scholarly articles -- but the number of persons worthy of a scholarly article is very limited

The popular music magazines have to appeal to consumers and thus write about the artists their readers are interested in.

The most important repositories of information about people involved in music are biographical dictionaries and encyclopedias.

The Music Sack provides the solution to finding information among these thousands of dispersed sources.

Users seeking information about people involved in music can check 300,000 sources with the click of a button!


3. The Complex Nature of the Data

The Music Sack stores information about people, the books and articles that have been written about them and the activities that those people have been involved in.

This information is complex in the sense that it cannot be represented on paper without an enormous amount of duplication.

Consider the Relationship between Person and Item

The relationship between Person and Item is Many-to-Many, that is: (Note: In this context "Many" simply means "more than one")

For example the Music Sack has details of 21,841 items on Fryderyk Chopin.

In the Music Sack database the information about Fryderyk Chopin is stored just once.

Each time a user selects one of those 21,841 items and seeks information about the subject they are looking at this single record about Fryderyk Chopin and not at 21,841 individual copies of that record.

The 1980 New Grove has information on 20,506 people (by my count)

As in the above example there is only one record describing the New Grove (1980)

When users select any of those 20,506 persons described within they are looking at that single copy of the informationa bout the New Grove (1980)

Other Many-to-Many relationships in the Music Sack are:

This type of data can only be efficiently described and stored in a relational database. It cannot be efficiently described and stored on paper and certainly not in the MARC format used by libraries.

The MARC format is simply an electronic representation of a paper record.

Compare the efficiency of the Music Sack to the redundancy and duplication found in the way libraries store data

A search of the University of Toronto library for "Chopin, Frédéric" (in any field) returned 1329 items.

Since all these items are stored in the MARC format the text "Chopin, Frédéric, 1810-1849" is repeated 1329 times.

Similarly if a library has 500 recordings of Beethoven's 5th Symphony then the data about the symphony is duplicated 500 times.

It is this duplication that requires the existence of an "Authority File". Librarians speak of the importance of their "Authority Files" and consider them one of the strengths of a library catalogue but in reality they are a sign of its weakness.

Databases were developed to reduce this redundancy and a well-designed database does not need any "Authority Files"
Q:   What information about people does the Music Sack store?

The Music Sack stores the following data about each person (if known)

The Music Sack also stores the source of the data. For example when a person's date of birth varies from one printed source to another, or when there are variations in the spelling of a person's name, these variationa are noted, as is the printed source they came from.

To see how the Music Sack handles data sources : click here.


Q:   What information does the Music Sack store about Books and Articles?

Bibliographic Items can appear in many different forms -- the Music Sack stores them all.

The Music Sack stores:


Q:   Don't libraries have this information about books and articles?

Libraries do not catalog the range of items that the Music Sack has. Generally they only catalog books.

When a library catalogs a book the cataloguer takes the information from the item on hand.

The book is then placed on the shelf and forgotten. (Any intensive library user can find evidence of this)

Libraries do not:

Information about the publishing history of an item and the relationship between items will not be found in library catalogs but in bibliographies.

Moreover no library in the world has every book that is detailed in the Music Sack.




Q:   Can you show a sample of the interesting bibliographic items in the Music Sack?

The bibliographic items in the lists below demonstrate the strength and flexibility of the Music Sack. They also show how the Music Sack can describe any sort of bibliographic item and any type of relationship that may exist between bibliographic items.

These examples also show how limited, and indeed how primitive, the typical traditional library catalogue is.

A selection of interesting bibliographic items:

1. Basic Books
2. Books with a short note regarding their contents
3. Books with sections
4. Books with an annotation from a bibliography
5. Books and articles that have been reprinted
6. Book Reviews and Books
7. A Selection of interesting items and their relationships.

Q:   You state above that you want the Music Sack to replace printed bibliographies. Explain.


Q:   What advantages will the Music Sack offer over traditional printed bibliographies?


Q:   What are the disadvantages of traditional printed bibliographies?


Q:   Who will benefit from this new way of creating and distributing bibliographies?


Q:   How would you define cataloging?

Since the functionality of the Music Sack exceeds that of a typical library catalogue, and since the Music Sack was created without reference to any of the library profession's cherished set of cataloging rules, this is an interesting question.

Here is my definition of cataloging:

Cataloging is the process of creating a card catalogue with the constraint that there can be only a single full description of the item being catalogued.

Since cataloging involves the thoughtfull placing of multiple entries (one of them special) for each item catalogued, cataloging is inextricably linked to a manual filing system

As I noted elsewhere, the essence of cataloguing is the making choices in the interest of economy. If a card catalogue could have an unlimited number of entries per item, then much of cataloging would vanish -- since no choices would have to be made.

Similarly, if the data is stored on a hard drive, there would be a need for only a single entry; and this single entry would be equally accessible regardless of its location on the hard drive. Hence there would be no point in a cataloguer devoting any of his or her time in trying to decide where to place it. (Not that a cataloguer can affect the physical location of a MARC record)

Note: A constraint is simply some sort of condition that must be incorporated into an application. It is usually referred to as a business rule. Contraints that librarians will be familiar with include:
Q:   Do you need a knowledge of cataloging to design a database to store bibliographic data?

No you don't. A database designer needs to understand the structure of the data to be stored, and how users will use that data. The structure of bibliographic data is obvious to any person familiar with books and bibliographies.

To rephrase what I have said elsewhere, cataloging is about the economical construction of a card catalogue, and, the placing of data elements within a heritage data structure.

The arrival of the database enabled what was, under a manual filing system, a single task, to be split into two tasks -- one skilled, one unskilled.

Today, the first task (the skilled one) is the design of the structure to hold the data and the associated applications that enable users to use that data. The second, unskilled task, is the entry of the data into that structure.

In the traditional cataloging process, the cataloguer performed both roles -- entering the data (the description) and then the choice of access points.

In a database today, the choice of "access points" is not made on a record-by-record basis by the person entering the data, but by the database designer (in consultation with the users of the data) when the database is created or revised.


Q:   Is a library catalogue a database?

No it isn't. Many librarians writing about library catalogues describe them as databases. But given its heritage and structure, I don't see how anyone could describe a collection of MARC records as a database.

Here is a definition of a database:

A database may be defined as a collection of interrelated data stored together without harmful or unnecessary redundancy to serve multiple applications; ....

The key phrase in the above definition is "without harmful or unnecessary redundancy". One of the goals of database design is to design a structure that minimizes redundancy -- that is the storing of more than one copy of a set of data. The process of eliminating redundancy requires the existence of multiple places to store the data -- and a library catalogue has only a single place to store data -- the MARC record.

To be described as a database, a library catalogue would need a minimum of two structures -- the first to store information about the book; the second to store information about people (authors, editors, and if the book was about a person, then the subject of the book).

When everything is stuffed into a single structure, redundancy is unavoidable. Thus, if a library catalogue has 10,000 books that involve Shakespeare (as author or subject), then this is stored 10,000 times, once with each MARC record:

100 1# $a Shakespeare, William, $d 1564-1616

If a library has 500 recordings of Beethoven's 5th symphony, not only is "Beethoven, Ludwig van, 1770-1827" duplicated 500 times, the data about the symphony is similarly duplicated 500 times.

I sampled a single MARC record about Beethoven and found it had three copies of the above:

Since a library catalogue is just a card catalogue on a computer, and since no attempt has been made to reduce the redundancy inherent in a card catalogue, I do not see how anyone could decribe a library catalogue as a database.

On page 26 of the book from which the definition of a database was taken, there is a series of diagrama illustrating the evolution of data-storage methods. They are described as:

A collection of MARC records resides firmly in Stage 1 of the author's evolutionary diagrams.

On page 25 the author notes:
The term data-base became current in the late 1960s. Prior to that time, the data-processing world had talked about files of data and data sets. As so often happens when a new term becomes fashionable, many users promoted their files by changing their title to data-base without changing their nature to include data independence, controlled redundancy, interconnectedness, security protection, or, in many cases, real-time access.

Q:   You said earlier that Library Science is locked into an "1840s-1960s time warp" -- Explain

A Library as a collection of objects.

A library is collection of objects (books, recordings, etc.); users enter the library looking for a particular object and need some device to enable them to locate individual objects from among the many objects that the library may contain.

This device is the library catalogue. The principles and rules that govern the construction of this catalogue can be traced back to the 1840s in Victorian England.

A database can be seen as simply a collection of objects. Users query the database to see if the database contains a particular object. If the answer is affirmative, the user presumably wants to see information about that object displayed.

The role of the database designer is to create a database schema that allows users to make the queries that return the objects sought.


The Music Sack consists of several collections of objects:

For example, the Music Sack has information on 1,000,000 people who have been involved in music. Users seeking information about a particular person have several ways of locating that single person from within the entire collection of 1,000,000.

Users can locate a person within the Music Sack by:

and, if applicable:

The rules for designing a database are simple and can be grasped with a few hours of study.

These rules can be applied to any collection of objects.


Library catalogues are different.

Although a library is simply a collection of objects, library catalogues are not constructed according to the rules used by the non-library world. Instead libraries use their own set of rules -- a set of rules with their origin in the 19th century. What these rules are and why libraries use them is something I find quite intriguing.

What appears below is what I believe to be the very first outsider's account of cataloging and the cataloging process.

Without exception, authors who write about cataloging are insiders -- people who have been educated entirely within the domain of library science and who have spent their entire working lives within its confines, seemingly unaware how the rest of the world has moved on.

Indeed, two of the most distinguished luminaries of library science (both with PhD's in Information and Library Science) have shown in their writings (to my satisfaction, anyway) that they lack an understanding of even the most basic aspects of the way the non-library world stores information.

How can someone have a PhD in Information and Library Science and yet would clearly fail a test given to students who had completed a weeks instruction in database design?

Even today, I doubt that one library science graduate in a hundred understands the divide that separates their information world from the rest.



"A 19th century information retrieval system"

Although a library catalogue may appear to a user in 2006 as modern piece of technology, the structure of a library catalogue and the rules used to construct it are very primitive.

One librarian has described a typical library catalogue as "nothing more than a 19th century information retrieval system"

To understand why a modern library catalogue could be described in this manner, a little history of how libraries have stored information over the centuries is required.

Way back in early 19th century library catalogues were in the form of bound books. However library collections are continually being added to, and since bound volumes do not allow for the easy addition of new material, this was a rather inflexible format.

Then along came a new technology that allowed for additions to be easily made -- this was the card catalogue. The library catalogue remained in the physical form of a card catalogue until the 1970s.


The Cataloging Process.

    Cataloguing an item involves three steps:

  1. Listing the attributes of the book (author, title, publisher, etc)
  2. Creating a card (or several cards) -- first hand-written, then typed.
  3. Filing those cards within the existing set of cards.
As the number of books published increased in the 19th century, libraries sought to reduce their cataloging expenses.

They did this in three ways:

  1. By cooperative cataloging
  2. By restricting the number of entries each item had in the catalogue
  3. By reducing the amount of information in some of those entries.

1. Cooperative cataloging.

Back in the 19th century each library catalogued its own collection. This was expensive and not very efficient -- if a book was in 100 libraries then it was catalogued 100 times.

To reduce this cost libraries began to share their cataloguing chores. One library would catalogue each book, create the necessary cards, then duplicate the cards and distribute the copies of the appropriate cards to libraries that had a copy of that particular book.



The Origins of Cataloging.

All cataloging theory, and all the rules that libraries use today to construct their catalogues have their origin in the next two items. These cost-saving measures forced the cataloguer to make a series of choices while cataloging the item on hand. To guide the cataloguer in this choice, and to ensure some uniformity of outcome, a set of rules was required. The latest version of these rules can be seen today in the Anglo-American Cataloguing Rules (2002)

If a card catalogue (or a catalogue in book form) could have an unlimited number of entries, these choices would not have to be made, and much of cataloging would vanish.

The conditions that required those choices to be made no longer exist, but library science still chugs along as if they still did.


2. By restricting the number of entries each item had in the catalogue.

Examples of how libraries restricted the number entries an item could have in a catalogue:

3. By reducing the amount of information in some of those entries.

The third way that libraries sought to reduce their cataloguing expenses was the most significant.

A decision, made in the 1840s, created the foundation of library cataloguing. Meeting this requirement spawned an intellectual industry that toils to this day.

The man behind that big decision was Antonio Panizzi.


Antonio Panizzi and his Main Entry

Antonio Panizzi was in charge of the Library of the British Museum. The existing catalogue (in book form) was hopelessly out of date and in need of revision.

In order to reduce the cost of the new catalogue (also in book form) he introduced a revolutionary change to the way the catalogue was to be constructed.

Each book in the catalogue was to have a single, full description. If any other entries were required, they were to be in an abbreviated form -- essentially directing the catalogue user to the single, full description of the book.

Thus if a book had two authors, one would be chosen for the single full description of the book (and the full description would be placed under that chosen author's name); the entry under the second author would be an abbreviated entry that essentially told the catalogue user to look under the first author's name location of the single full description of the book.

Some of the "problems" that cataloguing theorists have laboured over in their quest to make that choice of main entry strike an outsider as faintly preposterous.

For example: an item has content from more than one person: Which person should the main entry be under?

This situation can be found in:

Or, the content is from a single person, but the person uses a pseudonym.

Should the main entry be under:

Or, an author writes in two distinct fields and uses a different pseudonym in each????

Should the main entry be under:

Or, there is more than one form of the author's name?

Should the main entry be:

Any competent database designer would recognise the above situations and would design a database to accommodate them.

Today, the choice described above does not have to be made -- and yet at the same time the objectives in creating a catalogue could still be met.


The "Enter Under" Myth.

When a library catalogue was in the form of a book or a card catalogue, each catalogue entry had an actual physical location. The word "under" described the physical location of the entry that the cataloguer had chosen. But when records are stored electronically on a hard drive, the physical location of the data is irrelevant -- and is under the control of the disk operating system and not the person entering the data.

And the order was significant -- if the entries were not well-ordered then the catalogue was useless.

Multiple entries for each item had to be made for the catalogue to be useful. If there was only a single entry, and that was under, say, the author -- if you only remembered the title you could not find the item (unless you were prepared to examine every entry in turn until you found the entry under the books author).

But when bibliographic data (or any data) is stored in a database it is not stored "under" anything -- it is simply a row in a table and that row is accessible from any of the attributes of the item (more accurately, from any attribute the database designer makes accessible by indexing)

The idea that data is stored under some heading is one of the enduring myths of library science. This myth is perpetuated by the MARC format.

Open any page of the AACR(2002) and you will see the phrase "enter under". The expression "enter under" is meaningless when applied to information in a database -- it has meaning only when the information is stored in a manual filing system -- which in reality is what a library catalogue is.


The MARC Format.

As cooperative cataloging developed, the central role was assumed (in the US) by the Library of Congress. This role involved the printing, stocking, taking orders for, and shipping of untold millions of cards -- a huge logistical undertaking.

In the 1960s as computers became available, libraries looked for ways to employ them in this process of creating and distributing cards. Wouldn't it be terrific if the actual bibliographic data could entered, not directly onto a standard library card, but onto a computer media?.

This media could then be distributed to libraries, who could then print copies of the cards they required locally?

Thus was born the MARC format. This is the format in which just about every academic library in North America, the UK and Australia stores bibliographic data.

Today library science is a-buzz over the prospect of converting MARC format records into XML records. However this transformation would retain the underlying theme of the MARC format. To understand what that theme is, you have to go back to the 1960s when the original MARC format was created.

When you see images of the Beatles arriving in New York, or of men wearing hats and the president riding in a convertible, the 1960s do seem a long time ago. However, the 1960s seem even more remote when the field of computers is considered.

But his was the era that produced the MARC format -- and, I would argue, the era in which library science has remained.


The IBM Mainframe with 8 K of memory

The MARC format was developed on an IBM mainframe with 8 K of memory. This 8 K of memory had to hold the operating system for the computer, the data to be processed, and the instructions on how to process that data. In some instances, it was found that the title of a particular book was too long to fit in the computer's memory.

More significantly, none of the libraries participating in the development of the MARC format possessed a hard drive storage device. All data was stored on magnetic tape.

Another point to bear in mind is that the original purpose of the MARC format was to print catalogue cards -- its use for online storage came much later.

What the creators of the MARC format came up with was simply an electronic version of a standard library catalogue card. What else could they do with just 8 K of memory? And what else did they need when its sole purpose was to print cards?.

A library catalogue made up of MARC records is just a card catalogue on a computer.


Magnetic Tape Storage v. The Hard Drive

A magnetic tape, like a card catalogue, is a serial device -- that is it can only be accessed in one direction, and if the record you are looking for is at the end of the tape, then you have to spool through the entire tape to locate it.

This is how a card catalogue is accessed. Users inspect each card in sequence until they find the one they are looking for, or, if the library does not own a copy of the sought item, until they pass the location where the item would be.

The hard drive is a random access device. All physical locations on the disk are equally accessible (timewise). Moreover the physical location of the data on the disk is not decided by the person entering the data, but by the disk operating system.

Yet much of cataloging involves the thoughtful placing of individual records within an existing set of records, to aid library patrons in their search. The hard drive made redundant the ordering of rows within a set of data -- library science seems unaware of this.


Library science : stuck in the 1960s

While the non-library world has moved on, taking advantage of increasing computing power to enhance and simplify the storing and access of structured data, the library world clings to the 1960s technology if the MARC format. I can point to three main areas where Library Science is stuck in that by-gone era:
  1. The use of a unit record structure.
  2. The use of names to identify objects.
  3. The co-mingling of content and display.

1. The use of a unit record structure.

The unit record structure is, of course, the MARC format. Libraries stuff everything into it; the result is incredible complexity. Contrast this with the simplicity of the structure of the Music Sack.

The part of the Music Sack that stores the bibliographic items is more powerful that a typical library catalogue, yet the structure is so simple that I can sketch it from memory in a couple of minutes.

The hard drive made it possible to split data into separate tables. Deciding how many tables are required for a particular project, and then deciding how the information to be stored was to be distributed between these tables required a process of data analysis.

The development of this process brought forth terms such as:

The above terms, fundamental to database design, are totally foreign to Library Science -- and why should they need to know them?, since they simply stuff everything into the MARC format.

As an example of how data analysis can be applied to bibliographic data, take the case of series (a technique used by publishers to reduce their marketing costs).

After examining examples of how books and series are connected, the database designer would note that:

From these simple observations, a database designer could design a structure to store information about books and the series they form part of -- and that structure would be simpler and superior to anything library science could produce.


2. The use of names to identify objects.

Humans use names to identify objects -- Lewis Carroll, the Magic Flute, Cluj.

However, names are an imprecise method of identification, since the same object can have multiple names, and those names can have multiple spellings, etc.

Lewis Carroll is the pseudonym of the English writer and mathematician Charles Lutwidge Dodgson -- so which name should be used to identify him?

To complicate matters his birth name, Charles Lutwidge Dodgson, can have varying forms: The University of Toronto Library catalogue has three different references:

The Mozart opera, known in the English speaking world as the Magic Flute, is also known as: The city of Cluj, in Romania (or Rumania) has also been known as:

Each of the three entities above, a person, an opera, and a city, are three discrete, easily identifiable objects.

Library science, still in manual filing system mode, has expended an enormous amount of intellectual effort in trying to select one name from among many as the identifier for an object.

The non-library world has recognised the futility of this effort and adopted a simpler system for identification. This involves identifying each discrete object with a number -- thus students are identified by their student ID (unique within a single educational establishment), librarians are identified by their employee number, books are identified by a number expressed as a barcode.

All the variations in names listed above can be stored, together with a description as to their significance:


3. The co-mingling of content and display.

As I noted above, the MARC format is simply an electronic version of a standard library card and its original purpose was the printing of those cards. When the cards were printed, the sequence was basically :
  1. Read a character from the tape.
  2. If this is a printing character -- then send it to the printer.
  3. If it is not -- then read the next character from the tape.
  4. etc, etc.

Back in the 1960s, in order to print:

The characters had to be stored in the order they were to be printed.

Thus the MARC record had to look like this:

If the data was stored as the above, it would not have been possible to print out Since this would have involved:
  1. Reading the entire line of text and storing it in a variable.
  2. Then locating the personal name. This involved

  3. Discarding all characters up to the first "$" and all characters after the second "$"
  4. Leaving "$a Shakespeare, William, $"

  5. Trimming the first three and last three characters
  6. Leaving "Shakespeare, William"

    The locating the first name and last name, this involved:

  7. Finding the position of the comma (in this instance 12)
  8. Giving a variable LastName the value of the first 12 characters
  9. Giving a variable FirstName the value of the last (12+2) characters
  10. The date would have to be extracted from the full line of text (stored above)

    This involved:

  11. Discarding all characters up to the second "$"
  12. Discarding the next three characters
  13. Leaving "1564-1616"

    Then the three variables could be printed:

  14. FirstName LastName ", " Dates

This simple task would have been impossible using a 1960s computer, with only 8 K of memory and no hard drive.

Today, nobody would store data like this: 100 1# $a Shakespeare, William, $d 1564-1616

Instead, only the basic data elements would be stored, and in four fields in a database:

These data elements could then be displayed in any order desired.

The increase in power of the computer, together with hard drive, have made it possible to store data in a simpler format. This has made it possible to completely separate content from display.

The raw bibliographic data stored in the Music Sack is formatted by about 700 lines of code before it is displayed.


Q:   Is making a choice of "Main Entry" unique to library cataloging?

No it isn't.

The choice of "Main Entry" is one that library science has been making since the 1840s and the making of this choice is the very foundation of cataloging. However this type of choice -- selecting one option from among many options -- is not unique to library science.

The two examples below illustrate how this type of choice, based on a business rule, can be found in other situations, and should serve to place "Cataloging Theory" in perspective:

Example 1: A video store.

The store has policy of stocking only one copy of each movie. All the movies in the store are shelved by genre. Since some movies fall into two or more genres (comedy and sci-fi, etc) the store must have a set of rules to enable employees to determine under which genre to shelve the single copy of the movie.

The location of the single copy of the movie on the shelf is the main entry -- if the owner decided to include a note in the other genre locations directing the user to the actual location of the movie, then these would be added entries.

The rules to enable employees to make that choice are the equivalent of Anglo-American Cataloging Rules. And the employees who apply the rules are the cataloguers

Then the company changes its policy and decides that it will stock as many copies as necessary -- the result: that choice for main entry no longer has to be made -- hence there is no need for those rules.

Example 2: A multi-unit dwelling

Most apartment buildings have some sort of directory to enable visitors to locate the person they are visiting.

In smaller buildings this consists of a brass plate with a slot for each unit and a button for buzzing that unit. There is room in that slot for only one name.

In larger buildings the directory was made up of strips of Dymo tape with the occupants name and unit number. Since this directory was maintained manually by the super he/she insisted on there being only one name per unit.

But what if two or more people lived in the unit and they had different last names?

A choice had to be made! After a series of workshops, roundtables, international conferences, a set of rules is devised to help people living in multi-unit dwellings make that choice.

But then along comes a new technology that allows for more than a single name per dwelling -- in fact you can have as many names as you want -- no choice has to be made.

All those rules for making that choice are no longer required.

This is exactly what has happened in library science. A set of rules devised to lessen the cost implementing a 19th century technolegy (the card catalogue) has been made redundant by a new technolegy (the hard drive).

Yet cataloging theorists toil away, seemingly unaware of the significance of that new technology.



Go To Top

Notes and Explanations


Biographical Dictionary

A biographical dictionary could be defined as a book (of one or more volumes) containing a number of discrete articles about people. Thousands of these biographical dictionaries have been published over the centuries and they are a major source of information.

Biographical dictionaries come in many flavours: The point to note is that the term "biographical dictionary" is not an easy one to define precisely. Before you can index a biographical dictionary you have to find it. Several attempts have been made to compile lists of them but none is complete. Of the 300,000 items indexed in the Music Sack, I estimate that some 7,900 can be classified as biographical dictionaries My goal is to index every biographical dictionary (and every other source of significant biographical information)



Go To Top

Selectivity of Biographical Dictionaries

The authors and editors who compile biographical dictionaries have always had to be selective as to who is included.

The reasons for this selectivity vary:





Go To Top

Item

The sources of information indexed in the Music Sack have appeared in various formats

The term I use to cover all of the above is "Item"



Go To Top

343,547 Persons

This quantity is the result of a query I ran against the database. It is based on the current total of 7,895 biographical dictionaries currently indexed in the Music Sack. This is subject to revision due to the problem of defining a biographical dictionary.



Go To Top

1,858,376 Articles and Books

1,858,376 really is the total number of items referenced in the Music Sack. However only some 300,000 of these items are individually described.

The remaining items are articles from biographical dictionaries in which all the persons described are listed in alphabetical order. For example anyone looking for one of the 20,506 people in The New Grove (1980) does not need the volume number and page number.




Go To Top

Types of Names

All the names a person has used are in the Music Sack (if known). With a description of that name.

Here is a sample of these name descriptions:




Go To Top

Significant Dates

The Music Sack stores all the significant dates in a persons life (if known)

Each date has a description attached to it. Here are some of them:

  • born
  • baptised
  • born (Julian Calender)
  • born (Gregorian Calender)
  • date of birth on marriage certificate
  • place born
  • place from
  • died
  • buried
  • will signed
  • will proved
  • date of inquest
  • death announced
  • died (Julian Calender)
  • died (Gregorian Calender)
  • place died
  • place buried
  • date of Variety obituary
  • active
  • active from
  • active until
  • place active
  • active in church
  • active in church from
  • active in church until
  • dates of service
  • Work produced
  • If a person has a connection to an institution, the Music Sack stores that information

  • The Curtis Institute
  • The Curtis Institute (faculty)
  • The Curtis Institute (student): major
  • The Curtis Institute (graduated)
  • The Curtis Institute: role
  • "Il Conservatorio di Musica 'Giuseppe Verdi' di Torino"
  • "Student at: Il Conservatorio di Musica 'Giuseppe Verdi' di Torino"
  • Das Konservatorium fur Musik in Prag
  • Student at: Das Konservatorium fur Musik in Prag
  • Musik Gymnasium Wien
  • Musik Gymnasium Wien -- Eintrittjahr
  • Musik Gymnasium Wien -- Maturjahr
  • Die Wiener Musikhochschule
  • Taught at: Die Wiener Musikhochschule
  • Taught at: Die Wiener Musikhochschule (starting date)
  • Accademia Filarmonica di Bologna
  • Accademia Filarmonica di Bologna (anno della prima aggregazione)
  • Accademia Filarmonica di Bologna (anno di classe)
  • Berlin Philharmonic: date of performance
  • Berlin Philharmonic: member
  • Berlin Philharmonic: member from this date
  • Berlin Philharmonic: member until this date
  • Berlin Philharmonic: debut as soloist/conductor
  • Philadelphia Orchestra
  • Philadelphia Orchestra: member
  • Philadelphia Orchestra: member from this date
  • National Youth Orchestra of Wales
  • National Youth Orchestra of Wales: member
  • National Youth Orchestra of Wales: member from this date
  • Philharmonic Orchestra of Los Angeles
  • Philharmonic Orchestra of Los Angeles: member
  • Philharmonic Orchestra of Los Angeles: member from this date
  • Vancouver Symphony Orchestra
  • Vancouver Symphony Orchestra: member
  • Vancouver Symphony Orchestra: member from this date
  • Vancouver Symphony Orchestra: role
  • Orquesta Sinfonica de Mexico:
  • Orquesta Sinfonica de Mexico: member
  • Orquesta Sinfonica de Mexico: member from this date
  • Orquesta Sinfonica de Mexico: role
  • Toronto Symphony Orchestra:
  • Toronto Symphony Orchestra: member
  • Toronto Symphony Orchestra: member from this date
  • Joined the Handel and Haydn Society of Boston, Massachusetts
  • Debut with Paris Opera
  • Ano del debut en el teatro Colon
  • With the Theatre National de l'Opera





  • Go To Top

    The Relationship between Bibliographic Items

    Bibliographic items can be interconnected in a bewildering number of ways. The Music Sack stores information about each item and any relationship it may have with other items (if known). Where a relationship is known it is described.

    Here is a sample of those descriptions.




    Go To Top

    Interesting Arrivals and Departures

    If the way a person arrived or departed this world is any way unusual, then the Music Sack stores that information.

    Here is a sample.




    page 22:
    Martin, James : Computer data-base organization
    Second edition
    Prentice Hall, 1977
    Go To Top

    (c) 2017. Frank Greene. All rights reserved.