Gert Says
a rather irrevent discussion
File Naming Redux
Recently some listservs, which I frequent, have been discussing two topics that together gave me new twist on my perspective of sustainable digital imaging in terms of file naming.
The two discussion topics were:
- What happens after slides?
- Tagging files with meta data
My initial reaction to the file naming discussion was to respond with my usual mantra – not meaningful but not meaningless either. We all know that the file name needs to be unique. Many now use a simple alphanumeric code that is generated by the computer.
I am not entirely happy with unstructured alphanumeric file names for several reasons. File names can perform some functions beyond naming the file uniquely.
First, to truly insure uniqueness, I recommend adding a prefix that identifies your organization. It is very likely that your images will become part a collective repository, either as one office in a multi-office firm, a centralized university collection or as part of a cooperative effort such as the Colorado Digital Project, therefore it is important that your naming criterion be exportable and still be unique. This prefix would not only identify your images but would also then guarantee that your alphanumeric file name was unique in the collective system.
Secondly, file names can assist in creating an efficient and accurate workflow. File names that are based on the existing organizing principal, such as accession numbers or project numbers can provide a key field to relate to already assembled metadata.
Now, some museums that are publishing their images for public use are using file names as promotional devices. As the file name is part of the searchable text in a web page they believe it increases the number of hits to their page when the file name describes the object. I have to disagree with this rational for an old habit of using the file name as your sole management tool. It would be better to set this information in the “ALT” tags, which all web site designers should supply for images used on their site so that the visually impaired can have their computers “read the image” to them and also searchable text for the search engines.
However, I have stated in the past, that the file name should in someway identify the image. This has been because data records can become corrupted or even lost. Think of it as a slide losing a label. I remember when we were converting an architectural offices image management system from a proprietary one to a database with links to images. We had 3450 images and 3400 records. We had converted the files to names that referred to the master image CD, so we could always find the master image again, but that did not help us identify the original object.
One of my goals for a file name then became to reference an identifier that was maintained in another system and thus double the chances of maintaining that information. It is not that I don’t trust computers and management systems, let’s just say that my first work with computers was in 1969 at State Street Bank when I was trying to find out when stock accounts had been wiped out by creases in the magnetic tape. Since then, I prefer to keep my eggs in at least two places. By this, I mean systems as well as physical locations. For example: for marketing collections I use of job numbers which are maintained by the accounting department; for museums, I use an object’s accession number which is maintained by the registrars; and for those slide and photo libraries which have created accession numbers for their slides, I use those which I had assumed would be maintained through the slide label.
However, the discussion about the future of slides, which is mentioned above, reminded me that the slides and their labels might not be around to identify the image. According to that discussion, some collections are digitizing and then “deleting” their copy slides.
In addition, many organizations are developing integrated databases or Knowledgebase. The goal of a knowledgebase is to make all data accessible and to take it out of the “silos.” The downside is that the very isolation of the siloed information protected it from corruption. That is when the second discussion regarding metadata tags presented a solution.
Granted it was not the point of that discussion which whether to maintain descriptive metadata within the file or in a database. In this situation, the poser was involved in creating an easily searchable “company-wide database - along the lines of a knowledge management system.” Many brought up the issue that while maintaining data that is specific to the file in its file header is practical, maintaining data that was specific to the object was not. This is from a cataloger’s perspective. File headers are not relational databases with authority (thesaurus) files and standardized data structure.
While I agree that trying to maintain descriptive data about the original in a surrogate file’s header is a regression to maintaining a flat file database, I now see the benefit of maintaining an object identifier which would not need to be an alphanumeric computer friendly code has some positive characteristics in preventing loss of that data. If you have standardized your data structure, you could then assign the data field that describes the original object be entered into the file header. You could also extract that metadata for insertion in an “ALT” tag.
The ideal to me is that this data would still be saved in two ways, on the file and in a management system. Too many belts and suspenders? I think not.
3/21/05
