DISCUSSION PAPER NO. 54 DATE: November 22, 1991 REVISED: TOPIC: Providing Access to Online Information Resources SOURCE: Library of Congress SUMMARY: This paper discusses the efforts to provide descriptive and access information to electronic information resources, particularly within the context of USMARC. It suggests issues to consider in terms of the scope of such efforts and identifies data elements for the description of online information resources. A mapping to USMARC fields in the bibliographic format, including the incorporation of some holdings and community information format fields is suggested. FOR STATUS SEE DOCUMENT: DP54 COV DISCUSSION PAPER NO. 54: PROVIDING ACCESS TO ONLINE INFORMATION RESOURCES 1. INTRODUCTION The USMARC Format for Bibliographic Data includes, among others, a structure for coding the bibliographic information about computer files in machine-readable form. It describes both the data files stored in machine-readable form and the software, or programs used to process that data. Librarians and information professionals, as well as other users, operate in increasingly networked and internetworked environments. Many different kinds of electronic information resources, whether they are numeric databases, computer forums, discussion groups, mailing list servers, online public access catalogs (OPACs), full-text databases, or other varieties of information resources, are available to users over one or more networks such as the Internet, BITNET, etc. Information in electronic form requires special description, location and retrieval information. The USMARC Bibliographic Format provides fields for describing computer files, primarily with descriptive information with minimal attention to access (i.e., information to logon, electronic addresses, etc.). This structure is adequate for individual computer files, but not for online information resources that require additional data elements to provide location and access information. It is desirable to make this type of directory information for electronic information resources accessible to the USMARC environment so that such information may be available within the same systems as other records. For instance, a user may wish to conduct a subject search on a topic and retrieve both citations for books and journals and citations to electonic databases and computer forums. Discussion Paper No. 49 (Dictionary of Data Elements for Online Information Resources) first presented a list of data elements needed. At the USMARC Advisory Group discussion in June 1991, participants agreed that this effort should be further pursued. Some of the information describes broader resources, rather than just data files. Several groups have used the initial work presented in that discussion paper to further identify the data elements needed. MERIT, an organization in Michigan that provides information support services to the networking community, used the list of data elements and attempted to map them to an X.500 application (X.500 is an OSI--Open System Interconnection application protocol for directory service). This has provided the first step in establishing an Internet protocol specification. The Coalition for Networked Information's (CNI) Working Group on Directories has been working to refine the list of data elements. CNI's Architecture and Standards Working Group has begun to develop a working standard for referencing networked information objects. In addition, OCLC was awarded a U.S. Department of Education grant to provide investigation and analysis of Internet resources. (The project entitled "Assessing Information on the Internet: Toward Providing Library Services for Computer-Mediated Communication" is funded from Oct. 1991 to Sept. 1992.) It will conduct an extensive search on the Internet to locate and sample electronic information resources and develop and test a descriptive taxonomy of Internet resources. OCLC is using the list of data elements for online information resources and expects to further refine it as a result of its research. As a result of the discussions at the June MARBI meeting, a subcommittee was formed to further consider the issues brought forth in the paper. This paper brings forth much of the subcommittee's deliberations on the issue, which were conducted through electronic mail. This discussion paper will attempt to do the following: 1. Consider the scope of types of networked information resources for which it is desirable to supply access. This provides the first step in compiling a list of data elements needed. This discussion may raise many questions. 2. Identify data elements to describe and provide location and access information for online information resources. This will include the list presented in Discussion Paper No. 49 with some additions. 3. Suggest possible USMARC fields that the data elements could be mapped to. Contrary to the earlier discussion paper, fields adapted from the holdings and community information formats will be included. 4. Present a list of questions that needs to be addressed before further work can be done on the topic. 2. SCOPE Discussion Paper No. 49 identified types of information resources and included the following list: Online Public Access Catalogs Bulletin Boards Mailing List Servers Computer Discussion Groups and Forums Data Archives Computational Resources White Pages Network Information Centers Full-text Databases Numeric Databases Other types of citation databases This was intended as a preliminary list, and other resources may be added. In order to discuss the subject of access to online information resources, the types of information to which we need access to needs to be refined, including how they relate to one another. Following are questions to consider: 1. Do we want access to both "computer-mediated communication" (e.g., email) and to "resource sharing" (i.e., use of distant computing resources via the network, e.g. FTP, TELNET)? 2. At what level do we want to catalog (i.e. provide access to) this information? From a) the document (a computer file) itself? b) From the depository location (e.g., an FTP server) for the computer file? c) From the subdirectory it resides in? One may want to retrieve a file that is located at the Internet address "wuarchive.wustl.edu" in a directory called "/systems/mac/info-mac". Do we need a record for the FTP server, the file, and the directory in order to be able to search for such documents? (This brings up the related question of how to deal with the hierarchy between the type of information resources.) 3. What is an online information resource? Is it an electronic system in which either collections or descriptions of discrete units of information documents are contained? Can any level of information be included, e.g. a system (such as GLADIS) or a database within it (such as the UC Berkeley catalog)? Does this imply a distinction such as that for collection level description versus item level description? 4. Are some information resources actually services? For instance, is a Listserv a service or an information resource? Is it something to be included on our list of information resources to be accessed? Clearly, electronic documents (files) fit into the computer files format well, but other types of resources do not. Do we want to provide access to as much as possible that is available electronically, i.e. describe individual documents, broader information resources, and everything in-between? To assist in developing categories to put all these levels of online information resources into, the following Internet functions with sub-functions have been identified: 1. Email - to individuals - to directory servers (e.g. mail to anyone at an institution; has one address with different users) - to listservs (has any number of individual subscribers) - to mail daemons (servers that process mail and automatically mail back a document) 2. FTP (File Transfer Protocol) - individual documents available through FTP - directories within FTP servers - FTP servers themselves 3. TELNET (Remote Login) - systems requiring a password and login - bulletin boards - library catalogs and/or library information systems - single function services (e.g. the National Weather Service address) - multi-function services (e.g. FreeNet) It might be a reasonable requirement to specify that in order to create a record for any of these, it must have an address. Note: In the above list, it would be desirable to exclude Email to individuals. Are any others to be excluded? 3. Dictionary of Data Elements for Online Information Resources The previous discussion paper raised many questions that need to be answered regarding providing enough information in a record. In addition, it listed specific data elements for the information resource description and mapped them to USMARC fields. The following is a revised list. New data elements not previously included are indicated by an asterisk (*). Suggested changes in the mapping to USMARC fields appears in brackets [ ]. In some cases, more than one suggestion is given for the corresponding USMARC field. A few of the fields identified in Discussion Paper No. 49 are being made obsolete with format integration; although they may have been appropriate, they have been replaced here by other fields. In considering the appropriate fields for these data elements, in some cases fields from the community information format are appropriate when resources that are services are described. In other cases holdings fields seem appropriate for location and some access information. It might be useful to think of something like an FTP server as access information, characteristic not of the database itself, but of the holding. A data file that is mounted on various systems may then have multiple locations. Thus, existing holdings format fields have been included as suggested holders for this data. It might also be considered whether a block of fields should be defined in the USMARC Holdings Format for electronic locations (the 85X/86X fields are full), since much of this is indeed holdings/location information, rather than bibliographic information. Keep in mind that holdings format fields are defined in the bibliographic format, since embedding holdings information in bibliographic records is allowed. It must be emphasized that the mapping to USMARC fields are suggestions; further discussion needs to take place after the list of data elements itself is finalized. Data Element USMARC Field *Standard Identifier 026? Define new field for standard electronic ID number, including agency assigning Name of the Resource 245 $a Title Statement Acronym/Initialism 211 $a Acronym or Shortened Title Producer (see discussion below) 260 $a$b Place and Name of distributor Distributor of the Resource Location (and Sublocation) [852 $a Location; $b Sublocation; may have multiple occurrences for distribution centers, systems mounting the file, etc.] Contact Name and Address [535 Location of Originals (define subfields)] or [270 Primary address (adapt from CIF)] Network Access (universal) [541 Source of acquisition] Network Address(es) [541 Source of acquisition] or 270 $m Electornic mail address or [852 $e Address (Internet); associate with 852 $a and $b Location and Sublocation] Hours of Service [301 Hours, Etc. (adapt from CIF)] or [541 Source of Acquisition (new subfield) Telephone [270 $k Telephone number (adapt from CIF)] Fax [270 $l FAX Number (adapt from CIF)] Network Access Instructions [590 Local Notes; also can be associated with 852 Location field] Terminal Emulation Supported 538 $a Technical Details Note Logon/Subscription Instructions [Create new field] Logoff/Unsubscribe Instructions [Create new field] Type of the Resource 516 $a Type of File or Data Note Size of Resource 256 $a File Characteristics Frequency of Update 310 $a Current Frequency Language of Resource 546 $a Language Note [$b code] Profile of Resource 520 $a Summary, Abstract, etc. Note Audience 521 $a Target Audience Note Restrictions on Access 506 $a Restrictions on Access Note Authorization 506 $e Authorization or 845 Terms Governing Use (consider holdings information related to 852) Source Machine 538 $a Technical Details Note Cost for Use [541 $h Cost of use or [531 Eligibility, Fees, Procedures Note (adapt from CIF) Coverage 513 $a$b Type and Period Note Indexing Terms 653 $a Index Term Databases Available 505 $a Contents Note Other Providers of Database [775 Other Edition Entry] Documentation Available 556 $a Information About Documentation Note Responsibility for Record 040 $a$d Cataloging Source Maintenance Date/Time of Last Update of 005 Date and Time of Directory Information Latest Transaction Local Access Information and 590 Local Notes Guidelines *Processing Status [Define in 008; e.g., draft or final; peer reviewed] *Collection strength [Define in 008; describe strengths of collection of particular organization that may have its OPAC on the network] 4. Producers and Distributors For each online resource, there are a variety of people involved in making the information contained in the resource available to users. Since the focus of this discussion paper is on the data elements to gain access to the online information resource, the variety of people that are involved in the complicated process of creating, organizing, and making accessible the online resource will be conceptually divided into two categories: producers and distributors. Producer: The category of producer includes those who have organized the database or information resource. While it may or may not include the actual creator of records, the important aspect to describe is the party (institution or individual) responsible for the information resource as a whole. Distributor: The category of distributor includes those who are making the database or information resource available online to users (i.e., the organization that has mounted the database, the center that is hosting a mailing list server, etc.). Any database can have many distributors. For example, "BIOSIS Previews" (see example 3 in Appendix A) can be mounted on the University of California system under GLADIS or on the Harvard system under HOLLIS, as well as by BRS/After Dark. A single organization may in some cases serve in both roles since a producer may mount its own database and make it accessible online to users, and a distributor may also be producing an information resource in the course of its activity as a distributor. 4. Questions for Further Discussion It is important to refine the list of data elements. Some institutions may choose a format other than USMARC to which to map the data elements. The fields above are suggestions which attempt to view this information in a traditional bibliographic way. The following questions are raised to further discussion on this topic: 1. What types of information resources will be included/excluded for providing description and access? 2. Is the list of data elements complete? Has any further work identified additional data elements that need to be included that do not fall into any category above? 3. Does a review of the cataloging rules for computer files need to be done to finalize the data element list? 4. How should USMARC be expanded to accommodate types of information resources that do not easily fit into the computer file description? Is it useful to consider that some of this information belongs in holdings fields? 5. How will normalization of access information be accomplished (e.g., terminal emulation, address, etc.)? See Attachment A for examples of data for three electronic information resources. These examples are for illustrative purposes only, and all data may not be current or applicable. Example 1. University of California, Berkeley, Online Catalog Data Element USMARC Field Standard Identification Number () XXXXXX (numerics) Name of the Resource UC Berkeley Online Catalog Acronym/Initialism GLADIS Producer University of California Distributor of the Resource University of California Location Berkeley, CA Contact Name and Address Roy Tennant, Public Service Automated Systems Coordinator Internet: rtennant@library.berkeley.edu BITNET: roy@ucbgarne Network Access Internet Network Address(es) gopac.berkeley.edu Hours of Service 24 hours Telephone (415) 642-3532 Fax (FFF) FFF-FFFF Network Access Instructions telnet to gopac.berkeley.edu Terminal Emulation Supported VT100 emulation possible, but not explicitly present Logon/Subscription Instructions Information present for signing on Logoff/Unsubscribe Instructions Information present for signing off Type of the Resource OPAC Size of Resource XX million records Frequency of Update Daily Language of Resource English Profile of Resource Public access catalog covering the holdings of most UCB libraries. Circulation information is available for some library locations. Audience Students, Researchers, Faculty, Public Restrictions on Access Public access to catalog Authorization No password required for access Cost for Use No charge Coverage Catalog is complete for monograph holdings from 1977 to present. All serial titles, both past and present, are reflected in GLADIS. Contains records for maps, manuscripts, audiovisual materials and computer software. Indexing Terms Library Catalog, Citation Database Databases Available Medline (restricted) Responsibility for Record Public Service Automation Maintenance Office Date/Time of Last Update of 4/15/91 10:50 Directory Information Example 2. Public-Access Computer Systems Forum Data Element USMARC Field Standard Identification Number () XXXXXX Name of the Resource Public-Access Computer Systems Forum Acronym/Initialism PACS-L Producer University Libraries, University of Houston Distributor of the Resource University Libraries, University of Houston Location University of Houston, Houston, TX Contact Name and Address Charles Bailey, Assistant Director for Systems, University Libraries, University of Houston, Houston, TX 77204-2091 BITNET: lib3@uhupvm1.bitnet CompuServe: 71161,3410 Network Access BITNET Network Address(es) listserv@uhupvm1.bitnet Hours of Service 24 hours Telephone (713) 749-4241 Fax (713) 749-3867 Network Access Instructions email to listserv@uhupvm1.bitnet Logon/Subscription Instructions To join PACS-L, send the following email message to listserv@uhupvm1: SUBSCRIBE PACS-L FirstName LastName Logoff/Unsubscribe Instructions To sign off PACS-L, send the following email message to listserv@uhupvm1: UNSUBSCRIBE PACS-L Type of the Resource Computer Forum Size of Resource 1600 subscribers Frequency of Update Daily Language of Resource English Profile of Resource A moderated, international computer conference that deals with computer systems that libraries make available to their patrons. Audience Persons interested in automation in libraries Restrictions on Access Public access to forum Authorization No password required for access Cost for Use No charge Coverage Conference was established in June 1989. All messages to the conference are automatically archived. Indexing Terms Library Automation, Information Technology Databases Available To see what files are available, send the following email message to listserv@uhupvm1: INDEX PACS- L Responsibility for Record Moderator of list Maintenance Date/Time of Last Update of 4/29/91 6:50 Directory Information Local Access Information and Archived files for PACS-L are available through local Guidelines OPAC. See librarian for assistance in searching files. Example 3. Database Available Online Data Element USMARC Field Standard Identification Number () XXXXXX Name of the Resource BIOSIS Previews Producer BIOSIS Distributor of the Resource BRS/After Dark Location 1200 Route 7, Latham, New York 12110 Contact Name and Address BRS/After Dark Customer Support Network Access Telenet Network Address(es) C XXXXXX Network Access Tymnet Network Address(es) XXXXXX Hours of Service 24 hours Telephone (518) 783-1161, (800) 345-4277 Fax (FFF) FFF-FFFF Network Access Instructions Dial local node for Telenet or Tymnet connection and key in appropriate BRS/After Dark address for network used. Type of the Resource Bibliographic Citation Database Size of Resource 7,000,000 records Frequency of Update Monthly Language of Resource English Profile of Resource Contains citations, with abstracts, to international literature on research in the life sciences. Audience Scientists, Students, Researchers, General Public Restrictions on Access Authorized BRS/After Dark Accounts Authorization User ID and Password Needed Cost for Use Rates dependent on user category. Contact BRS/After Dark for current rates. Coverage International, 1970 to present Indexing Terms Life Sciences, Biology Other Providers of Database Data-Star, DIALOG, DIMDI, ESA- IRS, STN International Responsibility for Record BRS/After Dark Maintenance Date/Time of Last Update of 1/1/91 13:55 Directory Information Local Access Information and Mediated searching available Guidelines at main library location. Guidelines Fees include connect time charges for both database and telecommunications and printing charges.