THE THIN-FILM DATABASE a new concept in scientific literature searching Dr. Ilana Fried Dr. Alfred Rosenstein INFODISK ltd. P.O.Box 3611 Rosh-Ha'ayin 40800, Israel The explosion of scientific information, as illustrated by the ever growing number of papers published, has now come. When, some fifty years ago one could read "Chemical Abstracts" on the commuting train, and thirty years ago, one managed to read a section or two, today, it is very difficult to keep up with what is being published in one's own very narrow specialty. This situation has two results: a) while one knows everything on his own area, the amount of knowledge diminishes very rapidly when one steps out of his narrow field. This decreases the chances for creative cross fertilization; b) when the need to read about another specialty arises, the work is often delegated to juni or staff or t o non-scientists (librarians and information specialists), thus diminishing the exposure of the most experienced, senior scientist and engineer to new information. The computerized databases available today are no longer of much help. Their reliance on key-words, coupled with the ambiguity of language, complicates even simple searches. Moreover, when one needs to look for data, such as pressures or temperatures, these data are usually not searchable or not given. The time consumed and the frustration of conducting a literature search, is known to us all. Infodisk proposes a new concept in the presentation and retrieval of data from the scientific literature. Infodisk's first product is a database on thin films, the TF Database. THE APPROACH The Infodisk Thin Film Database contains only the 'hard data' from the original published articles. Infodisk's research scientists read all available articles on thin films and extract all q uantifiable info rmation from those dealing wit h th e preparation and properties of thin films. When an article discusses two materials, each material makes the object of one record. The same record may be present in one or more files, depending on the application of the material. Infodisk created complete files, so that searches in more than one file is usually unnecessary. The result is a database which answers difficult questions quickly (within a few minutes) and easily. THE CONTENTS OF THE THIN FILM DATABASE The TF Database is divided into seven distinct files: 1. Microelectronics 2. Superconductors 3. Electronic displays 4. Optics and Optoelectronics 5. Magnetic Storage 6. Optical Data Storage 7. Protective Coatings Each record contains 110-120 fields, which are grouped into six sections: 1. Bibliography 2. System Characteristics 3. Preparation Method 4. Preparation Conditions 5. Film Properties 6. Other Properties and Notes The structure of all files is identical in all sections except 'film properties'. The structure of the groups is as follows: 1. Bibliography This section contains all bibliographic information of the article plus one 'reference for more detail' if this reference is essential for the understanding of the article 2. System Characteristics This section contains the following fields: * Film composition * Applications * Substrate * Buffer layer * Top layer * Multilayer structure * Characterization methods * Preparation method 3. Preparation Method This section contains the following fields (please note that the field "preparation method" appears in two places. This is intentional): * Preparation method * Material source * Dopant source * Reactive gas * Carrier gas * Discharge gas * Component ratio * Substrate pretreatment * System pretreatment * Post-deposition treatment * Post-deposition processing 4. Preparation Conditions This section contains the following fields. It contains the data on temperatures, pressures, power used in preparation, as well as some other data: * Background pressure * Total pressure * Reactive gas pressure * Carrier gas pressure * Discharge gas pressure * Flow rate * Source temperature * Deposition temperature * Growth rate * Deposition duration * Film thickness * Discharge beam current * Source characterization * Power density * Target voltage * Other technological parameters In most categories, two fields are provided for maximum and minimum values. Some articles give a range of one or more parameters, this information is dealt with by indicating the two extreme values of the parameter. 6. Other Properties and Notes This group contains three fields: * Other properties * Notes, which contain important information given by the author * Conclusions, which contain a brief summary of the article written by the abstractor 5. Film Properties This group is different from one file to the other. Examples: Electrical properties * Conductivity type * Carrier density * Band gap * Resistivity * Conductivity * Carrier mobility * Dielectric constant * Dielectric strength * Other electrical properties Supeconducting properties * Critical temperature (onset) * Critical temperature (R=0) * Critical current density * Critical field * Normal state resistivity * Low temperature susceptibility Optical properties * Absorption * Spectral characteristics * Optical transmittance * Refractive index * Other optical properties Here, too, many categories contain fields for maximum and minimum values. EXAMPLES OF SEARCHES The following problems will be given as examples: 1. Find all films that were deposited using CVD or a related method, at temperatures between 100 and 300 deg C. The preliminary step is to choose the file (Microelectronics) and function (Carry out a search) The first step is to choose all records containing CVD in the 'Preparation method' field, and the function that links the stages together ('and'). The second stage i s to choose the fie ld 'Depositio n temperature' and input the two extreme values: 'greater than' 100 'and' 'less than' 300. Choosing 'end selection' initiates the search and displays the results. This process takes less than a minute. 2. Find all films whose carrier density values are of the order of 1011&a+.3R per cm3&a+.3R and were deposited at rates greater than 1 nm/sec. The preliminary stage is the same as before. In the first stage one chooses f rom 'Film Properties, Electroni c Properties' the category 'carrier density' and inputs the two values: 'greater than' 1*1011&a+.3R and 'less than' 1*1012&a+.3R. The logic funcion 'and' concludes this stage. From 'Preparation Conditions' one chooses 'Growth rate' and inputs the value 'greater than 1'. Concluding the search by 'end selection' produces the results in a few seconds. Again, the search took less than a minute. The searcher is most often surprised by the small number of articles retrieved. This is because all of these articles are relevant to the problem at hand. This is in contrast to the presently available methods, where one gets a large number of titles, but produces eventually, after a lot of searching and sifting, only a very small number of relevant articles. OPERATING THE THIN FILM DATABASEB There are five functions to the TF Database: 1. View records One can view the records in two forms: a Table Format which shows up to 1 7 records on the screen, facilitatin g comparisons, and a Label Format which gives one record per screen and is intended for more detailed viewing. 2. Carry out a search Each and every field is searchable. The program's special feature of "m ixed fields" makes it po ssible to pu t explanations next to numbers, and search the quantitative categories both as numeric (equal, greater than, less than) and as textual (contains, does not contain) fields. 3. Sort data Enables the sorting of the retrieved data according to any field. The sort can be done in acsending or decending order, according to numerical of alpha values. 4. Find text This function finds a string in the file, regardless of category. 5. Produce a report The user decides which fields he/she wants to put in the report. Reports are produced as ascii files in the active directory. There are two forms of reports: the Table format facilitates comparison between records, but gives only the first 12 characters in every field; the Label format presents all the data in the chosen fields. A special case is when one wants to see all the data from the retrieved articles, in all non-vacant fields. This option is also available. CONCLUSIONS Infodisk has succeeded in preparing a data-base that shortens the time required to make literature searches from days to minutes. The implications of this accomplishment are: * less time is spent in the library, more time spent in the laboratory; * it is now possible, investing only little time, to search outside one's own specialty, increasing the chances for creative fertilization; * everybody will use this database: scientists, both senior and junior, students, information specialists; * students can be given "real life" excersizes, exposing them to real researchproblems at an earlier stage in their education; * writing reviews, whether for publication or for internal use, becomes a mucheasier task. FOR INFORMATION ON THE ACQUISITON OF THE DATABASE PLEASE CONTACT INFODISK ltd. Tel +972-3-931 8669 Fax +972-3-930 1049 P.O. Box 3611 Rosh Ha'ayin, 40800 Israel e-mail: INFODISK@Zeus.Datasrv.co.il