design, query and evaluate information retrieval systems
Information retrieval is the cornerstone of what it means to be an information professional. Indeed, it was the idea of helping people find the information they are seeking that ignited my interest in this field, as information professionals have historically served as gateways to locating items of recorded human knowledge. Today, patrons of archives and libraries are more independent in their information quests, navigating electronic Information Retrieval Systems on their own. Yet the goal of the information professional remains the same: to provide access to information. The design and use of these systems are now the defining traits of library and information science professions; therefore it is imperative that practitioners can successfully do this.
Information retrieval systems (IRSs) must be catered to the informational needs of its users, wherein the internal structure working seamlessly with the outward-facing interface so that users can engage in successful search experiences. To enable users to independently utilize an IRS, the interface must be visually, structurally, and logically and useful. User-centered design of these systems, then, is central to providing access.
Knowing the audience from the beginning of the IRS design allows us to focus on user access throughout the process; the successful design process of an information retrieval system begins with identifying its intended users. This will guide important decisions about which metadata to include, the construction of thesauri and vocabularies, and the user interface.
Information professionals add value to information objects by assigning metadata to each information object's representative record in an IRS. It is the descriptive metadata that provides searchers with results for their text-based IRS queries. In addition to advances in full-text search technologies, subject-specific IRSs continue to maintain controlled vocabularies of terms to be used for indexing in order to maintain consistency and therefore efficiency within the IRS. These terms are carefully chosen to meet the needs of the subject at hand and the users of the IRS, and are therefore an important component of the structural development that effects search experiences.
The use of controlled vocabularies and thesauri for retrieving records in an IRS also allows us to utilize the concepts of recall and precision as powerful tools for IRS evaluation. Recall refers to the proportion of relevant search results retrieved from all the possible relevant results in a given IRS. Precision refers to the proportion of search results that are useful within all of the retrieved search results from a given search. A strong IRS will be indicated as such when both recall and precision exhibit high proportional values. Thus they are excellent tools for evaluating the effectiveness of an IRS because they provide a quantitative way to measure users' search experiences. By extension, recall and precision may also serve to evaluate the indexing and vocabulary choices of IRS records.
The first item of evidence that I am presenting illustrates my understanding of the precision and recall concepts as tools to evaluate an information retrieval system. Taken from an essay composed for the midterm of my Information Retrieval course, I was asked to summarize definitions of these concepts from scholarly articles in conjunction with Rijsbergen's (1979) exemplary work. This item of evidence also displays my ability to query an information retrieval system, as this assignment required me to query the preselected Library Literature & Information Science Full Text database to locate the articles on which my summaries are based.
The evaluation of an IRS based on recall and precision values is an extension of an evaluation of the descriptive metadata of its records. My second piece of evidence also supports my ability to evaluate an IRS using index-centric criteria. After assigning my own, theoretical indexing terms to a scholarly article in an assignment for my Vocabulary Design course, I queried three separate databases for the same article and critiqued the indexing terms that each had assigned to it.
To demonstrate my competence in IRS design, I am submitting a report summarizing a database design project completed in my Information Retrieval course. This project was completed with a partner, and together we described the needs of the projected users of the database. My role was to identify (and help assign) the metadata and vocabularies necessary to fully represent the items within the database - the items in question are various styles of hats for baseball teams from numerous cities. We created mock records and added them to an Inmagic DB/TextWorks database. Exported lists of the record structures and the full suite of records can be found in the document's Appendices. Our database and accompanying documentation were subject to review by classmates, and this piece of evidence includes our responses to external evaluations and the resulting record-based structural improvements.
The evidence I have presented shows my understanding of design and evaluation criteria for effective electronic information retrieval systems. Fulfilling this competency prepares me for future design of and interaction with IRSs, and will serve as a fundamental reference point as technologies continue to advance and expand the capabilities of modern information search and retrieval.
Van Rijsbergen, C. J. (1979). Information Retrieval. London: Butterworths. Retrieved September 12, 2010 from, http://www.dcs.gla.ac.uk/Keith/Preface.html