AODL Logo Grant Information
The National Gallery of the Spoken Word

Project Description

The National Gallery of the Spoken Word (NGSW) will create a significant, carefully organized on-line repository of spoken word collections. A collaborative project among the humanities, engineering , and library science, the gallery will provide the first large-scale repository of its kind through the identification and digital preservation of crucial materials in tape libraries throughout the United States. It will pioneer developments in informati on storage as it creates a recognized set of standards for preservation and access, and constructs sophisticated and integrated search mechanisms. Just as important, the collaborators on this project identify a complex set of opportunities for research, t eaching and outreach, because the most significant measure of the value of the project will be the users it attracts. High school teachers, college professors, government officials, journalists and engaged citizens will therefore be crucial collaborators in the creation of the NGSW.

The NGSW will address critical technical problems that remain unsolved in the delivery of high-quality voice materials on the WWW. First, many older analog versions of speech resources suffer from machine noise, copying distortion, background soun d and media deterioration. As one of its primary tasks, the NGSW will create a repository of high quality digital versions of key spoken material with standard bibliographic and metadata access. Second, while a number of search techniques work well for text, search techniques for very-large-scale databases do not yet exist for spoken materials. Participants in this project include researchers who are recognized leaders in the development of algorithms for searching using acoustic and linguistic models. Third, in the first attempts to present sound files on the WWW, little attention has been paid to standardization of digitization techniques. This project will create a set of standards for future development of sound on the web, including for formatting , sampling procedures, archiving of sound, and the presentation of materials.

The NGSW will help create a history of sound in the age of its virtual reproducibility. By bringing the spoken word across the Internet into living rooms, classrooms, research laboratories, libraries, and government offices, and by delivering the transformative power of language, rhetoric and speech via the Internet, the NGSW has the potential to create a worldwide virtual community. However, the difference between the classical polis and its online equivalent will be the profoundly democratic na ture of the online community. This process is already occurring around the world, as is recorded by studies like Rhonda and Michael Hauben's Netizens, but it has been based primarily on writing: e-mail, online chat, bulletin boards. As Internet t elephony becomes more feasible and popular, more and more online community-building discourse is likely to take the form of speech. Unless an organized project like the NGSW preserves and publicizes historical speech, however, the emerging online discour se will be deprived of the historical context out of which it emerges. By making sure that all voices are represented, we can promote open and democratic online discourse, with a rich trove of material to support each perspective, rather than permitting information-rich countries, nations, ethnic groups or ideological movements to define online oral and aural history.

Project Significance: Voices on the Internet

At the heart of this project stands a set of insights derived from media theory (McLuhan 1962, Singeltary and Stone 1988). Speech incites reaction, while writing admits nothing in return. Reading is fundamen tally private; speech is necessarily social and even political. Indeed, the Western rhetorical tradition is founded on the public nature of speech and the physical proximity of audience and speaker. While speech recording and radio broadcasts in the early 20th century virtually diminished the distance between speaker and audience, the Internet has opened a new phase in the history of the reproducibility of speech by translating the spoken word into a digital file that can be heard quickly, anywhere in the world. This technology, however, is still very young.

In order to create a fully communicative online public sphere, the NGSW will focus upon four key themes:

* The development of political culture in the twentieth century

* Popular culture in its historical context

* The dialectic of community and the individual

* International relations and conflicts in the modern era

In addition to these themes, we intend to situate these oral sources in several key contexts. These voices originate in historical, contemporary and personal contexts, and the NGSW will provide interpretive materials to help orient them for the user. In this way the NGSW, like physical museums, will provide both a storage place for the overall collection, and a public exhibit "space" for its most evocative elements. However, unlike a physical museum, our virtual storehouse need never rotate items out of the exhibited collection; we can continue to build the most accessible parts of the exhibits and leave them on display. Moreover, the NGSW avoids such traditional preservation problems as physical deterioration, inaccessibility to fragile materials, unavailability to simultaneous multiple users, and inaccessibility to persons with disabilities.

Michigan State University's (MSU) H-Net ( has trained scholars from all over the world in the use of the Internet. Its specific training projects, involving teachers and scholars from South Africa, Senegal, Poland, Russia, Portugal and Japan, have developed public discourse through creative online dialogues. Its most ambitious thematic project, the "Pluralism and Unity" resource created for the World Exposition in Lisbon (, initiates a publ ic discourse on many dimensions of American democracy, and will serve as the starting point for an international discussion of the same themes. As the partner responsible for coordinating development of the NGSW, H-Net will draw on its experience to deve lop a broad user base and to engage users as active participants in the community of public discourse. We especially target the following groups of users:

* Educational users: The collections in the NGSW will comprise a valuable resource for educators and students at every level. By creating Web-based materials, lesson plans and exhibits, the project partners will work directly with a range of sc hool systems and individual teachers to provide instruction and technical support for classroom use of the NGSW. Materials from the NGSW could be used under a wide spectrum of educational strategies, ranging from a traditional teacher-centered approach t o a more open and exploratory student-centered approach. Students with visual disabilities could especially benefit from the ready availability of sound resources.

* Government and policy-makers: An easily available record of articulate public opinion can be used by policy-makers to help develop programs which take full account of past arguments, including public perceptions. As the NGSW develops collect ions based on interviews with policy makers, it will provide a growing base of information about crucial and controversial subjects unmediated by historical or other interpretations.

* Broadcast: Clean, broadcast-quality audio files can provide an enormous resource for radio, television and Internet journalists. This could prove to be an important route by which the NGSW reaches a broader public.

* Research: Since not all the material in the NGSW is available in transcript form, researchers in a variety of historical and social science fields will find primary material that is not readily available in any other form. At the same time, a s scholars increasingly incorporate multimedia and hypertext in their own work, the voices in the NGSW will be available for inclusion in on-line publications.

* Publishers: As publishers move to bring multimedia materials into online and CD-ROM publications, a resource like the NGSW will prove invaluable to their efforts. With proper management, it is possible that commercial publishers could prove t o be sources of revenue and intellectual partners in the NGSW's continuing development.

* Engaged citizens: The Internet would not exist without the enthusiastic support of a much broader public base than the constituencies above. The American public has demonstrated repeatedly that it will engage in and learn from serious cultura l and historical debates if these are made properly accessible to them; cultural tourism is booming, for example, and C-SPAN continues to have a devoted group of followers. This NGSW will reach out to all levels of American society, engaging the widest p ossible range of the members of the democracy.

Project Significance: Technological Innovation

The participation of the Speech Processing Laboratory at MSU (Department of Electrical and Computer Engineering) in this project allows the development of a robust search capability. The great weakness of m ost sound data collections is the need to use transcripts to navigate at a detailed level. As the NGSW attracts audience and attention, users will benefit from a search capability that will allow identification of key words, concepts and names. In collab oration with the Robust Speech Processing Laboratory at Duke University, we will work to develop solutions to the challenge of enabling keyword, topic, speaker, and language searches of the sound files themselves. This search facility will help make the s ite much friendlier for users in all of our target categories.

In collaboration with the MSU Library, the NGSW will also develop a set of standards and practices for the preservation and presentation of recorded speech. First, we intend to develop standards of clarity for files containing spoken sound (as oppo sed to musical preservation and presentation) in cooperation with a range of institutions already working in this area (see below). Second, clarifying copyright issues regarding sound files is necessary. Third, the project will evaluate current software as well as develop new processes for removing machine noise and reducing copying distortion. Fourth, we will continue our commitment to training staff and users in the use and development of these materials. Finally, project staff will engage in active promotion of the site, travelling to conferences and seeking publicity that will bring attention to the quality and range of the NGSW's collections to ensure the broadest possible public participation.

Institutional Context

The NGSW represents an important outgrowth of ongoing educational and research initiatives in progress at each of the partnership institutions. First, participants in this project have demonstrated expertis e in the analysis, use, and development of speech materials. The Deller and Hansen laboratories represent more than 35 combined years in speech processing research funded by an array of federal, state, and private agencies including the NSF, ONR, NIH, US Veterans' Administration, the Whitaker Foundation, Ameritech, IBM, AT&T and others. Deller and Hansen co-authored the internationally used textbook Discrete Time Processing of Speech Signals, which will appear in a new edition in 1999. The bas eline development for the NGSW search engines will be a state-of-the-art topic-spotting algorithm developed by Hansen's group at Duke. This development is described below in the section "Search Algorithms and MetaData." Synergistic applications of Deller' s recent work in system identification adaptive filtering for speech recognition and coding, with Hansen's longstanding expertise in speech enhancement will deliver the machinery needed for web-based enhancement tools.

The Chicago Historical Society [CHS] is preserving and cataloging Studs Terkel's vast collection of oral interviews. The majority of the Studs Terkel interview tapes are currently at risk of physically disintegrating because of a manufacturing erro r in the cassette tapes themselves. The project to preserve these cassettes has three phases. The first, funded by Chicago public radio station, WFMT, and the Chicago Community Trust, involved the creation of a sound preservation laboratory, transcription of the labels on each reel of tape, and the appointment of Studs Terkel as Distinguished Scholar in Residence at CHS. The second, funded in part by the National Endowment for the Humanities (NEH), involved systematically appraising, prioritizing, arrangi ng, and cataloging the tapes in order to make them available to researchers and the public. A third phase will involve preservation reformatting of the tapes and creation of a cool storage environment for the original tapes.

Jerry Goldman at Northwestern University has pioneered the delivery of historical audio material over the Internet with two award-winning projects, "Oyez, Oyez" and "History and Politics Out Loud." H-Net has received a large grant from the NEH in c ollaboration with Goldman to expand "History and Politics Out Loud," digitize the 900 original hours of MSU's Vincent Voice Library (VVL), and create web-based audio resources for scholarly and educational use. This project will build upon H-Net's use of digitized voice files and JavaScript animations to develop historical arguments around citizenship and national identity in turn-of-the-century America. This project represents many accumulated hours of practical experience with technical and presentatio n issues the NGSW will face. H-Net and the CHS are also partnering with the Archives of the African National Congress to develop on-line oral histories of the South African Liberation struggles.

Second, MSU has made a very serious commitment to providing the infrastructure and resources necessary to bring online material into the classroom and the public sphere. MSU was the first large state university to provide email accounts to all its students. Currently, the MSU system supports over 45,000 email accounts and permits each to host webpages. MSU hosts MichNet, an important backbone of the Internet in the Midwest. MSU has consistently supported projects such as H-Net: Humanities and S ocial Sciences OnLine, the largest organized consortium of scholarly networks in the world. MSU Libraries is actively involved in text digitization efforts with classroom applications. As an acknowledgement of MSU's commitment to pioneering educational t echnology, the State of Michigan has awarded the university an annual grant of $10,400,000.

Third, many participants have developed related expertise in a number of other critical areas. Therefore, the NGSW will pioneer creation of an extensive testbed whose utility will be largely generalizable. Application and integration of intellectu al tools developed under programs such as Phase 1 of the Digital Library Initiative are necessary to realizing the full potential of this research. By bringing together five units at MSU including H-Net, the Speech Processing Laboratory, the MSU Libraries , MSU Museum, and faculty in the College of Education, as well as Northwestern University, and the CHS, the NGSW will draw on the strengths of each of these individual units. Developing this project in a partnership between a premier urban historical soci ety, a public land-grant and two private universities, the NGSW will demonstrate how these techniques and research methods can be applied and used across a range of institutions of diverse size and mission, as well as for a broad constituency of academic and public individuals. Collaboration with the Linguistic Data Consortium (LDC) at the University of Pennsylvania and Duke University's Robust Speech Processing Laboratory (RSPL) will contribute additional digitization, archiving, production, distribution , and database management expertise. Applying these materials in a range of educational levels and settings serves as an important testing ground for the utility of these applications in classrooms across the nation.

Project Partners

H-Net: Humanities and Social Sciences OnLine

H-Net will coordinate project efforts, construct the web-based user-interface, and design and implement evaluation metrics. While its technical and institutional hub resides at MSU, H-Net is an interdisciplinary and worldwide consortium of scho lars and teachers. A diverse organization engaged in many disparate and wide-ranging projects, H-Net has a longstanding commitment to maintaining high scholarly and pedagogical standards while increasing access and availability of resources to internation al scholars and students. The energizing force behind H-Net is a shared desire to develop new opportunities for scholarship and teaching that stem from the rapid technological development in computers and electronic communication.

With more than 94 networks, which reach over 90 countries, H-Net is the largest distributor of electronic discussion lists in the world, and hosts one of the most extensive websites in the humanities and social sciences. Both an academic and public audience currently involves H-Net in a number of cooperative endeavors to digitize and make widely available archival materials, artwork, artifacts, oral histories, and music. Creation of online databases, exhibitions, photo archives, educational outreac h programs, and research tools are an important part of H-Net's work. Central to these endeavors is a commitment to build unique, active, cordial and enduring cooperation and dialogue among scholars and teachers around the globe. It is a challenge to inte grate resources into partnering institutions and to overcome the physical limitations of individual collections. Yet such partnerships can be an effective way to serve local needs, while also developing collaborative endeavors. H-Net's partnerships with t he CHS, American Historical Association, American Political Science Association, and Oral History Association will contribute expertise and an extensive audience for the NGSW.

The NGSW will be housed on servers at MSU. From MSU, H-Net will create and manage the web-based interface as well as facilitate development of educational materials and online exhibits. Because it is such an interdisciplinary organization, which re aches both academic and public audiences, H-Net's outreach endeavors will involve the largest possible number of users. Through its discussion networks and website, H-Net will ensure that the NGSW is highly publicized. H-Net will also play a role in evalu ation of the NGSW and related resources.

Engineering Collaborators and Facilities

Speech Processing Laboratory: Michigan State University: The Speech Processing Laboratory at MSU (SPL) is a cognate lab in the Signal Processing Laboratories Consortium in the Department of Electrical and Computer Engineering at MSU a nd the primary site for the fundamental speech-processing research undertaken in this project at MSU. Much of the major equipment necessary for this work is already in place in this laboratory and in the general computing environment in the College of Eng ineering. Some of the "local" facilities in the SPL include an array of networked (Windows NT) Pentium-II-based personal computers, a SUN Sparc 10/30 workstation (Unix), and extensive software support for algorithm development and signal acquisition and p reprocessing. In addition to facilities for signal processing, support facilities for graphics, word-processing, and the like are also in place.

Robust Speech Processing Laboratory at Duke University: The Robust Speech Processing Laboratory (RSPL) in the Department of Electrical Engineering has been actively involved in sponsored research for ten years. Most of the necessary comp uting and data acquisition equipment is in place at the RSPL. Equipment and facilities available in RSPL include a network of 7 SUN SPARCStations with 16-bit linear digital audio I/O (SS-10/402, SS-5's, SS-4's, LX's), 17 GigaBytes of disk storage, a 600Mb yte optical read/write disk system, a dual-channel 48kHz Gradient 16-bit stereo audio I/O system with dual telephone interface, a 48kHz DAT-Link 16-bit stereo audio I/O system, a collection of X-window workstations and IBM PS-2 PC's with real-time digital signal processing hardware, three Sony Digital Audio Tape Decks/units (DTC-700, DTC-3, DTC-7), Shure SM10A head mounted close talking microphones, a Kenwood KA-88 integrated amplifier and other audio equipment, and a 130 square foot sound resist anti-aud io chamber. Signal processing analysis packages such as Matlab and a site license for Entropic Systems Waves-ESPS/HTK speech analysis packages. RSPL has an extensive library of speech data (5 GigaBytes), and is also a member of the LDC.

Linguistic Data Consortium: The LDC is a consortium of universities, companies and government research laboratories. It creates, collects and distributes speech and text databases, lexicons, and other resources for research and developmen t purposes. The University of Pennsylvania is the LDC's host institution. The LDC was founded in 1992 with a grant from the Advanced Research Projects Agency (ARPA), and is partly supported by a grant from the Information and Intelligent Systems division of the National Science Foundation. LCD will derive a subcollection from the NGSW for use in its speech-processing research collection.

InvoTek, Inc.: InvoTek, Inc.: Tom Jakobs, President of InvoTek, Inc., is a licensed Professional Engineer in the state of Arkansas and has been developing devices to assist people with disabilities for the past 10 years. Over this period he has been responsible for the design of 9 products and numerous custom designs for individuals with severe disabilities, including vision aides. InvoTek will consult with the speech laboratories at MSU and Duke University to insure that interfaces to t he NGSW are friendly to persons with disabilities.

Collections and Digitization Sites

Michigan State University Libraries: The VVL is a unit within the MSU Libraries. It houses over 50,000 audio recordings from the dawn of recording to the present. Its holdings include rare recordings from Edison cylinders, wire record ers, and unique dictation devices. The MSU audio library is named after G. Robert Vincent, inventor of the V-Disk in World War II, and chief sound engineer at the Nuremberg Trials, who founded the VVL when he came to MSU in 1961. The MSU Library has init iated a digitization program for the original 900 hours of the VVL's holdings in partnership with H-Net, and will digitize the NGSW materials. Through this collaboration, the MSU Library has demonstrated its ability to:

* Select appropriate voice material for this project;

* Use the latest technology to enhance the audio quality and comprehension of the voice material, by reducing machine noise and distortions from copying;

* Efficiently transfer these often-fragile materials from audiotape to digital form.

The MSU Libraries cataloguing department has extensive experience in the creation of bibliographic records. The Library has been creating and contributing MARC records to OCLC since 1975. Approximately 25,000 Voice Library titles in analog format have been cataloged by the MSU Libraries and are accessible in the online catalog. In the last several years the Library has been actively cataloging electronic resources, including a partnership role in a CIC cooperative cataloging project to provide bi bliographic access to ARTFL, a digital archive of classics in French language and literature. The Library holds an elected seat on the Policy Committee (Executive Board) of the Program for Cooperative Cataloging (PCC), an international cooperative catalo ging effort housed at the Library of Congress, and is a participant in the PCC's NACO project for name authorities.

The MSU Library holds over four million volumes, which makes it the 25th largest collection in the Association of Research Libraries - larger than the libraries of some Ivy League universities. It serves not only the 42,000 students at MSU, but als o every citizen of Michigan through its community borrower program, and has active outreach programs to school districts. The library also has an active text-digitization program, which works closely with the curriculum needs of undergraduate writing clas ses. Through its Digital Sources Center the library supports a wide range of projects including SGML, GIS, text and recorded speech digitization. The Library also serves as a source of copyright information for the university community.

Michigan State University Museum: Founded in 1857, the MSU Museum was one of the first museums on a college campus in the Midwest, and is one of the oldest university museums in the country. Research is supported by grants from a variety of public and private sources, as well as such federal and state agencies as the NEH; Michigan Council for the Arts, Michigan Humanities Council, NSF, Army Corps of Engineers, National Geographic Society, National Park Service, Institute of Museum Service s, and the Smithsonian Institution.

In addition to serving as a research center for extensive public archeology throughout the Great Lakes region, agricultural history, vertebrate paleontology, and vertebrate biology, the MSU Museum is the state center for research, preservation, and documentation of Michigan and Midwest traditions such as quilting, rag-rug making, decoy carving, Mexican-American music, foodways, African-American gospel music, inland river boat building, and Great Lakes maritime traditions. Staff folklife researchers have developed a national model for identifying and assessing traditional cultural resources. Part of the Michigan Traditional Arts Research Collection includes over 300 hours of oral history interviews with Native-American and Mexican-American folk arti sts, quilters and musicians from throughout the Midwest. Interview topics range from life histories of migrant fieldworkers, to creation stories and local histories. All of these materials are catalogued. Many of the tapes include accompanying patterns, q uilts, photographs and other physical artifacts. The majority of the interviews with Native-American folk artists also have been transcribed.

Chicago Historical Society: The nation's premier historical society, the CHS is a privately endowed, independent institution devoted to collecting, interpreting, and public presentation of the rich multicultural history of Chicago and Ill inois, as well as selected areas of American history, through exhibitions, programs, research collections, and publications. The CHS is committed to using its resources for research and education. In addition to a number of educational initiatives and tra ining sessions, CHS has secured title to the Studs Terkel tapes and the intellectual property they encode. CHS also holds a collection of audio and videotaped histories conducted by high school students in Chicago Neighborhoods.

The Terkel collection includes 9,000 hours of interviews done by Studs Terkel since 1954 on WFMT radio in Chicago. Terkel interviewed broadly in politics and the arts. A complete listing runs for pages but includes anyone of significance in America n life and culture who visited Chicago while Terkel's program ran: from Bertrand Russell to Abbie Hoffman, from Bob Dylan to Luciano Pavorotti. The collection also includes all the interviews Terkel has done over many years for his books. The collection i s especially strong in the arts, including musicians, painters, sculptors, novelists, essayists, poets and artists of all kinds. The collection presents an almost encyclopedic view of the arts in the United States in the last 45 years and, accordingly, is an important research resource. The NGSW will provide the widest possible access to these tapes, as well as the opportunity for electronic public history presentations bearing upon American history and culture in the 20th century.

The Chicago Neighborhoods material includes about 50 hours of interviews in each of four Chicago neighborhoods (Douglas/Grand Boulevard, Rogers Park/West Ridge, Pilsen/Little Village, and Near West Side/East Garfield Park). High school students tra ined by CHS conducted most of the interviews. CHS asked them to interview their families and also to find a neighborhood elder who could talk about what the area "used to be like." The interviews are all with ordinary people talking about their lives and the places where they live. They are moving and touching - and also funny in very unexpected ways.

CHS works with approximately 5,000 teachers each year from Chicago and suburban schools. About 50,000 elementary school students visit CHS each year. CHS runs a regular series of workshops for teachers. These workshops feature the use of the CHS w ebsite materials (] and occur both in Chicago schools and at the CHS facilities. CHS has a special relationship with the schools in each of four neighborhoods cited above, and has developed online and print curricular materia ls on the history of each neighborhood. CHS also has a specially-funded project with a group of west-side schools called "History Explorers" and Douglas Greenberg has a personal mentoring relationship with the administration, faculty and students of the P aul Cuffee School on Chicago's south side.

Northwestern University: Northwestern University houses the "Oyez! Oyez! Oyez!" multimedia relational database devoted to United States Supreme Court and constitutional law, created and run by Jerry Goldman. Supported by grants from NEH a nd NSF (9602170), the Oyez project has become an authoritative resource for scholars, students, and professionals across a range of disciplines. The Oyez audio archive holds more than 500 hours of oral arguments; this number will double in the next two ye ars. When complete, the audio archives will provide an accessible portal to the great constitutional controversies of the last half of the twentieth century.

While the authority and authenticity of the audio database has proved popular to tens of thousands, our preliminary assessment reveals that the database has also proved frustrating for many scholars and students. In a sense, this is not surprising . The arguments are lengthy, ranging from one to three hours. Digesting these materials is difficult work. In addition, the arguments are not syllogistic exercises, neatly laid out in Socratic fashion. They are expositions, with frequent interruptions fro m the justices who may move from topic to topic as their concerns direct them. Finding coherence in such material is daunting to experienced and dedicated scholars and professionals. It can be overwhelming to the uninitiated.

Recent advances in streaming technologies offer the promise of surmounting audio's inherent linearity. Synchronized Multimedia Integration Language (SMIL) is a proposed specification of the WWW Consortium (W3C) (Lohr 1998). SMIL is a powerful way o f linking text, images, and audio over the WWW. The Oyez project is currently using SMIL, to mark up argument transcripts to synchronize audio and still images of the speakers with the displayed text of the argument. This requires the use of transcripts s tored in the Supreme Court Library. They require scanning and verifying against the audio source materials. In addition, voices must be identified and all text must be tagged and time-coded. Once synchronized, however, it will be possible to search the da tabase using standard Boolean logic and return multimedia replies. We maintain that such capabilities will break the linearity barrier that today makes audio materials on the WWW both compelling and frustrating.

SMIL offers the promise of a real advance in the ability to stream multimedia in a useful (i.e. searchable) format on the WWW. The Oyez project has a steady core of users who could easily be called upon to test interactions of a SMIL database. With more than 80,000 page views a month, the Oyez project can attract and hold the interest of a diverse audience. An empirical test of SMIL is the logical next step, and the Oyez project is a good place to begin. Northwestern University offers a state-of-th e-technology network and an established infrastructure to distribute substantial multimedia collections. Previous grant support has provided the hardware, licensing and content base to test the robustness of various SMIL implementations.

Educational Testbeds and Partnership School Districts

Integrating the NGSW resources into a range of K-12 school districts, as well as college and university classrooms across the United States is an important part of this project. Faculty and staff in MSU's College of Education will oversee the K -12 initiatives, while H-Net will coordinate the university outreach. The partnership school districts span a range of communities, from urban and suburban districts that face a serious lack of social and economic resources, to poor rural districts, to o ne district with a wealth of resources. These districts were chosen both because of perceived needs by the districts themselves, and because of the enthusiasm of individual teachers and superintendents. Each of these districts also has strong and long-sta nding ties to the collection sites and other partnership institutions. Through H-Net's networks and college outreach programs, international faculty will also use the NGSW's collections in their lectures and discussion sections across a broad range of dis ciplines. NGSW resources will be integrated into classrooms at MSU, Northwestern, and Duke University as well.

Beginning in the second year of the project, the College of Education at MSU will conduct training sessions and summer workshops for teachers from three targeted school districts in Michigan. The MSU College of Education will provide follow-up and technical support for the districts both through the university's 24-hour helpline, coordination with district staff, and site visits. The College's Departments of Teacher Education and Educational Technology have international reputations as leaders in e ducation of teachers. The vision of these programs parallels MSU's basic land-grant philosophy, which is rooted in a strong commitment to contribute positively to the challenges facing resource-poor rural and urban communities. Workshops described in the section on CHS will train teachers from the Chicago area in the use of web technologies. Members of CHS, including Greenberg, also work directly in the city's school system through mentoring relationships with administrators, faculty, and students.

Targeted school districts include public school districts such as Baldwin, Benton Harbor, and Oak Park in Michigan, and the Paul Cuffee School in Chicago. The Baldwin Community School district is located in a small, rural community of approximately 6,000 persons in upper Michigan. 37% of the students are African-American, and 31% of the students live below the poverty margin. In spite of its size and location, Baldwin has a well-developed technology program, provides regular hands-on technology tra ining for its teachers, and employs a teacher to manage technology in the district. A newly-hired superintendent has also taken over management of the system, and has implemented a five-year plan designed to engage parents, community businesses, and organ izations in collaborative efforts to improve academic performance and prepare students for post-secondary work. The NGSW will be a fundamental part of this plan.

Benton Harbor and the Paul Cuffee School are both urban districts. Benton Harbor is a small, urban town of 36,000 persons situated along the St. Joseph River on the western coast of Michigan. Once a prosperous fruit-growing region, Benton Harbor wa s also home to foundries and plants for automobile parts, the Heath Company and Whirlpool. The loss of these foundries and auto plants in the 1960s and 1970s, seriously undermined the local economy. 84.3% of the 6,400 students in the Benton Harbor School District currently qualify for free and reduced lunches, well above the state average. During the last two years, however, community stakeholders concerned with the education, health, and social welfare of youth in Benton Harbor have committed themselves to long-term partnerships with the school district to improve educational resources. The NGSW fits squarely with the aims of these community businesses and parents, and will be an important part of these initiatives.

The Paul Cuffee School is also located in an economically poor, urban neighborhood. However, the CHS has played a large role in revitalizing the school by providing a wealth of resources. Through local donors and commitment of the district, the sch ool has acquired computer technology to enhance educational resources. Greenberg has personally been involved in CHS's mentoring program and has worked extensively in the Cuffee School. NGSW is a natural extension of this work, and will provide further re sources to the Cuffee classrooms.

Oak Park is another targeted district. Located in the Detroit area, Oak Park is an ethnically diverse and economically stratified community. 40% of students in Oak Park's public school system are eligible for the free and reduced lunch program. The district has recently implemented a six-year strategic planning process to effect substantive improvement in student achievement. Forming a professional development committee with representation from the administration, classroom teachers, community, and parents, is focusing focus on improving student achievement and establishing parameters for measuring success. The NGSW will continue to improve access to classroom resources, and to set new measures for success in the districts. By integrating these mul timedia resources into lesson plans and student research projects, the NGSW can help to transform the students' learning experiences, while continuing to build on community initiatives and goals.

Discussion of Key Aspects of Project

Preservation and Access: Preservation has at least three meanings for librarians and archivists (Conway 1996), and is central to preserving the quality, longevity, integrity and accessibility of the digital data contained in the NGSW. First, preservation makes valuable resources available. Digital conversion can be one of the most cost-effective and viable means of preserving deteriorating audiotapes with appropriate standards setting. Second, digital conversion can be used to create a high-quality copy of an item, thus protecting the original. By obviating physical handling of audiotapes, digitization prevents further deterioration. Third, protecting the data stream from corruption or destruction through carefu l choice of a storage medium is necessary to ensure a long life expectancy of a digital audio system. The NGSW meets all these preservation needs, as well as allowing use of these digitized resources by the widest possible audience.

Longevity of the collections will be ensured by storing the sound data in standardized formats to facilitate the ability to migrate data, indexes and software to future technologies. The NGSW collaborators will also remain intimately involved in th e creation of those access systems (Comm. on Preservation and Access 1995). Quality, defined in this context as the usefulness and usability of systems, is conditioned significantly by the limitations of capture, storage, and replay technology. Digital

conversion places less emphasis on obtaining a faithful reproduction of the original in favor of finding the best representation of the original in digital form. By developing mechanisms and techniques for judging quality of digital audio reproductions , the NGSW will make it possible to capture and preserve as much intellectual and aural content as is technically possible and then make that content available to listeners in ways that are most appropriate to their needs.

Setting standards for digitization of speech files is an essential component of preservation (see below). We will adhere to sampling frequency and resolution standards (16 kHz/16 bit) that faithfully preserve acoustic content and endeavor to develo p digital enhancement and filtering techniques that improve perceptual measures while minimally effecting authenticity. Another important task in the creation of massive acoustic database is the specification of lossless compression routines that make ef ficient use of available channel bandwidth. Indexing and careful authentication procedures to make sure files are not altered intentionally or accidentally (Lynch 1995, Wiebel 1995) will be important steps toward ensuring the physical integrity of the di gital audio files. Developing metadata interchange standards for audio files, including tools and techniques that will allow structured, documented, and standardized information about data files and databases to be shared across platforms, systems, and in ternational boundaries are all central objectives of the NGSW. Finally, these steps will assure the widest possible access to these historically significant resources. The NGSW will develop access systems that take advantage of the multiple interfaces pos sible on the WWW. The collaborators will also encourage vendors to provide open system architectures for audio digitization and encourage backward compatibility in new system designs.

Digitization and Search Engine: The digitization efforts proposed for the NGSW will involve several of the nation's largest physical repositories of historically-significant spoken word collections which because of physical problems of preservat ion and access, have previously been accessible to only a small number of individuals. Digitizing these speeches, oral histories, and newscasts will make these voices available to a wide public audience for the first time. At the same time, this effort wi ll preserve these recordings for future generations. Many of collections are currently decaying and in danger of being lost altogether. MSU's VVL and Museum, and the CHS house some of the nation's most significant speech holdings. By digitizing approximat ely 18,000 hours of these important collections, this project will serve as a model for preservation activity at other universities, at historical societies, and for private collections.

Currently several speech digitization standards are in use on the Internet, yet no archival standards have been adopted for speech files. In order to ensure the widest possible access to these materials, development of such standards is key. MSU wi ll work with the LDC to develop speech digitization standards and preserve these holdings. Because the LDC is an open consortium of universities, companies and government research laboratories which creates, collects and distributes speech and text databa ses, lexicons and other resources for research and development purposes, its guidance and partnership will contribute an important research and knowledge base to this endeavor. Based on H-Net's successful collaborative model of decentralized control, digi tization will be conducted at the individual collection sites in consultation with the LDC and the MSU and Duke speech processing laboratories. This process will assure that collection curators retain control over their individual collections while ensuri ng uniformity of digitization standards and protocols.

Extensive consultation between engineers at MSU's SPL, Duke University's RSPL, and the LDC will also address methods used to compress speech data for efficient storage. In collaboration with the individual collection sites, the project partners wil l preserve and digitize the recordings, and construct search and retrieval mechanisms. Balancing preservation concerns with the ability to provide quick access to acoustic features for search will be a central aim of this project.

Each of the project partners will strive to preserve the original quality of recordings while also allowing the materials to be efficiently searched. Algorithms will be developed to enhance noisy and degraded recordings. This will improve listening quality, intelligibility and authenticity of the recordings, while also allowing each user of the database to adjust the enhancement to suit his or her particular needs. Ultimately, this project will produce web-based equalization, noise reduction, and e nhancement software that can be used by the researcher, educator, student or general public to adjust the acoustics to optimize important perceptual features.

Search Algorithms and Metadata: This project will result in construction of a search engine, or set of search engines, to find speech citations in response to four classes of keyboard inquiries:

* Inquiry Type 1: Speaker and subject are known (and entered by the user in response to initial interrogations).

* Inquiry Type 2: Speaker and approximate wording are known (entered). The intention of the search is to locate the precise wording of a particular quotation.

* Inquiry Type 3: Identify speaker of given (or approximately given) quotation (entered).

* Inquiry Type 4: Find all speeches on a given topic (topic entered).

The SPL at MSU and RSPL at Duke will work collaboratively on this problem and will incorporate results developed by LDC for the editing process. The basis for the search engine(s) will be an efficient search algorithm for topic identification developed by Hansen at the RSPL. The topic-spotting system developed by the RSPL is based on context-dependent, continuous density hidden Markov models (HMMs) (Pellon and Hansen 1997, 1998). The user specifies a set of text-based keywords for a topic search. The s potter automatically extracts the phonetic pronunciation for each keyword from a 120,000 word phonetic dictionary developed at Carnegie Mellon University. If the keyword does not exist in the dictionary, a set of letter-to-sound rules is used to approxima te the phonetic transcription. Keywords are then modeled using quasi-triphonic HMMs. Non-keyword speech is modeled using context-independent HMMs. For pre-recorded data, the topic spotter can process data at approximately 6 times real-time (24 hours in 4 hours of computing). It can handle arbitrarily long files and has been used to scan as much as 24 hours of data (from CNN Headline News) at a time.

The digitized data will contain a wide range of recording conditions, microphone variation, background/telephone/channel distortions, and distortions due to age and condition of the analog media. The RSPL topic search engine has processing phases t o address a range of these issues. While scanning an input audio stream, the keyword spotter can classify the incoming data as (male/female), (music/noise), and (high quality/telephone quality). The (music/noise) model was trained from material captured f rom NPR radio broadcasts. In addition, for tasks requiring speaker identification, the group at RSPL has recently developed a speaker identification method based on non-uniform feature sampling which achieves the same performance (99.3%) at a rate 23 time s faster (Pellon and Hansen 1997) than a standard Gaussian-mixture model-method approach by MIT Lincoln Lab (Reynolds and Rose 1995).

Starting with Hansen's search engine, a major challenge will be to develop methods for adaptation and enhancement of the existing acoustic and language models to perform satisfactorily on various partitions of the VVL and other databases. Much work will need to go into this partitioning in conjunction with the methods used for search. This work will begin immediately upon the availability of a small database of digitized speech in the early stages of the sampling process, and will continue for most of the project duration to incorporate new data, results, archive structures, and methods for adaptation as these result from cognate research. The other main "search" issue is how to incorporate the user-supplied information (speaker, dates, etc.) into the search and, more importantly, to find fast, hierarchical search methods to permit as much searching in near-real-time as possible. Some current work in Deller's lab on fast HMM evaluation could be very useful for speeding up the search for appropriate utterances (Deller and Snider 1993, Lee and Deller 1998) . In addition, it will be necessary to maintain a dictionary of available speakers for search, and to allow for fast re-training methods for new speech data as they become available (Arslan and Han sen 1998, 1999).

Incorporating metatags and other identifiers will play a role in making search possible across an expanding collection of sound resources spanning a broad range of original recording type, age, and subject matter (Heery 1996). Several levels of met adata will also be used to help users identify speeches based on content, or audio-oriented data which will provide information about the sound itself. Through a system of relational databases which will balance the bibliographic and audio-oriented expert ise of the partners, the NGSW will develop and apply the first system of audio-oriented metadata while optimizing the effectiveness of speech search tools (De Rose 1995). Searching mechanisms for audio files must provide access based not only on author, t itle, edition, imprint, subjects, etc., but also on characteristics of the sound itself. The ability to search large bodies of audio information for such ephemeral qualities as the speakers' accent could reveal relationships as yet unnoticed or unremarked , and possibly open up new areas of scholarship. In addition, markup that identified relevant content and structure would facilitate such a discovery process.

Although there is no clear agreement in the library and archival communities about how audio files should be encoded, SGML is the standard coding system for text and offers many benefits for sound files. SGML imposes no fixed set of component type s, and is a public, non-proprietary standard, to which software vendors conform. Such encoding will require some extra effort, but careful selection of syntax and conventions will make the encoding task manageable (Herwijnen 1986). In addition to SGML ta gging, traditional library catalog records and metadata will be used to ensure a workable access system for users of the NGSW from the beginning of the project. MSU Library staff will create MARC bibliographic records that will be accessible in the Librar y's online catalog and on OCLC, an international bibliographic database. These records will be converted using one of several existing programs from MARC to Dublin Core metadata records that will provide bibliographic access via the WWW. Both MARC and Dub lin Core are international standards that assure consistent bibliographic record structures for effective retrieval (USMARC, 1994 & OCLC, 1997). Access points on both MARC and Dublin Core records will include authorized forms of speakers and keyword a ccess to key subject concepts. Name authority standards will be applied to assure that consistent forms of speakers' names are used as access points. These bibliographic records will allow the user to define or narrow a search to a particular topic or spe aker through a textual approach.

The LDC currently has automated procedures to facilitate the task of developing the audio-oriented metadata, while the SPL and RSPL laboratories at MSU and Duke University will work collaboratively to provide quality assessment (Deller, Proakis, a nd Hansen 1993) and user-friendly interfaces for adjusting perceptual quality. Use of a standard relational database management system will facilitate this coordination of efforts. MSU will operate the servers and the database management system, and perfo rm standard operational tasks including tape backups. Through careful design of metadata and searching techniques, the NGSW will begin to answer some of the questions most challenging to electrical and computer engineering as well as library and informati on science. Our paramount concern in this project is making this digital information available to future generations. Metadata, digitization standards, and carefully designed search systems will help ensure the longevity and data quality of these digital documents (Rothenberg 1997; Comm. on Pres. and Access 1996).

Enhancement, Restoration and Robust Search Mechanisms: Development project of algorithms for enhancement of noisy and degraded recordings will also be a central objective of this project. This will improve the listening quality, intelligibility, and authenticity of the recordings. Yet, these features are not necessarily improved concurrently. Tradeoffs will exist in any attempt to improve one of these perceptual features. Each user of the database may wish to adjust the enhancement to suit a par ticular need. A historian or archivist may view particular types of background noise as important context for a given subject matter, while a linguist or engineer may be more interested in a specific speech or recording feature. One objective of this rese arch is to develop web-based equalization, noise-reduction, and enhancement software that can be used by the researcher, student or general inquirer, to adjust the acoustics to optimize important perceptual features. This will further facilitate use of th e NGSW materials to suit a range of multi-disciplinary purposes.

The MSU and Duke speech processing laboratories will work collaboratively on this issue. Hansen's group at Duke has been responsible for formulating a number of effective speech enhancement algorithms based on constrained iterative spectral constra ints (Auto-LSP) (Hansen and Clements 1991), auditory constrained speech enhancement algorithms (ACE-I [Nankumar and Hansen 1995], ACE-II [Hansen and Nankumar 1995]), and morphological based constraints (MCE) (Hansen 1994). In addition, many of the traditi onal and more recent speech enhancement algorithms developed by other researchers are also available for system integration. One novel approach is a text-based speech enhancement method where knowledge of the phone sequence is used to formulate a dependen t enhancement method for the requested audio stream (Hansen and Pellom 1997). Research will be needed to formulate acoustic background classifiers that prescribe which enhancement method would be most effective for a given distortion. Pre-defined enhancem ent configurations will be made available for users wishing to select a preferred enhancement method, as well as suggestions for settings that yield the most "authentic" sounding reproduction with some optimal degree of noise suppression.

Some novel research that combines Hansen's expertise on quality assessment (Hansen and Arlsan 1995, Hansen and Nandkumar 1995), with Deller's work on set-membership identification (Deller 1996) will be explored to develop a user-friendly interface for adjusting perceptual quality. Work on this issue will begin in Year 2, following collection of a suitable database.

Compression, Encryption, and Copyright Protection: All academic communities inherently contain differing and often conflicting perspectives on intellectual property issues. As producers of intellectual property, university presses and faculty ar e concerned with preserving copyright protection; as consumers of intellectual property, university libraries and faculty are more concerned with issues of "fair use"; while instructional design groups are both producers and consumers. These conflicting p erspectives lay at the heart of H-Net which is constitutionally committed to open access at the same time it emerges as one of the largest humanities publishers in the digital age. Public discussion helps to develop national policies on intellectual prope rty rights that will be in the best interests of higher education. Copyright issues for recording of the spoken word are especially important and largely unresolved because there is a lack of litigation and case law in this area. One of the research proje cts in this grant is to examine the issues and to develop guidelines for this and other voice collections. In this work, we start from the position that: a) 17 USC 108F(3) exempts news broadcasts; b) 17 USC 107 allows the NGSW to make fair use of broadca st segments, depending on the amount and substantiality of the segment; c) For broadcasts since 1978, permission from broadcast networks should be sought in a good faith effort to secure their support; d) For voices since 1923, permission from the speaker s should be sought in a good faith effort to secure their support. The exception is federal employees speaking on government business: these speeches are presumed to have the status of government documents.

Recent research on encryption coding of speech and images (Kuo, Deller and Jain 1996) conducted in Deller's SPL at MSU, could be employed to create novel and high-secure "watermarks" for the speech files. The transform encryption coding tech nique can potentially be used to create both highly-compressed and highly-secure signal transmissions with virtually indestructible watermarks. Much research has been conducted on how to encode digital images with a small "signature" that is not perceived by the eye because of the natural masking properties of the human visual system, but will protect commercial interests and intellectual property (IEEE Int. Conf. ASSP, 1998). This project will develop a similar system for audio files. The differences bet ween the speech and image files are significant because of the differences in the physical properties of the sensory stimuli (dynamic auditory signals vs. static images), in their digital representations (bandwidth, signal dimension) and in the human perc eption of these stimuli. The "masking" effects and construction of digital techniques to exploit these effects post new challenges. Developing such a system will make it easier to obtain distribution permissions and new resources from other parties. Work on this problem will begin as soon as a minimal database is available.

User Interfaces for Blind Persons: At approximately the midterm of the project, following development of a prototype user interface for the evolving NGSW, we will begin work on the development of web-based interfaces for the blind. This work wil l consist principally of adapting existing prototype interfaces to operate in a sound-only mode, assuming the availability of certain audio and speech-recognition capabilities at the user's input terminal. The interfaces will be designed to support a flex ible array of commercially available, state-of-the-art audio interface devices. InvoTek, Inc., which has extensive experience in the development of augmentative and assistive devices, will be the key developer of these interfaces, in consultation with the speech groups at MSU and Duke.

Gallery Collections: In addition to allowing users access to the full repository of sound files, the NGSW will be composed of collections that span a broad range of topics and interests. Exhibits will be designed with accompanying text and graph ics. Connected through a set of relational databases, this system will facilitate use of the collections in classrooms as well as for a broad range of resource purposes. The proposed collections will include over 60,000 bibliographic records. Drawing from the rich collections of the CHS, MSU's VVL, MSU Museum, and Northwestern University, sample collections within the NGSW would include:

News and Newsmakers: Drawing primarily on the holdings of MSU's VVL, this will include selections of speeches by Teddy Roosevelt, Eugene V. Debs and Buffalo Bill Cody, as well as news broadcasts and special events from 1940 through the 1980s whi ch are currently housed as part of the Historical Voices and Janak collections at MSU. Watergate/Vietnam, including a wide variety of perspectives on Vietnam and Watergate, from presidential speeches to newscasts, is another strength of this collection.

20th Century Inventors and Scientists: From Thomas Edison's first cylinder recordings to John Glenn talking about exploring space, this collection will include recordings that are historically significant both because of their content and speake r, as well as the technical achievements discussed. These holdings are currently located in the VVL.

American Life: Using the oral interviews on which Studs Terkel based his books, and which are owned by the CHS, this collection will showcase a broad range of American experience and stories that span social, political and cultural life in the 2 0th century.

Chicago Neighborhoods: Owned by the CHS, this collection includes family genealogies and oral histories conducted in several Chicago neighborhoods by local high school students. These recordings provide a detailed account of urban life, and offe r a full range of neighborhood accents for linguistic study.

Folklife and Lore: This collection is composed of taped interviews with a variety of American folk artists. Recorded stories of Native-American quilters and Mexican-American folk artists from across the Midwest are a special strength of the coll ection. These holdings are currently housed at the MSU Museum.

History and Politics Out Loud : Voices of U.S. presidents, secretaries of state and other government officials make up the vast majority of this collection, which is housed at Northwestern University.

Supreme Court Decisions: U.S. justices and a range of court cases can be heard in these recordings, providing a far greater range of experience to listeners than reading the transcripts alone. This collection is also provided by Northwestern Uni versity.

World War II: Including a selection of broadcast news from the Ripps collection at the VVL, this collection includes broadcast news recorded from 1940 to 1945, from Pearl Harbor to the dropping of the atomic bombs.

The range and scope of recorded materials included in this collection will make the NGSW a central resource for a broad range of social sciences. The NGSW will also become part of the educational infrastructure as a place where teachers at all levels c an go for reliable aural-learning resources.

Educational Resources and Tools: The NGSW represents an opportunity to think about education at both K-12 and post-secondary levels in new and exciting ways that make full use of the new information technologies, while maintaining the highest pe dagogical and scholarly standards. This project will continue efforts that will break down communication barriers between teachers and scholars, while facilitating development of classroom tools which empower instructors, provide training, support and mod els of successful new teaching techniques, and forge new links between scholarly societies, museums and educational institutions. Additionally, the NGSW represents an opportunity to focus on children who reside in poor and minority school districts by dev eloping and implementing curricula that promote multiple approaches to learning. Aural resources can be used to challenge students and develop skills, while drawing on students' background and interests.

A variety of web interfaces will be tailored for use by students in a range of grade levels, teachers, and college faculty. Not only will users be able to search, select and listen to sections and subsections of sound files, but they will also be a ble to generate webpages that will contain the information required for the classroom. Following the "shopping basket" online model of commercial operations like, the WWW interface will allow teachers to collect materials and use them to create a webpage. The page itself will either reside on the project server at MSU, or be downloadable to a server or desktop for further editing and incorporation into the user's existing Web resources. The sound files themselves will be served from NGSW's mach ines. Using relational databases to provide narrative context, video clips, and graphics will help teachers to easily construct multimedia online lesson plans, or students to construct multimedia classroom projects. Teachers and students can also choose t o keep their sites private, or to include them in what will become a growing public gallery of lesson plans and educational tools created by NGSW personnel. Using the H-Net model, the galleries will be "curated" by scholars and teachers to ensure quality.

Users will be able to search for particular sound clips. An advisory board with representatives from scholarly societies and a range of universities and K-12 schools will determine criteria, such as discipline, speaker, time period, and other acces s points to index the search engine. Each display will contain detailed information about the file, such as length, language, whether an accompanying video clip is available, and so on. Once the user finishes collecting material, she can move to a "collat ing page" which will permit the organization of links to prepare a classroom presentation or a "reserve readings" page where students can later link for study and review. Programming would combine server-side CGI scripts and client-side Java to reduce ser ver load. Users can draw from the entire repository or specific collections. For teachers who have no multimedia feed into their classrooms, this method of WWW delivery will make material collection particularly easy and much less time-consuming than trip s to various libraries. NGSW will provide instructions for copying sound files to cassette tapes. Slides for classroom use can be created from online images. Texts can, of course, be printed out along with accompanying outline maps and either distributed in paper form or printed on transparency film.

Exhibits and model lesson plans will also be provided. Each will be highly interactive and will prompt written student responses, as well as providing invitations to explore different levels and pathways within the NGSW website. Student responses c ould be posted publicly, providing an important opportunity for exchange between students around the globe. Students will have the option of responding either to the exhibit itself, or comments by other students.

Educational Testbeds: Testing these applications and tools in a range of educational institutions, as well as incorporating teacher-designed materials as a central part of this site, is fundamental to ensuring wide use of the NGSW. Focusing on s chools in districts that serve economically disadvantaged and minority children also provides an important opportunity to use new ways of thinking about teaching and learning to address the high dropout rates, sporadic attendance and poor academic perform ance confronting these areas. The NGSW project will also effectively respond to calls for students to be educated to meet the challenges of a rapidly changing technological society.

This project will harness the intense support of teachers who are committed to multiple approaches to learning and the integration of educational technology into their curricula and instructional strategies. Over a five-year period, the NGSW will b e integrated into multiple classrooms within the collaborating districts. We see four potential models for classroom integration of the materials in the NGSW. First, and in the simplest case, the teacher can use the collections to bring sound material in to a traditional lesson plan. Second, the teacher can invite students to explore one or more of the collections using a set of thematically appropriate criteria in a guided version of active learning. Third, students can initiate their own exploration o f the materials, especially in advanced classes. Fourth, the more controversial issues can be framed as debates in which students can see contrasting points of view and work out their own solutions to the issues.

Programmers at the NGSW will create public interfaces tailored to all four models. The creative use of interactive database programming will reduce the opportunity cost for busy teachers who wish to utilize the NGSW's resources in the classroom. The interface will allow teachers to construct and download classroom presentations; a student version will help students create their own projects, which will either be stored on the NGSW's servers or downloaded to a school or personal computer. In addi tion, we will provide technical support.

Through a series of summer workshops and training sessions, teachers in these districts will receive professional development credit and training to help them integrate these materials into their classrooms. During these workshops, teachers will bu ild web-based resources to incorporate materials from the NGSW into their lesson plans, and will be partnered with teachers in different school districts who teach the same grade levels. In addition to receiving year-round technical support from MSU's Col lege of Education and staff, these teachers will also communicate with each other as they work to implement the tools in their classrooms. Follow-up sessions throughout the academic year, and site visits by MSU College of Education staff, will also facili tate the success of these projects.

Similar programs will be implemented for college faculty and instructors across a range of disciplines. Through summer workshops, faculty will learn to construct syllabi and tools that draw on the aural resources in the collection for use both insi de and outside their classrooms. These lesson plans will also become available as models for public use. Faculty will be recruited through a variety of means, including the extensive, international network of over 90,000 H-Net subscribers.


First year: Create prototype interfaces and digitize at least 1000 items to populate the NGSW. The educational interface, which will allow educators to build lesson plans, is installed.

Second year: The first teacher groups will meet in the summer to work on curriculum preparation. First use of NGSW in K-12 setting. NGSW is used in college teaching. Search engine is tested. Twenty-five percent of digitization and catalo ging of originally targeted collections is complete. Construction of exhibits begins.

Third year: Additional grades and classrooms begin to use the NGSW. Search engine testing and training of the system against a wide range of speech-types continues. Outreach through H-Net recruits other college faculty to use the NGSW in teaching. Fifty percent of the digitization and cataloging is completed. Construction of exhibits continues.

Fourth year: Additional grades and classrooms use the NGSW. Search engine is made available for limited search types to get feedback from real users. Seventy-five percent of the digitization and cataloging is completed. Construction of ex hibits and solicitation of feedback on utility of site continues.

Fifth year: NGSW is institutionalized at testbed schools. The search engine is completed and installed. One hundred percent of digitization and cataloging of original collections is completed. Work begins on new collections and outreach c ontinues to add oral history and other interview material.

Broad Implications for Research and Generalizability

The real strength of the NGSW project lies in the scope of its multi-disciplinary and collaborative partners. The range of institutions, organizations and individuals involved will ensure that the NGSW will be widely used and easily expandable to meet a range of research and pedagogical needs. The material archived and made accessible in the NGSW will be used for purposes that go well beyond pedagogy. Research in history and the social sciences is gradually moving to the Web. For example, H-Net publishes the largest and most timely collection of book reviews in the world. The NGSW will provide primary materials currently inaccessible to most scholars. Currently, only a few scholarly projects have emphas ized sound: the most notable of these is "May it Please the Court," which provided cassette tapes of Supreme Court hearings as well as analysis of these materials. As online multimedia journals begin to build interactive resources and so change the nature of scholarly discourse, historians will be able to include clips from the NGSW as evidence for their arguments. Cultural historians informed by anthropological research have focussed on the semiotics of speech, the ways in which speakers signal their au diences with subtle verbal cues. Writing about this is less effective than showing it, and the discourse among historians as these materials are examined will inevitably reveal aspects of the discourse previously hidden. With the use of the advanced sea rch capability, scholars will be able to scan large bodies of untranscripted sound files. Since the reduction of speech to a transcript is an expensive process, the need to create a transcript is currently a significant barrier to accessibility. So, for example, large bodies of primary oral history material, such as the Columbia University Oral History Project, could become far more useful to scholars with the successful development of the NGSW's unique and innovative search capability. This project wil l enhance the techniques used in speech technologies through creation of novel search, compression, and watermarking techniques. Development of enhancement software will break important ground, while facilitating use of this resource to suit a variety of research needs. Making speech resources widely accessible to scholars, teachers, students, and a wide public audience, remains the guiding aim behind this endeavor. Through development of standardized digitization processes at a number of collection sites , this project will further ensure that the NGSW becomes a network of information that will be able to expand indefinitely to interface with a wide variety of other campus networks and tools.

Not only will scholars benefit from this, but the creators of the NGSW believe that this resource will be of substantial benefit to policy makers. In particular, one of the most significant problems in policy decisions is a lack of awareness of pr evious arguments on the same subject. Through the search capability, research assistants for policy makers will quickly be able to provide historical background material which can help shape current arguments and decision-making processes. Historical so und clips can be downloaded and used in the preparation of briefing documents, which can be delivered online or offline. Although Washington has had the funding available for research of this kind for some time, the NGSW will make historical materials r eadily available for state, county and local governments, and for NGOs. In current debates over such issues as the environment and welfare, understanding the historical context can lead to better policy decisions and the formulation of more persuasive a rguments. By bringing sound clips into the public sphere as public-domain resources, the NGSW will help level the playing field in current political debates at every level. As term limits become more and more common, the common historical memory of poli cy makers disappears more rapidly. The NGSW can help mitigate this effect by providing a way to preserve the oral lore of governing communities.

One of the great dangers of the current state of the Internet is that many people are beginning to exploit primary materials for commercial purposes. This could have a chilling effect on multimedia teaching and inhibit the development of innovativ e research techniques. More important, it can stifle the free-wheeling creativity and independence of spirit which have been the hallmarks of the Internet culture. We have no way of predicting what use the broad general public will wish to make of these materials, but we have confidence that the Library of Congress, through its American Memory project, has made the right decision in making its material freely available to the public. Restricting access to the materials that make up America's collective cultural memory can only smother an active engagement with the American past.


The design and implementation process of the NGSW will be guided by a number of integrated evaluation methods. An interactive, online evaluation process will be implemented beginning at the first phase of pr oject development. Use of the Internet and website to provide and solicit feedback from users and collaborators, user tracking and questionnaires will all be integral parts of the NGSW website. As this site is intended to support scholarship, school learn ing, and visits by curious web surfers, automated measures that identify users, their navigational choices and their reactions are crucial (Cohen, Tsai, Chechile, 1995; Dede, 1985). Audience profile, concerns, and feedback on usability will be central to determining who is using the site, how easy it is to navigate for specific research and educational purposes, as well as ensuring that this information is accessible to the widest possible public audience. In particular, we are interested in what keeps a user interested in the selection they have chosen, especially when the user is a student or a curious surfer. To make this assessment process manageable, users will be selected at random and asked to participate in a NGSW interview. Early efforts will be aimed at measuring the reliability of these data, with follow-up interviews for some users in order to determine the veracity of their responses.

Measuring the successes and limitations of the NGSW in educational settings will be a significant focus of the evaluation process. These efforts will focus primarily on the partnership districts and on the teachers who participate in the workshops conducted by the MSU College of Education. In particular, we are interested in 1) the use and value of the audio as stand-alone media for education, and 2) how the audio can best be used with complementary visual media. We anticipate differential patterns of use based on the available media and the student aptitudes, and expect our research to help tune the NGSW to assist the widest variety of students. Using an ATI model (Snow, 1989) with an emphasis on assessing conceptual and complex understanding (May er and Sims, 1994), learning will be modeled along with covariant measures of aptitude and use. As the model is calibrated and seamless, mechanisms for online data collection are available, and we anticipate the use of the model as an online tool to suppo rt students with quick diagnosis of poor learning strategies.

As the portions of the NGSW are adopted for curricular use, groups of teachers will be selected from K-12 and post-secondary schools to identify additional needs and further evaluate project performance and usability. We expect this group to inform the development team of key content and tools whose addition would extend the use of the NGSW in schools, as well as ways to dovetail components of the NGSW with traditional parts of the curriculum. Lastly, groups of teachers will be used to collect data for the purpose of validating outcomes from learning assessments.

Publication and dissemination of methods and findings will be a key component to generate and provide scholarly feedback on the research questions and tools. Published papers and reports in academic journals and Internet discussion networks will fa cilitate multi-disciplinary peer review as well as making an increasing number of individuals aware of the NGSW.

Finally, as this is a true interdisciplinary effort, it is important to chronicle and assess the collaborative process (Kruger, Cohen, 1996). Several research groups at different sites are playing key roles, with the simultaneous goals of producing research, standards, and a remarkably useable product. Some kinds of collaborations are likely to serve some goals, but not others. For what purposes are face to face meetings essential? When is email appropriate? These and related questions become cruci al for developing the NGSW as well as similar projects. To help address these questions, key personnel will be required to keep an online diary of their ideas and reflections on the project and its goals. We anticipate that these diaries will help keep th e project on track and offer an opportunity to understand the role different kinds of collaborations play.

In summary, we expect the assessment to offer valuable answers to the following questions:

* How do scholars, students and curious users interact differentially with the resources in the NGSW?

* What interactions best serve individuals in each of these groups?

* How can we best diagnose potentially unrewarding visits to the NGSW and improve the chances for success?

* What kind of collaborative processes best serve large scale research ventures such as the development of the NGSW?

* How can the resources of the NGSW best serve a diverse set of students and teachers?


Arslan, L.M., J.H.L. Hansen. "Selective Training in Hidden Markov Model Recognition," accepted for publication in IEEE Trans. on Speech and Audio Processing. vol. 7, no. 1, January 1999.

Arslan, L.M., J.H.L. Hansen. "Likelihood Decision Boundary Estimation between HMM Pairs in Speech Recognition." IEEE Trans. Speech and Audio Processing. vol. 6, no. 4, pp. 410-414, July 1998.

Beerman, D. and K. Sochats, "Metadata requirements for evidence," Archives and Museum Informatics, 1966.

Bender, W., D. Gruhl, N. Morimoto, A. Lu. "Techniques for Data Hiding," IBM Systems Journal . vol. 35, no. 3-4, pp. 313-336, 1996.

Brassil, J., L. Ogorman. "Electronic Marking and Identification Techniques to Discourage Document Copying," Info Hiding , pp. 227-235, 1996.

Cohen, S., Tsai, F., and Chechile, R. A.. "A model for assessing student interaction with educational software. " Behavior Research Methods, Instruments, and Computers, vol. 27, n. 2, pp. 251-256, 1995.

Conway, Paul. Preservation in the Digital World (Washington: Commission on Preservation and Access, 1996).

Caronni, G., H.H. BrŸggermann and W. Gerhardt-HŠckl, eds. "Assuring Ownership Rights for Digital Images" in Reliable IT Systems-VIS '95. pp. 251-263 (Germany: Vieweg, 1995).

Cox, I.J. and Matt L. Miller. "Human Vision and Electronic Imaging II," SPIE 3016, pp. 92-99, February 1997.

Cox, I.J., J Kilian, T Leighton, T Shamoon. "A Secure, Robust Watermark for Multimedia," Info Hiding , pp. 185-206, 1986.

Day, M.W. "Extending metadata for digital preservation." Ariadne. No. 9, May 1997.

Day, M.W. "Preservation of electronic information: a bibliography." (, 1997.

Dede, C. Intelligent Computer Assisted Instruction: A Review and Assessment of ICAI Research and Its Potential for Education. (Educational Technology Center: Cambridge, MA, 1985).

Deller, J.R., Jr., "Application of OBE algorithms to speech analysis, recognition, and coding,'' invited chapter in: M. Milanese, J. Norton, H. Piet-Lahanier, and E. Walter, eds. Bounding Approaches to System Identification (London: Plenum, 1996).

Deller, J.R., Jr., J.G. Proakis, and J.H.L. Hansen. Discrete Time Processing of Speech Signals (New York: Macmillan, 1993).

Deller, J.R., Jr., R.K. Snider. "Reducing redundant computation in HMM evaluation," IEEE Trans. on Audio and Speech Processing. vol. 4, pp. 465-471, October 1993.

De Rose, S. "Structured Information: Navigation, Access, and Control," presented at the Berkeley Finding Aid Conference, [], April 1995.

Foote, J., G. Jones, K. Jones, S. Young. "Talker-Independent Keyword Spotting for Information Retrieval," Proc. Eurospeech, vol. 3, pp. 2145-2149, 1995.

Hansen, J.H.L. "Morphological Constrained Enhancement with Adaptive Cepstral Compensation (MCE-ACC) for Speech Recognition in Noise and Lombard Effect," IEEE Trans. Speech and Audio Processing: Special Issue on Robust Speech Recognition, vol. 2, n. 4, pp. 598-614, October 1994.

Hansen, J.H.L., L. Arlsan. "Robust Feature Stimulation and Objective Quality Assessment for Noisy Speech Recognition using the Credit Card Corpus," IEEE Trans. Speech and Audio Processing. vol. 3, n. 3, pp. 169-184, May 1995.

Hansen, J.H.L., M. Clements. "Constrained Iterative Speech Enhancement with Application to Speech Recognition," IEEE Trans. on Signal Processing. vol. 39, n. 4, pp. 795-805, April 1991.

Hansen, J.H.L., S. Nandkumar. "Robust Estimation of Speech in Noisy Backgrounds Based on Aspects of the Auditory Process," Journal of Acoustical Society of America. vol. 97, n. 6, pp. 3833-3849, June 1995.

Hansen, J.H.L., S. Nandkumar. "Objective Quality Assessment and the RPE-LTP Vocoder in Different Noise and Language Conditions," Journal of the Acoustical Society of America. vol. 97, n. 1, pp. 609-627, January 1995.

Hansen, J.H.L. B. Pellom. "Text-Directed Speech Enhancement using Phoneme Classification and Feature Map Constrained Vector Quantization," Speech Communications. vol. 21, pp. 169-189, April 1997.

Hardy, I.T. "Internet Archives and Copyright," Documenting the Digital Age Conference, 1997.

Heery, R.. "Review of metadata formats," (, 1996.

Herwijnen, Eric. Practical SGML. (Netherlands: Kluwer Academic Publishing, 1990).

IEEE International Conference on Acoustics, Speech, Signal Processing. Conference Proceedings, Seattle, May 1998.

International Organization for Standardization. Information Processing -- Text and Office Information Systems -- Standard Generalized Markup Language. ISO 8879: 1986 (E).

Koch, E., J. Zhao. "Towards Robust and Hidden Image Copyright Labeling" Proceedings of 1995 IEEE Workshop on Nonlinear Signal and Image Processing , pp. 452-455 (Neos Marmaras, Halkidiki, Greece, June 20-22, 1995).

Kruger, L., Cohen, S., Marca, D., and Matthews, L. "Using the Internet to extend training in team problem solving, "Behavior Research Methods, Instruments, and Computers. vol. 28, n. 2, 248-252, 1996.

Kuo, C.J., J.R. Deller, Jr., and A.K. Jain. "Pre/post filter for performance improvement of transform coding," Image Communication, vol. 8, pp. 229-239, 1996.

Lee, Y.B. and J.R. Deller, Jr. "State-space formulations of discrete symbol HMM decoding for fast match," in preparation.

Lohr, S. "Real Networks Hopes New Software Will Open Up Medium," New York Times, July 13, 1998.

Lynch, C. "The Integrity of Digital Information: Mechanics and Definitional Issues," Journal of the American Society for Information Science vol. 45, pp. 77-84, April 1995.

Matsui, K., K Tanaka. "Video-Steganography: How to Secretly Embed a Signature in a Picture" Journal of the Interactive Multimedia Association Intellectual Property Project.. vol. 1, n. 1, pp. 1187-205, January 1994.

Mayer, R., & Sims, V. "For whom is a picture worth a thousand words? Extensions of a dual-coding theory of multimedia learning. " Journal of Educational Psychology. vol. 86, pp. 389 - 401, 1994.

McLuhan, M. The Gutenberg Galaxy: The Making of Typographic Man (Toronto, 1962).

Nandkumar, S., J.H.L. Hansen. "Dual-Channel Iterative Speech Enhancement with Constraints Based on an Auditory Spectrum," IEEE Trans. Speech and Audio Processing, Vol. 3, n. 1, pp. 22-34, January 1995.

OCLC. "Dublin Core Metadata." [], 1997.

OCLC. "Description of Dublin Core Elements." [], 1997.

Payette, Sandra D. and Oya Y. Rieger. "Supporting Scholarly Inquiry: Incorporating Users in the Design of the Digital Library," The Journal of Academic Librarianship . vol. 24, n.2, pp. 121-129, March 1998.

Pellom, B.J., J.H.L. Hansen " An Efficient Scoring Algorithm for Gaussian Mixture Model based Speaker Identification," submitted to Signal Processing Letters, December 1997.

Pellom, B.J., J.H.L. Hansen. "Automatic Segmentation of Speech Recorded in Unknown Noisy Channel Characteristics," Speech Communication: Special Issue on Robust Speech Recognition in Unknown Communication Channels. vol. 24, Fall 1998.

Pellom, B.J., J.H.L. Hansen. "A Duration-Based Confidence Measure for Automatic Segmentation of Noise Corrupted Speech," accepted to ICSLP-98: Inter. Conf. Spoken Language Processing, Sydney, Australia, December 1998.

Preserving Digital Information (Washington: Commission on Preservation and Access, 1995).

Reynolds, D., R. Rose. "Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models," IEEE Trans. Speech and Audio Processing. vol. 3, n. 1, pp. 72-83, 1995.

Rose, R., D. Paul. "A Hidden Markov Model Based Keyword Recognition System," Proc. IEEE ICASSP-90. vol. 1, pp. 129-132, 1995.

Rothenberg, J., "Ensuring the longevity of digital documents," Scientific American. vol. 272, n. 1, pp. 24-29, January 1995.

Rothenberg, J. "Metadata to support data quality and longevity," (, 1996.

Singeltary, M. and G. Stone. Communication Theory and Research Applications (Ames, Iowa, 1988).

Smith, J.R., B.O. Comiskey. "Modulation and Information Hiding in Images" Info Hiding , pp. 207--226, 1996.

Swanson, M.D., B. Zhu, A.H. Tewfik. "Transparent Robust Image Watermarking" IEEE International Conference on Image Processing, v III, pp. 211-214, 1996.

Task Force on the Archiving of Digital Information, Preserving Digital Information. (Washington, D.C.: Commission on Preservation and Access, 1996).

USMARC Format for Bibliographic Data, including Guidelines for Content Designation. Washington, Cataloguing Distribution Service, Library of Congress, 1994.

van Schyndel, R.G., AZ Tirkel, CF Osborne. "A Digital Watermark," International Conference on Image Processing, vol. 2, pp. 86-90, Austin, TX, 1994

Weibel, Stuart, "The Foundation of Resource Description," D-Lib Magazine [], July 1995.

Principal Investigators :: Partners :: Grant :: Home