Announcement
NEWS
Find data for text mining
Find data for text mining
Monday, January 20, 2020
Are you tracking a word’s semantic change across multiple periodicals over many decades? Or maybe you’re looking to perform sentiment analysis or measure changes in word frequency. Do you know how to find and access the data you need?
Yale University Library added new tags to the Quicksearch catalog that make it easier to identify datasets for text and data mining projects. The format (XML, TIFF, etc.) and quality of the optical character recognition (OCR) varies widely, so we recommend starting with a sample issue once you identify a dataset that might work.
To locate newspapers and magazines that the Library licenses for current Yale students, faculty, and staff with an active NetID, add ‘yuldsetmediated’ to your search box in Quicksearch. You can then filter by fields such as language, subject region, or subject era to refine your results. To ask a question or arrange access to the data, email Research Data, and a librarian will follow up.
To identify transcripts, recordings, and other linguistic data, try searching with the more general ‘yuldsettxt’.
To find all datasets—including text, geospatial, numeric, and image data—use ‘yuldset’
For more information, visit the Text and Data Mining research guide.
RELATED NEWS
Fall 2023 DH Classes
Sep 06 2023
Looking for classes to take this fall? Here are some that will help you explore lyric poetry with digital tools, use data visualizations to address environmental problems, study the intersection...
Learn More »Spring 2023 DH Classes
Jan 09 2023
Looking for classes to take this spring? Yale will be offering more DH-related courses than ever. Here are some options that will help you learn Python and GIS, discover new...
Learn More »Welcoming Gavi Levy Haskell, Our New Developer
Nov 14 2022
The Yale Digital Humanities Lab (DHLab) is happy to announce that Gavi Levy Haskell has joined us as our new Digital Humanities Developer. Gavi has worked on digital humanities projects...
Learn More »