Creativeness Digital Scholarhip Group (CDSG)
The CDSG is composed of a team of researchers uncovering and exploring the forgotten sources, meanings, and social worlds of creativeness prior to the meteoric rise of a scientific“creativity” in the 1970s. The CDSG’s focuses on applying a range of Natural Language Processing and Machine Learning techniques to perform an archeology of discourses of creativeness and related concepts, unearthing new finds, making new connections, and interpreting its cultural and political relevance for the time period in which they were embedded. Most of our sources are from the post-Civil War period to the end of the Space Race, roughly the century between 1870-1970. This was a period in which the noun “creativity” rarely appeared and took its current form only toward this century’s end, especially during the 1950s. We have yet to find a dictionary, however, that doesn’t assert that creativeness and creativity are synonyms. In historical English usage, we have seen a multitude of marginalized terms, other than ‘creativeness’ –inventiveness and inventivity, for example. We want to better understand across time(1) whether they are indeed synonyms; (2) the lost sources and meanings of creativeness, early creativities (polysemic noun used in very diverse settings in first half of 20thCentury) and the range of phenomenon connected to a creative imagination; and (3) how to best think about the displacement of creativeness and related nouns with the adjective “creative” with “creativity,” which had been nearly complete by 1970.
Archive-Vision (archv or arch-v) is a collection of computer vision programs written in C++ which utilizes functions from the OpenCV library to perform analysis on large image sets. The primary function is to locate recurring patterns within each image in a set of images. Arch-v locates features from a given seed image within an imageset and outputs the image(s) with the most similarities. The first program, processImages.cpp, generates text files containing the keypoints and their mathmatical descriptors; with the keypoints, analysis can be done to compare images and find matches. The second program, scanDatabase.cpp, finds the images that are most similar to a given seed image. The third program, drawMatches.cpp, compares two images, locates their matches based on homography, then draws the keypoints and their relative match; this is most useful when the best matches have already been found.
The best use for arch-v is to find images which are similar to a seed image. The standard method is to process the image set that you will compare your seed image to, scan through the generated dataset with a given seed image to find the best matches, then draw identifiers for matching features between the seed image and its best match.
Mining the US Food and Drug Administration
The FDA Mining project focuses on the cultural politics of dietary health and the values and beliefs that shape American eating habits. There is often a disconnect between the legal meaning of words as defined by the Food and Drug Administration and uses by labelers and the common definitions of these words as understood by consumers. We engage in various modes of scraping and text mining (such as sentiment analysis, source tracing, and topic modeling) of online public comments to FDA proposals and decisions as a means of better understanding the complex personal, corporate, official, and legal discourses that surround the labeling of food.
The Pioneering Punjabis Digital Archive
The Pioneering Punjabis Digital Archive (http://pioneeringpunjabis.ucdavis.edu/) offers a window into the story of South Asian immigrants from the Punjab region in north India to California since the turn of the twentieth century. Explore over 700 video interviews, speeches, diaries, photographs, articles, and letters in which Punjabi Americans share their life stories, values, and contributions to California’s history over the last hundred and twenty years.
Play the Knave Modlab
The project, in coordination with the DSI, involves the creation of a gaming environment in which students recreate scenes from many works of Shakespeare. With this project, movement and vocal data are gathered as participants act out a given scene. From here, the data is taken and created into a video of the production and can be shared with others. This is an exploratory project in which the researchers are trying to not only bring about a better understanding of Shakespeare’s works but also recognizing speech and movement patterns.
BIBFLOW is a two-year project that is funded by the Institute of Museum and Library Services. The purpose of this project is to investigate the future of library services that can include cataloging and related workflows, new data models, and new encoding and exchange formats. At the end of the two-year time table, there will be a roadmap for the academic and library communities that would serve as a guide for the changes that are occurring in academia.
Gender and Citation Disparities
This project involves looking at the books Silent Spring and The Life and Death of the Great American City. Both of these books were written by female authors and are now recognized as being major texts in the ecology realm. However, despite their significance, they were not heavily cited at the time of each book’s respective publication date. Regardless of the lack of actual citations, many high profile male scholars picked up on their work and were able to proliferate the ideas from the books until they were about to come to the forefront. The major point of this project is to mine bibliometric data from Google Scholar and other sources in an effort to build a network of the citations from each of the books.
Predicting Length of Hospital Stays
One of the most significant problems that hospitals across the country are facing at the moment is the prediction of how long each patient will remain in said hospital. This project is attempting to build a better predictive model by taking into account both quantitative and qualitative data from hospitals. The main source of information is coming from classifying and mining doctors and nurses notes and using that information to create a model that better provides an estimate on each patients duration of stay.
Places in Walt Whitman
Walt Whitman was an American poet that was working during the period of transition from transcendentalism to realism. Due to this period of time, many of his pieces of work are rooted in physical spaces. This project used text mining methods to isolate all of the locations mentioned in Whitman’s works and then created a visual map of all of the locations.
English Short Title Catalogue
This project was originally intended to create a, “machine-readable catalogue of books, pamphlets and other ephemeral material printed in English-speaking countries from 1701-1800.”
English Broadside Ballad Archive
EBBA began in an attempt to allow for better access to broadside ballads. Since 2003, EBBA has been able to make broadside ballads from various holdings easily and fully accessible as they are now all on one site and have been extensively catalogued. The EBBA website has also allowed the ballads to be viewed via basic or advanced search functions to allow for an even greater extent of use.
Social Networks of Citation
The purpose of this project was to create a peer network of all publications and collaborations that span from a single faculty member. Through mining med-lined data, the network was successfully created.