Exploratory Reports

This service is designed to help researchers quickly assess the overall quality of their data, help reveal general trends and patterns, and provide recommendations for further data cleaning, analysis and visualization. Where applicable, code used to prepare the report is provided and researchers are encouraged to “recycle” it for future needs.

Example reports include:

  • Data Pathology.  An initial assessment of your data to determine its general quality and suitability for analysis. DataLab staff will not clean your data, but will provide feedback on its current form and a set of recommendations to help you with the data munging process. Datasets particularly suitable for a pathology report include continuous and categorical experimental and survey data, geospatial datasets, bibliographies, and text corpuses.
  • Data Wrangling. Found the data you need but it’s on a webpage, PDF or other unstructured format? We’ll take a look and suggest some approaches and tools for you to consider to help gather and wrangle the data.
  • Text Mining and Natural Language Processing (NLP). Given a corpus (i.e., a set of text documents), this report generates preliminary word frequencies, word clouds, and a topic modeling visualization. This report helps to uncover and synthesize patterns in discourse to augment and extend close reads. Data preformatting is required.
  • Coauthor Networks. Turn your bibliography into a visualization of the relationships between coauthors or another defined feature. Data pre-formatting is required.

Geospatial reports available include:

  • Geocoding data. Estimate geographic coordinates of up to 200 addresses; includes a basic map showing the locations of geocoded addresses with a limited number of reference datasets (e.g. roads or state boundaries; vector data) to help visualize the output.
  • Spatial Joining/Georeferencing based on place names. Make a non-spatial dataset spatial by joining non-spatial data tables with spatial data tables using place names such as city, county, state, zip code, or country as the key (vector data).
  • CSV conversion to spatial formats. Convert a table containing geographic coordinates into a spatial data format such as a shapefile or geojson. 

Request an exploratory report