Digital Humanities Software Tools

A librarian for American studies, anthropology , and sociology, Nancy K. Herther, has recently published an article in Computers in Libraries where she includes a good list of digital humanities tools.  Here are some links to these:

DH Press (https://github.com/Digital-Innovation-Lab/dhpress) “DH Press is a plugin for WordPress that enables scholars to visualize their humanities-oriented data and allow users to access that data from the visualizations themselves. ”

Omeka (https://omeka.org/) Omeka provides open-source web publishing platforms for sharing digital collections and creating media-rich online exhibits.

Scaler 2  (https://github.com/anvc/scalar) media-rich, scholarly electronic publishing) – media-rich, scholarly electronic publishing

Chronos Timeline (http://hyperstudio.mit.edu/software/chronos-timeline/) Chronos is a flexible jQuery plug developed by HyperStudio Digital Humanities at MIT.

TimelineJS (https://timeline.knightlab.com/) an open-source tool that enables anyone to build visually rich, interactive timelines.

Historypin (https://github.com/Historypin) a community archiving platform .

QGIS (https://qgis.org) A Free and Open Source Geographic Information System.

Concordle  (http://folk.uib.no/nfylk/concordle/) “Concordle has one point common with Wordle: it makes word clouds. But these are only text, and in a browser in general the choice of fonts is limited, so the clouds are not so very pretty. But it is much more clever:  All the words in the cloud are clickable, i.e. they have links to concordancer function. ”

Netlytic  (https://netlytic.org) “a community-supported text and social networks analyzer that can automatically summarize and discover social networks from online conversations on social media sites”

Palladio  (http://hdlab.stanford.edu/palladio/) Stanford University’s online visualization tool that take CSV files and SPARQL endpoints (beta) as input.

Prism  (http://prism.scholarslab.org/) a tool for “crowdsourcing interpretation.” Users are invited to provide an interpretation of a text by highlighting words according to different categories, or “facets.”

Tableau (https://www.tableau.com/) this is a well known data visualization tool, especially popular in business.

Umigon (https://github.com/seinecle/Umigon) Semantic analysis on Twitter.

Voyant Tools (https://voyant-tools.org/) One of the DH text analysis tools listed in a previous post.

IIIF Open Source Developments

IIIF (International Image Interoperability Framework)  is a community of research libraries and image repositories working on interoperable technology and community framework for image delivery with the goals of uniform and rich access to image-based resources, common APIs for image repositories that enable great user experience while viewing, comparing, manipulating and annotating images and provide uniform rich access to image resources hosted online.

The framework for IIIF development has been its Image API (http://iiif.io/api/image/2.1/#table-of-contents) that allows for the retrieval of pixels through a REST web service and Presentation API (http://iiif.io/api/presentation) that drives viewing interfaces.   In addition, there is a Search API (http://iiif.io/api/search/1.0) and Authentication API (http://iiif.io/api/auth/1.0/).  The APIs use JSON-LD (https://json-ld.org/) throughout.

IIIF Image Servers:

IIIF Image API Viewers:

IIIF Presentation API Viewers :

The full list of viewers is available here: https://github.com/IIIF/awesome-iiif

Demonstration IIIF sites: http://iiif.io/apps-demos/

 

 

 

 

Digital humanities text analysis tools

Distant Reading & Text Analysis

The Versioning Machine (http://v-machine.org/) is a framework and an interface for displaying multiple versions of text encoded according to the Text Encoding Initiative (TEI) Guidelines

Voyant Tools (https://voyant-tools.org/) web-based reading and analysis environment for digital texts.

Twine (http://twinery.org/) an open-source tool for telling interactive, nonlinear stories. You don’t need to write any code to create a simple story with Twine, but you can extend your stories with variables, conditional logic, images, CSS, and JavaScript when you’re ready.

Spoken audio analysis tools

Open  Source

WaveSurfer (https://sourceforge.net/projects/wavesurfer/)is an open source tool for sound visualization and manipulation. Typical applications are speech/sound analysis and sound annotation/transcription. WaveSurfer may be extended by plug-ins as well as embedded in other applications. http://www.speech.kth.se/wavesurfer/

Praat: doing phonetics by computer (http://www.fon.hum.uva.nl/praat/)

https://tla.mpi.nl/tools/tla-tools/elan/

Gentle (http://lowerquality.com/gentle/Forced aligners are computer programs that take media files and their transcripts and return extremely precise timing information for each word (and phoneme) in the media. Drift (http://drift3.lowerquality.com/ ) output: pitch and timing.  It samples what human listeners perceive as vocal pitch.

Kaldi (http://kaldi-asr.org/)  a toolkit for speech recognition written in C++ and licensed under the Apache License v2.0. Kaldi is intended for use by speech recognition researchers.

SonicVisualizer (http://www.sonicvisualiser.org/) an application  for viewing and analysis of contents of music audio files.

Audacity (http://www.audacityteam.org/)  a free, easy-to-use, multi-track audio editor and recorder for Windows, Mac OS X, GNU/Linux and other operating systems.

SIDA (https://github.com/hipstas/sida) Speaker Identification for Archives. Includes a notebook that walks through the steps of training and running a classifier that takes speaker labels and the audio, extracts features (including vowels), and trains a model and runs it.

Audio Labeler (https://github.com/hipstas/audio-labeler) An in-browser app for labeling audio clips at random, using Docker and Flask

ARLO (https://sites.google.com/site/nehhipstas/documentation) was developed for classifying bird calls and using visualizations to help scholars classify pollen grains. ARLO has the ability to extract basic prosodic features such as pitch, rhythm and timbre for discovery (clustering) and automated classification (prediction or supervised learning), as well as visualizations. The current implementation of ARLO for modeling runs in parallel on systems at the National Center for Supercomputing Applications (NCSA). The source code for ARLO is open-source and will be made available for research purposes for this and subsequent projects on sourceforge at http://sourceforge.net/projects/arlo/.

Not open source, but available for academic use:

STRAIGHT (http://www.wakayama-u.ac.jp/~kawahara/STRAIGHTadv/index_e.html) a tool for manipulating voice quality, timbre, pitch, speed and other attributes flexibly. It is an always evolving system for attaining better sound quality, that is close to the original natural speech, by introducing advanced signal processing algorithms and findings in computational aspects of auditory processing.

STRAIGHT decomposes sounds into source information and resonator (filter) information. This conceptually simple decomposition makes it easy to conduct experiments on speech perception using STRAIGHT, the initial design objective of this tool, and to interpret experimental results in terms of huge body of classical studies.

Online Services:

Pop Up Archive (https://www.popuparchive.com/) is a platform of tools for organizing and searching digital spoken word. Processing sound for a wide range of customers, from large archives and universities to media companies, radio stations, and podcast networks. Drag and drop any audio file (or let us ingest your RSS, SoundCloud, or iTunes feed), and within minutes receive automatically generated transcripts and tags.