For work, I’m transcribing a 1966 Thanksgiving Sermon from the Riverside Church. While we have a site setup so anyone can help us correct transcripts, this work is to improve our transcript creation process. We use software called Kaldi to create transcripts from audio and video files. The software isn’t perfect and the transcripts have a lot of mistakes. However, something is better than nothing and it improves our keyword search capability for the American Archive of Public Broadcasting.
We’ve been working with the Computational Linguistics Lab at Brandeis University to improve our transcripts with pre and post processing. One tool the Brandeis team developed is an audio segmenter, so that we can get the timecodes for music in a file, and then use the timecodes to skip those sections of the video so that Kaldi won’t try to make words out of music.
In order to test the software, I needed some transcripts with music in them. So, I grabbed the Thanksgiving Sermon and I’ve been transcribing the sermon of James A. Farmer Jr. Along the way, I’ve been thinking of monks copying and illuminating manuscripts, and this parallel feeling of doing something similar with modern technology.
I’m copying religious words for preservation and access. The mind wanders when copying a sermon. At moments, I dwell on what the minister says. Lose myself briefly as I think about this church in New York City at that moment in time.
There won’t be any doodles in the margins of my transcripts. How does one doodle in a text editor? ASCII art? There is, however, a line which passes from me through history to other people who have copied and preserved the words of God.