MyHeritage recently released their software process to transcribe old documents called Scribe AI. This is a subject I have been interested in for years, having written about it here and in some articles, and made it part of several presentations. My blog posts: Reading and Transcribing Old Handwritten Documents: Transkribus; FamilySearch Full-Text Search ... and other AI processes for reading old handwritten documents; FamilySearch’s Full-Text Search Exploration Revisited.
Using AI in searching for and transcribing old records is
probably the most useful aspect of the new technology. Simply improving writing
techniques or results using the assistance of programs such as ChatGPT is
interesting but does not advance anyone’s knowledge of their family history or
use of old documents.
Scribe AI is another program that allows one to quickly
transcribe old handwritten records. Some that I have experimented with or used
include FamilySearch’s Full-Text Search, Ancestry’s Document
Transcription Tool, Microsoft’s CoPilot and OpenAI’s ChatGPT.
And, of course, I have laboured over my own transcriptions having gained some
expertise in reading old handwritten documents over the past several years.
The available options for AI uses have varying degrees of
success, dependent on the quality of the writing, the age of the document, the
handwriting style, the type of preserved record and the language used. As the
tools become more widely used, the results will undoubtedly get better. Each
program improves as it is used to transcribe more documents and “learns” how
these documents are created.
So how did Scribe AI stack up in my testing?
I have recently been reviewing the history of Dunwich, a
lost town along the east coast of Suffolk. It was a vibrant commercial hub for
hundreds of years from before the Norman Conquest in the 11th
century to well into the 16th century. All during that period, the
coastline was being eroded away by waves, currents and storms in the North Sea.
Hundreds of buildings, including many churches have been
destroyed. In most cases, the records of people who lived there were also lost.
One set that has been preserved are the baptism, marriage and burial registers
for St. Peter Church from the time of the Reformation to the middle of the 17th
century when the church was abandoned. The last vestige of the building itself
went into the sea in 1697.
Some family history websites have indexes of these old
records, but nowhere are the actual images of the registers published. The
Suffolk Archives has a microfilm copy of the register, but the original
document is now at the British Library.
I engaged a genealogical consultant to photograph the
document which is amazingly complete except for much of the middle of the book
which was seriously water damaged. I am now going through the entries to
transcribe the information and see what I can learn about the history of the
people who lived in Dunwich before it was gone forever.
I selected one page from the 1654 marriage register and
uploaded it to the MyHeritage Scribe AI site to see how it would look at the
document.
Apart from some spacing and capitalization issues and the older style spelling (e.g. “marriage” instead of “marriage”) used by the writers, the result was quite good. If we discount these differences, there were only 11 errors (highlighted in yellow) in 322 words (3% error rate). The unfortunate part was that they were almost all surnames. That would not be unusual considering we are looking at names we are not used to seeing but it does emphasize that these are exactly the things we should pay careful attention to. By the way I tested the document with Transkribus and the error rate was 16%.
An interesting aspect of the Scribe AI process is that, in addition to the transcription, they also provide notes on: the historical context of the document; details mentioned including those of principal individuals and associated individuals; key findings; and suggested next steps.
Among those next steps are good reminders about searching
historical record collections for the named people, consult local archives for
more information about the area, look for wills and probate records and
investigate local area histories.
I also had Scribe AI look at a page of baptisms from 1539 to
1542. These are reasonably clear, compared to many other pages in the register
so I was hopeful that the AI transcription would be helpful.
It was! Compared to my transcription, there were 11 errors (highlighted in yellow) out of 420 words. I ignored some old-style spelling and some spacing problems. The errors were split between most names (25 last names and 3 forenames) and dates (17). The Roman numerals gave Scribe a bit of a problem.
The baptism transcription test result also included many notes about historical context, details mentioned, key findings and suggested next steps. Among those suggestions were to have a look at other parish collections on MyHeritage, particularly marriages, keep in mind spelling variations in surnames and checking probate records.
In my limited tests, the results of Scribe AI were very
good. I am sure doing more, and inputting corrections to the errors I found,
the results might improve. I did try a page that was severely water-damaged,
and it had, predictably, poor results. It did remind me that much of the page
was illegible, though.
I recommend using Scribe AI for any old documents you might
want to have transcribed.
I also recommend you continue to use FamilySearch’s
Full-Text Search to find those old documents and use their transcription
process. You might want to copy those documents to MyHeritage’s Scribe AI and
compare the results.



