Sunday, 7 September 2025

FamilySearch’s Full-Text Search Exploration Revisited

I wrote about FamilySearch’s Full-Text Search module on 15 July 2025 (FamilySearch Full-Text Search ... and other AI processes for reading old handwritten documents). In that post I described how FamilySearch is making major inroads to viewing old documents, providing valuable transcriptions of them, and expanding the abilities of family historians to find elusive ancestors. At that point the technique was still in the testing stage, part of a group of experimental programs.

Well, Full-Text Search has now gone mainstream. Searching scanned documents in their library for names and keywords for almost any subject is now a main category under the Search menu on the FamilySearch home webpage. To use the almost limitless capabilities of the FamilySearch resources you only need to register a free account.  


Hundreds of thousands of records from over 4,300 collections have been added to the inventory of searchable documents with more coming online regularly. Most importantly, non-indexed names and words can now be found, a major advancement in the hunt for family information.

Since I wrote my last blog post, I have wanted to do a bit more detailed looking at possible records of interest. This time I went looking for all mentions of our surname “shepheard” from the main home page search menu.


In all categories and in all places worldwide, the total hits for the name were 36,999. Most (19,158) were in records from the United Kingdom and Ireland with a close second for USA records (16,129). The other 1,712 references were from Africa (30), Asia & Middle East (3), Australia & New Zealand (1,192), Canada (446), Caribbean & Central America (15), Continental Europe (12), South America (2 and Other (24).


I selected the UK & Ireland regions, moved down to England (16,790 hits) and then down to Devon (3,203). That is the area where most of my Shepheard family originated. I could have further narrowed down the review in various specific Collections (15 of them), Centuries (1500 to 1900) and Record Types (59). These are summarized below.


Vital records, that is birth/baptism, marriage and death/burial entries, represent the largest proportion, as might be expected (2,212 hits, 69%). These are records that are most likely to be indexed.


Shepheard appears in records from 79 Devon parishes, most in the southern part of the county. Again, this would have been predicted as that is the region where the Shepheard families were most prevalent.

The images presented are of very good quality, many much better than the copies I obtained over the past few years from other sources. There are a few inconsistencies in the transcriptions as one might expect using old document. Some of the hits are different pages of the same collection but that does not dimmish their importance.

Restricting my search to a specific spelling, of course, meant that I missed seeing many individuals, but that can easily be remedied by doing more broadly defined searches. For different spellings one can substitute a ? for a letter.

So, a search for “sheph?rd” resulted in 2,216,970 hits (UK & Ireland = 285,409; USA = 1,806,249; England = 256,779; Devon = 12,927). And another search for “shep??rd” got 3,261,473 hits (UK & Ireland = 389,635; USA = 2,643,351; England = 344,944; Devon = 15,980). Both identified a much greater number of individuals in the primary regions of Devon.

The results of just this basic search for potential family members could lead to months if not years of work to review all the documents and look for possible relationships. Full-Test Search is truly a game changer.

And after my Shepheard ancestors I can look for other lines. For example, a search for Mayfield had 544,125 hits (UK & Ireland = 15,579; USA = 519,062; Maryland = 2,231; Baltimore = 729; 1800s = 301). This was the name of one of my 3rd great-grandfather, who was born in England in 1778 and migrated to the USA around 1810.

Obviously, there is an infinite number of searches that could be done which could add interesting and valuable information to my whole family tree.

Following is a summary of the 3,203 Shepheard search results for Devon in terms of numbers by area, collections, centuries and record types:

Full-Text Filtered Search Results (3,203)

Place: Devon, England, United Kingdom and Ireland

Collections (15)

England, Devon, Employment, from 1741 to 1838 (29)

England, Devon, Legal, from 1300 to 1600 (23)

England, Devon, Properties, 1745 (25)

United Kingdom, Biographies, from 1963 to 2009 (1)

United Kingdom, England, Biographies, from 1860 to 1884 (15)

United Kingdom, England, Deaths, from 29 September 1589 to 9 April 1877 (753)

United Kingdom, England, Education, from 1828 to 1839 (110)

United Kingdom, England, Employment, from 1753 to 1795 (73)

United Kingdom, England, Legal, from 28 March 1842 to 25 March 1845 (1,545)

United Kingdom, England, Marriages, from 1813 to 1868 (534)

United Kingdom, England, Others, from 1780 to 1933 (22)

United Kingdom, England, Poor Relief, from 1 March 1874 to 31 March 1874 (118)

United Kingdom, England, Properties, from 1628 to 1677 (200)

United Kingdom, England, Religious, from 12 April 1762 to 5 April 1863 (1,021)

United Kingdom, England, Residences, 1905 (9)

Totals by Century

1500s (2)

1600s (110)

1700s (535) 16.7%

1800s (1,592) 49.7%

1900s (221) 6.9%

Record Types (59)

Vital Records, Death Records, Burial Records (706)

Religious Records, Baptism Records (629)

Vital Records, Marriage Records (528)

Voting Records, Voting Registers (492)

Legal Records, Property Records, Land Estate Records, Land Estate Tax Records (438)

Religious Records, Parish Records (292)

Government Records, Tax Records, Tax Assessment Records (175)

Legal Records, Court Records (128)

Legal Records, Court Records, Probate Records (125)

School Records, School Enrollment Records, School Admission Registers (110)

Legal Records, Property Records, Rent Records (104)

Religious Records, Churchwarden Records (97)

Legal Records, Property Records, Land Records (86)

Religious Records, Poor Rate Records (86)

Government Records, Tax Records, Rate Books (56)

Business Records, Sale Records (44)

Government Records, Overseer Records (44)

Government Records, Overseer Records, Overseer Accounts (40)

Business Records, Employment Records, Personnel Files (39)

Legal Records, Court Records, Court Orders (33)

Vital Records, Death Records, Cemetery Records, Burial Registers (31)

Legal Records, Property Records (30)

Business Records (17)

Reference Materials, History Records (16)

Government Records, Public Records (16)

Government Records, Poor Law Records, Almshouse Records (9)

Religious Records, Poor House Records (9)

Reference Materials, Historical Geographies (8)

Vital Records, Death Records (7)

Vital Records, Death Records, Cemetery Records, Gravestone Transcription Records (7)

Genealogies (7)

Religious Records, Poor Relief Records (7)

Miscellaneous Records, Society Records (7)

Periodicals, Directories (6)

Legal Records, Property Records, Land Records, Land Assessment Records (6)

Business Records, Occupation Records (6)

Government Records, Poor Law Records, Poor Law Settlement Records (6)

Religious Records, Religious Marriage Records (6)

Religious Records, Baptism Records, Christening Records (4)

Government Records, Oath Rolls (4)

Government Records, Tax Records, Valuation Records (4)

Religious Records, Bishop Transcript Records (3)

Periodicals, Directories, City Directories (3)

Legal Records, Property Records, Land Records, Freeholder Records (3)

Genealogies, Heraldry Records (2)

Miscellaneous Records (2)

Legal Records, Court Records, Property Settlement Records (2)

Legal Records, Court Records, Bastardy Records, Bastardy Declarations (2)

Government Records, Nobility Records (2)

Voting Records, Voter Lists (2)

Vital Records, Death Records, Cemetery Records (1)

Vital Records, Death Records, Cemetery Records, Grave Registers (1)

Genealogies, Family Histories, Family History Record Indexes (1)

Legal Records, Court Records, Probate Records, Probate Indexes (1)

Government Records, Poor Law Records, Parish Poor Law Records (1)

Government Records, Town Records (1)

Government Records, Poor Law Records (1)

Reference Materials (1)

Monday, 11 August 2025

Natural Disasters: Present, Past and Future

I write and talk a lot about the effects Mother Nature has and has had on people and communities. You can read my list of published articles and books on this site here. My presentations are summarized here. I also try to maintain a bibliography of books and articles about the relationships of natural phenomena and family history. You can see the reading list here.

We are constantly bombarded (or so it seems lately) with news headlines and opinion pieces about natural disasters around the world, in many cases because of their purported connection with climate change. Differing opinions exist that events that we are observing now, around the world, are unique in terms of intensity, history and their impact on human settlements.

Many are related to short and long-term weather patterns: droughts, heat waves (the polar vortex in the winter), floods, storms, wildfires, etc. Other disasters that are part of the Earth’s normal geological processes include earthquakes, landslides, volcanic eruptions and tsunamis. All, of course, can cause distress and mayhem to people. As they have done for eons!

A short list of recent major events includes:

·         Drought in Western Canada, Western USA and around the world

·         Earthquakes in Japan, Russia, USA

·         Floods in New York and New Jersey (and Algeria, Australia, Bolivia, China, the Congo, Jordan, Nigeria, USA

·         Glacier collapse in Switzerland

·         Storms (including hurricanes and tornadoes) in Australia, Canada, Egypt, Kuwait, the Philippines, Tunisia, UK, USA

·         Wildfires in the Southeast Europe, Canada, Korea, USA

·         Volcanic eruptions in Iceland, Indonesia, Russia

News reports and some studies state that events are getting worse – in frequency, intensity and regional scope. But these comments only relate to a few past decades, not to the overall historical record.

The costs associated with damage from natural disasters have reached record levels. But they are mainly in areas where there are large populations and highly developed infrastructure. Would anyone doubt that the relative cost of the drought and fires in Europe in 1540 or 1842 would be much different on a per capita basis?

Our records of natural events, including written historical documents, only go back a few hundred years, in most regions much less time. For example, you can read about major storms in millions of newspaper articles going back to the early 18th century.

Geological and geographical records show major catastrophic events have occurred regularly in past centuries and over hundreds of millions of years.

Regarding family history studies, it is informative to look at what is going on in the modern world to appreciate how such events may have affected our ancestors, who were not likely to have been as well prepared or warned about impending natural disasters.

To take one example of the present and relate to outcomes in the past, we can look at the drought conditions plaguing western North America. The current ongoing megadrought began in 2000. Such dry periods have not been rare occurrences in this region.

The medieval era in western North America was also characterized by widespread and regionally severe, sustained droughts. Proxy data, primarily in the form of tree rings, indicate decades-long periods of increased aridity illustrated as peaks on the graph and shown as red on the map from AD 1150 across the central and western U.S.

In the Colorado and Sacramento River basins, reconstructions show long periods of persistently below average river flows during several intervals including much of the 9th, 12th, and 13th centuries.

Other proxy records include the position of tree lines, melting of glaciers and the types of chironomids present. Chironomids are a distinct group of lake flies whose populations and types can be correlated with specific climatic conditions.

All these proxies are consistent in supporting periods of elevated warmth in the medieval period that coincide with periods of severe and widespread drought. The driest episode was in the mid-12th century and was more extensive and persistent than any modern drought experienced.

One of the casualties of the long drought was the collapse of the Anasazi or Ancestral Pueblo society that had thrived for hundreds of years in the southwest part of North America. A series of megadroughts of the 10th to 13th centuries finally took their toll on the residents and forced them to move.

More recent droughts that may have impacted our ancestors, possibly droving them to migrate, that we can learn about in published records and family stories include: the 1930s Dust Bowl; the Great Plains droughts in the 1890s, 1870s and 1860s; central Europe in the 1840s; 1790s in Australia; 1760s in the British Isles; early 1600s in the American colonies; and the mid-16th century in Europe.

Drought is a normal, recurrent feature of climate that occurs in virtually all climate zones. Further to that thought, droughts have occurred virtually every year someplace and megadroughts have been experienced at least once per century, not uncommonly more frequently. The Earth will certainly continue to experience them in the future.

One wonders if residents living in the dry southwest region now could move, would they?

 

References and Data Sources

Ancestral Puebloans https://en.wikipedia.org/wiki/Ancestral_Puebloans

Southwestern North American megadrought https://en.wikipedia.org/wiki/Southwestern_North_American_megadrought

The U.S. Drought Monitor (USDM) is a map released every Thursday, showing where drought is and how bad it is across the U.S. and its territories.  https://droughtmonitor.unl.edu/CurrentMap.aspx

The National Integrated Drought Information System (NIDIS) is a multi-agency partnership that coordinates current drought monitoring, forecasting, planning, and information internationally and also historically. https://www.drought.gov/international

The North American Drought Monitor (NADM) is a cooperative effort between drought experts in Canada, Mexico and the United States to monitor drought across the continent on an ongoing basis. https://www.ncei.noaa.gov/access/monitoring/nadm/maps

The Global Drought Information System (GDIS) is a tool for visualizing drought related data across the globe. https://gdis-noaa.hub.arcgis.com/

The Canadian Drought Monitor (CDM) is Canada's official source for the monitoring and reporting of drought nationally. https://agriculture.canada.ca/en/agricultural-production/weather/canadian-drought-monitor

Copernicus is an EU program aimed at developing European drought information services based on satellite Earth Observation and in situ (non-space) data. https://drought.emergency.copernicus.eu/

Tuesday, 15 July 2025

FamilySearch Full-Text Search ... and other AI processes for reading old handwritten documents

Generative Artificial Intelligence (AI) is becoming a more prevalent technique for transcribing old documents, in particular, that for handwritten text. Programs are growing in number for very old records and for use in many languages.

The process revolves around Optical Character Recognition for written or printed documents that were created using old handwriting or printing styles. The idea was to create a digital library or memory of letter and word shapes that could then be compared to new, scanned images to produce an interpretation of what the document contained. The results would be applicable to reading single pages or entries (of interest to genealogists) to mass conversion of documents containing a multitude of pages stored in archives. The objectives are to do so quickly, easily and accurately.

FamilySearch Full-Text Search

One feature that has caught the eye of genealogists lately is the Full-Text Search function developed by FamilySearch that has expanded options for searching handwritten documents in the thousands of collections and millions of images it has in its digital library.

Full-Text Search, introduced in 2024, is a part of a group of experimental programs which are part of FamilySearch Labs “where you can explore emerging FamilySearch features that are not yet ready for public release”.  Researchers are invited to participate in refinement of the programs, through testing and feedback.

Full-Text Search uses AI processes developed by FamilySearch to scan and locate information on digitized documents that have not been fully indexed. Their technique makes it possible to find specific words or phrases, including names of people that may not have been the main parties. All one needs to participate is an account on the site, which is free to obtain.

I wanted to try out the Full-Text Search to see how it would work. Beginning on the FamilySearch Labs page, I selected the experiment titled Expand your search with Full Text and clicked on “Go To Experiment.”

A search form came up asking for information on keywords, names, places, dates and collections to locate data about people and events.

On the form I entered just Nicholas Shepheard, which is the name of several of my ancestors, including four in my direct line, who lived in Devon, England for several centuries. As recommended, I put his name in quotation marks so that it would look for instances where both names were present and with the exact spelling.

I did not fill in any other data as I wanted the widest search possible. A list came up with 79 results with some documents having the name repeated several times within them. The documents were spread across many areas in the United Kingdom and Ireland (72) and the United States of America (7). In the UK and Ireland group, 66 were from England sources, four were from Ireland, and two were from Wales. The English sources included 11 counties or regions.

Then I narrowed down the list by choosing only those in Devon and got 37 results. Images of each original document found could be opened so I could see who the individual was and whether they were part of my family. A full transcription accompanied the images, both of which could be downloaded.

Some of the records I had seen before from searches of other collections and websites. Some were new to me. For convenience, Full-Text Search highlighted Nicholas’s name on the images and in the transcription.

One of the documents was a 1786 settlement examination from Ermington parish (reference FamilySearch: England, Devon, Plymouth, Parish Chest Records, 1556-1950). It shows that a man named Hercules Ferris worked for Mr. Nicholas Shepheard around 1761-62 at his farm called “Quay” in Cornwood parish.

AI transcriptions are not perfect. The settlement transcription had 14 errors plus six missing words out of a total 189 words in the actual transcription, a 10.6%-character error rate (CER). I determined that Quay was a misinterpretation for Gnats. This location later became Notts and was the Shepheard family seat for over 170 years – between 1630 and (probably) 1806.

This was the first time I had seen this document. I had not encountered it on any search of Cornwood or Ermington parish indexes or lists that included Nicholas Shepheard’s name. Important to me was the document appeared to confirm the family was living at Gnats/Notts in the mid-18th century.

So, my experiment was a success!

Future projects will be to investigate those other examples with Nicholas Shepheard in UK and USA documents and to investigate other family members and locations.

Or I may look at natural events, one of my favorite subjects. For example, I did a quick search for "hailstorm” and got 6,536 results: 595 in the United Kingdom and Ireland and 5,711 in the United States of America. A search for “floods” got 149,004 hits: 10,107 in the UK and Ireland and 132,618 in the USA; “earthquakes” got 3,531 and 231,628, respectively; “famine” got 18,861 and 194,020, respectively. The results included mentions in newspapers which FamilySearch has in its library. Each search can be narrowed down to locales, years and names which will be handy for looking at specific families and past homes.

I highly recommend family historians take advantage of this new program and do some searches for their ancestors. I think you will be very pleasantly surprised.

Other AI Transcription Options

In reviewing AI transcription options, I also wanted to test and compare other techniques, so I uploaded the 1786 Ermington settlement example to other platforms. The results were eye-opening.

Transkribus

Transkribus has become one of the internationally recognized go-to programs for transcribing historical documents.

The development of software to transcribe old records began back in the late 1990s. Libraries were already using Optical Character Recognition to digitize printed books, but primarily for those written in English.  Another program was needed for material published in other languages.

Researchers came up with Analysed Layout and Text Object format which stored text and images of handwritten letters and words. The images were transcribed and stored for comparison to other documents over time building up a library of words and phrases. Such documentation became what is called Ground Truth, a growing repository of images that could serve machine learning, or artificial intelligence processing.

The Transkribus project was established and backed by several institutions, coming together as the READ-Coop, formed to test and further develop the programming. The group became the official guardians of the Transkribus platform. There are now more than 100 European members of the coop.

The process is simple to use. Just set up a free account, open the program, drag an image into the left-hand side of the window and the program will immediately begin. After a few minutes waiting in the queue, a transcription will be available. A line-by-line comparison with the original image can be produced.

I followed this formula with the 1786 Ermington settlement document.

Ancestry

Ancestry is developing a new process – still in Beta testing at present – called Document Transcription Tool. The function can read and transcribe a variety of old handwritten documents. This feature can be used globally across all Ancestry platforms, including the app, mobile, and desktop websites and in multiple languages.

To use the program, a user must have an Ancestry account and a family tree posted on their site. A target document is first loaded on to the Gallery section of an ancestor’s tree profile. Once opened, a button marked “Transcribe” is selected and the process will begin. The transcription takes only a few minutes.

For my test, I added the 1786 Ermington settlement document to the profile of my 5th great-grandfather, Nicholas Shepheard. I then let Ancestry do its thing and come up with a transcription.

ChatGPT and Copilot

As part of my assessment, I also looked at having two other AI sites attempt a transcription: ChatGPT, developed by OpenAI; and Microsoft Copilot. These are two main-line platforms, developed by well-known groups, now commonly used in AI processing.

After uploading the 1786 Ermington settlement document to each of them, I asked, “Can you transcribe this image?”

Again, almost immediately I had transcriptions of the document.

Results

I compared Transkribus, Copilot, ChatGPT and Ancestry results with my own (actual) transcription. On the illustration here, my transcription, which I believe is accurate, is on the right. All the words in the four processes which matched the actual transcription are highlighted in yellow.

All AI techniques worked well. Character Error Rates (CER) were calculated for each from the number of words transcribed wrongly plus any word count difference in the result.

      The best CER was in the ChatGPT transcription at just 7.9%, including missing seven words.

      Copilot was right behind with a CER of 8.5%. This transcription was very close to the actual with only seven words mis-transcribed. It did miss nine words, though.

      The Ancestry transcription ended up with fewer words than the actual transcription. It missed a whole phrase, along with the last word. Its 12.7% CER is acceptable but such large rates need close, line-by-line checking.

      Most of the 16 errors in the Transkribus transcription were words where different letters were interpreted, such as often mistaking ‘e’ for ‘o’. Curiously it looked at a blemish on the document and transcribed it as a number. The CER of 11.1% is a bit misleading as it was very close to the original both in word count, format and spelling.

      The FamilySearch Full-Text Search transcription had, as noted above, a CER of 10.6%, comparable to the other platform results (14 errors plus six fewer words).

Many errors can be a result of penmanship as much as historical writing styles.

Transcription of any of the processes can be improved by making corrections to the results offered and resubmitting them. Over time, as the archive of “ground truth” (more examples processed and corrections submitted) is built for similar documents, the transcriptions will get better.

Overall, the results of all the techniques were very encouraging. I certainly will be using each, or all of them going forward.

Online References

AI know how for family history: Have you tried the FamilySearch AI Full-Text Search. https://www.family-tree.co.uk/how-to-guides/ai-know-how-for-family-history-have-you-tried-the-familysearch-ai-full/

Ancestry News: Ancestry launches Document Transcription Feature https://www.ancestry.com/c/ancestry-blog/ancestry-news/document-transcription-feature

ChatGPT https://chatgpt.com/

Copilot https://copilot.microsoft.com/

FamilySearch Full-Text Search https://www.familysearch.org/en/search/full-text

FamilySearch Labs https://www.familysearch.org/en/blog/familysearch-labs

Mühlberger, Günter. (2023). A Short History of Transkribus. https://blog.transkribus.org/en/a-short-history-of-transkribus-with-gunter-Muhlberger

Transkribus https://blog.transkribus.org/en

BYU Library helpful recent videos

Using the FamilySearch Full Text Search Feature-A Genealogical Goldmine – James Tanner (2 June 2024) https://youtu.be/YRYn7wyo7OA?si=J7P10grh7p_pxhPA

AI, Handwriting Recognition, and Full Text Searches – James Tanner (2 February 2025) https://youtu.be/5PVUHrJLT4w?si=I5LLqSafJ3iJ6BPm

FamilySearch Full-Text Search: A New Key to Tearing Apart Brick Walls – Amy Peacock (4 February 2025) https://youtu.be/udU2xT0ssXA?si=Mnf06b4K793Nem-m

Getting to Know FamilySearch's New Full-Text Search – Kathryn Grant (19 February 2025) https://youtu.be/LhNE8znSPgM?si=_Fz1lT-UTfbPXQlj

The Needle in the Haystack: Researching Women and Minorities using FamilySearch’s Full-Text Search – Julia A. Anderson (26 February 2025) https://youtu.be/jPg0qTcsBVM?si=6Qxrxzypi2-UdBS4

The FamilySearch Full Text Function – Jerroleen Sorensen (10 May 2025) https://youtu.be/4M3h-bSiQGM?si=5p5z20qZ2u9dz87b

Legacy Family Tree Webinars

Full-Text Search: Genealogy Game Changer – Geoff Rasmussen (11 March 2024).  https://familytreewebinars.com/webinar/full-text-search-genealogy-game-changer/

Secrets for Success: How to Harness the Power of FamilySearch’s Full-Text Search – Julia A. Anderson (21 May 2025) https://familytreewebinars.com/webinar/secrets-for-success-how-to-harness-the-power-of-familysearchs-full-text-search/