Incorporating non-baseball biographical sources

The appearance of the Macmillan Baseball Encyclopedia in 1969 is perhaps the single most significant milestone in the recording of baseball history. For the first time, players’ career statistics had been systematically re-tabulated by going back to sources.

Statistics were naturally the focus of the bulk of the Big Mac. However, the book also marked the first appearance of a truly systematic register of people who had played in the Major Leagues, listing not only their year-by-year career performance, but also information on their full names and dates and places of birth and death.

The existence of this information, which forms the core of the biographical data on baseball persons, owes much to Lee Allen, who was historian at the Hall of Fame in the 1960s. Allen made it his business to know about players as people - what they got up to outside of baseball, after their retirement, where they lived, and, when and where they passed away.

The Lee Allen group

Consulting non-baseball sources is essential for filling out the broader picture of people in the Register, and linking them to other activities. The informal Lee Allen group contributes to the Register by finding, indexing, and matching records from sources outside the game to entries in the Register. They look for records on baseball people in sources including:

  • Obituaries and death notices. We maintain a file of references, and full text where possible, on obituaries and death notices relating to the passing of baseball people, thanks especially to the BaseballNecrology e-group. These are semi-structured data, which require the skills of a human researcher to extract the essential data into a form that can be processed. These sources vary widely in their coverage (and accuracy), but can provide important leads to a person’s activities that allow linkage to other records.

  • Public and administrative records. These include birth and death certificates, military service records, Social Security records, Census records, and similar items, which we bundle together into a collection we refer to (slightly inaccurately) as ‘vitals’. These often aren’t conclusive on their own, because few explicitly mention baseball as an occupation, but can often be linked usefully to other records.

  • Entries on findagrave.com. We maintain an index of findagrave memorial records. These are also quite useful alongside obituaries and death notices. We do use this information with caution, as its user-contributed nature means it does not have the same fact-checking as other sources might. For instance, there are many cases in which findagrave contributors have copied Register information into a findagrave memorial (usually via Baseball-Reference). These records are most helpful in distinguishing different people from each other.

Progress and data quality

The group follows a systematic strategy of looking for information based on data which are missing or ambiguous in baseball records. In a typical week, the group identifies and confirms new records updating the profiles of about 100 people.

Working systematically, and keeping track of which people have (and have not) been researched, is essential, given that the population of professional baseball players is likely around 250,000. Experience shows that, even for names which are not that common, there are many potential close matches in public records for a given player. Further, with the preponderance of incomplete names and inaccurate dates of birth in baseball records, matching records with a sufficient degree of confidence is a labor-intensive process.

Therefore, while the group in principle welcomes leads for links to non-baseball records of the kinds listed above which are candidate matches for baseball people, they are not able to offer any guarantees of being able to respond to correspondence on a particular record, or to offer any timelines on when new information on a particular person may be updated in the Register.