Sunday, May 18, 2025

Not Everything is on the Internet


I have often heard it said that “everything is on the internet.” This is patently false. Despite the great advantage that the internet gives by providing vast amounts of knowledge at our fingerprints, it is important for individuals–especially the younger generations who never knew a pre-Internet world–to understand the limitations of the internet.

There is the obvious problem that with the popularization of the internet, everything that was previously in print didn’t magic itself into the digital world. Archivists, historians, and computer scientists have been hard and work to digitize massive amounts of digital information. Most impressively, The New York Times has managed to digitize and archive all articles from its founding in 1851 to the present, The Times of London back to 1791. Yet very few periodicals have been able to retroactively digitize their complete publications. Of most interest to young people, DC Comics and Marvel, while providing access to large portions of their historic comic books, have been unable to digitize their complete historic publications.

Many internet databases of scholarly articles have only been retroactively digitized back to the 1970’s. Of course, we can arrogantly claim that there is little of value to learn (especially in the sciences) from pre-1970’s research. But even without debating that point, it is easy to see how ruling out sources from before that time could potentially limit our perspective on a myriad of topics.

There is yet another thing to consider. Similar to how a tendency to only publish the results of experiments that show statistical significance leads to a publication bias, a selection bias exists in archival digitization. In the context of digitization, selection bias occurs when archivists prioritize manuscripts they personally or institutionally value, often excluding others that may be equally or more relevant to different researchers or communities.

A great example of this is the efforts of the Library of Congress to digitize documents related to the Founding Fathers. Due to budgetary and temporal constraints, the LoC has prioritized key figures, such as Thomas Jefferson and Benjamin Franklin, and has yet to digitize documents related to lesser-known contributors to our nation’s founding, such as George Read, a signer of the Constitution from Delaware. The George Read papers consist of 8 boxes, comprising 2 linear feet of space in the library archives. Because no historian has bothered to research George Read in-depth and publish a book on his works, George Read’s manuscripts are not even indirection available through the internet by being able to purchase a biographical book about him. Only two documents related to George Read have been digitized: a letter from James Madison to Read (digitized because of its association to Madison, a former US president) and a eulogy written by George Read honoring a Captain James Lawrence and Lieutenant A.C. Ludlow, who were killed in the War of 1812. While the efforts of the Library of Congress are commendable, Americans need to understand that the Library must be selective in what it chooses to digitize, and this selection inherently inserts bias.

Over the years, when vocalizing my disagreement with the statement that “everything is on the internet” I have often been met with the challenge “like what?” While a list of what is not available on the internet would be exhaustive in nature, I can provide specific examples from research I have conducted in recent years.

For an undergraduate paper on self-esteem, I discovered that the internet did not contain the source of the Rosenberg Self-Esteem Scale (RSES), the first attempt to operationalize for measurement of the concept of self-esteem. Rosenberg’s book Society and the Adolescent Self-Image by Morris Rosenberg was added to JSTOR in December 2014, a year after I completed by paper. For my research, I had the pleasure of accessing the print book from the Harold B. Lee Library at BYU.

While researching my Icelandic heritage I discovered that the Icelandic Sagas do not exist in their entirely on the internet, nor do the majority of archeological manuscripts, whether it be manuscripts of the Illiad or the Dead Sea Scrolls.

In my ongoing work to create a book that gives ethnographic and anthropological details about the Native American tribes after which 27 States are named, I discovered that there is very little online information about the Alabamu tribe, from which the state of Alabama desires its name. For my research, I chose to purchase books related to the tribe, such as The Alabama-Coushatta Indians by Jonathan B. Hook, Alabama-Coushatta Indians: ethnological report and statement of testimony by Daniel Jacobson, Documents on the Alabama and Coushatta tribes of Texas by Howard N. Martin, and Myths & Folktales of the Alabama-Coushatta Indians of Texas by Howard N. Martin. Some of these books may have recently been digitized, yet the point remains that online information regarding the Alabamu Tribe is scarce.

In other topics of research, I have found the majority of primary sources, or essential secondary sources, related to topics such as the Battle Creek Skirmish (1849) and the Battle of Fort Utah (1850) in central Utah are not available online; information on the history of Guyana is relatively scarce, specifically the rule of Forbes Burnham and his oppression of the Indian population there; the voyage of Christopher Columbus–specifically, an English translation of the oft-cited and invoked “History of the Indies” by Bartolome de Las Casas has not been digitized in its entirety (although the far-less detailed  “A Short Account of the Destruction of the Indies” is widely available on digital libraries)–and other documents related to the age of discovery; and much of the Soviet defector literature from the Cold War era–specifically, the works of Yuri Bezmenov, Ian Pacepa, Anatoli Golitsyn are not available in ebook format.

Admittedly, these topics are specialty topics and all related to history, one of my areas of expertise. However, the examples are illustrative of the incomplete nature of digitization and online knowledge, and highlight the importance of seeking analog mediums in our self-study.


No comments:

Post a Comment