How To Use the Trans-Atlantic Slave Voyage Database — Basic Data Science for Historians
Although data science might seem like an abstract realm where the computer science is more comfortable than the humanist, the so-called Fourth Industrial Revolution, or the increase of available data in the world, has required that teachers start to train students in data literacy. This type of thinking is characterized by an understanding that combines both algorithmic sense as well as statistical fluency, fully in a social and ethical context.
Historians are an essential part of that equation, because of the many types of archives and databases already established. In this short explanation, I will demonstrate how to introduce a few of these concepts such as the basic structure of a query, with the Trans-Atlantic Slave Trade Database. I will also show some ways that a person could expand this lesson with additional formats.
What is the Trans-Atlantic Slave Trade Database?
Before we step further into how to use the database, let me explain a little about the background of this database. The Trans-Atlantic Slave Trade Database is a collection of individual ships and cargo, as well as other types of data, that document the Trans-Atlantic slave trade. The material come from many sources, like ship manifests and customs documents. Scholars collaborated to create the entire database over several decades. Most of the material from the present database comes from the 1999 Cambridge University Press publication, The Trans-Atlantic Slave Trade: A Database — on CD-ROM. When the designers translated that data to an online database, they also allowed for scholars to update and add to the archive as they discovered new information. Today, the database consists of almost 36,000 ship voyages from 1514 to 1866.
Begin to Query
Let’s dive right in. To begin, first check out the main page, www.slavevoyages.org. The top navigation bar consists of the following heading: Voyages Database, Assessing the Slave Trade, Resources, Educational Materials, About the Project, and Language. Feel free to check out some of the other resources. The site contains academic essays about slavery and lots of summary statistics and maps that visualize the trade.
We are going to focus however on the main database. So, highlight Voyages Database in the top navigation menu and select “Search the Voyages Database.” (You can also select the link directly through the main body, the first of the three main links.)
Whenever I teach students how to use this database, I always stop here and ask them to analyze and think about what they are seeing. Before any data analysis, one must be familiar with the data. We can “be familiar” with the data in many ways, such as understanding general patterns and outliers, but in order to do that, we must understand the structure of the database and the data.
In other words, how is the content organized?
What is a database?
In order to assess the organization, we have to reflect briefly on what is a database.
Databases are pretty straightforward and are almost ubiquitous today. The majority of people are familiar with the concept, because Google is a kind of database, and so are most search engines. The local library hosts a database of books, businesses create databases of clients and expenditures.
At the simplest level, a database is an organized, collection of data for a specific purpose.
Hence, the local library might collect data about books and other media. Each book has the title, the author (or maybe the composer in the case of music), the subject, and other relevant features like an ISBN number. Other data that might be included could be media type and date of publication. All of this information tells you data about an object. And together all the objects and their data form what we call the library’s database.
It’s pretty simple, right?
So far what we have: objects, variables, data, and databases.
Objects: the particular item which is the source of the data. In the case of the library, this would be the book, the CD, the tape, or whatever specific item. In a database like Google, this would be the individual website. What would it be for the Voyages then?
Variables: the specific type of data collected from the object. In the case of the library, this would be the author of the object, the medium of the object (BOOK versus CD), the publication date, etc. The idea is similar with Google as well, although I think it’s less common for someone to access these advanced features. While we almost exclusively search Google simply by keywords, each website has variables. When we are searching by keywords, we are searching almost always one variable, the content. [That is not exactly true, but bear with me]. We can apply other variables to Google searches to target our search better. For instance, Google also categorizes websites according to the type of file. Most users would know this by the difference between Web and Images in the navigation bar underneath the text search bar. But in formal language, one could use the astype search variable to limit searches to a particular type of file, say pdf versus jpg.
We will continue to explore this concept further in the Voyages Database. But for the time being, consider what kinds of variables would one want to collect about shipping and slaves?
Data: the information collected from an object. I am using this term in this restrictive sense. In the library example, thus all the variables together give us the data of an object. And as we add an object to the database, then we further add to the amount of data available to analyze.
Database: the collected set of all the data.
Navigating the Voyages Database
Now, returning to the Voyages Database, it has two major regions for its interface: the left column navigational and query center, and the central output of the data.
Each entry in the database is a ship voyage, and it has been assigned an individual number when it is entered. One thing to note here is that the number in the database is not the same as the chronological number. Because this database was developed over a large period of time, and is compiled from several other previous databases, the number when the ship voyage was added to the database is not the same as the ship voyage over time. [What is the database identification number for the earliest historical record in this database? Answers at the end]
The main results section is organized according to a table. The current view then shows basic information about a single entry, its Database Identification number, Vessel Name, Captain’s Name, Year Arrived with Slaves, Principal Region of Slave Purchase, and Principal Region of Slave Landing. Each entry is a link and opens to the complete list of information about each voyage, but the data is not complete for every data entry. Some have most of the variables filled, while others have only a couple tables filled.
The left column is the search engine to input queries and is the primary tool to access information in the database.
The first variable is time, and the user can restrict the data to any set of years ranging from 1514 to 1866. For instance, if one wanted to know all the ship voyages in a particular year, they could put the year in both columns to reveal how many voyages were made in that year. If one wanted to know the number of ship voyages for a particular period of time, such as the number of voyages made before the creation of the United States, set the range of years from 1514 to 1789. To return to the original settings, simply click the “restore it” link underneath the time range. Try setting some of your own restrictions to see how it works.
Restricting time is a great way to limit the number of voyages, but it is rarely enough. Scan the rest of the Basic and General Variables to get an idea for the types of information available to the user. The variables are categorized into groups and have many ways to access. For instance, while the most common criteria for Voyage Date is the year that the ship arrived with the slaves, what if you wanted to know the date that the crew left the port and the date of return? Under the general variables, there are nine different variables to help answer that question. Of course, the data is not so accurate for every ship; but in many cases, this information is accessible.
Time to Query
Now that we have a good idea of what kind of data is in this database and how the database is structured, it’s time to make some queries!
Two notes as you follow through this: 1) None of these queries is particularly challenging but since some are adapted from the instructional materials at the Voyages Databases, check out those solutions if you need help. 2) Make sure to start a new query from a previous one. After you select a variable, a new box will appear. You can delete the box, by selecting the X in the upper right corner of the box, or if you wish to use that same variable, you can select new query to input a new search for that variable.
I already gave a challenge query earlier. If you haven’t tried it yet, try it now.
Query 1: What is the database identification number for the earliest historical record in this database?
Query 2: The first recorded voyage in this dataset is from 1514. What year was the fourth voyage? What do we know about this voyage?
Query 3: Find the 1858 voyage of the Wanderer. Open the voyage record to read more. What was an important outcome of this voyage? At the top of the screen, select the map to see the specific journey of this voyage. Describe the voyage from start to finish. Where were the slaves purchased and where were they disembarked? How did the journey end?
Query 4: Find the 1860 voyage of the Clotilda. Open the voyage record to read more. What was an important outcome of this voyage? At the top of the screen, select the map to see the specific journey of this voyage. Describe the voyage from start to finish. Where were the slaves purchased and where were they disembarked? How did the journey end?
Combining Sources
When I started this tutorial, I said that history is an important field to encourage learning and to provoke questions about data science. So far, we have only looked at the most static way to engage with this database. But when a historian is doing research, this database is far more useful when it comes to combining sources and investigating.
To create more advanced queries, one should use multiple variables to subset the data. For example, how many ships were American that purchased slaves from the Bight of Biafra? To search for this, one might first select Nationality from the Basic Variables. Then, under the Nationality box, the user would select American. However, this search would show all American voyages. Since we want to determine the number of American ships that purchased slaves in the Bight of Biafra, we need to make this result less. To do that, we would then add a second variable; from Voyage Itinerary, select Principal Place of Slave Purchase. A new box will appear. If we new where the location was already located, we could go through the dialogs and select it. But a faster method is to use the quick search tool. Choose “Select” and a small bar will appear. Type “Biafra”, and there will be two remaining options. Tick the selection boxes, and click “OK”. Now you can search, and the data will be limited to the right source. If you are interested in summary statistics, you could then click the summarize data tab from the table and see the numbers.
Try using multiple variables through the following examples.
This extract comes from The Gentlemen’s Magazine, October 1773, 523:
Query 5: Identify the Captain from this article, and the event described here. Then, after determining how many other ships he crewed, describe his career. Use the database then as a key source to create a narrative about his life.
Query 6: Read the following passage and use the information from the memoir to determine the name of the ship and the year in which he came to America. What other questions could you develop around this to investigate?
A Narrative of the Life and Adventures of Venture, A Native of Africa, 1798
Chapter 1
I WAS born at Dukandarra, in Guinea, about the year 1729. My father’s name was Saungm Furro, Prince of the Tribe of Dukandarra. My father had three wives. Polygamy was not uncommon in that country, especially among the rich, as every man was allowed to keep as many wives as he could maintain. By his first wife he had three children. The eldest of them was myself, named by my father, Broteer. The other two were named Cundazo and Soozaduka. My father had two children by his second wife, and one by his third. I descended from a very large, tall and stout race of beings, much larger than the generality of people in other parts of the globe, being commonly considerable above six feet in height, and every way well proportioned.
…
The invaders then pinioned the prisoners of all ages and sexes indiscriminately, took their flocks and all their effects, and moved on their way towards the sea. On the march the prisoners were treated with clemency, on account of their being submissive and humble. Having come to the next tribe, the enemy laid siege and immediately took men, women, children, flocks, and all their valuable effects. They then went on to the next district which was contiguous to the sea, called in Africa, Anamaboo. The enemies provisions were then almost spent, as well as their strength. The inhabitants knowing what conduct they had pursued, and what were their present intentions, improved the favorable opportunity, attacked them, and took enemy, prisoners; flocks and all their effects. I was then taken a second time. All of us were then put into the castle, and kept for market. On a certain time I and other prisoners were put on board a canoe, under our master, and rowed away to a vessel belonging to Rhode-Island, commanded by Capt. Collingwood, and the mate Thomas Mumford. While we were going to the vessel, our master told us all to appear to the best possible advantage for sale. I was bought on board by one Robertson Mumford, steward of said vessel, for four gallons of rum, and a piece of calico, and called VENTURE, on account of his having purchased me with his own private venture. Thus I came by my name. All the slaves that were bought for that vessel’s cargo, were two hundred and sixty.
Conclusion
There are additional ways to extend this project, such as investigating other slave narratives and comparing the story with the database. Another way to complicate this story would be to add layers of ethics on top of this database. For instance, many historians find it useful to study the Zong ship and its subsequent life. The database serves as an important tool to allow one to see the social relations between the owner and the captain. Combining the database with the legal documents, students could develop a deeper understanding about the limitations of data and the need for ethical responsibility in considering what data is used for.
In terms of data science, this short article introduced the general structure of a database, data entries, and data variables. After considering the structure of the data, this path also showed how to construct basic queries and demonstrated how to then build advanced queries by building subsets of queries. These skills showed how to access a database and how to extract information from the database.
Answers to the specific queries
Query 1: 49518
Query 2: 1525; This particular voyage is a rich source of data. There are some obvious facts about it, and it is worth checking out in more detail. What do you think is going on with the data on the slave cargo?