Using Devonthink for Historical Research: Approaching Your Data
As I mentioned in my last post on devonthink, the ability to capture archival and secondary sources digitally has expanded our capacity to collect the bits and pieces of information at a scale and speed that is, in my opinion, transformative for historical research. And, it is the combination of speed and scale that is potentially revolutionary. I say potentially, because in and of itself collecting endless amounts of quantifiable and qualitative data is nothing new. Witness Braudel’s history of the Mediterranean. Reproducing such an endeavor with the aid of computers could simply mean taking ponderously large collections of note cards, file cabinets, moleskine notebooks, legal pad transcriptions, rolls of microfilm, and stacks of book, and reduce the storage space from sprawling all over the office to a pocket-sized usb harddrive. But is that enough? Likewise, years and years of archival collecting and transcribing can be reduced to months with an itchy shutter finger. Digital photos, starting at about 3.2mp, are truly superior to photocopies (and better for the documents, as no flash is necessary). Of course, you still have to read and transcribe/take notes on all those photos.
I want more out my technology. I want to be more efficient, but I also want to push up against the traditional means of collection, analysis, and emplotment that has defined historical practice regardless of one’s theoretical emphases, while remaining rooted in a fundamental historicity. Are we there yet? Maybe. But, I don’t think quite. What I hope for, and want to develop in my own work, is the ability to combine the qualitative and the quantitative in analyses of historicized texts. This isn’t a plea for including regressions and statistics on word or concept frequency in some sort of a renewed social scientism. It’s also not quite a desire for a textual discourse analysis aided by text mining and QDA applications. (Though, I find these very interesting in their own way, and potentially fruitful. The Center for History and the New Media is pioneering the potentials of text mining for historical analysis.) The differences I’m thinking my way through here approximate, in a way, the differences between the meanings of the word archive for Derrida and Steedman, and the differences between texts implicated by these understandings of archive.
Devonthink’s classification, search, and AI infrastructure is a step in the right direction. For people who work mostly with the every-growing mass of information available online, the ability to import, auto-classify, and connect disparate pieces of info is very cool, particularly as the internal structure of your database becomes increasingly tight and predictable. It also helps to be working in one language. One can keep dumping PDFs and web archives and their own bits of writing and the like into the database and find connections. This is essentially the base of Steven Oberlin Johnson’s earlier piece of devonthink evangelism. I’ve manage to sell using devonthink to one colleague in my department. His initial response in reading around online and looking at the program was that it promised to work really well for bloggers and journalists who wanted the computer to read for them. He’s a bit of a surly guy, though, and the application’s potential overcame his cynical disposition.
I don’t use the classify functions in devonthink, which are designed to group a new or existing record with other like records, to suggest places to store a piece in my database. As I’ve mentioned before, I stick to a fairly well defined note tree hierarchy designed to group data based on its place in the archive, as well as its general place/role in the book project. I do, on the other hand, love the search capacities of the program. Some historians may find the classification tools, smart groups, and other means of arranging data very convenient. Using replicants, it is possible to store individual note files in numerous places at once, which an be especially helpful when building thematic databases. An individual criminal case, for example, can be housed in a folder for all criminal cases, and also replicated in other folders that deal with, say, the particular type of crime (property, violent, sexual, etc.) or with other criminal cases whose defendants or plaintiffs share certain characteristics (gender, social class, honorifics, property, occupation, marital status, etc.).
What I do use are the incredibly strong search functions, and supplemental to search “See Also,” “Classify,” “Context,” and “Spelling.” As I come to approach the writing phase of my work, I will have identified important themes, concepts, categories, etc. that emerged in the months on end of transcribing documents and reading the secondary literature. Usually, I’ll jot notes to be kept in the Note folder on the tree. For example:
In this case, the Classify and See Also commands can provide clues for taking the work in a particular direction, or connecting with other sources:
Alternatively, I can choose terms from within a secondary source that I’m working with and connect them to other bits of data in the database. Here, for example, is the result of selecting pecado nefando from an article by Zeb Tortorrici on sodomy in colonial Mexico:
The problem is that I’m working in two different languages, which limits the utility of these functions. The search tools, on the other hand, are much more striking. So, let’s say I’m working on a piece dealing with sodomy prosecutions. I can search for a variety of terms to find cases, legal writings, etc. that connect:
The pane on the right provides the many different spellings that show up in my database. I performed this search using fuzzy spelling, so anything approximating the search term is included. I can then double click on the specific spellings and new results will show up in the search window. Double clicking on any of the entries will open that file. In addition to spelling, the search results can also be approached through the context in which the search term most often appears– the adjacent words on file:
This process is very helpful. In seeing visually, and then being able to link/re-search contextual terms, I am able to consider discursive trends associated with concepts. Fuzzy spelling search capabilities are vital for working with early modern documents, when spelling conventions weren’t yet set, and given that my own transcriptions may have some of their own spelling errors. These searches and their cross-indexing can be repeated for names of individuals, phrases, longer strings of text, and more. What this does, for me, is provide an extra layer of immersion that helps visualize trends and connections in the documents I may have missed or not thought to look for. On the one hand, it provides ready access to specific pieces of information, a sort of intensification of the classic notecard and subject means of organizing research. But, it also moves in the direction of computer aided analysis, or breaking out of the note tree hierarchy to reorganize and redeploy data without having to abandon the comfort of the hierarchy. When Devonthink 2.0 makes it out of beta, I think I will most likely maintain a Spanish-language database for my next project alongside a separate English database. Because we will be able to open two databases simultaneously, and search them as well, I hope this will make the internal structure of the data even more robust.
So what do I do from here? Well, as I move into the writing phase, I bring together these search and classification capacities with the organizational writing environment of Scrivener by dragging-and-dropping the research materials into Scrivener where they can be grouped together and re-organized to support the narrative development of a given chapter or article. I like Scrivener for the grunt work of writing specifically because it allows me to have the research and writing environments side-by-side without needing multiple monitors. Though I have multiple monitors, I’m not always in a place where they’re accessible when I’m writing, and so this feature is excellent. My current writing environment looks like this:
The research notes in the left hand pane can be re-arranged to suit the development of the piece, as a working source outline that keeps me on track while writing. I know by the time words are being put on the screen exactly how I want the narrative to develop, and I have the exact sources necessary to do that on the left, and on display in the bottom pane as I’m writing. These note files include the bibliographic reference from Bookends, which I incorporate directly into the flow of writing.
Scrivener also has a nice full screen writing environment that minimizes distractions:
As the draft gets close to completion, or ready for final formating, I will compile the draft sections together into a single piece (the program does this for you) and export from scrivener as a .doc or .rtf file and open in a word processor, usually Mellel or Word 2004 for that work-up. I’ll scan with Bookends (set for the citation style required by the publisher) and double check the citations, make sure the formating meets the expectations of the publisher, and then send it off. For documents that don’t require .doc status, I will almost always save them as pdfs.
So, those are the basics of my workflow. There are about 1000 other things I could say about working in devonthink and with writing apps available for Mac. But, I hope these three posts do a good enough job of outlining a workflow that can help the professional historian collect, manage, approach, analyze, export, and use their research materials. If you have any questions or recommendations, please make a comment as I’m always interested in seeing how others use technology to help with their research and writing processes specifically attuned to the needs of the academic humanities.