As I’ve written about on a number of other occasions (here and here), I love using a digital camera for archival research. I’m an evangelist with graduate students, undergrads writing honors theses, and any of my colleagues who will listen for using digital cameras in the archive, something I’ve been doing since 2002. There are various approaches to this, some of which you can find described, for example, here and here.
My approach to the camera in the archive is to go big or go home! I can, and do, take thousands of photos in a week. I was in Quito this past February for a week’s worth of collecting at the ANE, and brought another 5,000 or so images back with me. It was a raid. I have criteria that I test against in the archive, and any folder that meets the tests gets photographed, even if I may not ultimately use it. In between taking the pictures and using the pictures, what do I do with them? Well, first things first— I move them off the camera and onto multiple drives when I get to the hotel each evening (to my laptop, and to an external drive, which I never carry together in the same bag!). Then, I reproduce the archive’s own box/folder structure on my hard drive and move the photos into their appropriate folders. I keep a log of documents and the corresponding image numbers in a moleskine notebook, and divide things up based on that key. Still, that leaves me with many, many folders full of randomly named images. That calls for batch processing. As I am planning on putting many of these photos up with their transcriptions on my project site, I need the file names to make sense and include a little bit of metadata. So, I rename the photos with like this: YYYYMMDD_Box.Folder_Series_picNumber. And to this, I decided to put together a python script as an exercise.
The script has one dependency outside of standard modules:
#!/usr/bin/env python # encoding: utf-8 """ photoRename.py Created by Chad Black. """ import os import easygui photo_dir = easygui.diropenbox(msg="Choose a directory") os.chdir(photo_dir) new_file_name = easygui.enterbox(msg="New File Name: \n"+str(photo_dir)+'\n'+ '(YYYYMMDD_Box.Folder_Series_)', title='New File', default=photo_dir[-4:]) files = os.listdir('.') index=1 for filename in files: os.rename(filename, new_file_name+str(index).zfill(3)+'.jpg') index +=1 newFiles = os.listdir(photo_dir) print "Finished! \nHere's a list of the new files: \n"+'\n'.join(newFiles)
What this file does first is open a dialogue box to pick the directory where your photos are:
Then, a second dialogue box allows you to enter the new base name that all the files will have:
Then, it walks through all the files in the directory and renames them with the basename plus a three digit number starting with 001. Finally, it lists all the renamed files as confirmation that it worked:
The script can be modified for any To make the file executable from a terminal on a Mac, first change to the directory with the file and change permissions:
bash$ chmod +x photoRename.py. Then, edit your
.bash_profile file, adding this:
PATH:PATH$"/path/to/the/directory/". From now on, you can invoke the script from the command line by simply entering its name.
Speed is of the essence when working in archives – travelling to the archive and staying in the vicinity is costly both in terms of time and money. I also have a list of criteria by which I assess items in the archive. Very little reading happens in the archive – that occurs when I am reviewing the items I have collected at home.
Photographed handwritten items are much easier to read than the original because they can be enlarged on a computer. If I am doing a lot of train travel I find this time ideal to transcribe material.
I am not as sophisticated as you with naming and organising files. While I am in the archives I download photos in batches as I go to check that the photos are readable. Then I assign folders to them using the system used by the archive as you do. However, I rename them manually. You have given me food for thought.
Perkinsy– thanks for the comment. When I started digitally photographing my manuscripts in 2002, I downloaded in batches at the archive too. I did so for two reasons- 1. to check the quality of the pictures; and 2. when my SD card was full. Now, I don’t do it because I have a really big SD card, and because I’ve so regularized my system for taking the photos that I know they’ll be clear. The best way to do this is to use a tripod or monopod (clamped to the table), and to use repeatable procedures to photograph each page/pair of pages. In part that’s easy for me because the size of paper used in the 18th century was pretty standard.
On batch renaming– for windows, I think there are a number of applications out there for it. You might check bulkrename or batchfilerenamer. No need to reinvent the wheel, right? I’m just trying to pick up a bit of programming ability.
Thanks for the suggestion – I will check it out.
[…] There are many other types of dialog and text-display boxes that easygui provides. In the past I’ve used, for example, a text entry box to set the base name for batch renaming photos. […]
[…] the requisite couple of posts on my current workflow. We have scripts for monitoring server memory, batch renaming photos, tweeting from the command line (here and here), bursting and OCRing pdfs, posting to wordpress.com […]
[…] LikeBe the first to like this. Tags: Archivo Nacional Ecuador, digital archive, Digital History, python […]