a cross-series cluster image using ncd

I have more to say about this later, but I find this image pretty interesting. When doing a box-by-box comparison of 12 different series finding guides from the Archivo Nacional del Ecuador, you get an image like this:

(Click to make it bigger, which you’ll need to do!)

The graph represents a normalized-compression-distance comparison of 950 boxes described by the contents of their manuscripts. (Which to 22 hours of CPU time to accomplish.) Given the nature of the documents, it is a somewhat artificial comparison. But, you’ll notice that the clusters are almost always within series. That’s what we’d expect from institutional documents. But, I need to chart the years from the various series clusters and see if they come from roughly the same decades.


Associate Professor of Early Latin America Department of History University of Tennessee-Knoxville

Posted in Digital History, Latin American History, programming, Research and Writing

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


Hacer juicio ú dictamen acerca de alguna cosa... significando que el objeto excita el juicio ú dictamen en la persona que le hace.

Deducir ante el Juez la accion ú derecho que se tiene, ó las excepciones que excluyen la accion contrária.

RAE 1737 Academia autoridades
Buy my book!

Chad Black

I, your humble contributor, am Chad Black. You can also find me on the web here.
%d bloggers like this: