Thursday, 27 February 2014

Making it Free, Making it Open - Transcribe Bentham, publications, and unexpected benefits

A few years ago I made a commitment to Open Access - in an attempt to reach a wider audience for my academic work, and to tell people about research as it was happening (not three of four years later once it was locked behind a paywalled journal). I'm really pleased to have something new to talk about once again, and this time I can share it with you before it even comes out in print. Allied to this are a few spin offs from the project in question - Transcribe Bentham, which aims to make the work of the the philosopher and reformer, Jeremy Bentham (1748 – 1832) available via a
 double award-winning collaborative transcription initiative, which is digitising and making available digital images of Bentham’s unpublished manuscripts through a platform known as the ‘Transcription Desk‘. There, you can access the material and—just as importantly—transcribe the material, to help the work of UCL’s Bentham Project, and further improve access to, and searchability of, this enormously important collection of historical and philosophical material. [Link]
First, the article: a pre-publication version which will be published in April in a special issue of the International Journal of Humanities and Arts Computing, from Edinburgh University Press. In it, Tim Causer and myself talk about crowdsourcing transcriptions of Bentham's writings, the impact of Transcribe Bentham on the work of the Bentham Project, and the use of volunteers to help us with tasks traditionally associated with lone academic researchers. We give particular examples of new Bentham material transcribed by volunteers dealing with the subjects of political economy, animal welfare, and convict transportation and the history of early New South Wales, which has further clarified and widened our understanding of certain aspects of Bentham’s thought. You can go and get it here:
 Causer, T. and Terras, M. M. (2014) "Crowdsourcing Bentham: beyond the traditional boundaries of academic history". International Journal of Humanities and Arts Computing, 8 (1) (In press). Link to PDF version in UCL Repository.
I'm pleased it is up there quickly, and openly, and free for all to see. Its one of the aims of the Transcribe Bentham project, of which I am only a small cog, to make Bentham's writings more well known, accessible, and searchable, over the long term. Allied to that is the ethos in involving a wider group of society in contributing to the project - this is about "co-creation" (as it gets called in Gallery, Library, Archive, and Museum (GLAM) circles) rather than academic broadcast. It would make no sense for us to take the product of something developed in online crowdsourcing, and lock it back in the academic ivory tower, given we asked for help to understand and find the material in the first place. We're finding our way with how to credit transcribers along the way (some of them are named in the article above, and we did ask their permission to do so) and to carry out crowdsourcing in as ethical a way as possible (something which is also of concern to others figuring out crowdsourcing in GLAM as we go). All in all, open access here is part of the Transcribe Bentham product: make it free, make it open.

And future doors line up ahead of us to walk through. This week we hit over 7000 manuscripts transcribed via the Transcription Desk, and a few months ago we passed the 3 million words of transcribed material mark. So we now have a body of digital material with which to work, and make available, and to a certain extent play with. We're pursuing various research aims here - from both a Digital Humanities side, and a Bentham studies side, and a Library side, and  Publishing side. We're working on making canonical versions of all images and transcribed texts available online.  Students in UCL Centre for Publishing are (quite literally) cooking up plans from what has been found in the previously untranscribed Bentham material, unearthed via Transcribe Bentham. What else can we do with this material?

And other doors open. I've talked before about reuse of the code behind Transcribe Bentham - in use by the Public Record Office of Victoria, and parts of it (the Transcription Desk bar, since you ask) has since been used in the Letters of 1916 transcription project, too. We're also in talks with other collections who are thinking of doing crowdsourcing, and who may use the Transcription Desk: watch this space. Again, this is part of the same trajectory: make it free, make it available.

And other doors open. The development of systems to read handwritten material (more advanced than Optical Character Recognition, which to date really only has success on printed, clean material) depends on having datasets of images of handwritten texts, plus checked validated transcripts of their content in a useful format, to train and test systems and algorithms. Transcribe Bentham is pleased to be part of the Transcriptorium project (as am I!), looking into Handwritten Text Recognition (HTR) technologies, and a set of 433 pages of Bentham's manuscripts plus the crowdsourced transcriptions are this year making up the "ICFHR 2014 Handwritten Text Recognition on the tranScriptorium Dataset" - to evaluate and test the current algorithms on Handwritten Text Recognition. How great is that. Did any of us sitting round the table first discussing crowdsourcing and Bentham back in 2009 ever expect we (and our transcribers) would be creating a benchmarked dataset in which to train handwriting recognition technologies? No. It is wonderful.

Create. Involve. Research. Make it available. Some of this by planning, some of this by happy accident. I now see the Open Access ethos underpinning all of this, and driving forward the direction of my research into the use of computing in culture, heritage, and the humanites. So, enjoy the article. We have access to and did and found out some cool stuff, you know - and we made it freely available. 



Wednesday, 5 February 2014

Male, Mad and Muddleheaded: Academics in Children's Picture Books

Academics in children's picture books tend to be elderly, old men, who work in science, called Professor SomethingDumb. Why does this matter?


Like many academics, I love books. Like many book-loving parents, I'm keen to share that love with my young children. Two years ago, I chanced upon two different professors in children's books, in quick succession. Wouldn't it be a fun project, I thought, to see how academics, and universities, appear in children's illustrated books? This would function both as an excuse to buy more books (we do live in a golden age of second hand books, cheaply delivered to your front door) and to explain to my kids - now five and a half, and twins of three - what Mummy Actually Does.

It turns out it's hard to search just for children's books, and picture books, in library catalogues, but I combed through various electronic library resources, as well as Amazon, eBay, LibraryThing, and Abe, to dig up source material. I began to obsessively search the bookshelves of kids books in friend's houses, and doctors and dentist and hospital waiting rooms, whilst also keeping on the look out on our regular visits to our local library: often academics appear in books without being named in the title, so dont turn up easily via electronic searches. Parking my finds on a devoted Tumblr which was shared on social media, friends, family members, and total strangers tweeted, facebooked, and emailed me to suggest additions. People sidled up to me after invited guest lectures to whisper "I have a good professor for you..." Two years on, I've no doubt still not found all of the possible candidates, but new finds in my source material are becoming less frequent. 101 books (or individual books from a series*) and 108 academics, and a few specific mentions of university architecture and systems later, its time to look at what results from a survey of the representation of academics and academia in children's picture books.

What are academics in children's books like?

The 108 academics found consist of 76 Professors, 21 Academic Doctors, 2 Students, 2 Lecturers, 1 Assistant Professor, 1 Child, 1 Astronomer, 1 Geographer, 1 Medical Doctor who undertakes research, 1 researcher, and 1 lab assistant. In general, the Academic Doctors tend to be crazy mad evil egotists ("It's Dr Frankensteiner - the maddest mad scientist on mercury!"), whilst the Professors tend to be kindly, but baffled, obsessive egg-heads who dont quite function normally.

The academics are mostly (old, white) males. Out of the 108 found, only 9 are female: 90% of the identified academics are male, 8% are female, and 2% have no identifiable gender (there are therefore much fewer women in this cohort than in reality, where it is estimated that one third of senior research posts are occupied by women).  They are also nearly all caucasian: only two of those identified are people of colour: one Professor, and one child who is so smart he is called The Prof: both are male: this is scarily close to the recent statistic that only 0.4% of the UK professoriat are black. 43% of those found in this corpus are are elderly men, 33% are middle aged (comprising of 27% male and 6% female, there are no elderly female professors, as they are all middle age or younger). The women are so lacking that the denoument of one whodunnit/ solve the mystery/ choose your own adventure book for slightly older children is that the professor they have been talking about was actually a woman, and you didn't see that coming, did you? Ha!

The earliest published academic in a children's book found was in 1922 (although its probable that the real craze for featuring baffled old men came after the success of Professor Branestawm, which was a major international bestseller, first published in 1933, and not out of print since). The first woman Professor found is the amazing Professor Puffendorf - billed as "the world's greatest scientist" -,  published in 1992, 70 years after the first male professor appears in a children's book. 70 years (although it is frustrating that the book really isn't about her, but what her jealous, male lab assistant gets up to in her lab when she goes off to a conference. More Puffendorf next time, please). There is also a more recent phenomenon of using a Professor as a framing device to suggest some gravitas to a book's subject, but the professor themselves does not appear in any way within the text, so its impossible to say if they are male or female.  Male Professors in children's books have appeared much more frequently over the past ten years: women not so much.


What areas do these fictional academics work in? (There is an entirely different genre of children's books covering the lives of real academics - but that's for another obsessive compulsive mini research project). Here we identify the subject areas of the 108 academics:

Most of the identified academics work in science, engineering and technology subjects. 31% work in some area of generic "science", 10% work in biology, a few in maths, paleontology, geography, and zoology, and lone academics in rocket science, veterinary science, astronomy, computing, medical research and oceanography. There is one prof who is a homeopath, and I wasnt sure whether to put them in STEM or Fiction, so I plumped for STEM as they seemed to be trying to see if homeopathy worked (I like to presume all the academics here have proper qualifications, but who knows if fictional characters can buy professorships online these days). Subjects classed as Fictional were serpentology, dragonology, and magic. Arts, Humanities and Social Science subjects identified are archaeology (6% of the total), and linguistics, psychology, arts and theatre. 27% of those with an academic title make no reference to what type of area they supposed to work in: they are generally just trying to take over the world. Just out of interest, the female academics identify their subject areas as serpentology, maths, paleontology, ecology, and three generic scientists (with two further unknown subjects), so its not as is the women are doing the "soft" subjects in children's books, when they actually appear.

Not all of these academics featured are humans: 74% are human, 19% are animals, 4% are aliens, 2% are unknown, and 1% are vegetable.  There are no discernible trends regarding animals that are chosen to represent wisdom - its not like they are all owls - with three mice, three dogs, two toads, a kingfisher, a gorilla, a woodpecker, a pig, a crow, an owl, a dumbo octopus, a mole, a bumble bee, a shark, a cockroach, and a wooden bird. If you spot any defining similarities there, let me know.

There are some other fun trends to note. 46% of those humans featured are bald (higher than the average percentage?) - no women are bald. 35% had very big, messy hair, and it seems to be that if you are in academia, you should be a bit disheveled, in general. 45% have white hair - but none of the women have white hair. 13% had ginger hair (higher than the average percentage?). 37% had moustaches, and 16% had beards (higher than the average percentage?) - but no women had facial hair.  What they wore is also interesting:
Labcoats, suits (but not if you are female!) or safari suits (but not if you are female!) are the academic uniform du jour.

The names given to the academics are telling, with the majority being less than complimentary: Professor Dinglebat, Professor P. Brain, Professor Blabbermouth, Professor Bumblebrain, Professor Muddlehead, Professor Hogwash, Professor Bumble, Professor Dumkopf, Professor Nutter, and two different Professor Potts. There is the odd professor with a name that alludes to intelligence: Professor I.Q, Professor Inkling, Professor Wiseman, but those are in the minority.

What types of book are they featured in? 82% of the 101 books are fiction stories, and the theme of the stories tends to be "academic is out of touch with how the world works, with hilarious consequences" in the case of professors, or "is evil and wants to take over the world, but is thwarted by our plucky hero (never heroine)" in the case of doctors. 7% of the books are factual, using a fictional academic to explain how science or experiments work, and 1% are cookbooks. The remainder, 10%, are a curious genre I have called "tall tales" - where the fictional academic character is brought in to bring gravitas and explain something, but the explanations are either fictional or bordering on fiction. Its a curious blend of science and fiction: they are not traditional stories, but work in a way which subverts the traditional children's science books, injecting fiction into the process (not very succesfully, in most cases).

What can we draw from this? If you are going to be a fictional human academic in a children's book, you are most likely to be an elderly, old man, with big white hair, who wears a lab coat, has facial hair, works in science, and is called Professor SomethingDumb or Dr CrazyPants, featuring in a story about how you bumble around causing some type of chaos. Close your eyes and think of a Professor. Is this what you see? Or this?  (One wonders how much well-circulated images of Einstein have perculated into the subconscious of writers and publishers to emerge as the obvious representation of an academic in children's illustrated books).


Universities in Children's Picture Books

What about the universities themselves? They dont feature as often as the academics associated with them - the focus of children's books is seldom about such an institution that will have an effect so far in the future of the reader, although some characters plan well ahead in advance. Lectures, when depicted, are obviously very boring and impenetrable. University buildings are like castle schools for grown ups or  the site of secret underground lairs or the best holiday park ever. There are a couple of sweet kids books from the USA that attempt to describe the university campus and rituals of specific actual colleges - Baylor University and Boston College.  But in general, the children's books revolve around the characters, rather than the fact they are in a university, per se.

Why is this relevant? 

Obviously, this has been a bit of a fun project. Given the lengths gone to to gather this corpus of children's books, it is unlikely that any individual child would happen across all of the books noted. It's actually interesting to think how few children's picture or illustrated books feature academics or academia (at time of writing, Amazon lists 1.3 million books in its children's section, and 101* different books (or books series) were identified in this project). While no doubt there are other books out there not on the list, this has been a darn good crack at finding as many as possible, not only in the English Language. Professors and academic Doctors in children's books are a useful device on occasion, but really are not terribly frequent in the scheme of things.

That said, the difference in gender, and how women and men are represented, and the underepresentation of those who are anything but white in children's books about academia, is shocking, especially given that almost all scientific fields are still dominated by men, and women are frequently discriminated against and although 46% of all PhD graduates in the EU are female, only 1/3 of senior research posts are occupied by women. At a time when researchers are asking if available toys can influence later career choice, can the same be said about books? At a time when it is becoming the parents' job to encourage girls into science and technology - and to educate all children about science and engineering careers - does the lack of anything but white, old men as academics in children's books reinforce the impossibility of anyone other than those making a contribution? At a time when the leaky pipe of academia shows that women are leaving in droves at every level of the academic ladder, should we be worried that there are no female academics in children's books above middle age?  Laugh at this analysis if you will, but sociological analysis of other children's books has shown that
there is a hidden language or code inscribed in children’s books, which teaches kids to view inequalities within the division of labor as a “natural” fact of life  – that is, as a reflection of the inherent characteristics of the workers themselves.  Young readers learn (without realizing it, of course) that some... are simply better equipped to hold manual or service jobs, while other[s]... ought to be professionals. Once this code is acquired by pre-school children... it becomes exceedingly difficult to unlearn.  As adults, then, we are already predisposed to accept the hierarchical, caste-based system of labor that characterizes the... workplace. [link]
Another analysis of 6000 children's books published between 1900 and 2000 suggests the gender disparity, and the lack of women characters, sends children a message that "women and girls occupy a less important role in society than men or boys":
The messages conveyed through representation of males and females in books contribute to children's ideas of what it means to be a boy, girl, man, or woman. The disparities we find point to the symbolic annihilation of women and girls, and particularly female animals, in 20th-century children's literature, suggesting to children that these characters are less important than their male counterparts... The disproportionate numbers of males in central roles may encourage children to accept the invisibility of women and girls and to believe they are less important than men and boys, thereby reinforcing the gender system. [link]
As for the diversity issue - in general, children's books have been shown to be stubbornly white, even though "children of all ethnicities and races need role models of all ethnicities and races. That breeds normalcy and acceptance, and it's good for everybody. [link]" What we are seeing here in this corpus, then, is a microcosm of what is happening in children's literature in general, although played out alongside an ongoing debate about the involvement of women and minorities in the academy. That doesn't make it ok, mind.

There are wider nuances, though, that dont just involve headcounts of men and women, black and white.  Children's perceptions of scientists have been shown to be based on various stereotypes, and the stereotypes of academics presented and promulgated in these books is the product of writers and publishers who, taken together, quite clearly don't think academics are much cop, which will perculate back to those who read the books, or have the books read to them. Academics are routinely shown as individuals obsessed with one topic who are either baffled and harmless and ineffectual, or malicious, vindictive and psychotic, and although these can be affectionate sketches ("bless! look at the clueless/psychopathic genius!") academics routinely come across as out of touch wierdos - and what is that teaching kids about universities?  In this age of proving academic "impact", it might be not so bad for us to be able to show we were relevant to society? That there is more to academia than science? Or for the kids books I show my kids to have more positive and integrated representations of professors and academics? Perhaps this is not the role of kids books though, and I should just be telling my kids my own tales of academic derring-do. 

I mean, who would spend two years gathering a corpus of kids lit for fun, and then count how many beards the people in the books had. Wierdo. Wierdos, the lot of them.

Top Children's Picture Books Featuring Academia

Out of all of the books found in this project, there are some which have been read and read again by my boys, and some which got tossed aside as soon as they arrived. There is also one I adore, but the boys are not so interested in. If you wanted to read some children's books which feature academics and universities, you could do worse than start with the following:

1. Dr. Dog, by Babette Cole, Red Fox, London, 1994. Dr Dog is a medical Doctor, but who also does research. It has one fantastic page where Dr Dog goes to conference in Brazil to give a talk about bone marrow, and that one page has explained where Mummy goes when she goes in the airplane, on many occasions. Very useful. For age 2+

2. Professor Puffendorf’s Secret Potions. Robin Tzannes, Korky Paul, Oxford University Press, 1992. The most read story in our house about a Professor. Prof P goes off to a conference and her lazy lab assistant wants to steal her secret potions for himself... (I would have preferred to see more about her, though). 2+

3. Mahalia Mouse Goes to College, by John Lithgow, illustrated by Igor Oleynikov, Simon and Schuster, 2007, New York. Mahalia is a brave little mouse who wants to go to Harvard and study maths, and succeeds. Uplifting. 3+

4. The Rooftop Rocket Party, by Roland Chambers, Anderson Press, 2002. Doctor Gass is a rocket scientist, who doesnt believe a little boy that the water coolers on top of the New York skyline are capable of going to the moon... Delightful.3+

5.  Professor Astro Cat’s Frontiers of Space, by Dominic Walliman and Ben Newman. Flying Eye Books, 2013. This is a lovely, well illustrated, detailed and well written kids introduction to astronomy, which is explained by Professor Astro Cat. Nice paper too, bibliophiles. For age 5+.

6. Professor Wormbog in Search for the Zipperump-a-Zoo (Mercer Mayer Classic Collectible: Little Monsters), by Mercer Mayer, Golden Pr. 1976.  Professor Wormbog is searching for the only thing he hasnt got in his zoology collection... perhaps they are right under his nose all along? 2+

7. Mungo and the Spiders from Space. By Timothy Knapman, illustrated by Adam Stower. 2007, Puffin, London. A rollicking space adventure about a little boy who gets an old book about an evil doctor... and steps into the book... 4+

8. Any of the Octonaut books, by Meomi. Now a popular tv programme, the Octonauts started off as a book series. Professor Inkling shows how he can work with others to, ya know, deliver impact in the field etc etc. The Meomi books (Harper Collins) are delightful, with lots of detail that demand rereading - start of with the Octonauts Explore the Great Big Ocean (but steer clear of the tv spin off books published by Simon and Schuster - they arent a patch on the illustrated books by Meomi). Much, much loved in our house. 2+

9.  The Dr Xargle and Professor Xargle books (he gets promoted at some point, evidently). By Jeanne Willis and Tony Ross, different publishers. Xargle explains various things about human society, or science, to his university class of aliens, with hilarious consequences. 3+

10. Professor Twill’s Travels, written and illustrated by Bob Gumpertz. Ward Lock Limited, London, 1968. A sweet tale of Professor Twill, travelling the world to collect animals. The illustrations in this book are very much of the era - it's just beautiful. A forgotten classic. 1+

And one just for the adults: Jack Dawe and The Professors, Bedtime Stories for Technically Inclined Little Ones, 1964, illustrated by Brian Green. (By "Uncle B", no press listed). An Oxford Professor wrote down and vanity published the tales of academia he told his nieces and nephews. They are absolutely hilarious.

Happy reading. And if you find any more academics or universities that I dont know about in children's picture books... do let me know!


*There were a few characters that appear in series of books, for example Professor Branestawm, Dr Xargle, and Professor Inkling in Octonauts. Only one book from each series was counted: if all the books from series were included, there would be over 140 books in total. Please note, none of the spin offs from the children's film Monsters University were included in this analysis, as we're dealing here with things that started as books, rather than spin offs, and it would take over the corpus, and, hmmm, that deserves an analysis of its very own... uh-oh...

Wednesday, 27 November 2013

I'm not going to edit your £10,000 pay-to-open-access-publish monograph series for you

Over the last three or four months, I’ve been talking with an academic publisher – one of the big names that most people have heard of – who approached me to talk about launching a series in Digital Humanities. Now, Digital Humanities is quite fashionable at the moment, with many presses launching books and series about digital arts, culture, humanities and heritage, but goodness knows there is a need for a series that would publish only academic monographs in the area, rather than text books like this and this. I’ve been enjoying talking through the issues of publication with the press in question, and I asked Bethany Nowviskie to join me as co-editor, hoping to work together and thinking about how we could do something that suits our academic neck of the woods: offering good digital as well as print content, and tackling the open access monograph issue in as brave a way as possible, committing to delivering a high quality print publication that would also be available in open access too.

Last week they emailed me with their new company policy on open access. They are fully committed to offering high quality open access versions of their high quality academic books. But to produce open access versions, authors would be required to pay £10,000 (with applicable taxes added on top) to cover the “number of costs” that are involved to “produce” these titles.

I believe – at a time where rumours are flying that the next Research Excellent Framework will require all submissions to be available in open access, including monographs (although, please see later update at the end of the post about this) – that placing a £10,000 cost-to-publish fee onto monographs is iniquitous and will exclude many, if not most, early career scholars in the humanities from publishing their books in open access, as well as excluding any academic who is not at a very rich institution who has the resources to meet this publisher’s ransom demand. (There's an excellent blog post by Mercedes Bunz which demonstrates this very point).  This will have deleterious effects on humanities academic career progression, as the monograph is still seen to be the proof of academic excellence (even if “just print” will no longer “count”.) I believe that this stance by publishers to place the costs of publishing open access monographs onto humanities academics (in particular) is perfidious, and the only way we can counteract it is to stop engaging with presses who behave in this manner, refusing to submit manuscripts to them, but also, refusing to peer review manuscripts for them, and refusing to edit manuscripts – or a series of manuscripts - for them.

So I'm not going to edit their £10,000 pay-to-open-access-publish monograph series. And here is my reply to them. I’m not sure about the legalities of talking about this, so I have stripped out any identifying information regarding the individual publisher to safeguard myself. I would very much welcome your comments.


Dear (doesn’t matter which particular publisher, this could be directed to the whole shower of those who are asking for £10,000 for pay to open access publish a humanities monograph).

I understand that you are operating in a world where traditional publishing mechanisms and relationships have been turned on their heads. I understand that you have revenues to make to cover your costs, and profits to report to shareholders. I understand that, given a lot of authors from now on will have to provide open access versions of their research, you see this as an opportunity to further extend your profits. But I cannot understand the maths involved in calculating that it will cost £10,000 to turn a ready-for-print PDF proof into an ebook (seriously, I've been round the block a few times in book production, and that's some hourly rate those folks are charging you). You have looked at the £2,000 per academic journal paper model for open access in the sciences, and simply multiplied it and stuck it onto what you think is an humanities equivalent: a monograph equals about five journal papers, right?  It would be more honest for you to say: we are charging £10,000 to offset the open access copy against loss of potential revenue for book sales. I understand that this is a concern for you, of course I do, and it would be better to say this up front.

But even with this concern, I do not see that humanities authors are the people you should be targeting to make a profit.

The £10,000 cost for open access is not a commitment to open access at all. It is is a shield behind which you can keep open access away from those who might harm your profit margin. But think of the poor humanities academic who *has* to publish their work in open access. What are they going to do? Turn to their institution? Only the best ranked institutions in the world will be able to cover their costs: are you seriously saying that only those in the top universities worldwide are welcome to publish open access with you? Even within those institutions, only the top ranked individuals with prior grant income would have such a request entertained: here's a secret which you probably haven't figured out: most humanities faculties aren't rolling in money. So should aspiring book writers get the £10k from grant income? But you are applying a model from the sciences that doesnt apply to the humanities: in the UK the average Russell Group humanities academic brings around your cost for open access in grant income a year, and funding councils who have had their own incomes slashed cannot expect to prop up the publishing industry. Some have suggested that the £10,000 is seen as an "investment in self" where individuals would seriously pony up the £10,000 from their own meagre funds (read: credit cards), in the hope that they would recoup this through promotion, tenure, etc. Its a huge gamble to take, at a time when many - including most early career scholars - are exhausted from carrying the student debt albatross round their necks. As a result, the numbers publishing open access with you will be few and far between. With your "commitment" to open access, you will still be able to publish print editions for those who do not care about securing an open access copy. There's your open access commitment right there - you are more likely to never, ever have to publish an open access volume, even though you have a "policy", as it is just not achievable for all but the independently wealthy. And academic success for all just moves that step further away again. Hurrah for building the pristine ring-fenced arena that no-one can ever use, unless they bring their own polo horse! *snort*. It's just odious. 

I know that my list of suggestions for pursuing an open access monograph series in Digital Humanities were not usual (just to recap, I asked for: the print book for sale, with full contents available for free in an open access digital version, with a creative commons license to be agreed with each individual author (some of them might allow commercial reuse, such as CC-BY, some of them might be more conservative going for ND). This would be Diamond Open Access -so full peer review process, item available free in digital form, but no "author pays" model, and the resulting book should be published in various ebook formats, with no digital rights management (DRM). The author should retain copyright. Ideas for offsetting costs and potential lost revenue include lowering the level of royalty payments, or increasing the point at which the publication will start to recoup costs, depending on a realistic cost model, which we could help work out.) I'll also point out that I have never once asked for payment in any of this (and just for math's sake: what proportion of that £10,000 per open access book will go to the series editors? Oh that's right, none). So you expect to use my contacts, and to use my time, and for me to help feed into a exclusionary model that keeps your wheels turning, that takes money from institutions, or grants, or individuals, and to do that for you without even listening to anything I have been saying about the need for open access in the humanities, particularly within our community, or what we can do to fix – or at least experiment with - the existing model to be in everyone's favour?

The open access agenda is a huge issue in Digital Humanities. It is at the heart of the discipline: doing things in the open, experimenting, being the voice for the humanities in the digital age, showing people how it is done. Digital Humanities is big business at the moment, as can be witnessed by the explosion of Digital Humanities titles published in the past year alone (which is why you are talking to me, after all). Goodness knows we need more research monographs to come out that give people the space to seriously consider and present their research ideas amongst all these textbooks. But this can only be done by operating within the research modes of the community. We could have committed to doing a trial of, say, 5 or 10 books that would be printed with diamond open access too, and being absolutely open and honest about the costs and the revenues and the potential losses and gains, and really led the way in a discussion about where open access monograph publishing goes, and what works, and what doesnt, and what the realistic costs of producing open access research to a high standard is. We would have been famous, we would have sold books, we would have attracted the best and brightest minds with the most brilliant texts, no matter what their bank balance was. As it is, your £10,000 (plus taxes) seems entirely one-size-only-fits-you, jumping on the bandwagon of a scared publishing industry whose fear is contagious, copying an approach which doesnt work for anyone, but allows you to have a policy that will never actually have to be exercised. I'm sad, as I see this as a missed opportunity for us to work together.

I am at a stage in my career where I do not have to take on anything that I do not want to do, or do not agree with. I am at a stage in my career where I should be sticking up for what I think is right, and also looking out for early careers scholars coming up behind me. I am uncomfortable in putting my name to a Digital Humanities series that touts a £10,000 pay to publish open access policy as fair or egalitarian.  I'm not going to edit your £10,000 pay-to-open-access-publish monograph series. I doubt that any leading figure in our field would, but I wish you well in finding the person to take this book series forward.

I hope your book series in Digital Humanities is a success, I really do. Its been a pleasure scoping out what a book series could have looked like, especially with the challenges that face us in the digital environment. But I am left frustrated that we could have done so much together. Please do get in touch in the future, when this £10,000 open access model doesn't work for you, when you may like to - or have to - be braver.

Update 28/11/2013: Since posting last night, this has gone a little... viral. With the result that, ring-ring! that's HEFCE calling (via a tweet from Ben Johnson, thanks Ben) to point out the current state of affairs on the requirements for open access in the next REF. This is sketched out in paragraphs 46-50 of this policy document. So there wont be a requirement in the next ref for open access, but their view is "that open access publication for monographs and books is likely
to be achievable in the long term".

Obviously, if I had been able to find this (rather than the rumours) this would have tempered a couple of sentences in my blog post above, but only a couple, so I'm not going to retool it. The fact remains that open access monographs are on the horizon, and that publishers are attempting to profiteer from this without any adequate costing model as to how to achieve them. I'm not happy about being any part of that, and will not give up my time, advice, and hard work to support a model which excludes many from taking part in making their work available via open access.

Tuesday, 15 October 2013

For Ada Lovelace Day – Father Busa’s Female Punch Card Operatives

15th October 2013 is Ada Lovelace Day – the annual celebration of women in science, technology, engineering and maths, named after Ada Lovelace, the first computer programmer.  Working with Charles Babbage in 1840, Lovelace understood the significance of his Analytical Engine (a machine that can conduct a number of different functions, such as addition, subtraction, multiplication and division) and its implications for computational method. She saw that via the punched card input device the Analytical Engine opened up a whole new opportunity for designing machines that could manipulate symbols rather than just numbers. Lovelace attempted to draw together romanticism and rationality to create a ‘poetical science’ that allowed mathematics and computing to explore the world around us, recognizing the potential for a move away from pure calculation to computation, and possessing a vision that foretold how computing could be used in creative areas such as music and literature.

It seems apposite on Ada Lovelace day to look at some female punchcard operators from the very first days of available electronic computation, working on one of the first "poetical science" projects in "Humanities Computing". From 1949, an Italian Jesuit priest called Father Roberto Busa (November 13, 1913 – August 9, 2011) pioneered the use of computing for linguistic and literary analysis, teaming up with IBM to produce an index of the works of St Thomas Aquinas. Thomas Aquinas wrote some 9 million words of medieval Latin, and so Busa’s project to index his works via computational methods took over 30 years, being one of the earliest and most ambitious projects in the field which is now called Digital Humanities

To produce an index, the works of St Thomas Aquinas had to be encoded onto punchcards, and Marco Passarotti, from the CIRCSE Research Centre, Università Cattolica del Sacro Cuore, Milan, Italy (where the Index Thomisticus Treebank project is hosted), explains how this happened:
Once, I was told by father Busa that he was used to choose young women for punching cards on purpose, because they were more careful than men. Further, he chose women who did not know Latin, because the quality of their work was higher than that of those who knew it (the latter felt more secure while typing the texts of Thomas Aquinas and, so, less careful). These women were working on the Index Thomisticus, punching the texts on cards provided by IBM. Busa had created a kind of "school for punching cards" in Gallarate. That work experience gave these women a professionally transferable and documented skill attested to by Father Busa himself.
Update! (23/1013): We now know the name of the woman top left: Livia Canestraro. She also appears in many of the pictures below.


Livia Canestraro
Livia Canestraro, above and below.
Update! (23/1013): We now know the name of the woman back left: Rosetta Rossi Bertolli. Livia Canestraro is bottom right, and below.


Update! (23/1013): We now know the name of the woman second from the left: Gisa Crosta.


These previously unpublished images come from the archive of Father Busa and date from the late 1950s and early 1960s. Taken in Gallarate, Italy, they show the ranks of women involved in encoding and checking the punchcard content of Thomas Aquinas’ works. The women can also be seen demonstrating the technologies to visiting dignitaries, and overseeing the loading of the punchcards into the mainframe.

We don’t know the names of these women: further research and enquiries are ongoing to try to establish their identities, and their role in the project. However, it shouldn’t be that surprising to us that women were so important in Father Busa’s pioneering computing project: in the early 1960s computer programmers were commonly women.   It’s pleasing to show on Ada Lovelace Day how important women were to one of the first projects in my academic field - look at the scale of the operation! - although further research is needed to uncover the role and responsibilities of women in this project: the majority of them seen here are doing data entry, albeit in a skilled and new format. The project certainly could not have happened without their input.

The images shown here are kindly made available under a Creative Commons CC-BY-NC license by permission of  CIRCSE Research Centre, Università Cattolica del Sacro Cuore, Milan, Italy. For further information, or to request permission for reuse, please contact Marco Passarotti, on marco.passarotti AT unicatt.it, or by post: Largo Gemelli 1, 20123 Milan, Italy. This year is the 100th anniversary of the birth of Father Busa, which will be celebrated with a workshop in Sofia on the Annotation of Corpora for Research in the Humanities.



Tuesday, 23 July 2013

Digital Humanities in works of literature?

This post finds me jetlagged and happily worn out after my trip over the pond to the Social, Digital, Scholarly Editing conference in Saskatoon, followed in quick succession by Digital Humanities 2013 in Lincoln, Nebraska. 10 days, 6 flights, 2 countries, 2 conferences, 2 papers, 1 panel session, 2 chaired meetings and 3 posters later, I made my way home yesterday and decided not to work on the plane home (shock! horror!) but to treat myself to a nice novel. I picked up "Her Fearful Symmetry" by Audrey Niffenegger, and happily battered through it whilst airbourne - laughing to myself when the following paragraphs emerged...

Martin shook his head... "I used to work at the British Museum, translating ancient and classical languages. But now I work from home".

Julia smiled. "So they bring the Rosetta Stone and all that here to you?"...

"No, no. I don't often need the actual objects. They take photographs and make drawings - I use those. It's all become so much easier now everything is digital. I suppose someday they'll just wave the objects over the computer and it will sing the translation in Gregorian chant.  But in the meantime they still need somebody like me to work it out." Martin paused, then said, rather shyly, "Do you like crossword puzzles?"  (Niffenegger, A.  (2009). Her Fearful Symmetry, p. 129.  Scribner, New York.)

Later on in the book - set in and around Highgate Cemetry in London - the following is also said:

 "Perhaps we ought to make another sign to post at the gate," said James. "All uncertain grave owners please present yourselves during office hours when the staff can attend to your very time-consuming requests".

"We want to help them," said Jessica. "But they must call ahead. These people who pitch up on the cemetery's doorstep wanting us to do a grave search while they wait - it's beyond anything."

"They think the records are digitised," Robert said.

Jessica laughed.  "Ten years from now, perhaps. Evelyn and Paul are typing in the burial records as fast as their fingers can fly, but with one hundred and sixty-nine thousand entries -"

"I know."
Its not the first time I've seen digital humanities/ digitisation creep into fiction - I remember some ludicrous database in Dan Brown's Da Vinci Code* - but it did make me think, people are starting to notice the kind of things we've been working on for (in my case) over a decade. It's great to see something that so relates to my doctoral work and published texts pop up in a work of fiction. Heck, the people at US immigration who ask you what you do when you say you are going to a conference might even understand what "Digital Humanities" means next! Maybe not.

Anyone else stumble across mentions of computing, culture, humanities and heritage in fiction? If so, I might feel another Tumblr coming on. Uh-oh...

* I dont have a copy of the Da Vinci Code, but the internet has provided an illegal online version, I copy the scene here. First one to send me a cease and desist and ask me to take it down wins.

She glanced at her guests. "What is this? Some kind of Harvard scavenger hunt?" Langdon's laugh sounded forced. "Yeah, something like that." Gettum paused, feeling she was not getting the whole story. Nonetheless, she felt intrigued and found herself pondering the verse carefully. "According to this rhyme, a knight did something that incurred displeasure with God, and yet a Pope was kind enough to bury him in London."

Langdon nodded. "Does it ring any bells?"

Gettum moved toward one of the workstations. "Not offhand, but let's see what we can pull up in the database."

Over the past two decades, King's College Research Institute in Systematic Theology had used optical character recognition software in unison with linguistic translation devices to digitize and catalog an enormous collection of texts – encyclopedias of religion, religious biographies, sacred scriptures in dozens of languages, histories, Vatican letters, diaries of clerics, anything at all that qualified as writings on human spirituality. Because the massive collection was now in the form of bits and bytes rather than physical pages, the data was infinitely more accessible.

Settling into one of the workstations, Gettum eyed the slip of paper and began typing. "To begin, we'll run a straight Boolean with a few obvious keywords and see what happens."

"Thank you."

Gettum typed in a few words:

LONDON, KNIGHT, POPE

As she clicked the SEARCH button, she could feel the hum of the massive mainframe downstairs scanning data at a rate of 500 MB/sec. "I'm asking the system to show us any documents whose complete text contains all three of these keywords. We'll get more hits than we want, but it's a good place to start."

The screen was already showing the first of the hits now.

Painting the Pope. The Collected Portraits of Sir Joshua Reynolds. London University Press.



Gettum shook her head. "Obviously not what you're looking for." She scrolled to the next hit.

The London Writings of Alexander Pope by G. Wilson Knight.

Again she shook her head.

As the system churned on, the hits came up more quickly than usual. Dozens of texts appeared, many of them referencing the eighteenth-century British writer Alexander Pope, whose counter religious, mock-epic poetry apparently contained plenty of references to knights and London.

Gettum shot a quick glance to the numeric field at the bottom of the screen. This computer, by calculating the current number of hits and multiplying by the percentage of the database left to search, provided a rough guess of how much information would be found. This particular search looked like it was going to return an obscenely large amount of data.

Estimated number of total hits: 2, 692

"We need to refine the parameters further," Gettum said, stopping the search. "Is this all the information you have regarding the tomb? There's nothing else to go on?"

Langdon glanced at Sophie Neveu, looking uncertain.

This is no scavenger hunt, Gettum sensed. She had heard the whisperings of Robert Langdon's experience in Rome last year. This American had been granted access to the most secure library on earth – the Vatican Secret Archives. She wondered what kinds of secrets Langdon might have learned inside and if his current desperate hunt for a mysterious London tomb might relate to information he had gained within the Vatican. Gettum had been a librarian long enough to know the most common reason people came to London to look for knights. The Grail.

Gettum smiled and adjusted her glasses. "You are friends with Leigh Teabing, you are in England, and you are looking for a knight." She folded her hands. "I can only assume you are on a Grail quest."

Langdon and Sophie exchanged startled looks.

Gettum laughed. "My friends, this library is a base camp for Grail seekers. Leigh Teabing among them. I wish I had a shilling for every time I'd run searches for the Rose, Mary Magdalene, Sangreal, Merovingian, Priory of Sion, et cetera, et cetera. Everyone loves a conspiracy." She took off her glasses and eyed them. "I need more information."

In the silence, Gettum sensed her guests' desire for discretion was quickly being outweighed by their eagerness for a fast result.

"Here," Sophie Neveu blurted. "This is everything we know." Borrowing a pen from Langdon, she wrote two more lines on the slip of paper and handed it to Gettum.

You seek the orb that ought be on his tomb. It speaks of Rosy flesh and seeded womb.

Gettum gave an inward smile. The Grail indeed, she thought, noting the references to the Rose and her seeded womb. "I can help you," she said, looking up from the slip of paper. "Might I ask where this verse came from? And why you are seeking an orb?"

"You might ask," Langdon said, with a friendly smile," but it's a long story and we have very little time."

"Sounds like a polite way of saying “mind your own business.”"

"We would be forever in your debt, Pamela," Langdon said, "if you could find out who this knight is and where he is buried."

"Very well," Gettum said, typing again. "I'll play along. If this is a Grail-related issue, we should cross-reference against Grail keywords. I'll add a proximity parameter and remove the title weighting. That will limit our hits only to those instances of textual keywords that occur near aGrail-related word."

Search for: KNIGHT, LONDON, POPE, TOMB

Within 100 word proximity of: GRAIL, ROSE, SANGREAL, CHALICE

"How long will this take?" Sophie asked.

"A few hundred terabytes with multiple cross-referencing fields?" Gettum's eyes glimmered as she clicked the SEARCH key. "A mere fifteen minutes."

Langdon and Sophie said nothing, but Gettum sensed this sounded like an eternity to them.

"Tea?" Gettum asked, standing and walking toward the pot she had made earlier. "Leigh always loves my tea."

Monday, 27 May 2013

On Changing the Rules of Digital Humanities from the Inside

There has been a lot of talk recently about how my field – Digital Humanities – has to change. We are too insular. We’re excluding those who want to partake in it. The structures that have been built within the discipline preclude the type and means of research which we claim to do.  Issues of gender, race, ethnicity, and class raise their heads. There are a few online resources that exist which sum up these feelings: see “Toward an Open DigitalHumanities” google discussion document and, more recently, the Open Thread on “The Digital Humanities as a Historical“Refuge” from Race/Class/Gender/Sexuality/Disability?” over at Postcolonial Digital Humanities.

I’m not denying that there are issues in Digital Humanities. One need only look at the recently published program for DH2013 and cast your eye over the authorship of the accepted papers to see that this year’s Digital Humanities presenting cohort is around 65% male, 35% female. But what I would say, speaking on a personal level and not representing any authority here, is an obvious point which I don’t hear often voiced. Most people “within” Digital Humanities – that is those within the ADHO committee structures, those helping to run the conferences, those helping to allocate student bursaries and prizes, those helping to review papers and manuscripts, and heck, even the cool kids on twitter, are people who want Digital Humanities to be as open and as great as possible. This whole field has been built on the hard work of many academics who have given up their free time to try and entrench the use of computing in humanistic study into an academic field of enquiry, and it wouldn't exist without them, even if the form it exists in is currently imperfect. I would say, from where I sit on various committees, that people want to keep DH growing, and growing healthily. So if there are things wrong with DH, then do give concrete examples, or propose concrete solutions, so they can be taken forward. They're listening - we're listening.

There are things that have really frustrated me within DH, and it is only recently that I’ve started to actively question and pursue them, to get them to be changed. For example, in 2006 I first noticed that the TEI guidelines encouraged the use of ISO5218:2004 to assign sexuality of persons in a document (with attributes being given as 1 for male, 2 for female, 9 for non-applicable, and 0 for unknown). I find this an outmoded and problematic representation of sexuality, which in particular formally assigns women to be secondary to men, and so, in one of the core guidelines in Digital Humanities, we allow and indeed encourage sexist structures to be encoded. I was shocked to hear this – and have often brought it up when discussing entrenched issues in DH about gender balance. In a recent conversation on twitter about this topic, Stephen Ramsay summed up the issue:



James Cummings responded to our tweets, asking why, if it bothered me (and others) so much, hadn’t anyone submitted a feature request to TEI about it? And you know, it had never occurred to me that there would be an easy route to question this sort of stuff. He pointed me to where to submit a request, which I did here.  The discussion which follows is really very interesting – look out for the “you cant possibly be offended!” argument, or the “but we’ve always done it this way!” response. Also look out for very vocal support from Gabriel Bodard, in particular, who helped steer the discussion forward to ensure that at
“the TEI Council meeting in Brown, 2013-04, we agreed to change the datatype of person/@sex, personGrp/@sex and sex/@value from ISO 5218 to data.word, so as to allow the use of locally defined values or alternative published standards to be used in these attributes.” 
Women are secondary in the TEI rules no more! Hurrah! – and all it needed for that to happen was for someone to raise the issue in the correct forum, and explain the issue to those who did not understand it, until they finally did.

I’m Program Chair for DH2014 and issues of diversity and equality are currently on my mind as we discuss and choose plenary speakers for the Lausanne conference. It was recently pointed out to me, though, that the ADHO conference protocols don’t allow issues of diversity to be taken into consideration when choosing plenary speakers, originally saying
“Keynote speakers are decided by the International Program Committee in consultation with the Local Organiser, and should ideally represent a range of disciplines, interests, and geography.”
This isn’t good enough, as it means that you cant say “We’ve got a man to be one of the speakers, how about having a woman for the other one?” without being at risk of being accused of breaching protocol. I’ve recently chased an amendment round the ADHO committee structures, which means the ADHO conference protocols, since last week, state:
"Keynote speakers are decided by the International Program Committee in consultation with the Local Organiser, and should ideally represent a range of complementary disciplines, interests, and geography, with consideration given to issues of gender equality, and economic, ethnic, cultural, and linguistic diversity." 
Perhaps a small deal, focussing on the choice of a couple of speakers a year at our international conference, but pointing to the fact that the ADHO constitution needs to be looked over, to see where we can enshrine issues of gender equality, and other issues of diversity, within our communities. We need to make the rules that people have to abide by. We can make the rules, and we can change the rules. What rules would there help to be?

Of course, changing rules and guidelines wont make everything change overnight, and I wouldnt like to naively claim they will solve everything, but they are a start. I guess what I’m saying here is that, in general, folks “within” Digital Humanities are doing their best, and open to discussion and improvement, and are not willfully obstructive to those of a different gender, race, or economic class, etc. Criticism is helpful, and if there are things that need changing, or unconscious biases that need rectifying, then point them out, tell us. Tell us where concrete things are that we can act upon. We all want Digital Humanities to be the best it possibly can be, and I, for one, don’t mind changing the rules from the inside, in the time that I remain there. 

28/05/13 Addendum to the original post:  for an ADHO led initiative on diversity see GO::DH. I'd also like to encourage anyone who is interested in discussing change to consider standing for election to one of the ADHO organisations - we always need volunteers who want to roll up their sleeves!


Sunday, 26 May 2013

On Throwing Your Klout Around

I am @melissaterras. I have just shy of 4500 followers on twitter, a blog which garnered 100,000 readers last year, and a current Klout score of 64. I tend to take this kind of thing with a grain of salt: I hang out on social media because I enjoy it and it has also proved useful and beneficial to my career. I’m aware I’m not Justin Bieber and that my stats – while above average - are not particularly big shakes.  But over the past few weeks a few things have happened which have made me think about digital identity, responsibility, and where academic use of social media crosses into the “real life” arena.

Case 1. I travel a lot with work, usually using Opodo to book tickets. A few weeks ago I found myself locked out of “My Opodo” and couldn’t access it to check itineraries, tickets, or print boarding passes, etc.  I tried getting in touch with customer services, spending hours on the phone, emailing, tweeting and asking for help. Nothing. With an upcoming trip, and growing frustration (spending an hour on hold to Opodo is never in the plan of my day) I posted a few disgruntled tweets about their shocking customer service, which, retweeted by some followers, had the potential to reach over 10,000 users within a matter of minutes. My mobile rang. Opodo – a firm reknowned for not answering customer complaints in a timely fashion- had phoned me to help resolve the problem.

I’ve seen it reported that Klout scores andtwitter follower counts are now being paid attention by customer services, but while I can provide various concrete examples of why having a digital profile has helped my academic career, this is the first time I can point to something which has actually helped resolve an issue I have had with a commercial entity. I’m simultaneously aghast that it would take an above average twitter following to help you get on a departing flight, and relieved that it helped me to get an increasing pressing travel issue sorted out.  What about those not-so-valued customers that didn’t manage to get the issue resolved in time?

Case 2 is where I now am aware that writing something online could cost a local business tens of thousands of pounds in business. I’m not happy with the project management company who looked after a build at our home, as the ceiling is now leaking, and they are ignoring any enquiries we are making to help have this sorted.  It would be easy for me to name them here, linking to their website, and within a couple of days if you googled for them my blog post would appear above their own website in the rankings, due to the fact that my blog is tapped into more existing networks than theirs.

It would seem that, at the moment, the easiest tool at my disposal to use is my digital identity. Indeed, it is probably the only leverage I have to stop the growing discolouration of our new dining room ceiling. But that makes me uneasy, as I know how difficult it would be for them to claw back in a negative customer comment once it has been broadcast online, and we are happy in general with our build and are sure this is a minor issue to resolve. Should I be throwing my klout around, if it will negatively affect others in the long term?  

I’m left thinking of the increasingly intertwined nature of customer service, digital presence, and moral responsibility. Whilst I was playing at this, this stuff got real.