In order to better integrate my blog with my website, better manage comment spam, and reduce my dependence on Google, this blog has moved to In order to avoid broken links I won't be deleting content from here, but no new content will be added, so please update your bookmarks and feeds.

Thursday 22 September 2011

Thoughts on "Cheat's Guide to Project Management"

Sally Pewhairangi's workshop "Cheat's Guide to Project Management" covered the planning stages of managing a project in a way that made it clear why the planning is so vital.

We started by discussing reasons projects fail -- one of those brainstorming sessions everyone always has plenty of material for and which can get downheartening. But Sally concluded this section by saying that while we can't always make these problems disappear, we can manage them; and looking back at my notes now I can see that the vast majority of the problems we talked about would be much alleviated by the process the rest of the session modelled.

This, much abbreviated and paraphrased, was:
  1. Find out/figure out how the project fits into the institution's goals. A project to merge serials into the main collection will go differently if the aim is to free up space or to aid findability. If push comes to shove, which consideration will win?
  2. Define the heck out the project. Make sure everyone's on the same page about exactly what is to be achieved, by when, and with what resources. What's included/excluded? Get it in writing and signed off by everyone to prevent confusion, co-option, mission creep, the sudden discovery that you have no budget, etc.
  3. Break the project down into tasks and subtasks so you know everything that has to be done and don't get surprised.
  4. Work out who's doing which subtasks by which dates.
For someone like me who just wants to achieve something, this often seems like a nuisance, and during the session my group was constantly having to rein ourselves back from rushing ahead to the what when we hadn't sorted out the why. But when we did plan it all, it became much easier to come up with a much more innovative and relevant approach to solving the problem.

One of the other fascinating things came during the "silent brainstorm" section that is, everyone scribbling out all the tasks they could think of in silence. No talking meant no-one dominating or being shy, and no derailing into knocking ideas prematurely. And this really brought out the different strengths of different team members - when we categorised the tasks as a team we could see one person focusing more on communicating with stakeholders, one person on technical aspects of the project. Come to think about it, this could be a good way of deciding who should be responsible for managing what.

In short, a fantastic workshop which has given me a whole new perspective on planning and, more practically, the tools to do it systematically.

Plus, the template we worked through was so useful in breaking things down, guiding us through, and giving a real sense of accomplishment at the end, that I'm now pondering how something similar might work in an infolit class: guiding students through thinking about what information they need and where to find it. I'm thinking something like:

Plan your search
1. What’s your topic?

2. What kind of information do you need?
Well-tested research <-----------------------------------------------------------> Cutting-edge knowledge
Summarised information <------------------------------------------------------------> Detailed information
Layman’s level <---------------------------------------------------------------------------> Research level

3. Who would have written about it? When? Where would they have published?
Kinds of people
Date-range published
Kind of publication

4. What words would they have used to talk about it?
Synonyms - any other words that mean the same

5. What sources would hold the publications from #3? What search features are available?
Database or other source
Available search features

and then some stuff on analysing results, facets, pearl-growing, etc. (I may abbreviate the above to try and fit the whole thing to a single A4 sheet for a one-hour class; or may leave it at two sides for the class I get two hours with.) I won't have a chance to test this out probably until next year so would be happy to hear any ideas in the meantime!

Tuesday 20 September 2011

The death of organised data

I've been hearing rumours that the big IT companies may be giving up on organised data. Which is kind of a big thing for the same reason that it makes perfect sense: there are terabytes upon terabytes of data pouring onto computers and servers all the time, and organising all of that into a useful format takes a heck of a lot of time.

Especially because data organised to suit one need isn't necessarily going to suit most actual needs. If you're a reference librarian (either academic or, I suspect, public) you'll have had the student coming to your desk who can't quite understand why typing their assignment topic into a database doesn't return the single perfect article that explicitly answers all their questions.

So I think there's two ways of organising data:
  • "pre-organising" it - eg a dictionary, which is organised alphabetically, assuming you want to find out about a given word. It has information about which are nouns and what dates they derive from (to a best guess, obviously) but there's no way to search for nouns that were used in the 16th century because the dictionary creator never imagined someone might want to know such a thing.
  • organising it at point of need - eg a database which had all this same information but allowed you to tell it you want only nouns deriving from the 16th century or earlier; or only pronunciations that end in a certain phonetic pattern; or only words that include a certain other word in the definition.
Organising data at point of need solves one problem (it's much more flexible) but it doesn't actually save time on the organising end. In fact, it's likely to take quite a lot more time.

So is humanity doomed to be swimming in yottabytes of undifferentiated, unorganised, and thus useless data? I frowned over this for a while, and after some time I remembered the alternative to organising data: parsing it. (This is just what humans do when we skim a text looking for the information we want.) So, for example, a computer could take an existing dictionary as input and look for the pattern of a line which includes "n." (or s.b. or however the dictionary indicates a noun), and a date matching certain criteria, and returns to the user all the lines that match what was asked for.

Parsing is hard, and computers have historically been bad at it. (Bear in mind though that for a long time humans beat computers at chess.) This is not because computers aren't good at pattern-matching; it's because humans are so good at making typos, or rephrasing things in ways that don't fit the criteria. (One dictionary says "noun", one says "n.", one says "s.b.", one uses "n." but it refers to something else entirely...) A computer parsing data has to account for all the myriad ways something might be said, and all the myriad things a given text might mean.

But if you look around, you'll see parsing is already emerging. One of the things the LibX plugin does is look for the pattern of an ISBN and provide a link to your library's catalogue search. You may have an email program that, when your friend writes "Want to meet at 12:30 tomorrow at the Honeypot Cafe?", gives you a one-click option to put this appointment into your calendar. Machine transcription from videos, recognition of subjects in images, machine translation - none of it's anywhere near perfect, but it's all improving, and all these are important steps in the emergence of parsing as a major player in the field of managing data.

So yes, if I was a big IT company I might want to get out of the dead-end that is organising data, too - and get into the potentially much more productive field of parsing it.

Friday 16 September 2011

Keeping track of contacts

Last week I put an Access database template up on my website and thought I'd better get around to actually mentioning it. What it does is keep brief notes of all my interactions (face-to-face, phone, email) with the academics, research students, and undergraduates I liaise with. (The templates actually on the website of course only have dummy data.)

My old system was handwritten notes on a copy of the welcome letter each grad student received on enrolling. This was Suboptimal for many reasons that only began with the fact that I could never decide whether to sort by first name, surname, or department... The database lets me:
  • sort by anything I like;
  • see together everything I've talked about with any given person, or everything I've talked about regarding any given course;
  • or my favourite (which I got the fantastic help of @dakvid for coding the SQL; also more generally I want to acknowledge my colleague Margaret Paterson for her inspiring beta-testing) - sorting all my contacts to show at the top the people I talked with the longest ago so I can be reminded to catch up with them.
As I said, I'm currently using it for liaison work but I suspect it could be used for other uses too, so if anyone wants to nab a copy, there it is.

Thursday 8 September 2011

Links of Interest 8/9/2011 - news in e-resources

Michael S Hart, founder of Project Gutenburg, died on the 6th September - read his obituary.

JSTOR is making much of its public domain material openly accessible. (Library Journal also comments.)

The Journal of Librarianship and Scholarly Information "is a quarterly, peer-reviewed, open-access publication for original articles, reviews and case studies that analyze or describe the strategies, partnerships and impact of library-led digital projects, online publishing and scholarly communication initiatives." It's put out a call for submissions for its inaugural issue in [Northern Hemisphere] Spring 2012.