Open Calais + site-wide tags = semantic site architecture

Preamble about people

Over the last month, we’ve I’ve started to grow an embryonic social web publishing platform that can be many things but fundamentally offers a personalised and collaborative environment for research, teaching and learning. (Where? You’re looking at it!). There are a few active blogs (currently fewer than on the pilot Learning Lab blogs), nearly 70 users and the word is starting to get out at a pace that I can manage. So, now it’s time to look to the future…

By running BuddyPress, the connections between people are pretty much taken care of. Sign in to http://dev.lincoln.ac.uk with a Lincoln username and password and you’ve joined a community that, as it grows, will increasingly and effortlessly connect people through the information they choose to add to their profile. Staff and students can click on a link and find other people who have similarly tagged their profile.

Notice the comma seprated hyper-linked data
Notice the comma-separated hyper-linked data

What is of equal interest to me, and potentially very useful to the university community, is how we link the content that is being generated by staff and students and make those links accessible. It is not difficult to appreciate what the potential is when you have a revolving community of 10,000 people who, over time, document their work, their research, teaching and learning using cutting edge web publishing tools, but I’m writing this post to try and understand and sketch out how I might evolve what I have begun.

Put simply, WordPress Multi-User (WPMU) allows one person (me) to provide and manage multiple web sites which other people (staff and students) take ownership of. Typically, every action, every new user and every new page and post on every site, is recorded and held in a shared database(s). Although at this low level, the data is relational, on the surface, when you look at one of the sites, they pretty much stand alone and so they should. We’re not talking about a single website with lots of users, we’re talking about lots of websites with lots of users. They might be working collaboratively with others, but they’re working as individuals or in distinct groups that benefit from a distinct online identity. BuddyPress helps bring things together by aggregating people’s actions (i.e. posting blog updates, making friends, joining groups, posting messages) but the visibility of those connections is transient. Social networks display our actions along a timeline and the connections between people are, for the most part, buried until the next time person A interacts with person Y.

Enough about connecting people.

Site-wide content aggregation

Site content is a mixture of text, multimedia and metadata. The last thing I’ll do when completing this blog post is to categorise and tag it. Each time I write, I publish text, (sometimes images) and metadata which summarises and categorises the full text. Why am I telling you this? You know it already. What you may not know is that each post created on our university WPMU installation, by any person, providing their blog is public, is aggregated into a single site and re-published a second time. So this post exists here on this site and there, on the Community Posts site. Notice how the Community Posts version links back to the original post. We’re not creating a whole new resource, we’re creating a powerful linked resource that allows others to search, filter, browse and discover content held across multiple sites. With only a few sites up and running here at the moment, the opportunity to discover varied content is limited, but over time that will change. Look at wordpress.com, where there are 5 million sites:

Browse by user-generated metadata

Search over 5 million sites
Search over 5 million sites

On the university blogs, this is made possible through the use of the site-wide-tags plugin, which was developed by @donncha, the same person that develops WPMU and the wordpress.com site. By using this plugin, a WPMU installation can share similar functionality to what you see on wordpress.com. I say ‘similar’ because, as I’ll mention later, designing how people discover content is key to all of this and something I, or we as a community, would benefit from thinking about and acting on collectively.

Community Posts
Community Posts

On the Community Posts site, you can search the full-text of every post, filter resources by category and tag, and subscribe to feeds from any combination of tag or category. Any search can be turned into a feed by appending ‘&feed=rss’ to the end of the resulting URL.

i.e. http://tags.dev.lincoln.ac.uk/?s=gaming&feed=rss

To create a feed from a tag or category, just click on a tag or category and append ‘/feed’ to the end of the URL.

i.e. http://tags.dev.lincoln.ac.uk/tag/games/

You can combine tags with ‘+’, too:

http://tags.dev.lincoln.ac.uk/tag/games+development/

You can also specify the type of feed you want by appending:

/feed/rss/
/feed/rss2/
/feed/rdf/
/feed/atom/

Mixing categories and tags is currently broken by a bug but is due to be fixed in the next version of WordPress.

So it’s not difficult to imagine, over time, an active community of thousands of university web publishers, having their content aggregated into a site-wide resource that allows full text searching, browsing and filtering with a choice of feeds to syndicate that content elsewhere. See how it’s happening at the University of Mary Washington, where over 2400 sites have been created in under three years.

Semantic technology

Yesterday, I discovered OpenCalais. It’s a semantic technology that’s been around since January 2008, so you might be tired of hearing about it, but if not, ‘Welcome to Web 3.0!’

The Calais Web Service automatically creates rich semantic metadata for the content you submit – in well under a second. Using natural language processing, machine learning and other methods, Calais analyzes your document and finds the entities within it. But, Calais goes well beyond classic entity identification and returns the facts and events hidden within your text as well.

Nice. And it’s installed on this site. There are three Calais plugins available for WordPress. This one, allows writers to submit their blog posts to the OpenCalais web service API and fetch back a number of auto-generated tags based on the content of their post. The longer the post, the more tags are returned. Tags are returned in just seconds. Those tags can be added to the post in their entirety or used selectively (actually, you have to add them all and then remove those you don’t want to include – a minor irritation). This next plugin, allows you to automatically go through every post you’ve written and tags them using the Calais web service. It’s all or nothing, but following the auto-tagging of archive content, you can then go to the ‘tags’ menu and delete any tags you don’t want to use. I’ve done that to this site and to the Community Posts site. Calais looks for names, facts and events and the API allows for up to 40,000 transactions a day and up to four per second. It returns some predictable tags and a few odd ones, but on the whole is fast and works like magic.

The third plugin also allows blog authors to fetch tags for the post they are writing and, in addition, it also suggests Creative Commons licensed images based on a dynamic evaluation of the chosen or suggested tags.

The tagaroo interface
The tagaroo interface

Image suggestion is a nice idea, but tends to return some fairly generic images.

Having used OpenCalais to auto-tag the Community Posts site, a whole new and richer set of semantic metadata has been added with barely any effort. The challenge now is to figure out how to 1) automate this as a scheduled process, so that the Calais plugin looks for new content every hour, say, and tags whatever has been recently introduced (a cron job that calls the plugin and a modification to the plugin to look at the timestamp of the post and ignore anything older than when it was last run?); 2) present the semantic data in an accessible way and this mostly, I think, comes down to appropriate site design.  The wordpress.com screenshots above show one way of doing it. A del.icio.us style approach is a more powerful and versatile model of tag filtering. Until then, it’s a matter of constructing filters, searches and feeds in the way I’ve outlined above.

So how might all of this semantically structured data be used? It seems to me that most of the advantages are proportional to the quantity of information available. For teaching and learning, it could be used by students and staff who want to find and re-use material that has been posted in the past for a specific course or subject area. Great for new students who want to measure the type and quality of work produced by students in previous years. In a similar way, it could be used by staff looking for posts by colleagues on subjects they might be teaching, and because searches and tags can be turned into feeds, past content could be aggregated into a new course site. A widely adopted, semantically tagged WPMU installation could also reveal trends in the type of work occurring at the university and, by tagging names of people, queries against references to Prof. X’s work could be made (I also wonder whether through the use of feeds, content from the institutional repository could be joined up with all of this, too – but it’s late in the day and I can’t think straight).

You’ll see from the image below that using Calais on the Community Posts site, resulted in a much richer variety of tags than would have appeared if we relied on user-generated tagging alone (136 posts now have 558 tags). Some people don’t even bother to tag their work… Shame on them! Notice too, that with the Firefox Operator plugin, you can take a tag on the site and use it to find related resources elsewhere. So if you’re looking at work tagged ‘client-applications’ on WPMU, you can conveniently hop over to delicious and find further web resources or, on a whim, look at what books on this subject are available on Amazon.

Operator provides a way to use tags on one site to discover related resources on another site
Use tags on one site to discover related resources on another site

Anyway, if you’re still reading, you might remember from the title of this post that my overriding interest in all of this is how it can be understood as and developed into a site-wide ‘architecture’. Again, I’m thinking how user-generated tags have determined the way delicious is designed for navigation and searching of resources. I need to learn more about how WordPress themes are constructed and consider how available functions can be best exploited and usefully presented on this type of site. If you have any ideas or want to work on a specific theme to get the most out of the site-wide-tags plugin, please do leave a comment or get in touch on Twitter @josswinn

OAuth, OpenID, XMPP with WordPress

Automattic, the company behind WordPress, released an update to Prologue, their theme for group discussion, today. I read about this, minutes after reading about the new OAuth features in WordPress 2.8 and an hour or so after reading about a new Facebook Connect plugin for BuddyPress, the social networking layer for WordPress. All this stimulation proved a bit too much for me, so this post is an attempt to plot what’s happening here and what might be possible in just a few months from now…

So, I have the BuddyPress Facebook Connect plugin working on a my test installation…

BuddyPress Facebook Connect

Nothing fancy going on there. Basically, new users to the site can register using their Facebook credentials. The plugin doesn’t do anything for existing users on the site. They just login with their local account as usual. For a first release, the plugin is a good proof of concept and with a bit more integration work will make it easy for Facebook users to join BuddyPress sites.

The new Prologue theme, P2, is impressive, too…

P2 on wordpress.com

It takes advantage of the new threaded comments feature in WordPress 2.7+ , has ‘realtime’ notifications (unless I’ve missed something, the use of the term ‘realtime’ is a stretch – see below) and has some nice keyboard shortcuts…

Keyboard shortcuts

One thing that’s lacking is a Twitter-like realtime notification that a new post has been made and you should refresh your bowser. Twitter doesn’t use it for the user home page, but they do on their search page and I like it.

Twitter notifications

Moving on, OAuth functionality for WordPress is still in development but the latest code from the SVN trunks of both the DiSo plugin and WordPress does appear to work…

OAuth options

Be warned that it does not run on a server where PHP runs as a CGI. I tried to run it first on Dreamhost, but it gave an error showing that getallheaders() is an undefined function.

I need to spend more time with the OAuth plugin to see how it will actually work in practice. One of the first use-cases for it is to allow client applications like the iPhone app, to be able to post remotely without sending a password using XML-RPC. If anyone has any ideas and wants to test it with me, please leave a comment. As I understand from the announcement, it’s working but it’s still early days… For more information, see Will Norris’ presentation from last August.

Finally, there’s mnw, a new plugin for WordPress that provides support for the OpenMicroBlogging specification. With this, users from other sites using the specification, such as identi.ca and other Laconica-based services, can subscribe to your blog/omb site and receive updates whenever you publish a new post or page. So this…

WP OMB…ends up here…

WP posts on identica

mnw is still a bit rough around the edges but it was only released as V0.1 a month ago, so that’s to be expected. Note that mnw only seems to work on single WP installations (WPMU produces a familiar error message which I think is wp_nonce related) and does not work on WP 2.8 trunk. Also, identi.ca complained of my avatar image being the wrong size. In the example above, I’d removed my avatar from the mnw settings, but I’ve since found that a .png of 96px seems to work OK.

What does it mean for me and you?

So, what does all this mean? In terms of wordpress.com, we might speculate that before too long, they will add the BuddyPress layer to their 4.5m blogs to create a sizeable social network. The P2 theme shows posts in realtime, they’re already offering an XMPP firehose of blog posts and there are plugins that offer XMPP functionality for WordPress, so remote real-time updates aren’t far away and realtime remote publishing already exist using XML-RPC. With the P2 theme, anyone can create a Twitter-like site that any number of registered users can post to and anyone can comment on. Add OpenID authentication and OAuth authorisation and you’ve got a large, mature and open social (micro)blogging service.

For self-hosted WordPress users, it’s even closer to being a reality. I’ve had a site running today that accepts new user registrations via the DiSo OpenID plugin and those users can then post updates to the Prologue themed site and join a threaded group discussion. If I enabled XML-RPC posting, users could post in ‘realtime’ to the group site from their iPhone or other other client app. With OAuth support, this would be possible from desktop and mobile applications as well as other sites such as Flickr, without exchanging protected user data such as a password. Those updates could also be broadcast via XMPP in realtime, which I’ve done on another blog I was testing.

WordPress Flickr account setup

Things are a bit different for WordPressMU/BuddyPress installations. As you’ve seen above, I’ve got a BuddyPress site running that accepts users joining via Facebook connect.  Functionality is limited to social networking and it still has some issues that need working on before it’s ready for every-day use (I’ve noted them on the BP forum). WPMU blogs (by which I mean blogs not the overall site) don’t allow new-user registrations so the blog adminstrator needs to sign up new users. Users registered via Facebook don’t have an email address associated with their account, so blog admins can’t add these types of users as the process requires a username and email address of a new or existing user.

However, by activating the right plugins, registered WPMU users (I’m thinking university staff and students) could participate in a group microblog using the P2 theme, LDAP and/or OpenID for login and XML-RPC and XMPP for remote publishing and receiving posts. It won’t be too long before you can send and receive WordPress posts via your GMail or Jabber account (on your iPhone/iPod) in realtime (hopefully with support for tagging), and all of that data is simply WordPress data and has RSS feeds hanging off every tag and wrapped around every post.

Just a thought.

BuddyPress: A university’s social network

To cut to the chase, this post is about using WordPress MU and BuddyPress with enterprise authentication (LDAP) to create an internal/private social network while leaving the blogs, by default, public.

Since May 2008, I’ve been running WordPress MU on the Learning Lab, a Linux server I maintain at the University of Lincoln, for experimenting, trialling and evaluating software that may enhance and support research, teaching and learning. It’s a great job 😉

Of all the software we’ve looked at over the last few months, ‘WordPress Multi-User’, has clearly shown the most potential for use by staff and students at the university. It’s a mature, well maintained, very popular open source blogging platform. In fact, it’s more than that. It’s a web content management system that runs 5 million blogs on wordpress.com and 280,000 blogs on edublogs.org. While evaluating WPMU on the Learning Lab, 65 blogs were registered by 123 users. I didn’t advertise the service at all during this period, preferring to work with individuals on specific projects and get their (informal) feedback. The feedback has been positive. People initially need support but once they’re set up and running, they only tended to contact me when they wanted to push WordPress to do more for them through plugins and custom themes.

During this period, I’ve been watching and doing my best to help with the progress made on BuddyPress, a set of plugins for WordPress MU, developed by Automattic, the company behind WordPress. It’s been interesting trying to get everything to work together at times but over the last few weeks it’s all come together.

BuddyPress Profile

Automattic also develop open source forum software which integrates with Buddypress, too. Jim Groom at the University of Mary Washington pioneered the integration of all three products and I’ve had it working here at the University of Lincoln quite nicely. However, bbPress is still beta software and I’d like to be able to offer privacy options on forums, too, which is currently unsupported (there are some plugins, but they’re not mature enough for our use yet). So currently, we’re running WordPressMU, BuddyPress, an LDAP plugin for WPMU and a privacy plugin that’s commonly used on WPMU installations. It works really well.

I’ve documented some of the set up on our wiki. It’s not been difficult. For the time-being, while BuddyPress matures, I’ve chosen to stick with the default home and members themes, changing just the logo. Forums are, as mentioned above, turned off for now. I wonder if we’ll ever turn them on as the ‘Wire’ (similar to the Facebook Wall) is available and people are used to using services like Twitter and the Facebook Wall to communicate these days. We’ll see what demand there is for forums.

The final set up is really quite sweet. A member of the university goes to https://dev.lincoln.ac.uk for the first time and logs in with their usual credentials. The first time they login, they are signed up. That’s it. No sign up page needed. It’s as if they were already a member of the social network, which, being members of the university, they are of course. From there, they see the BuddyPress home pages, can join groups, change their profiles and, when they’re ready, create or join a blog.

I’ve finally finished setting it up for general use today. The few people that know about it and have already joined, instantly see the benefits of having the social networking layer on top of the blogs. I’m excited to see how this works out over time. It’s not something we’re going to launch in a big way just yet (it’s only me supporting it at the moment), but I’m guessing that it will spread quite quickly through word-of-mouth.

The university web team are supportive and are sending staff and whole departments my way when they want a web site. The IT support team have been trained to use WordPress, should they get enquiries their way. We’ve got a few projects that have been waiting patiently for the new home of the blogs and a number of the Learning Lab blog users are migrating across already. The potential for supporting personalised and group online learning is now better than it’s ever been and the social networking element only helps bring peers together for collaboration and discussion.

Many thanks to Jim Groom and D’Arcy Norman who have been working on WordPressMU at their universities in ways which I hope we can emulate and contribute to here at the University of Lincoln.

Storytlr: Make your social networking tell a story

Storytlr is a relatively new ‘lifestreaming’ service that allows you to aggregate your activity on a growing number of social networking sites  (and other sites that provide RSS feeds) into one single stream that can then be manipulated to create visual narratives within a given time period.  There are other lifestreaming and aggregation services. FriendFeed is one. I use the WordPress Lifestream plugin on another blog, too.

There are several things I especially like about Storytlr that are worth highlighting here:

  • Manipulate the stream: You can edit the title, text content, date and time of each item in the stream, make items private or the entire stream private.
  • Visual Narratives: Create ‘stories’ from isolated feeds within a certain time frame. For example, I might go to a conference and use this blog to report back to my colleagues. However, using Storytlr, I might include Twitter, Flickr and YouTube posts to create a narrative over two or three days. However, I’m probably also using Twitter to keep in touch with other conference participants; things like what time to meet up for a beer or to ask where a presentation is when I have forgotten the room number. Stuff that I wouldn’t necessarily want to include in my report of the conference. Storytlr will allow me to create this conference report selecting specific items from the Twitter, Flickr, YouTube and blog feeds. You can see how this could also be used by students (or staff) who want to tell the story of a project they are working on, or a field trip they’re away on. Several people could share and post to the same account.
  • Some feeds are pulled in realtime: Storytlr uses GNIP to import updates from Twitter, Digg, Delicious and Seesmic in realtime. Increasingly, there’s an expectation that our online activity will show in realtime. RSS/Pull is being replaced by XMPP/Push architectures such as GNIP. No more waiting for RSS feeds to refresh! Watch for news sites like the BBC to start offering realtime news updates using GNIP or similar.
  • Backup to plain text: You can backup/download each of your feeds in their entirety at any time as CSV files.
  • Custom CSS and domain names: It’s your story so why not host it under your domain name in a theme that you have designed?
  • You can share stories on external sites: Once you’ve created a story or aggregated your lifestream, you can then embed it on other sites using Storytlr widgets.
  • Edit, archive, search and republish your lifestream: I use Delicious and Google Reader’s Shared Items to bookmark web pages that I want to share or, more often, bookmark to read at a later date. Storytlr provides a way to aggregate these items, archive them by month and search through them. Nice.
  • Support for Laconica microblogging sites: They support my personal installation of Laconica. It’s the first time I’ve seen this. Support for Identica is growing but it’s nice to see support for other Laconica installations. It’s a distributed microblogging application after all!
  • Forthcoming: It’s early days. They have plans for lots of other features, which users can vote for. Their blog is worth reading, too.

A few issues

  • Login is not secure: There’s no https or lock icon in my browser when I log in and there’s only two of us voting for this feature to be implemented!
  • Home-made: It’s self-financed and being developed by two blokes in their spare time from the living room.
  • Speed: It’s a bit slow. A search through your feeds can take a while. However, the good news is that they’re moving to new servers at this end of January, which should resolve this.

Interested? Here are links to my lifestream and a test story of notes from my christmas break.

Skilling up for WordPress

A post on the WordPress Publisher’s Blog highlights a large increase in the number of new job offerings that include WordPress as a required skill.  The original oDesk report shows that Joomla is clearly the ‘most in demand skill in 2008′, although WordPress has the ‘fastest growing demand.’ WordPress only shows 55% of the demand that Joomla has but the growth is very still very impressive. At that rate, by the end of 2009, WordPress is very likely to be the ‘most in demand skill’ among oDesk’s clients. oDesk is a ‘marketplace for online workteams,’ a ‘a job board for freelance and contract technical jobs‘. Their tagline is ‘Hire, Manage, and Pay remote contractors as if they were in your office.

Of course, according to this video we will all be contractors before too long, quoting the US Department of Labour’s estimate that today’s learner will have 10-14 jobs before the age of 38.

Regardless of how the figures might be (mis)interpreted, the report does suggest that the demand for WordPress-related skills, whether they are technical, administrative or just user-side, is increasing significantly.  A 400% increase in demand for WordPress technical skills means that someone’s got to be managing and posting to those WordPress sites at the end of the day.

When advocating the use of WordPress to the university, I argue that learning how to use online web applications such as blogs and wikis is as relevant to today’s graduates as learning how to use word processors and spreadsheets was a few years ago. My last job was not in ‘technology’, but most of my productive work was done using Confluence, enterprise wiki software that was rolled out throughout the organisation.

Of course, universities are not solely responsible for ensuring students have the right IT skills. Note that 157,690 new blog posts have been made by 170, 828 users on wordpress.com alone today with over 10 million published WordPress blogs worldwide. That’s a lot of people learning for themselves. WordPress is probably the best choice of platform if you want to learn how to navigate around a modern, productive Web 2.0 site. It’s free to use, more popular than Blogger and growing faster, too, and unlike Facebook, you can actually get some work done 🙂

You can read more about wordpress.com statistics here.

History of the Internet, PICOL and CC video

Just a couple of videos which I came across by accident. Both demonstrate how well information can be communicated through animated graphics and images. The first, History of the Internet, “is an animated documentary explaining the inventions from time-sharing to filesharing, from Arpanet to Internet.” I read Where Wizards Stay Up Late this year, which is a compelling read about the same subject. I can imagine the video being used as an effective teaching resource in class with the book included on a reading list.

[vimeo 2696386]

The video looks fantastic in HD on my 24″ iMac display 🙂 One of the reasons for this is the use of the PICOL icons, which are an impressive attempt to “find a standard and reduced sign system for electronic communication.” PICOL stands for Pictorial Communication Language and the icons are CC licensed. While reading about the PICOL project, I came across a decent video introducing Creative Commons, which I hadn’t seen before. I think I’ll use it for my Thinking Aloud seminar later this month.

Facebook to the repository via SWORD

A post to note that I have successfully deposited a document into our institutional repository from my Facebook account using the Facebook SWORD app, written by Stuart Lewis

There’s a few things worth mentioning: It’s a 3.1.1 EPrints IR, hosted at our university and maintained by EPrints Services. EPrints has supported SWORD since v3.1. Originally, the FB app didn’t work for the following reasons:

  • The ‘Depositing on behalf of’ field has to be left empty. I was told by Seb at EPrints Services that this is ‘disabled by default’.
  • The repository URL needs to point at the ‘service document’, not the base URL of the IR. For us, that is http://eprints.lincoln.ac.uk/sword-app/servicedocument
  • We use LDAP for authentication and the IR configuration needed to be tweaked to account for this when depositing via SWORD.
Once we’d overcome these issues, my ‘test.txt’ doc was successfully deposited from my desktop to the University of Lincoln IR via Facebook:
…with a few caveats:
  • The app announced ‘Item Deposited!’ and gave a URL which resulted in a 404 dead link http://eprints.lincoln.ac.uk/sword-app/collections/1738/deposit. I don’t know why. I thought it was because I wasn’t logged in to the IR, but even after logging in, the link was dead.
  • The app (maybe it’s defined in the SWORD spec, I haven’t checked), zipped up my metadata and document, which resulted in depositing two items: My test.txt document and the original zip file were both showing in my item list. This could be because of the way our IR is configured to unpack zip folders. I don’t know.

  • The metadata mapping was partially successful. The referreed status didn’t map across at all and the URL reference I gave mapped to the ‘Identification number’ field in EPrints, rather than the ‘Related URLs’ field, which was what I was expecting. Maybe the SWORD app field could be renamed ‘Identification URL/DOI’ or similar? The title, abstract and my name were correctly mapped. It’s a shame that my email address wasn’t autocompleted as it would be if I were depositing through the normal EPrints workflow. 
Despite these issues, it’s good to see this working in principle and I imagine that the above could be rectified quite easily. Perhaps someone can offer their solutions here?

As Stuart notes on his blog, the main value in this kind of app is the ability to broadcast to your Facebook friends that you’ve just deposited something in an IR. My main gripe, however, would be that it doesn’t make the deposit process any easier, which is what interests me about the SWORD protocol. Working this way…

  • I have to use two applications to make my document public, the benefit being that other people are told about what I’ve just done. 
  • The EPrints URL that the app points to, even if it was working, points to a non-public space, so my friends don’t have a direct link to the document from within Facebook. 
  • The metadata fields in the present version of the app, are not configurable which means that I have to add more metadata through the EPrints interface. 
  • Finally, it does seem odd to upload a document from my desktop to Facebook only to send it to another application and finish off the process of deposit there. It would be more useful, if I could deposit files that I already hold in Facebook. I don’t use Facebook enough to really know if there are apps that allow you to create documents within Facebook, but if there were, then perhaps Facebook could be used as a (collaborative?) working space and the SWORD app used to deposit final versions to an IR.

Outsourcing email and data storage case studies

The JISC published four case studies on Friday concerned with ‘outsourcing email and data storage’. They are quick reads and straight to the point. Pulling together all the ‘Lessons Learned’, we are told the following:

  • Handle the beta mentality – expect things to change, ask not how you can control change but how will you respond to it.
  • Web 2.0 is as much an attitude as any technical standard.
  • Ensure that your contractual and procurement processes allow for the provision of a free service. They may be designed for a traditional system of tendering with providers bidding to provide the service, and may not cope with a bidding system based on a ‘free’ service.
  • Ensure that students and staff are aware of the reasons behind the change.
  • Who is a student and who is a member of staff? If you have a high proportion of graduates who undertake various jobs and duties for the University, will they need a staff or a student email account, or both?
  • What emails and data do you need to keep private and confidential?
  • Are you aware of the jurisdiction that any external third party servers are under?

Useful observations. For me though, what the reports didn’t address was why each university was providing an email address to students in the first place. Isn’t the issue less about ’email and data storage’ and more about having a trusted and portable university identity? Providing a GMail or Windows Live hosted account still doesn’t guarantee that the majority of students would use that email address as their primary address (prior to outsourcing at the University of Westminster, “96% of students did not use the University email system”). I’m assuming that the new, third-party managed email addresses are still *.ac.uk accounts – this wasn’t clear to me from the reports. Having a *.ac.uk account is useful, primarily for online identification purposes.

Personally, I think that the benefit of having Google or Microsoft manage a trusted university identity for students, is not the email service itself (yet another address that students wouldn’t necessarily use for messaging), but the additional services that Google provide such as their online office apps, instant messaging, news reader (all accessible from mobiles) and, most importantly, the trusted identity that is used across and beyond those value-added services. Furthermore, as both Google and Microsoft embrace OpenID, that trusted identity will assume even greater ‘value’ beyond their own web services. Email addresses are well established forms of online identity and most people are happy to have that identity managed by a third-party.

I like the URI approach that OpenID currently uses although I think that adoption will be slow if users can’t alternatively use their email address (i.e. johnsmith@gmail.com, rather than http://johnsmith.id.google.com or whatever Google settles on). Some services do allow that option using Email Address to URL Translation, which highlights the value of having an email address, not for the communication of messages but for the communication of one’s identity.

Anyone with any thoughts on this? It’s pretty simple to get a message across these days but harder to manage our online identities.