Announcing the Curator’s Workbench
I am proud to announce this new desktop tool, which is definitely the coolest software I’ve worked on this year. It solves several problems we faced in submission work flow and we hope it can dramatically speed up processing for large collections with custom metadata. The features break down into three vaguely overlapping categories, those being capture, rearrangement and description.
Here are some screenshots of the interface:
This screenshot shows the project tree to the left and a MODS editor on the right. The user is editing the MODS elements for a single folder called “TUCASI”. The attributes of the selected MODS name element are editable in the properties view in the lower right quadrant.
The most novel feature and the one I most want to highlight is batch metadata crosswalks. The screenshot above shows a crosswalk editor, which consists of a canvas and a palette of widgets. The end user can construct a pretty sophisticated mapping of custom metadata to MODS by “visual programming”. By dropping widgets on the canvas and linking them together, they define how a field becomes an element. Presently the editor only supports tab-separated metadata sources, but as time allows we plan to extend the feature to support any delimited file and XML sources.
Whenever a crosswalk definition is saved, it is used to generate or regenerate a set of MODS records. These MODS records can be automatically associated with files and folders through a matcher widget on the canvas, which works as long as you have file and folder names in your custom metadata. Otherwise you can drag and drop a MODS records onto the appropriate item in the arrangement.
This visual programming and automation of crosswalks saves a lot of valuable time on the part of curators and programmers, who would otherwise be engaged to create custom scripts for each new custom metadata format. Since we are collecting data from disparate parts of the university, each collection may come with a unique descriptive metadata format, often manually created spreadsheets or discipline-specific XML. It’s just not resource efficient to create custom scripts for most incoming collections. The crosswalk feature lets us migrate literally thousands of descriptive records at a time and link them to data objects without new software development.
The last feature to mention today is staging of files. I designed the workbench to process large numbers of files and folders in one submission. However repository ingest happens via a web interface, which is not the most reliable way of transmitting thousands of large files let alone a SIP containing such numbers. So we needed to stage files in advance. The diagram above shows how data flows from incoming data to staging, archival and access storage. Individual users have accounts in a staging area within our iRODS grid. Files placed there by the workbench are readable by Fedora at ingest time, when they are copied into archival storage.
This approach comes with several advantages:
- There are no data transmission failures at submission time
- The transmission of files to staging can be incremental, controlled and “paranoid” with a checksum comparison
- The workbench can inform users of staging issues as they arise, so they can be addressed before submission.
- Files are staged in the background while you work on arrangement and description
- There are efficiencies to be gained at ingest time, when copying from a staging grid location to an archival grid location.
Some Notes on the Software Technology
The workbench is built upon a considerable pile of open source code and standards, including the following:
- Eclipse Rich Client Platform (RCP)
- Eclipse Modeling Framework (EMF) and Graphical Modeling Framework (GMF)
- METS XML for project definition files and submission files
- MODS XML
- iRODS jargon client libraries
The Eclipse RCP is extensible via the OSGi framework. This means that parts of the tool can be made modular and/or mashable to better fit non-UNC environments. This will require some refactoring that we need to do anyway, but most of it is already there with OSGi.
One module that I’d like to see is a way to integrate Google Refine into workflows. This seems like a natural fit for cleaning up custom metadata and normalizing various sources before crosswalks are applied.
Another modular area would be export for submission. The current implementation transforms our internal METS project definition into a submission METS for ingest into the CDR. Needless to say, this submission METS is in a CDR-specific profile. So a natural extension point would be to support other export modules for other repositories.
The BETA software is available for download, experimentation and use. We cannot provide any support, but we do welcome your comments here or contact us directly. Oh yeah, you may only download and use the software at your own risk. See our download page.




[...] This post was mentioned on Twitter by Erin O'Meara, gregj. gregj said: Phew, finished my first real post on the CDR blog. http://www.lib.unc.edu/blogs/cdr/index.php/2010/12/01/announcing-the-curators-workbench/ [...]
Tweets that mention Announcing the Curator’s Workbench at Carolina Digital Repository Blog -- Topsy.com
1 Dec 10 at 4:16 pm edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
[...] available for download to get feedback from the larger community. Please check out our earlier blog post about [...]
CDR update for Fall 2010 at Carolina Digital Repository Blog
15 Dec 10 at 2:52 pm edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
This looks really slick Greg! Good job. I noticed the downloads are all binaries. Any plans for releasing the source?
Seth
17 Dec 10 at 10:40 am edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
Seth – we hope to have the code available soon. If you DM Greg, he might be able to share.
eomeara
21 Dec 10 at 11:10 am edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
[...] looked at some useful tools out there including DROID – digital record object identification, Curators workbench – useful tool from University of North Carolina, creates a MODS description and Archivematica – [...]
JISC Beginner's Guide to Digital Preservation » Blog Archive » Approaches to Digitisation
11 Feb 11 at 4:05 am edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
Greg, nice job!
Daniel Aldrich
12 Jul 11 at 4:37 pm edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
[...] versus not; physical versus online access; and dynamic versus static. Erin described her use of Curator’s Workbenchwithin FOXML and Solr to control access permissions and assign restrictions and roles to e-records. [...]
Professional Development – Audra at SAA, Day 1: Collecting Repositories and E-Records Workshop
29 Aug 11 at 7:30 pm edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
hmmm… yes yes, agree!
liebherr šaldytuvai
12 Nov 11 at 2:29 am edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
Hi there, just became alert to your blog through Google, and found that it’s really informative. I am going to watch out for brussels. I’ll be grateful if you continue this in future. Many people will be benefited from your writing. Cheers!
Allegro Zwierzeta
19 Nov 11 at 10:16 pm edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
Great article I like it keep them coming
Sarrah Philips
22 Nov 11 at 10:52 pm edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
[...] workbenches, perfect fo… More >> Workbenches: From Design And Theory To Construction And Use ISBN13: 9781558708402Condition: NEWNotes: Brand New from Publisher. No Remainder Mark. Product Desc…m/images/I/51cbVD9GUDL._SL160_.jpg" /> ISBN13: 9781558708402Condition: NEWNotes: Brand New from [...]
Workbenches: From Design And Theory To Construction And Use : Home and Garden: Roses Vegetables Tomatoes Composting
24 Nov 11 at 9:49 pm edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
Great post Greg. Nice software!
Rod Fewer
1 Dec 11 at 9:53 am edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
[...] of PINTOY – Workbench : [wpramaprice asin="B003L76WB0"] PINTOY – Workbench Pintoy Junior Work Bench PINTOY – Workbench Chunky wooden bench complete with ove…float:left;margin: 0 20px 10px 0;" [...]
PINTOY - Workbench
16 Dec 11 at 12:17 pm edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
great post Greg and good software
i will bookmarked your blog .
thank you greg
reza
29 Dec 11 at 11:13 pm edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
[...] oftenn have people that have some wood like miskeete or pinePowered by Yahoo! AnswersDonald asks…need to obtain plans for jewelers two sided work bench, also looking for sites which h…class="answer">Admin [...]
Your Questions About Wood Projects Free | Woodworker Plans
31 Dec 11 at 1:46 am edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
yes best article for horseracing
horse racing system
3 Jan 12 at 4:36 am edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
good software
Marta
4 Jan 12 at 9:38 am edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
I hadn’t heard of this, but useful to know. Thanks for sharing…
Byan@Phone Tips and Tricks
13 Jan 12 at 9:47 am edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
Awesome Software thanks for posting !
Joomla Real Estate
Francisco d'Anconia
17 Jan 12 at 1:15 pm edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
it’s really good post, i think this is very useful. thanks for sharing with us
readable books
22 Jan 12 at 11:03 pm edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
Greg, I am going to bookmark your blog, too. Really great post! Good luck!
Julie Homes
1 Feb 12 at 6:56 am edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
[...] of Pintoy Junior Wooden Table : [wpramaprice asin="B0002VZ11U"] Pintoy Junior Wooden Table Pintoy Junior Chair (Pink) Wooden chair with a pink seatDelivered flat pa…ages/I/41JYESHXMQL._SL160_.jpg" />Pintoy Junior Wooden Table Pintoy Junior Chair (Pink) Wooden [...]
Pintoy Junior Wooden Table
1 Feb 12 at 11:17 pm edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
Komentarze pozytywne zawsze trzeba wpisywać jak najszybciej, więc po przejrzeniu strony postanowiłem dorzucić swoją cegiełkę do zbudowania dobrego jej wizerunku. Należy się jej pełen szacunek, z tego powodu, że lubię na niej spędzać czas, a nie mam go niestety zbyt wiele. Jesteście dla mnie odskocznią relaksacyjną, jeżeli można się tak wyrazić. Lubię czytać artykuły, które moim zdaniem są na wysokim merytorycznym poziomie. Wiem z doświadczenia, że rzadko to się zdarza. Przykre, ale niestety prawdziwe. Cóż, mam nadzieję spotykać na swojej drodze samych takich dobrych autorów. Do zobaczenia niedługo. Przyjaciel strony.
Piotr
10 Feb 12 at 6:10 am edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>
This is awesome stuff you posted. I really enjoy reading it and very informative.
Horse Racing System
12 Feb 12 at 3:11 pm edit_comment_link(__('Edit', 'sandbox'), ' ', ''); ?>