OpenRefine (formerly Google Refine) is a fantastic tool for cleaning data. We initially explored the software because we had heard it was a great tool to help us transform our metadata into RDF and because we could reconcile with other data sets like the Library of Congress subjects headings. But even if you are not ready to start creating linked data, OpenRefine is great tool to help clean up your data.
In our current digital asset management system, CONTENTdm there are lots of things we can't do using the provided Project Client that helps us manage our metadata creation. We can't make global changes, add fields on the fly, sort in a robust way, see the contents of fields and break apart elements, and our most serious GRIPE....the software does not have an UNDO button in case of mistakes. Many a time we have "filled down" an entire column ruining hours of work and cursing the software developers for not building this simple function.
But OpenRefine has all these features and much more. Here is a great intro video to get you started. Once you see the power, you will be hooked! Below the video are links to other resources, including a fabulous book that is chock full of OpenRefine recipes and functions.
And this excellent book, Using Open Refine
Please feel free to share with us any ways that you have found OpenRefine to be useful in working with your data.