Monday, 8 September 2008

Some Success!!

After some weeks of searching, experimenting and generally (seemingly) getting nowhere, If have finally (thanks to JISC forums & Sam Rowley) have worked out a way to get Word docs marked up against a Metadata Schema and an XML files created - which can then be used to populate Metadata field in the HIVE repository.

After last week's meeting, it was agreed that we would use/explore the IEEE LOM metadata scheme for learning items.

See: http://standards.ieee.org/reading/ieee/downloads/LOM/lomv1.0/xsd/

Marking up Word docs for Metadata (XML)

I was working on applying a schema to a Word doc for the purpose of marking up in order to create an xml file from it. Thankfully, I managed to get Word to recognize and validate the schema I was trying to use in Word. I managed to get this to work by downloading all the xsd files from the IEEE standards website and recreating the file structure (as referenced in the xsd files too) in my local drive - see image right.

To apply a schema - go to 'Tools' > 'Templates and Add-ins' > 'Add Schema' (Word 2003)

It is a bit buggy (I don't know whether the LOM schema are written well) - anyway to get around problems of Word not finding missing 'namespaces' or some not being declared well - I used the following in conjunction with the primary lom.xsd:
  • extend\custom.xsd
  • unique\loose.xsd
  • vocab\loose.xsd
You can change the schema settings and browse for versions to be referenced.

To start - apply the schema to the whole document - then highlight text and select the LOM element it is to be marked up with.

One thing of note - Word does not like you to have any extraneous text/data in the Word doc before you 'save-as' XML doc (save data only). If you do not want the bother with removing this 'extra text/data' - you can force Word to save the XML file without validating the schema (see XML options). This doesn't seem to compromise the process of using this XML file to populate the Metadata fields when uploading a document to HIVE (using the Harvest Road Explorer 3.0 screen).

The schema type I told HIVE I was supplying is IMS 1.3.

The image supplied shows the validation screen in Word for the XML metadata bindings - the 'X' logo indicates that there is a validation issue - this disappears when you get rid of the extra 'un-marked-up' text/data.

If you want more details about this 'uploading' of files to Hive using Harvest Road Explorer and the XML files - please see previous Blog entries.

1 comment:

FlorenceinSummer said...

This is great Ben - this will help me with the Enable project. Especially if I can have extraneous data in the word document (lets face it not all content needs to be XML tagged to go into HIVE)