Tuesday, August 31, 2010

EPUB Development - The Details


Connexions recently introduced EPUB files for Collections and Modules. EPUB files are electronic books (ebooks) that can be read on mobile devices, dedicated electronic book readers, and on computers running ebook reader (eReader) software. Creating the EPUB files had a unique set of problems and solutions.

Our content is stored as XML files at the Collection (collxml) and Module (cnxml) levels. To convert a Collection to an EPUB, we first retrieve the Collection and Module XML files. Then the files are converted to Docbook and then into HTML which is used to create the EPUB files. Docbook already had a Docbook-to-EPUB transformation that generates the EPUB specific configuration files and directory structures. The Docbook-to-EPUB transformation is geared toward standard books, not textbooks. It was lacking some transformations for biblographic information, annotations and more detailed footnotes. Linkage between Exercises and Solutions was also added. All of this is triggered by a Ruby script that was originally from Docbook.

Because of the small screen size that EPUBs are usually displayed on, a new stylesheet had to be created. The new style has less indentation so the text uses all of the available screen area. Boxes around notes and examples were also modified to keep the separation of the content, but to limit the space taken up by the box.

We tested the EPUBs on a variety of devices. The initial developer testing was done using the ebook reader in Calibre or the Firefox EpubReader plugin. Our formal testing used an iPad and an iPod Touch. We tested using Stanza and iBooks on both devices. More recently, we have tested on the enTourage eDGe. Some outside testing was performed by "friends and family" on a variety of platforms including iPads, iPhones, iPod Touch and Droid phones.

There were several challenges in creating the EPUB files:
  • Handling MathML: EPUB readers do not support MathML, so the obvious solution was to convert simple MathML to text and convert all of the more complex MathML to images. We first tried converting to SVG, but found that although SVG is part of the EPUB spec, it is not supported by ebook readers. Converting to PNG solved the problem. It is supported by all of the readers.
  • Cover Images: Book cover images were needed for the "bookshelf" display in ebook readers. The image needed to be customized with the title for each EPUB and needed to be generated when the EPUB was created. Our team created an SVG image of the cover. When an EPUB is created, the title of the EPUB is added to the SVG and it is converted to a PNG image for the cover. The same image is used as part of the title page of the EPUB. The image is created using part of the as-yet-unreleased SVG1.1 spec for automatic text reflow (using Inkscape).
  • Metadata: CNXML had a problem with Module metadata that was corrected in version 0.7. However, most of our content is still stored as CNXML 0.5. We solved this by creating a CNXML 0.7 version of the CNXML for every module as part of the Collection source zip and the Module source zip. The EPUB generation uses these source zip files to retrieve the XML. This upgraded version of the CNXML allowed access to the needed metadata and will allow developers to use the source zip files without having to be concerned about the version of the CNXML.
  • EPUB Limitations: As we have tested our EPUB files, we discovered some limitations with the EPUB format and readers. Since Connexions authors have not been entering content with EPUBs in mind, some content does not display correctly as an EPUB. We created a set of author guidelines that discusses the limitations found and offers some suggestions for entering content that will display correctly as an EPUB.
  • Offline HTML Zip: There have been numerous requests for a downloadable HTML version of Modules and Collections. Creating EPUBs required the creation of a new HTML version of the content, so it seemed like a good time to offer an Offline HTML. We soon discovered that what looked good in an EPUB reader does not always look good in a browser. Our original thought was to reuse the HTML generated for the EPUB, but that was abandoned for a separate HTML generation that corrected the problems. See the downloads help page for more information on the files available.
The EPUB files are an important step in allowing Connexions users to have their content easily available on multiple platforms. We welcome any feedback so we can continue to improve them.

Monday, August 30, 2010

Introducing Minh Do

“Next in our series of introductions of the members of the Connexions Consortium Technology Committee, I am delighted to introduce Minh Do, the director of a team that was one of the first to embrace Connexions and Enterprise Rhaptos internationally. The Vietnam Foundation has been working with Connexions and running Rhaptos in Vietnam since 2006 and they have been pioneers in the areas of internationalization, proxy caching, language support in PDF generation, customization, training and local support for Rhaptos. Their vision for a 21st century education in Vietnam is inspiring and their commitment unwavering. Minh and his team have been trusted advisors in the development of Enterprise Rhaptos and they have been key beta testers for new features in the software, offering their time on short notice and helping ensure quality."
Kathi Fletcher -- Technology Director and Project Manager at Connexions

I am Minh Do and I am the Vietnam Open Education Resources (VOER) Program Director at the Vietnam Foundation (VNF). My background is in eLearning, open course ware, and information technology. Before joining the VNF, I was the Vietnam Open Course Ware Program Manager for the Vietnam Education Foundation (VEF). And prior to that, I was the Deputy Director of the Center for e-Learning and Online Testing Technology at the Information Technology Institute (ITI) at the Vietnam National University in Hanoi (VNU Hanoi). There, I coordinated the "in-country training course," for the second phase of the "Vietnam Information Technology Training Project" hosted by JICA Japan and VNU Hanoi.

My involvement with Connexions began in June 2006 at the CNX training workshop at VNU Hanoi. I met Hung Tran from the VEF technical team and Rich Baraniuk, Sidney Burrus, and Chuck Bearden from the CNX team. I was impressed by the software features and the LEGO idea and kept in touch from that time on.

My e-Learning center collaborated with the VEF team to increase the Open Coursware (OCW) activities in Vietnam by installing a version of Rhaptos in Vietnam and training users how to use Rhaptos to create and share their knowledge. We volunteered to translate the Rhaptos interface into Vietnamese and I myself went around the country with Hung Tran to set up the caching servers to prepare for the launching of the VOCW Program. Day by day, I got more and more deeply involved in VOCW. Finally, I decided to fully join the VEF team and lead the program with Hung Tran.

The transition from VOCW to VOER is quite a long story, but now, let’s talk about where VOER is heading with its “3 development legs” model (Technology, Content, and Community):

  • Technology Development: Having an innovative, reliable, and inexpensive platform plays a vital role for any OER project. Thanks to the release of Enterprise Rhaptos (ER) from Connexions, we were able to deploy a robust software platform for VOER with a major data center located in Hanoi. The VOER website is now available for free public access from the Internet. We are also cooperating with an open source development company named NextG Solutions. NextG specializes in Zope/Plone solutions and will test and customize ER, and develop more functions for ER as the community needs. A support center named Connexcent (Asia-Pacific Connexions Support Center) was established under the umbrella of the VNF in order to provide best practices and support services to the regional community in particular and the global community in general.

  • Content Development: Besides the technology, content is the heart of VOER. To provide the seed content for the project, we aimed to collect teaching materials from various sources and make them available on the VOER website. Connexions Rhaptos makes the usage of textbooks more effective by dividing a textbook into independent parts, or modules, and allowing combination of modules from different textbooks to create a new one. In the first year of VOER, we plan to publish 20,000 modules on Rhaptos. The first source of available textbooks is a library of 1000 textbooks from MOET (Ministry of Education and Training). Other sources include universities, companies, and research institutions. After just a few months working with these organizations, we have around 10,000 modules available in VOER.

  • Community Development: The last leg makes the project sustainable. We are in the process of building the community for VOER by cooperating with our partners in technology development and content production, providing training to students and faculty members on using and sharing content, and sharing information about VOER with the public. At the moment we have established the Alliance for Vietnam Open Learning Technologies (VOLT); VOLT has early members from important entities in Vietnam such as FPT (the largest technology company in the country), the Institute of Sociology (the leading research on sociology in the country), the Online Management Training Company, the Center for Promotion of Advancement of Society (a non-profit organization, belonging to the Vietnam Union of Science and Technology Associations) and the Hanoi Obstetrics And Gynecology Hospital (the leading and first-of-its-kind hospital in Reproductive Health Care in Hanoi). Members of this community are now using customized versions of Enterprise Rhaptos to create and share knowledge both internally and externally. In addition, we have won participation commitments from major universities in Vietnam such as the University of Danang, the University of Education, and the Vietnam National University in Hanoi and HoChiMinh City.

Based on my experience with VOER, as a Connexions Consortium Technology Committee member, I really want to help accelerate all the “3 development legs” activities for both Connexions and Rhaptos. The software needs to be more user-friendly, more flexible to customize, as well as able to handle large amounts of uploading and sharing. The community needs expansion beyond individuals to organizations like universities or institutes, because the number of users will determine the success of CNX. Besides that, content on different Rhaptos instances should be linked together to help people use modules and collections more flexibly and easily. As I am focusing on the development of Enterprise Rhaptos, I hope our technology team will be instrumental in the very near future in designing and implementing a way to link all the needed modules from any Rhaptos repository into collections.

Please visit the links below to see more about what we are doing.

Monday, August 23, 2010

Introducing Roché Compaan

“I am pleased to introduce technology committee member, Roché Compaan of Upfront Systems. Roché's expertise in software development and Plone and Zope technologies has been invaluable, and Upfront Systems has rapidly become expert in Connexions/Rhaptos development. Over the past year and a half, Upfront built several new lens features used by Siyavula, a proxy-cache to speed up access in South Africa, the new Connexions rating system, and the new Express Edit feature that helps authors quickly check out content for editing and helps readers derive a copy to adapt. Roché's performance expertise and advice led to a configuration change that halved the time authoring tasks consume. We are very lucky to have his involvement in the Consortium and Technology Committee."
Kathi Fletcher -- Technology Director and Project Manager at Connexions
In 1998 I co-founded a software development company called Upfront Systems located in Stellenbosch, South Africa. Very early on I felt myself drawn to the open source movement and thought that this was a very healthy an productive protest against the establishment. I didn't think it was crazy to build a business on open source principles, but my partners did and since 2000 I have been the sole owner of the company.

Around 2001, a colleague of mine encouraged me to look at Zope. It was a web framework that was years ahead of its time. It was a significant departure from the then common cgi style web apps and it boasted an object database, a multi-threaded web server and publisher that could traverse and publish objects. I have been involved with Zope and the community around it ever since, and saw Plone grow up to become one of the major content management systems in the world. Upfront Systems was the first Zope solution provider in South Africa and we contributed a Zope and Plone training course to the community early on. We used this same course to train developers at Computer Associates in New York when they had a brief flirt with Plone in 2004.

About 3 years ago I met Mark Horner, the Open and Collaborative Resources Fellow in the Shuttleworth Foundation. As part of his Siyavula project, Mark was looking for a platform to use for the publication of a whole curriculum of workbooks bought from a private school in South Africa. Connexions caught his eye, mainly because it did a darn good job at producing printed books by using Latex for typesetting. Mark is a Latex junky. So when Latex junky and Plone pundit met, no other framework stood a chance. Admittedly the patience and charm of the Connexions team had a lot to do with the choice to go with Connexions as a platform. Over the past few years we've undertaken numerous 20 hour trips to Houston to scope and plan the development of extensions to Connexions. The extensions helped us present content in a way that is a familiar to South African teachers, while ensuring that the features that we develop are generally useful to other Connexions users.

We were the first external development team that worked on features that would be released on cnx.org itself. It shouldn't come as a surprise that this wasn't smooth sailing in the beginning. But I believe this is exactly what Connexions needed - a remote development team that can help surface development practices and knowledge that were mostly held by members of the Connexions team and not visible to the outside world.

As a committee member I would like to focus on growing the Rhaptos developer community. As a long time member of the Plone community I will naturally look there to recruit developers. At the upcoming Plone Conference in Bristol I will lead a sprint where we will start the migration of Rhaptos to Plone 4.0. Rhaptos is still running on Plone 2.5 and moving it forward to the lastest Plone version would make developing for it significantly more attractive for existing Plone developers.

Looking forward to see you all at the Plone Conference in Bristol!

Introducing the Connexions Consortium Technology Committee Members

Over the next few weeks, the Connexions Consortium Technology Committee members will be introducing themselves on this blog to the wider developer community. The Connexions Consortium is a group of organizations and individuals, including the world's foremost leaders in education, who work together to advance open source educational technology and open access educational content.

The technology committee is responsible for the technical aspects of the Connexions Consortium. Among other things, the technology committee is responsible for the technical development, implementation, and maintenance responsibilities that occur on the Connexions platform. The committee reviews and makes recommendations to the Board about the technical affairs and policies of the Connexions Consortium.

We are lucky to have a really outstanding and committed group of eight individuals on the technology committee. They participate in conference calls together about every 6 weeks, and they provide expert input to Connexions, the Connexions Consortium, and Connexions and Rhaptos partners. I look forward to introducing them to you.