Whither the ebook index?—Erin Mallory (ISC conference 2014)

Erin Mallory is the manager of cross-media at House of Anansi Press, which has been publishing ebooks (in addition to its print books) since 2009. Mallory launched the Indexing Society of Canada’s 2014 conference with an overview of the current state of ebook indexing workflows.

Ebook formats

Ebooks come in three main formats:

  • PDFs support some multimedia and interactivity and are easy to create but have limited sales channels. The static format of PDFs makes them popular for technical or reference books but may create poor reading experience for readers using certain devices (for example, trying to read on a smartphone).
  • EPUB is the most popular ebook format and is essentially a self-contained website, using XML and CSS. Text is reflowable. EPUB is a neutral, standard format compatible with all current e-readers except the Kindle. EPUB 2, still the most commonly used version, is based on HTML 4 and CSS 2. EPUB 3 is a newer format, with many improvements in functionality, accommodating languages that read vertically or from right to left, as well as MathML.
  • MOBI is also based on XML and CSS but is proprietary to Amazon and is compatible only with Kindle devices and apps.

The main reading engines are:

  • Adobe Reader Mobile SDK, which renders ebooks on Adobe Digital Editions, Kobo, and Nook.
  • WebKit, which renders ebooks on most mobile e-readers, including the iPad, and browser-based e-readers.

Publisher’s considerations

Ebook indexes are really only useful if they are fully hyperlinked. Until recently, hand coding each hyperlink was the only way to create a fully functional ebook index, so publishers had to consider the return on investment. Not only is creating an ebook index time consuming, but proofing the index adds time to the quality-assurance process.

Further, the publisher has to consider what devices its audience is using. First-generation Kindles and Kobos don’t support hyperlinking, and not all e-readers support a “back to” function.

Because of these limitations, Anansi decided when it launched its ebook program in 2009 not to include indexes in ebooks at all. Today, the publisher has adopted a workflow that has streamlined some aspects of ebook index creation.

Recent improvements

Scripts for Adobe Creative Suite 5+ can be very useful; some auto-generate cross-references in a formatted index that are maintained when exported to EPUB. The scripts aren’t perfect, so some (about half) of the links still need to be hand coded. These scripts use styles, so if a designer hasn’t properly styled the index, they won’t work properly.

There are also scripts that convert an external index (for example, one created in Word or a program like Cindex) to create an index in InDesign that is maintained on PDF export.

The Creative Cloud version of InDesign allows for linked indexes to be exported into EPUB. Publishers can be reluctant to relinquish control of their InDesign files to an indexer, but Mallory acknowledges that if professional indexers can save the time by embedding the index, publishers may have to push aside their reluctance and find ways of working with them.

Project considerations

For each project, ask yourself the following:

  • Does your ebook need an index?
  • Does the index have to match the print book?
  • What devices will your readers use?
  • Can the index be adapted to better serve the digital reading experience?
  • Can you change your indexing workflow to simplify the ebook index creation process?
  • What kind of markers do you want to use?

Mallory points out that in an ebook, using page numbers may not make the most sense. Some indexers in the audience remarked that seeing a page range communicates important information about subject coverage. InDesign indexes can allow the range to be listed but link only the first page number.

(On Day 2 of the conference, Judy Dunlop gave an excellent summary of the workflow she used in a recent project doing embedded indexing in InDesign Creative Cloud. Post coming soon!)

Resources

Indexing Society of Canada and Editors’ Association of Canada conferences 2014—personal highlights

I’m back from four and a half days in Toronto, where I attended ISC’s and EAC’s national conferences. As in previous years, I’ll be posting summaries of some of the talks I attended—a process that, as I’ve learned, will take me several weeks. Both conferences were excellent, featuring a variety of sessions that appealed to novices as well as seasoned pros and that tackled not only the technical aspects of indexing and editing but also the business side of freelancing. Best of all was being able to see old friends and pick up conversations as if no time had passed, as well as meeting new colleagues and putting faces to names.

My days were packed: I had the privilege of introducing indexing superstar Enid Zafran at her talk about indexer–author relations at the ISC conference, and at the EAC conference I ran a two-part senior editors’ unconference: at a lunchtime session on Saturday, editors shouted out topics they wanted to discuss. I recorded the topics on a flip chart, then, with the help of sticky dots, the editors voted on their favourite ones. I ranked the topics based on votes and created our discussion agenda for our session on Sunday. It was impossible to get through all fourteen of the proposed topics, and it would have been great to have more time, but in general I thought the format worked reasonably well. It also helped that we had a great group; I’m consistently amazed by how much can happen when you just get a bunch of smart people talking to each other about what they know.

The highlight of my week, though, was the EAC banquet. Not only did we learn from Moira White that EAC has established a new award—for a person or organization that has helped advance the editing profession—in memory of our late friend Karen Virag, but we also saw Certification Steering Committee co-chairs Anne Brennan and Janice Dyer acknowledged for their enormous volunteer contributions to the association. Both won the President’s Award for Volunteer service—a well-deserved and long-overdue recognition of the hours and hours and hours of work they put into steering the certification program. Congratulations go out to all the President’s Award winners, including Lee d’Anjou Award–winning volunteer of the year, Michelle Boulton. (Just as note to the national executive, I would have loved to hear what these fantastic volunteers had done for EAC, not just their names! Please consider a giving one-sentence summary of each volunteer’s contributions at next year’s banquet.)

Congratulations, also, of course, to Claudette Upton Scholarship winner Daniel Polowin, and to University of Alberta Press’s Peter Midgley, who finally, finally received the Tom Fairley Award for Editorial Excellence he so deserves.

For me, the most exciting part of the evening was being able to present, on behalf of the Certification Steering Committee, designations of Honorary Certified Professional Editor to six pioneers of EAC’s certification program. Without them, the program simply wouldn’t exist. As someone who’s benefited tremendously from certification, both as a CPE and as a CSC member who’s had the privilege to work for the past two and half years with some of the most brilliant, funniest people I know, I want to thank and congratulate these champions, mentors, and friends for their dedication: Lee d’Anjou, Peter Moskos, Maureen Nicholson, Jonathan Paterson, Frances Peck, and Ruth Wilson. I would not be where I am today without them.

If anyone has any photos of the presentation they could send me, I’d be grateful for them. Believe me—the amount of restraint it took to keep from spilling the beans about this surprise was enormous!

Book review: Starting an Indexing Business

You’ve taken indexing courses. Read the indexing chapter of the Chicago Manual of Style and Nancy Mulvaney’s Indexing Books. Bought yourself indexing software.

Now what?

For most would-be indexers hoping to start their own freelancing business (as many of us are now aware), the actual indexing work isn’t the biggest challenge. Getting that work, not to mention managing the financial and administrative details of self-employment, is the tough part, and it’s one that gets very little attention in most indexing reference books. Starting an Indexing Business, edited by Enid Zafran and Joan Shapiro, is a rare exception, offering people who are launching—or considering—a career as a freelance indexer some insider wisdom about running their own business.

The fourth edition of Starting an Indexing Business was published in 2009, but it was recently released as an ebook. With chapters about moonlighting as an indexer while holding down a full-time job by Melanie Krueger, the business of being in business, by Pilar Wyman, and liability and exposure issues for indexers, by Enid Zafran, this book tries to answer a lot of questions that a freelancer just starting out might have. It’s a quick read, and it’s packed with tips from indexing veterans who have spent years in the trenches. Seeing the issues from different indexers’ perspectives is helpful, and the diversity of contributors shows that, despite having similar traits that make us good at what we do, different indexers take different approaches to running their business. Particularly interesting is the debate about whether to invest in disability insurance, with Wyman advocating for it and Zafran saying she didn’t see the need.

Zafran’s chapter about liability has a lot to offer, spurring the reader to think about how best to protect their business and to assert their copyright to make sure they get paid. A sample letter of agreement for indexing services also appears as an appendix to the book, and it serves as a helpful tool for freelancers to communicate clearly with a new client and start off their working relationship on the right foot.

Although the book has plenty of solid advice for new indexers, much of it will be old hat to people who have had a few projects under their belt. Being five years old, it also needs an update. I suspect that cold calling and mailing out brochures to prospective clients, as marketing strategies recommended by a few of the contributors, have largely given way to email enquiries and websites. I would also hope that a fax machine is no longer a must-have in the home office. Workflow and file transfer technologies have also evolved dramatically since the book’s publication, and ebooks and self-publishing have exploded. Further, the book is geared toward a primarily American audience, with references to health insurance and U.S. taxes that wouldn’t apply to Canadian indexers.

New freelancers may find Starting an Indexing Business helpful, although I wouldn’t call it a must-read. For those with a few years’ experience already, there isn’t much in this book that you won’t already know. And beyond the sample letter, I don’t see much in this book that you would refer to time and again, so I’d be inclined to borrow it from the library, if you can. If you do want to add this title to your collection, I’d suggest waiting for an updated edition, so that the advice better reflects current practices and technology.

Back to school: A self-indulgent personal post

This week I got an official letter of acceptance to the PhD program in SFU’s Faculty of Health Sciences, where I’ll be studying knowledge translation. In particular, I’ll be looking at ways to apply plain language principles to mental health research to make it more accessible to patients, practitioners, advocacy groups, and policy makers. I’m thrilled by the prospect of applying my editorial skills and clear communication knowledge to increase health and scientific literacy.

Although I’m heading back to school, in no way will I be leaving publishing; I adore my career, and my plan (although plans may change, of course) is to come right back once I’ve completed the degree. In the meantime, I’ll be dialing down the amount of publishing work I take on to a small handful of projects a year so that I can focus on my research.

I’ll also be drastically cutting back on my volunteer commitments with organizations such as the Editors’ Association of Canada. Over the past two years I’ve been a member of EAC’s Certification Steering Committee, which oversees the national program that certifies editors who have demonstrated excellence in proofreading, copy editing, stylistic editing, or structural editing. This committee is made up of some of the smartest, funniest, and most dedicated people I know, and working with them on projects to promote and strengthen the certification program has been a huge privilege. Leaving this collegial, optimistic, and productive group in August will be bittersweet.

At the branch level, I’ve worked with Frances Peck for the past two seasons (and with Micheline Brodeur last year) on the EAC-BC Programs Committee to set topics and invite speakers for our monthly meetings. We managed to put together an impressive lineup of speakers on fascinating subjects from forensic linguistics and cartography to subcontracting and the evolving role of libraries. Our ideas have spilled over into next season, and whoever takes over on the committee next year will be able to hit the ground running.

I can’t emphasize enough that my experiences on these committees—not to mention the professional relationships and friendships I’ve forged—have been tremendous for professional development, and I urge anyone considering volunteering for EAC to seize the opportunity. I will still be an active EAC member, and I am still happy to volunteer for small jobs here and there or for one-off events, but I’ll no longer have the time for ongoing committee work. If there’s still demand after this year’s PubPro unconference, a peer-driven professional development event for publication production professionals, I would be more than willing to run it again. And I still hope to attend EAC meetings and conferences and write up what I’ve gleaned from the sessions on my blog (although once I’m off the Programs Committee, I may allow myself to miss the odd meeting).

Speaking of my blog, my intention is still to post regularly on editorial, indexing, publishing, and plain language topics, but you might start seeing a bit more of a knowledge translation, health literacy, or mental health bent to my writing. Realistically, though, I won’t have time to do any more book reviews once school starts up. I’d love to keep crapping out my dumb little cartoons, but I might not be able to keep up with my monthly schedule.

Finally, I’d love to keep teaching in SFU’s Writing and Communications program. Changes are afoot in how those courses are being offered, though, so I’m not sure if I’ll still have a role to play. If it turns out that I will, I’ll be sure to post news about upcoming courses.

I’d like to thank all of my friends, colleagues, and mentors who have given me encouragement and advice as I’ve plotted this next step, which I have wanted to take for a long time. I feel incredibly lucky to be surrounded by so many amazing, supportive people.

Cookbook indexing in Microsoft Word

I’ve just wrapped up a cookbook index, and while I was putting it together I found myself referring to notes I’d made a while ago for a friend who wanted to do cookbook indexing but didn’t want to invest in indexing software. When I worked in house, I’d prepared several cookbook indexes using only Microsoft Word and figured out, through trial and error, a reasonably efficient system. I figured I’d share my notes here for anyone interested. If you have a client with a specific house style, you might have to adjust the approach a bit.

What follows isn’t a guide for writing a good cookbook index. For that kind of information, I’d suggest “A Piece of Cake? Cookbook Indexing–Basic Guidelines and Resources” by Cynthia D. Bertelsen and Recipes into Type by Joan Whitman and Dolores Simon (relevant excerpt about indexes here). The notes below are just a step-by-step system you can follow to take advantage of Word’s functions when creating a cookbook index even though it ordinarily isn’t a great program to use for indexing.

***

Specialized indexing software is invaluable if you’re indexing most nonfiction titles, but a cookbook index has a straightforward structure that Word can easily accommodate.  The key is to keep the following in mind:

  • As tempting as it might be to sort as you go along—as indexing software allows you to do—don’t. You’ll have a much easier time if you alphabetize near the end.
  • The pages may not be final when you start data entry. Be prepared to adjust your locators if they move around.
  • Microsoft Word does not sort letter by letter; you may have to go through your index at the end and tweak the ordering of the entries.

1. Data entry

a) Start with the first recipe. Key in the recipe title verbatim (or copy and paste from a PDF), along with the page range. If the recipe has a photo, add that page number in italics.

Type the title in as it appears if it starts with a descriptor:

Deen’s Buttered Bacon Rolls, 108–9, 109

If the title starts with a main ingredient, state the main ingredient category first, followed by a comma. Keep everything on the same line for now.

chickpeas, Chickpea, Green Onion and Quinoa Salad, 54–55

b) Copy the recipe title and locator (the highlighted part):

chickpeas, Chickpea, Green Onion and Quinoa Salad, 54–55

c) Paste the recipe title and locator after keying in all other main ingredients and broad categories (like “beef,” “fish,” “salads,” “sauces,” etc.) on separate lines:

green onions, Chickpea, Green Onion and Quinoa Salad, 54–55
quinoa, Chickpea, Green Onion and Quinoa Salad, 54–55
salads, Chickpea, Green Onion, and Quinoa Salad, 54–55

d) If the recipe title starts with more than one descriptor, add entries for all possible inversions that readers might look up. Add a special mark like an asterisk, which indicates that this entry could be considered for cutting if space is tight:

Buttered Bacon Rolls, Deen’s, 108–9, 109*

e) Key in any subrecipe titles and page ranges, under an appropriate category if necessary. If the subrecipe title is generic, you may also have to add the full recipe title for clarity. Append a double-asterisk, indicating that this is a subrecipe:

dressing, Special Dressing, for Chickpea, Green Onion and Quinoa Salad, 54–55**

f) Index special ingredients or techniques only if they are defined/discussed in detail. If the book contains many definitions, you may want to indicate these by setting the locators in boldface. Again, append a double-asterisk:

cold smoking, 56, 56–59**

g) Repeat steps 1a to 1e for all recipes in the cookbook, proceeding in order. Apply 1f as needed, as you go along.

h) Add any logical cross-references.

beef. See also veal

i) Run a spell check on the index.

j) Save this file as index_v1.

k) Once the cookbook’s pages have been finalized, confirm locators, making any necessary adjustments. Save index_v1.

2. Structuring

a) Alphabetize: select all, go to Table → Sort… → Sort by paragraphs, ascending.

b) You’ll have lists like these:

beef, Chinese Five-Spice Beef Short Ribs
beef, Curried Beef and Vegetable Skewers
beef, Grilled Garam Masala Burgers
beef, Wine-Marinated Prime Rib Roast
beef. See also veal

Move the general category and any cross-references to the top, then replace the category in all other entries with a tab indent by selecting that segment of text and using Word’s Find and Replace function.

beef. See also veal
     Chinese Five-Spice Beef Short Ribs
     Curried Beef and Vegetable Skewers
     Grilled Garam Masala Burgers
     Wine-Marinated Prime Rib Roast

Go through the index and repeat this step for all categories that have two or more subentries.

c) For main ingredient categories that have only one recipe, just invert the recipe name to showcase that ingredient first:

quinoa, Chickpea, Green Onion and Quinoa Salad, 54–55

becomes

Quinoa, Green Onion and Chickpea Salad, 54–55

d) Add line spaces after the end of each section that begins with the same letter. Add group headings “A,” “B,” etc. before each section only if there is enough room.

e) Add a headnote mentioning that photos are referenced in italics and definitions in boldface.

f) Save as index_v2.

3. Cutting to spec and finalizing the index

a) Save as index_v3.

b) If the index is too long, consider first eliminating whole categories that readers are unlikely to look up or that are redundant. For example, if the book itself has a section devoted to desserts, having a dessert category in the index is not needed.

c) If the index is still too long, consider combining some categories and adding cross-references. For example, if you have divided “fish” and “shellfish,” consider combining them under “seafood” and adding cross-references to the new category under both “fish” and “shellfish.” Doing so will allow you to cut duplicates of recipes that include both fish and shellfish.

d) If the index is still too long, consider cutting all subrecipes and special ingredients/techniques, which you’d marked off earlier with double-asterisks.

(If the index needs a lot of cutting and you’re confident you will need to cut all entries marked off with double-asterisks, you can use Word’s Replace function to get rid of all of them at once. If you’re comfortable with wildcard searches, place your cursor at the top of the document, then, in the Replace dialogue box, put [!^13]@\*\*^13 in the “Find what” field and nothing in the “Replace with” field. Make sure “Use wildcards” is checked. Clicking “Replace all” should get rid of any lines that end with a double-asterisk.)

e) If the index is still too long, evaluate for cutting or abridging only those entries that have an asterisk. (Never cut out or modify an entry that matches the recipe title exactly.) If it makes sense to cut the whole entry, do it. You could also cut part of the title if it refers to sauces and garnishes that aren’t a fundamental part of the dish.

f) Delete all the asterisks. (Using the Replace function, put * in the “Find what” field and nothing in the “Replace with” field.)

g) Edit the index as outlined in Chicago 16.133, in particular double-checking alphabetization, then save index_v3 and submit it.

Versioning system

  • Index_v1: This version makes it easier to update locators if pages—especially if spreads or larger groups of pages—are moved around.
  • Index_v2: Go back to this version if the publisher decides to add pages to allow more room for the index.
  • Index_v3: Your final submitted index.

Recent innovations in indexing software

ISC conference attendees were treated to a tour of four of the most popular indexing programs.

TExtract

Harry Bego, developer of TExtract, came from the Netherlands to give us a presentation and demo of his “semi-automatic” indexing software. Having been a researcher in natural language processing at Tilburg University, Bego incorporated linguistic and statistical analysis algorithms into TExtract; these identify important terms and compile them all into an initial draft index, taking out a lot of grunt work of data entry. Bego was quick to emphasize that the user is always completely in control. Although TExtract puts together the initial index automatically, the indexer can review each entry and choose whether to accept or discard it. For each entry, the program shows its frequency and “significance score.” Users can adjust the significance threshold of a text to control what kinds of terms are picked out in the initial index, and they can add filters to determine which terms the program should exclude or include.

TExtract also has a “document replacement” feature that allows the indexer to compare a new version of the text with an old one and update the index accordingly. The entries are linked to the text—a feature that supports in-context navigation and editing.

Although TExtract is in itself a complete indexing program, Bego told us that TExtract outputs can then be fed into any of the other major indexing programs (such as the ones below), if an indexer is more comfortable editing on different software.

SKY Index

SKY Index’s developer, Kamm Schreiner, was unable to join us in person, but he sent a video that showed off some of this program’s features.

SKY has a spreadsheet-like interface and allows you to do data entry on the right-hand pane while a preview pane on the left shows you the index as it’s being built. Schreiner has built in several functions that in other programs might require a macro: SKY Index can easily consume subheadings, swap acronyms, and so on. The program also flags common errors for the indexer (e.g., adding a locator to an entry that already has a See cross-reference).

SKY Index has an edit view, which allows for quick and efficient editing. You can open up a browse pane that allows you to see and compare two separate sections of the index side by side. The browse pane can be used in data entry view as well, and the program bookmarks where you left off before opening the browse pane.

More information and videos are available on the SKY Software website.

CINDEX

Frances Lennie, a freelance indexer since 1977, established Indexing Research in 1986 to develop CINDEX, which is available on both Windows and Mac. The company also features a publishers’ edition of CINDEX, available only on Windows, which accommodates multi-user production environments, such as legal publishing houses or government publishing houses.

CINDEX uses an index card metaphor and allows up to 15 levels of subheadings. The indexer enters data in the record entry window while the index builds in the background. All records are date and time stamped.

CINDEX is fully Unicode compliant and supports different sorting conventions and spell checking in a variety of languages. Lennie gave us a demonstration of an index created in Hebrew, which reads right to left and so has entries in inverted order (locator on the left, entry on the right).

Lennie showed how CINDEX supports searching specific entries based on a character string, page range, or style attributes. Editing is also easy: CINDEX offers both global and individual editing options, and it also uses a variety of techniques to check the integrity of the index. Finally, the index may be exported in many different formats and file types.

Macrex

Gale Rhoades, the North American publisher of Macrex, gave us a demo of Macrex’s just-released ninth version. Macrex is a Windows-based program that was first developed in 1981. It boasts a seemingly overwhelming list of features, but the secret to using it, explained Rhoades, is to focus only on your current project and not worry about what may seem like overhead. She has worked with many indexers over the years, she explained, and she has always managed to help them configure Macrex to do what they need it to do.

Macrex is big about giving the user control. Entries are written directly in the index; different components (e.g., cross-references, locators, etc.) are colour coded, and you can change the colour palette to suit your tastes. Further, you can create a folder with a particular client’s specifications (e.g., sort order, layout, cross-reference format, output format, etc.) and use it essentially as a template for all of the work you do for that client.

Rhoades emphasized the client support that you get with Macrex. She hosts a weekly chat session for North American Macrex users to talk about indexing and software issues; she can also connect directly to your computer to troubleshoot Macrex problems.

Louise Spiteri—User-generated metadata: boon or bust for indexing and controlled vocabularies? (ISC conference 2013)

Louise Spiteri is the director of the School of Information Management at Dalhousie University, and she spoke at the ISC conference about social tagging and folksonomies. As a trained cataloguer, Spiteri said to us, “I’m a firm believer in controlled vocabularies, but we have to accept the fact that that’s not what our clients use.” She added, “User-generated metadata is here. Let’s accept it and learn to work with it rather than against it.”

Traditionally, a document’s metadata has been the purview of cataloguers, information architects, and professional indexers. Users could search for an item based on its existing classification, but they couldn’t amend that item’s categorization and organization based on their own needs and understanding.

In recent years, however, many blog and social media platforms have made it possible for users to store and categorize items—blog posts, photos, music, articles, and so on—based on their interests. They can organize these items by adding their own keywords, and in many cases they can add further metadata in the form of ratings or reviews.

Users typically add keywords using tags, which are non-hierarchical. A social dimension to user tagging was popularized by such sites as Delicious, CiteULike, and Flickr, on which users could not only tag information but also share those tags with a wider community. The collective tagging efforts of such a community is a folksonomy (a portmanteau of “folk” and “taxonomy”)—the set of terms that a group of users has used to tag content. Although such a set is open and uncontrolled, some sites offer tag recommendations based on what others have assigned, allowing for the potential for consensus.

User tagging has its limitations, of course—from ambiguity and polysemy (does the tag “port” refer to wine or a computer port or the left side of a ship?) to synonymy (especially in cases of spelling variants and singular versus plural nouns) to variations in the level of their specificity—but it can also be enormously powerful. In some communities, for example, dedicated users—avid fans who are intimately familiar with the content—can generate a set of tags that are more useful and informative than classifications offered by the vendor or a cataloguer, who is more likely to do the minimum level of cataloguing. Social tagging’s major strength is that terms can be individualized to users’ own needs. Further, folksonomies can adapt quickly to changes in user vocabulary, accommodating new terms with virtually no cost to the user or the system. Over time, particularly if the platform supports recommendations for tags, an item’s tags will tend to stabilize into an organically curated set.

Spiteri also briefly discussed newer forms of social tagging, including hashtags and geotags. Hashtags, common on Twitter, Tumblr, Instagram, and now Facebook, allow users to quickly follow a stream of content about a particular topic. However, they suffer the same problems as uncontrolled vocabularies; Spiteri strongly advocated promoting an official hashtag for a public event so that everyone uses the same one and the conversation isn’t split among multiple streams. Geotags, by contrast, add geographic metadata to information—allowing users to follow location-based news or identify the place a photo was taken, for example—and because they are often given in numerical format, such as latitude, longitude, and altitude, they are likely to be more consistent.

Social tagging, emphasized Spiteri, isn’t going away. How do we indexers work with it? Ideally, we would have a system that combines both controlled vocabularies and tags. On many blogs, for example, you can assign a post to one or more categories, which can be tightly controlled. User tags can then supplement or complement these categories, serving special user-focused functions. For instance, in multi-cultural communities, users can tag an item in their own language. Tags can also connect like-minded users, a function that controlled vocabularies don’t readily support. Most importantly, indexers can learn from user tags, adapting their subject headings to the language of their clients.

Caroline Diepeveen—Team indexing: The way forward? (ISC conference 2013)

Caroline Diepeveen led a small team that indexed the five-volume Encyclopedia of Jews in the Islamic World (EJIW), published by Brill. Her efforts, along with those of her co-indexers, Pierke Bosschieter and Jacqueline Belder, won the team the Society of Indexers’ Wheatley Medal in 2011. Most gratifying for Diepeveen was the jury’s remark that they couldn’t tell that this index had been composed as a team.

Indexers are used to working in isolation, Diepeveen said, and some seem averse to the idea of working in a team. But her own experience with EJIW was positive, and in a small survey she conducted about team indexing, with eleven indexers responding, she found that 73% had had good experiences, while 27% said that their experience was okay; nobody had found team indexing particularly negative. The respondents had mostly worked in groups of two or three and used such strategies as constant discussion and a controlled vocabulary to achieve consistency in their work. Many teams had one main indexer who was responsible for putting the team together and ensuring the quality of the final product.

In Diepeveen’s case, team indexing became a necessity because of EJIW‘s project deadlines. She had initially signed on as the encyclopedia’s sole indexer. In theory, the encyclopedia would be built one article at a time; the editors expected a steady flow of articles from the authors, and Diepeveen could index at her leisure. In reality, the bulk of the articles came at the end, and the options for the publisher were to extend the deadline or to bring in more indexers.

Fortunately, the encyclopedia itself was compiled using a sophisticated content management system (CMS) with a fine-tuned workflow. Team members were allowed access to only the parts of the CMS that they needed; authors from all over the world contributed articles directly into the CMS, which were then edited by a team of editors and finally released for indexing. With the CMS, articles could easily be assigned to one indexer or then reassigned as needed; there was no need to mail files around. (Brill had attempted to develop a software module that allowed embedded indexing directly in the CMS, but the first version of the indexing module didn’t allow basic indexing features, such as selecting a range, and so was deemed unacceptable. In the end, the index was not fully embedded and instead was compiled using anchors in the text as locators.)

Serving as team captain, Diepeveen not only put together the indexing team but also oversaw her team’s work. She had already done some of the indexing before she brought on the other indexers, so the other team members could use her work as a reference. Helpfully, the articles in the CMS showed all indexed terms highlighted in green, and Diepeveen could easily see whether her teammates were over- or under-indexing and provide feedback as needed. She emphasized the importance of regularly communicating with team members to build trust and a strong working relationship. Geographically separated team members may not be able to meet in person, but teleconferencing and web conferencing go a long way in clarifying roles and tasks, not to mention allowing team members a chance to get to know one another.

To keep the process running smoothly, the team had to lay some groundwork:

  • Diepeveen did a thorough edit of the index near the start of the project so that all team members would have a basic structure to work towards.
  • The team disallowed double postings; cross-references could be converted to double postings at the very end if needed.
  • The team stipulated that all entries must have a subheading. When you see only one part of a publication, you don’t know how much weight or detail is given to a particular subject in another part of the publication. Again, unnecessary subheadings could be edited out at the very end if needed.

Most importantly, Diepeveen said, the team “kept asking questions. EJIW worked almost like peer review on the go. We asked each other, ‘Why did you decide to do things this way?’ We kept each other sharp by asking questions. That improved the quality of the index.”

As larger and larger electronic publications become the norm, Diepeveen said, team index will probably become more common. Emerging technological tools may help with the logistics, but the most important aspect of team indexing, she reiterated, was the team itself. It is critical to invest in trust, not only at the beginning of the project but also regularly throughout.

Pilar Wyman—Metadata, marketing, and more (ISC conference 2013)

Pilar Wyman (@pilarw on Twitter) is the immediate past president of the American Society for Indexing, as well as a member of the ASI’s Digital Trends Task Force, and she spoke at the ISC conference about promoting indexes as metadata and showing our clients how our indexes can be used as sales tools for their books.

We’re used to thinking of a book’s metadata as information about the book as a product—its title, author, ISBN, etc.—but a book’s index can also serve as metadata: each index heading and subheading can be thought of as a tag for a chunk of text that we want readers to see. As a result, readers can use this metadata to provide them with a filtered view of the content that reveals specific facets or dimensions of a book.

Indexes, Wyman argues, are as important for ebooks as a search function. They

  • add browsability and help readers find what they need by expanding the number of access points to content
  • serve as a navigational tool
  • offer pre-analysis: indexes give readers a good sense of the range of topics covered and the importance of each
  • provide a conversation with the reader, allowing publishers to show what their product has to offer

Wyman advocates giving away a book’s index for free (as Amazon essentially does with its Look Inside feature) as a marketing strategy, to let readers know what they could be getting. She also showed us the potential of index mashups, in which you combine the indexes of several publications in a collection, allowing users to browse or search across all of them. These mashups could be enormously useful for “scrapbook files”—collections of content from a variety of sources, as you’d find in a university course pack, for example.  Each heading in the mashed-up index is a link, taking you either directly to the content or to a summary screen of available information, with context. Most importantly for publishers, these indexes would offer users a direct link to purchase any of the books included in the mashup.

To exploit this marketing potential of ebook indexes, whether they are standalone indexes or mashups, publishers should link them—both in to the content and out to further resources or places to buy the book. These linked indexes should be included as back-of-the-file chapters or, better yet, in the front of an ebook so that the index gets searched first. For usability, the index should be accessible wherever you are in the book (just as you can flip to the back of a print book anytime you want), and the “find” tool should bring up the best hits, as identified by the index. Results should show snippets of a term in context, and cross-references should help the reader refine their search terms.

Generic cross-references can often present a dilemma for the indexer (e.g., Does See specific battles really give readers the information they need?), but Wyman’s vision for the EPUB index eliminates this problem: “specific battles” would link to a list of those battles, which would in turn link to the corresponding headings in the index. She also adds that smart use of tagging would allow you to filter not only based on concept but also type of content. For example, many of us already indicate definitions with boldface, images with italics, etc. This “decoration metadata,” as Wyman calls it, can be another layer of information that users can use to narrow their search down to what they need. Wyman also introduced the concept of a reverse index: users can highlight a section of text and discover what terms in the index are associated with it, allowing them to easily jump to other places in the text that discuss the same topic.

As indexers, Wyman said, we’re already skilled at figuring out aboutness and can easily apply those skills, especially if we’re already familiar with embedded indexing, to semantic tagging of text. If we can persuade our clients of the value of using our indexes as a sales tool, we can further leverage our expertise.

***

(My take: I think the idea of index mashups is brilliant. My colleagues who work in academic publishing spend huge amounts of time compiling different catalogues for different subject areas and markets. Offering one index mashup of all of their Aboriginal studies titles and another for their women’s studies titles, for example, could allow them to show the breadth and depth of their list to particular target markets, including academics considering course adoptions and subject-specific libraries.)