Ten principles for creating better indexes—Margie Towery (ISC conference 2014)

Margie Towery, two-time winner of the American Society for Indexing’s award for excellence in indexing, not to mention an indexer of The Chicago Manual of Style, treated Indexing Society of Canada conference-goers to a three-hour seminar covering ten principles for creating better indexes.

An index, said Towery, should

  • help readers find specific information faster
  • have an easy-to-use structure
  • reflect the text
  • provide multiple entry points.

Indexing is both an art and a science; when we choose what to index, we rely on reason, experience, and intuition. To create better indexes,

1. Consider your audience

Who are the readers, what are their expectations, and what terminology might they look for? Towery suggests getting familiar with terminology by referring to

  • subject dictionaries
  • similar books
  • the author’s previous works
  • the author’s website
  • the press website
  • online searches on the topic

Also refer to the book’s table of contents and introduction, as well as the author’s concept list, if it’s offered. Some indexers refuse to work with an author’s concept list, a position Towery doesn’t understand. To her, it’s easier just to include those terms in the index and keep the author happy.

Add cross-references as soon as you see the opportunity—for example, if the author mentions using two terms interchangeably.

For a long project, Towery encourages immersing yourself completely in the subject, using other books, movies, music, and art.

2. Consider the metatopic

How to treat the metatopic can be controversial among indexers. Towery believes that the metatopic main heading is a keystone to the index and can be a teaching tool, as new users may naturally want to start looking in the metatopic.

A good approach may be to begin with a table of contents structure that points readers to the main headings that will let them find what they need. You may also find mind mapping helpful; there are apps that can turn a mind-map into an outline for you.

Refer to the “indexing” heading in Hans H. Wellisch’s Indexing from A to Z for an example of a metatopic done well.

3. aim for accuracy

  • Ensure the terms accurately reflect the text (more on that later) and are spelled correctly.
  • If you’re providing dates for events or years for legal cases, make sure those are accurate.
  • Disambiguate similar names—e.g., Smith, John (dentist); Smith, John (doctor).
  • Don’t allow subheadings to make adjectives of main headings.
  • Cross-references have to be accurate and consistent, in spelling, capitalization, and punctuation.
  • Triple-check accuracy of double-postings.
  • Make sure your page locators are accurate—not only that they point to the right pages but also that they are of the right type. Should 2, 3 be 2–3? Are the locators correctly formatted according to press style?

4. Aim for comprehensiveness

“In best-case scenarios,” said Towery, “every index would be comprehensive—that is, it includes all substantive information and provides multiple ways of finding the information.” In reality, however, we face space, time, and wage limitations, so the key is to achieve balance. Indexers have to consider the many ways a user might “name” and search for something.

5. Aim for conciseness

Encapsulate meaning in as few words as possible: avoid using a fifteen-letter word when a five-letter one means the same thing. Balance jargon with everyday language. (Although clarity, reflexivity, accuracy, and audience issues are equally important.)

In some cases, conciseness may trump specificity (for example, if the heading “railroad development” has only two locators and “railroads” only one, you might want to combine them under one heading.)

6. Aim for consistency

Topics of equal weight in the text should be treated similarly in the index, both in depth and specificity. Do they have similar numbers and types of locators and subheadings?

Also ensure that you have consistency in

  • cross-references (Are cross-references from similar entries—e.g., initialisms—treated the same way?)
  • formatting details (e.g., does the text use the serial comma? Where are you placing your cross-references?)
  • structure (i.e., headings should be parallel)

Finally, said Towery, don’t be afraid to be consistently inconsistent. In some situations, you have to bend rules for headings of a certain type. Just ensure that all of the headings in that category are treated the same way, and you won’t confuse your readers.

7. Aim for clarity

“The relationship between the main and the subheading must be instantly obvious,” said Towery. “Indexers shouldn’t have to figure out what’s meant.” That’s why function words like as, of, by, and so on, are so critical. “They should not be used willy-nilly,” said Towery, “but to clarify the main–subheading relationship.” Towery also cautioned that a phrase like “influence of” can be ambiguous. Be clear by specifying “influence on” or “influence by.”

Keep in mind that “words reverberate neurolinguistically differently in different people,” said Towery. Choose between terms like terrorists and revolutionaries carefully, keeping in mind that the headings should reflect the text but also help readers find what they need.

Names may benefit from glosses to clarify who they are or what their relationship is to key players in the text.

Towery also challenges us to “love the alphabet: use the alphabet not only to keep the most important word in front but also to keep logic in the subheadings whenever possible.” Fortunately, birth comes before death in the alphabet, but illness comes after death. You might need to get creative with your wording or force sort for chronology.

8. Aim for readability

Check out Susan Olason’s “Let’s get usable! Usability studies for indexes” article in The Indexer. The article notes that commas can be confusing, especially for reversals, and that table of contents–styled entries and indented formatting is more user friendly. Interestingly, Caroline Diepeveen mentioned research in the Netherlands that showed run-in subheadings are easier to read. Towery suspects that may be because run-in subheadings force indexers to be clearer and more succinct. Indented styles work better with technical texts; run-in may work better for narratives.

Towery noted the need to chunk information to accommodate the fact that we can keep only a few bits of information in short-term memory at a time.

Don’t use old indexing devices like ff. and passim., which most readers don’t understand.

Tips for readability overlap considerably with the other principles for creating better indexes:

  • Use a visible metatopic structure.
  • Include parallel structure where appropriate.
  • Be consistent in the treatment of topics.
  • Be sure the relationships between main and subheadings are unambiguous.
  • Place the most important word first in the subheadings whenever possible.
  • Avoid jargon, especially in subheadings.
  • Use the alphabet for logical progressions.
  • Clarify subheading meanings with function words.
  • Consider clumping and gathering similar subheadings.
  • Sort out long entries into more approachable chunks.
  • Avoid inversions whenever possible.

Although Olason found that indented styles were more readable, they also have their own problems. If a series of subheadings start with the same function word (e.g., on), they can create what Towery calls “gridlock” or eyeball barriers. Reword the subheading if needed to prevent this problem.

Towery also cautions that our training and work have biased our thought process. We may not be objective about what makes an index usable. If you can, show your index to someone else, a reader who can provide more unbiased feedback.

9. Aim for reflexivity

“An index should reflect the text from which it comes,” said Towery. “The index internalizes the text. But it’s not simply a regurgitation of the text in alphabetical order.” Indexers digest books to create an approachable alphabetical and structured index from the text.

Reflexivity also applies to the tone that characterizes the text. However, Towery says that an index doesn’t need to carry forward the author’s biases. Use neutral headings and subheadings to point to the text, where the author’s voice and opinions can take over.

10. Use common sense

Use natural, everyday language whenever possible. Make sure that the index makes sense to all of its possible audiences and that it’s usable by a variety of people. Again, sometimes you need to break the “rules”—with experience, you’ll get a better sense of when and how you should overrule standard practices to make a better index.

Read other indexes and critically evaluate how well they work, why, and what could be done differently. To evaluate what makes a good index, use the American Society for Indexing’s checklist for its excellence in indexing award.

***

Beyond these ten principles, Towery emphasizes the need for

  • cross-training: “Keep indexing skills fresh by learning other related skills,” she said. She finds that trying to summarize a book in a haiku helps her distill the text into its essence and achieve the precision and conciseness that an index needs.
  • napping: It’s scientifically proven to increase alertness, boost creativity and the ability to see connections, strengthen memory, clarify decision making, and improve productivity. Towery suggests reading Sara Mednick’s Take a Nap!

For an eye opener, Towery recommends reading What Is an Index?, a book Henry Wheatley wrote back in 1878 that discussed a lot of the same issues we are dealing with today.

Food for thought: the expanding universe of cookbook indexing—Gillian Watts (ISC conference 2014)

Gillian Watts, a past president of the Indexing Society of Canada, is an avid cook who’s always been drawn to cookbook indexing. Frustrated by not being able to find what she needed in a Time-Life series of cookbooks she owned called Foods of the World, Watts began cataloguing the recipes and ingredients in the series using index cards. She has since indexed about 140 cookbooks on a variety of topics, from breadmaking to gluten-free recipes to Indian cuisine.

Why index cookbooks?

There’s a big market for cookbooks today, particularly those focusing on healthy foods or cuisine from other countries, as well as those written by celebrity chefs.

Cookbooks are also comparatively easy, if you already know how to index. They’re “not a strain on intellectual faculties,” said Watts, and you can make “quick bucks, though not necessarily big bucks.”

What’s more, cookbooks are fun: every book has a different challenge, a “different world of sensory delights,” although, warned Watts, they “can lead to frequent snacking.”

Indexing approach

As with any index, know your client’s preferences before you begin, although sometimes the publisher doesn’t know what they want. In cookbooks there seems to be a preference for letter-by-letter sorting, and generally you need only one level of subhead. “Only once did I have to go to two levels,” Watts said.

Some publishers ask indexers to use special formatting, such as italics or bold, for main entries, particular techniques, or images.

“As a matter of practice,” said Watts, “I over-index. It’s easier to cut stuff out later rather than add it back in.” Watts keeps the main headings lowercase singular, to take advantage of her indexing software’s autocomplete function.

Bear in mind that the cookbook author had a reason for giving the recipes the titles they have, so try to preserve the original syntax when indexing. Also, Watts will index any ingredient in a recipe name, even if very little of it is used.

Knowing how to cook is a huge asset to a cookbook indexer; it’s important to understand the flavour profile of ingredients. An experienced cook, for example, would recognize that 1/4 cup of cilantro has more flavour than a 1/4 cup of parsley—and that it would have more influence in 2 cups of sauce than an 8-serving stew.

Cross-references are also important: often fresh and dried ingredients are used very differently.

Watts keeps a “staples list” that sets the threshold for which certain ingredients (e.g., beer, breadcrumbs, butter, carrots) make it into the index, but, she emphasized, you need to be flexible. In books for parents or for people with health problems, foods normally considered staples (e.g., flour) may become important to know about—and hence important to index.

For common cookbook terms, Watts has added a series of abbreviations to her software that autocorrect to the longer word—e.g., ch will render as chocolate. This trick saves her keystrokes and is especially useful for terms with accented characters.

The metatopic can be tricky for books that focus on a particular ingredient. For a book about quinoa that Watts worked on, where every recipe included quinoa, she indexed special forms of quinoa, such as “quinoa flour” and “quinoa flakes,” and implied that anything not listed simply used quinoa.

In cookbooks that have a health component as well as recipes, the index entries sometimes make “awkward bedfellows.” You may end up with “unappealing juxtapositions of symptoms and recipe items” and may need to get creative with wording. In one project she recommended using two separate indexes in order not to ruin the reader’s appetite.

Editing and trimming

Once you’re done data entry, edit the index, eliminating all one-entry headings. Check all cross-references.

The number of entries isn’t the same as the number of lines; some recipes have long, descriptive titles. The number of entries should be about 85 per cent of the lines available.

If space is at a premium, get rid of entries beginning with cooking techniques; people look up food, not techniques. Staple products and flavourings are also good candidates for cuts. “Sometimes you have to cut your pet entries,” said Watts, and “it’s important not to clutter the index with trivialities, even if they sound yummy.”

You may also want to group similar ingredients, such as berries, nuts, seafood, and so on, for space. “Sometimes I cheat and use the flavour profile rather than the actual food,” said Watts. For example, the entry “apple” would include applesauce, apple juice—basically anything that tastes like apple.

References

If the universe of cookbook indexing appeals to you, Watts recommends the following resources:

Watts also suggests looking at indexes in your own cookbooks. Which are useful? Which are irritating? And makes them so?

***

(Related: See my post about cookbook indexing using Microsoft Word.)

Whither the ebook index?—Erin Mallory (ISC conference 2014)

Erin Mallory is the manager of cross-media at House of Anansi Press, which has been publishing ebooks (in addition to its print books) since 2009. Mallory launched the Indexing Society of Canada’s 2014 conference with an overview of the current state of ebook indexing workflows.

Ebook formats

Ebooks come in three main formats:

  • PDFs support some multimedia and interactivity and are easy to create but have limited sales channels. The static format of PDFs makes them popular for technical or reference books but may create poor reading experience for readers using certain devices (for example, trying to read on a smartphone).
  • EPUB is the most popular ebook format and is essentially a self-contained website, using XML and CSS. Text is reflowable. EPUB is a neutral, standard format compatible with all current e-readers except the Kindle. EPUB 2, still the most commonly used version, is based on HTML 4 and CSS 2. EPUB 3 is a newer format, with many improvements in functionality, accommodating languages that read vertically or from right to left, as well as MathML.
  • MOBI is also based on XML and CSS but is proprietary to Amazon and is compatible only with Kindle devices and apps.

The main reading engines are:

  • Adobe Reader Mobile SDK, which renders ebooks on Adobe Digital Editions, Kobo, and Nook.
  • WebKit, which renders ebooks on most mobile e-readers, including the iPad, and browser-based e-readers.

Publisher’s considerations

Ebook indexes are really only useful if they are fully hyperlinked. Until recently, hand coding each hyperlink was the only way to create a fully functional ebook index, so publishers had to consider the return on investment. Not only is creating an ebook index time consuming, but proofing the index adds time to the quality-assurance process.

Further, the publisher has to consider what devices its audience is using. First-generation Kindles and Kobos don’t support hyperlinking, and not all e-readers support a “back to” function.

Because of these limitations, Anansi decided when it launched its ebook program in 2009 not to include indexes in ebooks at all. Today, the publisher has adopted a workflow that has streamlined some aspects of ebook index creation.

Recent improvements

Scripts for Adobe Creative Suite 5+ can be very useful; some auto-generate cross-references in a formatted index that are maintained when exported to EPUB. The scripts aren’t perfect, so some (about half) of the links still need to be hand coded. These scripts use styles, so if a designer hasn’t properly styled the index, they won’t work properly.

There are also scripts that convert an external index (for example, one created in Word or a program like Cindex) to create an index in InDesign that is maintained on PDF export.

The Creative Cloud version of InDesign allows for linked indexes to be exported into EPUB. Publishers can be reluctant to relinquish control of their InDesign files to an indexer, but Mallory acknowledges that if professional indexers can save the time by embedding the index, publishers may have to push aside their reluctance and find ways of working with them.

Project considerations

For each project, ask yourself the following:

  • Does your ebook need an index?
  • Does the index have to match the print book?
  • What devices will your readers use?
  • Can the index be adapted to better serve the digital reading experience?
  • Can you change your indexing workflow to simplify the ebook index creation process?
  • What kind of markers do you want to use?

Mallory points out that in an ebook, using page numbers may not make the most sense. Some indexers in the audience remarked that seeing a page range communicates important information about subject coverage. InDesign indexes can allow the range to be listed but link only the first page number.

(On Day 2 of the conference, Judy Dunlop gave an excellent summary of the workflow she used in a recent project doing embedded indexing in InDesign Creative Cloud. Post coming soon!)

Resources

Indexing Society of Canada and Editors’ Association of Canada conferences 2014—personal highlights

I’m back from four and a half days in Toronto, where I attended ISC’s and EAC’s national conferences. As in previous years, I’ll be posting summaries of some of the talks I attended—a process that, as I’ve learned, will take me several weeks. Both conferences were excellent, featuring a variety of sessions that appealed to novices as well as seasoned pros and that tackled not only the technical aspects of indexing and editing but also the business side of freelancing. Best of all was being able to see old friends and pick up conversations as if no time had passed, as well as meeting new colleagues and putting faces to names.

My days were packed: I had the privilege of introducing indexing superstar Enid Zafran at her talk about indexer–author relations at the ISC conference, and at the EAC conference I ran a two-part senior editors’ unconference: at a lunchtime session on Saturday, editors shouted out topics they wanted to discuss. I recorded the topics on a flip chart, then, with the help of sticky dots, the editors voted on their favourite ones. I ranked the topics based on votes and created our discussion agenda for our session on Sunday. It was impossible to get through all fourteen of the proposed topics, and it would have been great to have more time, but in general I thought the format worked reasonably well. It also helped that we had a great group; I’m consistently amazed by how much can happen when you just get a bunch of smart people talking to each other about what they know.

The highlight of my week, though, was the EAC banquet. Not only did we learn from Moira White that EAC has established a new award—for a person or organization that has helped advance the editing profession—in memory of our late friend Karen Virag, but we also saw Certification Steering Committee co-chairs Anne Brennan and Janice Dyer acknowledged for their enormous volunteer contributions to the association. Both won the President’s Award for Volunteer service—a well-deserved and long-overdue recognition of the hours and hours and hours of work they put into steering the certification program. Congratulations go out to all the President’s Award winners, including Lee d’Anjou Award–winning volunteer of the year, Michelle Boulton. (Just as note to the national executive, I would have loved to hear what these fantastic volunteers had done for EAC, not just their names! Please consider a giving one-sentence summary of each volunteer’s contributions at next year’s banquet.)

Congratulations, also, of course, to Claudette Upton Scholarship winner Daniel Polowin, and to University of Alberta Press’s Peter Midgley, who finally, finally received the Tom Fairley Award for Editorial Excellence he so deserves.

For me, the most exciting part of the evening was being able to present, on behalf of the Certification Steering Committee, designations of Honorary Certified Professional Editor to six pioneers of EAC’s certification program. Without them, the program simply wouldn’t exist. As someone who’s benefited tremendously from certification, both as a CPE and as a CSC member who’s had the privilege to work for the past two and half years with some of the most brilliant, funniest people I know, I want to thank and congratulate these champions, mentors, and friends for their dedication: Lee d’Anjou, Peter Moskos, Maureen Nicholson, Jonathan Paterson, Frances Peck, and Ruth Wilson. I would not be where I am today without them.

If anyone has any photos of the presentation they could send me, I’d be grateful for them. Believe me—the amount of restraint it took to keep from spilling the beans about this surprise was enormous!

Book review: Starting an Indexing Business

You’ve taken indexing courses. Read the indexing chapter of the Chicago Manual of Style and Nancy Mulvaney’s Indexing Books. Bought yourself indexing software.

Now what?

For most would-be indexers hoping to start their own freelancing business (as many of us are now aware), the actual indexing work isn’t the biggest challenge. Getting that work, not to mention managing the financial and administrative details of self-employment, is the tough part, and it’s one that gets very little attention in most indexing reference books. Starting an Indexing Business, edited by Enid Zafran and Joan Shapiro, is a rare exception, offering people who are launching—or considering—a career as a freelance indexer some insider wisdom about running their own business.

The fourth edition of Starting an Indexing Business was published in 2009, but it was recently released as an ebook. With chapters about moonlighting as an indexer while holding down a full-time job by Melanie Krueger, the business of being in business, by Pilar Wyman, and liability and exposure issues for indexers, by Enid Zafran, this book tries to answer a lot of questions that a freelancer just starting out might have. It’s a quick read, and it’s packed with tips from indexing veterans who have spent years in the trenches. Seeing the issues from different indexers’ perspectives is helpful, and the diversity of contributors shows that, despite having similar traits that make us good at what we do, different indexers take different approaches to running their business. Particularly interesting is the debate about whether to invest in disability insurance, with Wyman advocating for it and Zafran saying she didn’t see the need.

Zafran’s chapter about liability has a lot to offer, spurring the reader to think about how best to protect their business and to assert their copyright to make sure they get paid. A sample letter of agreement for indexing services also appears as an appendix to the book, and it serves as a helpful tool for freelancers to communicate clearly with a new client and start off their working relationship on the right foot.

Although the book has plenty of solid advice for new indexers, much of it will be old hat to people who have had a few projects under their belt. Being five years old, it also needs an update. I suspect that cold calling and mailing out brochures to prospective clients, as marketing strategies recommended by a few of the contributors, have largely given way to email enquiries and websites. I would also hope that a fax machine is no longer a must-have in the home office. Workflow and file transfer technologies have also evolved dramatically since the book’s publication, and ebooks and self-publishing have exploded. Further, the book is geared toward a primarily American audience, with references to health insurance and U.S. taxes that wouldn’t apply to Canadian indexers.

New freelancers may find Starting an Indexing Business helpful, although I wouldn’t call it a must-read. For those with a few years’ experience already, there isn’t much in this book that you won’t already know. And beyond the sample letter, I don’t see much in this book that you would refer to time and again, so I’d be inclined to borrow it from the library, if you can. If you do want to add this title to your collection, I’d suggest waiting for an updated edition, so that the advice better reflects current practices and technology.

Recent innovations in indexing software

ISC conference attendees were treated to a tour of four of the most popular indexing programs.

TExtract

Harry Bego, developer of TExtract, came from the Netherlands to give us a presentation and demo of his “semi-automatic” indexing software. Having been a researcher in natural language processing at Tilburg University, Bego incorporated linguistic and statistical analysis algorithms into TExtract; these identify important terms and compile them all into an initial draft index, taking out a lot of grunt work of data entry. Bego was quick to emphasize that the user is always completely in control. Although TExtract puts together the initial index automatically, the indexer can review each entry and choose whether to accept or discard it. For each entry, the program shows its frequency and “significance score.” Users can adjust the significance threshold of a text to control what kinds of terms are picked out in the initial index, and they can add filters to determine which terms the program should exclude or include.

TExtract also has a “document replacement” feature that allows the indexer to compare a new version of the text with an old one and update the index accordingly. The entries are linked to the text—a feature that supports in-context navigation and editing.

Although TExtract is in itself a complete indexing program, Bego told us that TExtract outputs can then be fed into any of the other major indexing programs (such as the ones below), if an indexer is more comfortable editing on different software.

SKY Index

SKY Index’s developer, Kamm Schreiner, was unable to join us in person, but he sent a video that showed off some of this program’s features.

SKY has a spreadsheet-like interface and allows you to do data entry on the right-hand pane while a preview pane on the left shows you the index as it’s being built. Schreiner has built in several functions that in other programs might require a macro: SKY Index can easily consume subheadings, swap acronyms, and so on. The program also flags common errors for the indexer (e.g., adding a locator to an entry that already has a See cross-reference).

SKY Index has an edit view, which allows for quick and efficient editing. You can open up a browse pane that allows you to see and compare two separate sections of the index side by side. The browse pane can be used in data entry view as well, and the program bookmarks where you left off before opening the browse pane.

More information and videos are available on the SKY Software website.

CINDEX

Frances Lennie, a freelance indexer since 1977, established Indexing Research in 1986 to develop CINDEX, which is available on both Windows and Mac. The company also features a publishers’ edition of CINDEX, available only on Windows, which accommodates multi-user production environments, such as legal publishing houses or government publishing houses.

CINDEX uses an index card metaphor and allows up to 15 levels of subheadings. The indexer enters data in the record entry window while the index builds in the background. All records are date and time stamped.

CINDEX is fully Unicode compliant and supports different sorting conventions and spell checking in a variety of languages. Lennie gave us a demonstration of an index created in Hebrew, which reads right to left and so has entries in inverted order (locator on the left, entry on the right).

Lennie showed how CINDEX supports searching specific entries based on a character string, page range, or style attributes. Editing is also easy: CINDEX offers both global and individual editing options, and it also uses a variety of techniques to check the integrity of the index. Finally, the index may be exported in many different formats and file types.

Macrex

Gale Rhoades, the North American publisher of Macrex, gave us a demo of Macrex’s just-released ninth version. Macrex is a Windows-based program that was first developed in 1981. It boasts a seemingly overwhelming list of features, but the secret to using it, explained Rhoades, is to focus only on your current project and not worry about what may seem like overhead. She has worked with many indexers over the years, she explained, and she has always managed to help them configure Macrex to do what they need it to do.

Macrex is big about giving the user control. Entries are written directly in the index; different components (e.g., cross-references, locators, etc.) are colour coded, and you can change the colour palette to suit your tastes. Further, you can create a folder with a particular client’s specifications (e.g., sort order, layout, cross-reference format, output format, etc.) and use it essentially as a template for all of the work you do for that client.

Rhoades emphasized the client support that you get with Macrex. She hosts a weekly chat session for North American Macrex users to talk about indexing and software issues; she can also connect directly to your computer to troubleshoot Macrex problems.

Louise Spiteri—User-generated metadata: boon or bust for indexing and controlled vocabularies? (ISC conference 2013)

Louise Spiteri is the director of the School of Information Management at Dalhousie University, and she spoke at the ISC conference about social tagging and folksonomies. As a trained cataloguer, Spiteri said to us, “I’m a firm believer in controlled vocabularies, but we have to accept the fact that that’s not what our clients use.” She added, “User-generated metadata is here. Let’s accept it and learn to work with it rather than against it.”

Traditionally, a document’s metadata has been the purview of cataloguers, information architects, and professional indexers. Users could search for an item based on its existing classification, but they couldn’t amend that item’s categorization and organization based on their own needs and understanding.

In recent years, however, many blog and social media platforms have made it possible for users to store and categorize items—blog posts, photos, music, articles, and so on—based on their interests. They can organize these items by adding their own keywords, and in many cases they can add further metadata in the form of ratings or reviews.

Users typically add keywords using tags, which are non-hierarchical. A social dimension to user tagging was popularized by such sites as Delicious, CiteULike, and Flickr, on which users could not only tag information but also share those tags with a wider community. The collective tagging efforts of such a community is a folksonomy (a portmanteau of “folk” and “taxonomy”)—the set of terms that a group of users has used to tag content. Although such a set is open and uncontrolled, some sites offer tag recommendations based on what others have assigned, allowing for the potential for consensus.

User tagging has its limitations, of course—from ambiguity and polysemy (does the tag “port” refer to wine or a computer port or the left side of a ship?) to synonymy (especially in cases of spelling variants and singular versus plural nouns) to variations in the level of their specificity—but it can also be enormously powerful. In some communities, for example, dedicated users—avid fans who are intimately familiar with the content—can generate a set of tags that are more useful and informative than classifications offered by the vendor or a cataloguer, who is more likely to do the minimum level of cataloguing. Social tagging’s major strength is that terms can be individualized to users’ own needs. Further, folksonomies can adapt quickly to changes in user vocabulary, accommodating new terms with virtually no cost to the user or the system. Over time, particularly if the platform supports recommendations for tags, an item’s tags will tend to stabilize into an organically curated set.

Spiteri also briefly discussed newer forms of social tagging, including hashtags and geotags. Hashtags, common on Twitter, Tumblr, Instagram, and now Facebook, allow users to quickly follow a stream of content about a particular topic. However, they suffer the same problems as uncontrolled vocabularies; Spiteri strongly advocated promoting an official hashtag for a public event so that everyone uses the same one and the conversation isn’t split among multiple streams. Geotags, by contrast, add geographic metadata to information—allowing users to follow location-based news or identify the place a photo was taken, for example—and because they are often given in numerical format, such as latitude, longitude, and altitude, they are likely to be more consistent.

Social tagging, emphasized Spiteri, isn’t going away. How do we indexers work with it? Ideally, we would have a system that combines both controlled vocabularies and tags. On many blogs, for example, you can assign a post to one or more categories, which can be tightly controlled. User tags can then supplement or complement these categories, serving special user-focused functions. For instance, in multi-cultural communities, users can tag an item in their own language. Tags can also connect like-minded users, a function that controlled vocabularies don’t readily support. Most importantly, indexers can learn from user tags, adapting their subject headings to the language of their clients.

Caroline Diepeveen—Team indexing: The way forward? (ISC conference 2013)

Caroline Diepeveen led a small team that indexed the five-volume Encyclopedia of Jews in the Islamic World (EJIW), published by Brill. Her efforts, along with those of her co-indexers, Pierke Bosschieter and Jacqueline Belder, won the team the Society of Indexers’ Wheatley Medal in 2011. Most gratifying for Diepeveen was the jury’s remark that they couldn’t tell that this index had been composed as a team.

Indexers are used to working in isolation, Diepeveen said, and some seem averse to the idea of working in a team. But her own experience with EJIW was positive, and in a small survey she conducted about team indexing, with eleven indexers responding, she found that 73% had had good experiences, while 27% said that their experience was okay; nobody had found team indexing particularly negative. The respondents had mostly worked in groups of two or three and used such strategies as constant discussion and a controlled vocabulary to achieve consistency in their work. Many teams had one main indexer who was responsible for putting the team together and ensuring the quality of the final product.

In Diepeveen’s case, team indexing became a necessity because of EJIW‘s project deadlines. She had initially signed on as the encyclopedia’s sole indexer. In theory, the encyclopedia would be built one article at a time; the editors expected a steady flow of articles from the authors, and Diepeveen could index at her leisure. In reality, the bulk of the articles came at the end, and the options for the publisher were to extend the deadline or to bring in more indexers.

Fortunately, the encyclopedia itself was compiled using a sophisticated content management system (CMS) with a fine-tuned workflow. Team members were allowed access to only the parts of the CMS that they needed; authors from all over the world contributed articles directly into the CMS, which were then edited by a team of editors and finally released for indexing. With the CMS, articles could easily be assigned to one indexer or then reassigned as needed; there was no need to mail files around. (Brill had attempted to develop a software module that allowed embedded indexing directly in the CMS, but the first version of the indexing module didn’t allow basic indexing features, such as selecting a range, and so was deemed unacceptable. In the end, the index was not fully embedded and instead was compiled using anchors in the text as locators.)

Serving as team captain, Diepeveen not only put together the indexing team but also oversaw her team’s work. She had already done some of the indexing before she brought on the other indexers, so the other team members could use her work as a reference. Helpfully, the articles in the CMS showed all indexed terms highlighted in green, and Diepeveen could easily see whether her teammates were over- or under-indexing and provide feedback as needed. She emphasized the importance of regularly communicating with team members to build trust and a strong working relationship. Geographically separated team members may not be able to meet in person, but teleconferencing and web conferencing go a long way in clarifying roles and tasks, not to mention allowing team members a chance to get to know one another.

To keep the process running smoothly, the team had to lay some groundwork:

  • Diepeveen did a thorough edit of the index near the start of the project so that all team members would have a basic structure to work towards.
  • The team disallowed double postings; cross-references could be converted to double postings at the very end if needed.
  • The team stipulated that all entries must have a subheading. When you see only one part of a publication, you don’t know how much weight or detail is given to a particular subject in another part of the publication. Again, unnecessary subheadings could be edited out at the very end if needed.

Most importantly, Diepeveen said, the team “kept asking questions. EJIW worked almost like peer review on the go. We asked each other, ‘Why did you decide to do things this way?’ We kept each other sharp by asking questions. That improved the quality of the index.”

As larger and larger electronic publications become the norm, Diepeveen said, team index will probably become more common. Emerging technological tools may help with the logistics, but the most important aspect of team indexing, she reiterated, was the team itself. It is critical to invest in trust, not only at the beginning of the project but also regularly throughout.

Pilar Wyman—Metadata, marketing, and more (ISC conference 2013)

Pilar Wyman (@pilarw on Twitter) is the immediate past president of the American Society for Indexing, as well as a member of the ASI’s Digital Trends Task Force, and she spoke at the ISC conference about promoting indexes as metadata and showing our clients how our indexes can be used as sales tools for their books.

We’re used to thinking of a book’s metadata as information about the book as a product—its title, author, ISBN, etc.—but a book’s index can also serve as metadata: each index heading and subheading can be thought of as a tag for a chunk of text that we want readers to see. As a result, readers can use this metadata to provide them with a filtered view of the content that reveals specific facets or dimensions of a book.

Indexes, Wyman argues, are as important for ebooks as a search function. They

  • add browsability and help readers find what they need by expanding the number of access points to content
  • serve as a navigational tool
  • offer pre-analysis: indexes give readers a good sense of the range of topics covered and the importance of each
  • provide a conversation with the reader, allowing publishers to show what their product has to offer

Wyman advocates giving away a book’s index for free (as Amazon essentially does with its Look Inside feature) as a marketing strategy, to let readers know what they could be getting. She also showed us the potential of index mashups, in which you combine the indexes of several publications in a collection, allowing users to browse or search across all of them. These mashups could be enormously useful for “scrapbook files”—collections of content from a variety of sources, as you’d find in a university course pack, for example.  Each heading in the mashed-up index is a link, taking you either directly to the content or to a summary screen of available information, with context. Most importantly for publishers, these indexes would offer users a direct link to purchase any of the books included in the mashup.

To exploit this marketing potential of ebook indexes, whether they are standalone indexes or mashups, publishers should link them—both in to the content and out to further resources or places to buy the book. These linked indexes should be included as back-of-the-file chapters or, better yet, in the front of an ebook so that the index gets searched first. For usability, the index should be accessible wherever you are in the book (just as you can flip to the back of a print book anytime you want), and the “find” tool should bring up the best hits, as identified by the index. Results should show snippets of a term in context, and cross-references should help the reader refine their search terms.

Generic cross-references can often present a dilemma for the indexer (e.g., Does See specific battles really give readers the information they need?), but Wyman’s vision for the EPUB index eliminates this problem: “specific battles” would link to a list of those battles, which would in turn link to the corresponding headings in the index. She also adds that smart use of tagging would allow you to filter not only based on concept but also type of content. For example, many of us already indicate definitions with boldface, images with italics, etc. This “decoration metadata,” as Wyman calls it, can be another layer of information that users can use to narrow their search down to what they need. Wyman also introduced the concept of a reverse index: users can highlight a section of text and discover what terms in the index are associated with it, allowing them to easily jump to other places in the text that discuss the same topic.

As indexers, Wyman said, we’re already skilled at figuring out aboutness and can easily apply those skills, especially if we’re already familiar with embedded indexing, to semantic tagging of text. If we can persuade our clients of the value of using our indexes as a sales tool, we can further leverage our expertise.

***

(My take: I think the idea of index mashups is brilliant. My colleagues who work in academic publishing spend huge amounts of time compiling different catalogues for different subject areas and markets. Offering one index mashup of all of their Aboriginal studies titles and another for their women’s studies titles, for example, could allow them to show the breadth and depth of their list to particular target markets, including academics considering course adoptions and subject-specific libraries.)

Nancy Mulvany—The repurposed book index and indexer (ISC conference 2013)

Nancy Mulvany is the author of Indexing Books, the go-to reference for any aspiring or practising indexer. She kicked off this year’s Indexing Society of Canada’s conference in Halifax with her keynote speech about the changing role of the index and indexer in a digital age.

We all know that we will have to evolve and adapt to this new landscape. But how do we go about it, and what potential obstacles do we face? Mulvany warned us about some of the more insidious forces behind our seeming glut of free information. She quoted Jaron Lanier’s Who Owns the Future?, saying “the dominant principle of the new economy, the information economy, has lately been to conceal the value of information.” One you start to devalue information, she said, you devalue the people who provide access to information—that’s us. Companies like Facebook expect us to share information freely while they profit handsomely; Mulvany noted that “making information free is survivable so long as only limited numbers of people are disenfranchised”—right now it’s easy to get information for free by exploiting our musicians, writers, and artists.

Tracing the history of the book—from the invention of the codex to the development of the printing press and movable type to the first use of pagination—Mulvany offered context for the evolution of our consumption of information. At one time, Mulvany said, knowledge and information were highly prized: books were so valued they were chained to their bookshelves in libraries. Today we have an abundance of books in personal and office libraries, as well as in our computers and e-readers.

Finding information in an ebook, however, can be frustrating. If you go to Amazon and look through the reviews of reference books or non-fiction books that readers are trying to get information from, you’ll discover that those who have the Kindle edition can’t find what they’re looking for. Is there a way we effectively integrate the codex and the ebook and find the information we need?

Mulvany gave us an example of what’s possible by taking us on a tour of Evernote, a note-taking application that allows you to collect information from a variety of sources—from PDFs and Word files to images and audio files—and keep them in one place. Evernote makes the text and images searchable; it even has the ability to decipher neat handwriting on a scanned or photographed document. Items in Evernote are “indexed,” in that you can assign categories and subcategories to items, then add tags—all of which are searchable. If you add another layer of information by aggregating indexes in Evernote, as Mulvany demonstrated with her collection of cookbook indexes, you can then search across multiple books at once, and she believes that there’s a market for organizing information in this way in all sorts of fields. If you want to make the information in a law office library searchable, for example, “the five-hundred-page book on torts doesn’t have to be scanned—provided there’s a good index.”

The power to aggregate information in applications such as Evernote is an example of the repurposed index, but how do we repurpose the indexer? That’s easier said than done, said Mulvany; many of us are very set in our ways, but we have an amazing skill set that includes the ability to analyze, prioritize, synthesize, and localize. By applying those skills with tools such as Evernote, we could “create a product for a client that provides an incredible depth of access to information—something that the most sophisticated search algorithm can’t provide.” In so doing, Mulvany warns, we have to remember the users, who are getting more and more difficult to anticipate—especially younger people who have never been taught how to use an index.