ISC Conference 2012, Day 2—What is the future of indexing?

Cheryl Landes is a technical writer and indexer who sees a changing role for indexers—one that is rife with possibilities.

Today people are consuming content in four main ways: through print, on e-readers, on tablets, and on smartphones. In the past year, more people have been moving towards tablets and smartphones rather than e-readers, since the former devices offer colour and other functionality. Many software vendors of authoring tools are adding outputs to accommodate tablets, and more and more companies are publishing technical documentation that can be read on tablets or smartphones (for example, Alaska Airlines replaced forty pounds of paper pilots’ manuals with iPads). Despite the movement towards mobile devices, however, Landes doesn’t believe that print will ever go away.

Digital content means users are able to search, but searching doesn’t yield the speed of information retrieval or context that an index offers. Indexers have to be proactive about educating others about the utility and importance of indexes, and emerging technologies are providing many opportunities for indexers to apply their skills beyond the scope of traditional back-of-the-book indexing.

Partnering with content strategists

Indexers can serve as consultants about taxonomies and controlled vocabularies, which are key to finding content. (An example of a taxonomy is the Legislative Assembly of British Columbia’s Index to Debates.)

Database indexing

Growth in this area is anticipated as more companies move their catalogues online, particularly in retail.

Embedded indexing

Embedded indexing tags content directly in a file and allows for single-sourcing, which is ideal for publishers who want print and digital outputs for their content. (Landes echoes Jan Wright in saying that for the past decade technical communicators have been grappling with issues trade publishers are facing now, yet they’re not talking to each other. How do we start that conversation?)

Search engine optimization

Indexers understand what kinds of search terms certain target audiences use. Acting as consultants, they can create strategies for keywording in metadata.

Blog and wiki indexing

This area is likely to grow because more companies are turning to blogs to promote products and services, and they are using wikis for technical documentation.

Social media

Possible consulting opportunities abound in this quickly changing field. Facebook’s Timeline and Twitter’s hashtags are both attempts at indexing in social media, but one can envision the need for more sophisticated methods of retrieving information as more and more content is archived on these platforms.

ISC Conference 2012, Day 1—Getting work with the federal government

I’ve attended a number of “getting work with the feds” presentations, but Marion Soublière’s at the ISC conference was by far the most engaging, informative, and coherent of them all. Soublière is a writer and editor and the author of Getting Work with the Federal Government. In her talk, she outlined short-term, medium-term, and long-term strategies that independent contractors can use to get government work.

Almost two hundred federal departments, agencies, and crown corporations will need the skills of writers, editors, and indexers, so the work is diverse and interesting. Best of all, the compensation can be quite good. With this spring’s round of budget cuts, many civil service positions were eliminated, meaning that the government will have to rely on more contractors to fulfill their needs. What’s more, many baby boomer bureaucrats are in a position to retire, so the opportunities are many.

Soublière dispelled myths that the government works only with large companies (in December 2010, 98% of suppliers were small or medium-sized firms); that you can only work for the government if you go through temp agencies; and that you need to be bilingual (most tenders ask for specialization in one official language or the other).

Public Works and Government Services Canada
does 85% of the shopping for more than one hundred departments and agencies, and 93% of these are what’s considered “low-dollar buys” (less than $25,000). PWGSC runs buyandsell.gc.ca—your gateway to a lot of government opportunities. It is also behind the Office of Small and Medium Enterprises, which offers free seminars about doing business with the Government of Canada.

One of the first steps to working with the government is to register as a supplier. Go to buyandsell.gc.ca, register in the Supplier Registration Information (SRI) database, and get your Procurement Business Number (distinct from your GST/HST business number), which you will need in order to get paid. The government uses the SRI for low-dollar buys. Be sure to keep your profile updated every quarter, because it could be deleted if it’s deemed dormant.

Next, join MERX. Even if you don’t want to bid on tenders there, you need to be able to sign in to see who has bid for them, and this information may lead to subcontracting opportunities.

Since bidding on a standing offer can take months, your best bet to get government is a multi-pronged approach with short-term, medium-term, and long-term strategies.

1. Short term

Search MERX for “temporary help services.”

Try to subcontract to a firm that already has a Government of Canada contract. See the Contract History section of buyandsell.gc.ca to find suppliers who have been awarded contracts.

2. Medium term

See the list of departmental material managers on buyandsell.gc.ca, and ask to be added to their departmental source list.

Search the Government Electronic Directory Services site for managers in your field (i.e., communications) and contact them via email.

Approach other key contacts, including PWGSC procurement officers for your community (find these through the Procurement Allocations Directory on buyandsell.gc.ca) and PWGSC regional offices, which will tell you about opportunities in your area.

Apply to get registered in Professional Services Online, which handles buys up to $78,500. Soublière suggests that if you’re a sole proprietor, list yourself as the consultant as well as the owner; keep your per diems, dates available for work, and years of experience updated; and submit quarterly activity reports whether you received work or not.

3. Long term

Bid on standing offers and supply arrangements. Requests for proposals are usually for a one-time contract, and since bids take a long time to prepare, it’s more cost-effective for you to go for opportunities that offer ongoing work. Check MERX daily, and don’t dismiss multi-million-dollar tenders, which could be appropriate for sole proprietors if multiple standing offers are involved. Businesses can also team up to bid on a job.

Tips

  • Don’t bother bidding if you don’t meet all the mandatory requirements
  • Don’t use a solicitation document someone emailed you. If you don’t get on the document request list yourself, you could be disqualified.
  • Ask questions; all questions are circulated to everyone who downloaded the solicitation document.
  • Follow the instructions exactly. Mirror wording and sequence in the solicitation document.
  • Keep work samples handy.

If you don’t have security clearance and you have what the government considers “a possibility of a contract,” you can ask in a bid that a buyer sponsor you. Once you have the clearance, don’t let it lapse—it could take up to one year to renew.

Soublière notes that the more proposals you put together, the faster you’ll get, because much of the material becomes boilerplate. She also emphasizes the importance of marketing yourself—on a website, on LinkedIn, on Twitter. List your standing offers, and follow up with contracting authorities to remind them of your availability.

***

This is just a summary of what Soublière covered in her extremely thorough presentation. More information can be found in her book and on buyandsell.gc.ca.

ISC Conference 2012, Day 1—American Society for Indexing’s Digital Trends Task Force

ASI’s David Ream and Jan Wright gave the ISC a report on their work with the Digital Trends Task Force (DTTF), which came into being  in the summer of 2011 after the issue of electronic publication indexing was brought up at the ASI conference earlier that year.

The task force actively participated in the International Digital Publishing Forum (IDPF), a consortium of businesses and organizations involved in defining the new EPUB 3.0 standard. By establishing a special indexers’ working group in the IDPF, and with the Australia and New Zealand Society of Indexers’ membership in the IDPF, indexers made their presence known to a much wider community of players driving the future of electronic publishing. (EPUB is the open source format that can be read on the iPad, Nook, Kobo, Sony readers, and other e-readers. The notable exception is the Kindle, which uses a different format.)

The task force also set out to do industry outreach at such events as the Digital Book World and O’Reilly’s Tools of Change conferences. With this kind of outreach, the ASI could establish itself as an authority about indexing in a digital age. At the latter conference, a recurring concern of electronic publishers was the issue of discovery, since traditional channels, like bookstores and libraries, are now out of the equation. Indexing—and indexers—Ream and Wright pointed out, was the gateway to discovery, and because discovery means money, publishers are more likely to listen to indexers if we emphasize discovery. (Interestingly, Amazon did not participate in Tools of Change.)

Wright also presented at the WritersUA conference. WritersUA, based in the U.S., is a group of technical writers, which have had to deal with the issue of single-sourcing—and a move to XML—years ago. They have experience solving the kinds of problems trade publishers are only now beginning to face.

Wright’s outreach extended to being a guest on #ePrdctn Hour on Twitter, which, as a platform, Wright said, was more powerful than she could have ever imagined. After her Twitter hour, establishing herself as an expert in the nascent field of ebook indexing, Wright was able to reach organizations and companies that otherwise would have been much harder to access. For instance, she is now able to talk directly to Adobe engineers about InDesign’s scripts for ebooks.

The ASI is trying to get the Digital Trends Task Force to conferences that indexers don’t usually attend, focusing on the themes of monetization and semantic metadata.

To stay informed about digital trends affecting indexers, Wright and Ream suggest joining the DTTF’s LinkedIn group and following TidBITS, Peter Meyers (@petermeyers on Twitter), and Joe Wikert.

ISC Conference 2012, Day 1—Indexing National Film Board of Canada images

NFB librarian Katherine Kasirer showed ISC conference attendees what’s involved in indexing the National Film Board’s collection, particularly its Stock Shot library.

We all know the National Film Board as a Canadian institution. It was established in 1939 and has about 13,000 titles in its catalogue, including feature-length documentaries and short animated films. Only 2,500 are available through the NFB.ca website, and these are the result of the NFB’s ongoing project to digitize all films and make them available for streaming.

The NFB also has what it calls the Stock Shot library (or the “Images” database), which is a collection of discarded footage (outtakes) that can be used in other productions. The database also includes

  • the Canadian Army Film and Photo Units (CAPFU) collection, deposited in 1946
  • the Associated Screen News collection
  • captured materials from World War II (German war propaganda)
  • the Canadian Government Motion Picture collection

Users might be, say, music video or commercial producers, researchers, or documentary and feature filmmakers. The database has very fine subject indexing to allow users to find exactly what they need. Since filmmakers often have to convey a particular mood or show a specific object or event, the indexing must include a number of elements of information to help users retrieve the desired footage, including

  • subject
  • location
  • shooting conditions (e.g., foggy, sunny)
  • time of day, season
  • camera angles (e.g., close-up, aerial shot)
  • year of production
  • special effects (e.g., underwater, time-lapse)
  • camera operator
  • film (title of film that produced the outtakes)
  • technical details

The search is, of course, bilingual, and will bring up images and clips, not just a written description. Kasirer’s presentation really drove home how specific and often how nuanced image and footage indexing can be.

ISC Conference 2012, Day 1—Building a bilingual taxonomy for ordinary images indexing

Elaine Ménard gave ISC conference attendees a glimpse into the world of information science research. An assistant professor in the school of information studies at McGill University, Ménard embarked on a project to develop a bilingual taxonomy to see how controlled vocabularies can assist in both indexing and information retrieval. Taxonomies are inherently labour intensive to create, and the bilingualism adds an additional complication.

Ménard’s Taxonomy for Image Indexing And RetrivAl (TIIARA) project consists of three phases:

  1. a best practices review,
  2. development of the taxonomy, and
  3. testing and refinement of the taxonomy.

Phase 3 is currently underway, and she gave us an overview of the first two phases.

In phase 1, Ménard and her team evaluated 150 resources, including 70 image collections held by libraries, museums, image search engines, and commercial stock agencies and 80 image-sharing platforms with user-generated tagging. They discovered that 40% of the metadata dealt with the image’s dimensions, material, and source, and 50% of the metadata addressed copyright information, with the balance devoted to subject classification. This review of best practices constituted the basis of phase 2.

In phase 2, Ménard’s team constructed an image database and developed the top-level categories and subcategories of the taxonomy. To create the database, they solicited voluntary submissions and ended up with a database, called Images DOnated Liberally (IDOL), of over 6,000 photos from 14 contributors. Her taxonomy kept in mind Miller’s Law of 7 +/- 2 and featured (after a series of revisions and refinements) nine top-level categories, designed to help users with retrieval while being as broad as possible, and a further forty-three second-level categories.

After the category headings were translated, two volunteers, one anglophone and the other francophone, tested the preliminary taxonomy through a card-sorting game, in which they were instructed to sort the second-level cards according to whatever structure they desired and provide a heading for each sorted group. This pretest showed a polarization of “splitters” and “lumpers” and didn’t provide any practical recommendations for the taxonomy but did suggest revisions to the card-sorting exercise.

Ten participants (five male, five female; five anglophone, five francophone) were recruited to test the taxonomy to expose problematic categories in the structure. Half of the group was instructed to sort the second-level categories according to the existing first-level structure; the other half could sort the second-level categories as they pleased. Through this test Ménard hoped to assess how well each category and subcategory were understood; the differences between the French and English sorts would reveal nuances that had to be taken into account in the translation of the structure.

Results showed that the first-level categories of “Arts,” “Places,” and “Nature” were well understood but that “Abstractions,” “Activities,” and “Business and Industry” were problematic. Feedback from participants helped researchers clarify the taxonomic structure to seven first-level headings. Interestingly, Ménard found fewer disparities between the languages than expected.

The revised TIIARA structure was refined to include second-, third-, and fourth-level subcategories and was simultaneously developed in English and French.

In phase 3, underway now, two indexers—one English, one French—will work to index all images in the IDOL databases according to the TIIARA structure. Iterative user testing will be carried out to validate and refine the taxonomy.

So far the study has shown that language barriers still prevent users from easily accessing information, including visual resources, and a bilingual taxonomy is a definite benefit for image searchers. Eventually the aim is to implement TIIARA in an image search engine.

ISC Conference 2012, Day 1—More to come!

There were four other ISC sessions that I attended today, but I haven’t had the chance to write them up. I’ll post them as soon as I can piece together something coherent out of my notes. Thanks for your patience!

UPDATE (Sunday, June 3): Ack. Now I have three and a half days’ worth of conference sessions—for the ISC and the EAC—that I have to summarize and post. I took a heap of notes, and I got the speakers’ permission to post synopses, so I’ll eventually get everything up here, though perhaps not as quickly as I initially imagined. I’m hoping to work my way through the session notes over the next week or so. Right now, however, brain = toast.

ISC Conference 2012, Day 1—The glory and the nothing of a name

Noeline Bridge is the editor Indexing Names, a book fresh off the press. She spoke today at the ISC conference about proper noun indexing, particularly the tricky problems that arise from people’s names.

Determining the order of the elements of a name with multiple components is the basic problem that a proper noun indexer must solve. For example, the indexer must know that many medieval names and names that indicate a patronymic are typically left as is and that German names with “von” are traditionally indexed under the part that follows “von.” Bridge gave attendees an extremely useful list of resources that guide the practice with respect to inverting names in a variety of languages.

Deciding how much information to include and exclude is also an indexer’s judgment call. We have to be sensitive to what a publisher or author may want. For instance, one of Bridge’s publisher clients insisted that all military titles be included. Bridge occasionally adds glosses with qualifying phrases for added specificity. As an index user, she explains, she likes to know right away which entries refer to human beings and which ones do not, and the glosses help establish that.

Be careful for parts of a name that may be titles or honorifics. If an author uses only one name to refer to a person (e.g., Batista, versus Fulgencio Batista y Zaldivar), one school of thought is that that’s all you need to include, but Bridge often prefers to look up and include all components of that person’s name for completeness.

Bridge uses glosses to help distinguish between people with similar names (a situation that comes up often in family histories or local histories) by place, by occupation, or by relationship. She uses these to keep them straight for herself and often simply leaves them in to help the reader. Sometimes she uses a family tree program to keep track of whom the text is referring to if there are many generations of people with the same name.

Changes in name can be a complicated category, because in some cases—for instance, when a writer adopts a pseudonym—the person is adopting a different persona, and an argument can be made to index these separately. In cases where a name evolves, once again, the indexer must use judgment to decide whether to use the most recent name/title or the one used predominantly in the book.

In the case of transliteration and romanization, the decision usually has been made for you by the author regarding spelling. An exception is when you have a collection or anthology with different authors on overlapping topics.

A theme throughout Bridge’s talk was that you must be prepared to yield tactfully an author’s preferences, and you must be sensitive to context. For example, whereas you would usually index a celebrity under a name by which she is most commonly known, at times it may be appropriate to use her birth name if you’re indexing a book about her family history.

ISC Conference 2012, Day 1—Ebook indexes: the devil is in the details

Jan Wright, a leader in the field of ebook indexing, gave the keynote address at the Indexing Society of Canada’s annual conference this morning. We are witnessing a watershed moment, she says, where we are trying to define what the markup for our content should look like, no matter where it is—whether it ends up on paper or on a device like e-reader or smart phone. This development is in its infancy right now, with conflicting formats on different platforms, and Wright is part of a working group of indexers actively involved in shaping the EPUB 3.0 standard to include indexing concerns.

Current ebook indexing is either nonexistent or ineffectual. Ebook indexes may be missing or static, and there are almost no ebook indexes that index at a paragraph level. They are not an integrated navigational tool, they are difficult to get to, and they are hard to browse, especially if they’re typeset in two columns.

Existing platforms try to mimic certain features of indexing, but they don’t provide all of the functionality of a traditional index. For example, iBooks Author conflates an index with a glossary and limits the function of indexes as navigational tools. Amazon’s X-Ray, currently available only on the Kindle Touch, shows all occurrences of a particular term by page, chapter, or book, but it is merely recall—without the precision of an index—and offers terms in chronological order. In other words, it’s a brute force attempt at indexing.

When considering ebook indexes, we have to take into account a reader’s mental patterns and search behaviours. Some readers have never read the book and need to know if it adequately covers a given topic; some have read the book and know that their search topic is in there, but they have to find it. We must also keep in mind that reading styles differ whether you’re reading for education or for pleasure, fiction or nonfiction. Using physical cues, such as the position in a book or location on a page, to locate content, as well as behaviours like skimming, are disrupted in ebooks. Some platforms attempt to mimic a paper metaphor, but really, paper is just another interface. The key is to figure out what each interface does best and playing to those strengths, because the paper metaphor doesn’t carry over well onto a small screen. The danger with today’s ineffective ebook indexes is that they are training the reader to believe they are unpredictable and thus to question why they should bother using them at all.

The ideal ebook index has features that have been implemented in other contexts before and so should be completely feasible. Wright gave us a demo of what an effective ebook index should do. It should be accessible from every page; the “Find” feature should reflect the best hits, as identified by index; it should show the search results with snippets of text to offer context; it should allow cross-references to help you refine search phrasing; and it should remember that you’ve been there before and let you go back to it. Ebooks would also allow for additional functionality, like bringing up all indexed terms in a highlighted swath of text in a kind of “mind map” that offers additional information showing how concepts are connected.

So what can we do now? First, Wright says, is to get ready for the eventual use of scripts and anchors in EPUB 3.0. A goal is to develop a way to add anchors or tags to content at the paragraph level, which would allow for hyperlinking directly to the relevant content. Once prototypes of the interactive ebook index have been developed, we must assess their usability to ascertain what’s best for readers.

A big takeaway from this keynote speech is that advocacy and outreach are essential. With the standards at a nascent, malleable stage, this is the time for indexers to have their concerns addressed as the technology develops so that indexers’ workflow can be taken into account. (But more on this in a later post.)