BHL-Europe
Summer Meeting Leiden Aug 2009

Presentation of digitized literature in web interfaces

 Comparison of duplicates
(page-level metadata, scanning quality)

Comparison of viewers
(speed, scrolling, zooming, structure metadata)

Comparison of journal volume summaries

Comparison of search page functionalities

 

 

 

 

 

Typical example for the need of literature in biodiversity research:

http://www.funet.fi/pub/sci/bio/life/: Papilio butterfly genus website

Given preconditions with which the researcher comes in:

Usual way to consult literature:

 

 

 

 

 

 

 

Comparison of duplicates

Page-level metadata   Scanning quality   Speed of web interface

La Cepède 1802: Plates inserted between text pages.

La Cepède, B. G. E. de 1802. Histoire naturelle des poissons. Tome quatrième. - pp. j-xliv [= 1-44], 1-728, Pl. 1-16. Paris. (Plassan).

Gallica ("2007" = 1997?): http://gallica.bnf.fr/ark:/12148/bpt6k97533v
Page-level metadata: Roman pp. yes (Roman), Arabic pp. yes (Arabic), plates yes (Arabic)
Scanning quality: bitonal including figure plates, very low resolution
Speed: quick, 5-7 sec, rarely longer

Harvard (2008): http://www.biodiversitylibrary.org/item/30730
Page-level metadata: Roman pp. not, Arabic pp. yes, plates not
Scanning quality: colour with yellow-orange hue, resolution high, but too close cuts at margins (Pl. 16, "text" page after p. 674)
Speed:  very slow, 12-20 sec, often 30 sec or longer (JPEG2000)

Smithsonian (2009): http://www.biodiversitylibrary.org/item/44010
Page-level metadata: Roman pp. yes (Arabic), Arabic pp. yes (Arabic), plates not
Scanning quality: colour with yellow-orange hue, resolution regular
Speed: very slow, 12-20 sec, often 30 sec or longer (JPEG2000)

 

 

 

La Cepède 1800: Plates inserted between text pages.

La Cepède, B. G. E. de 1800. Histoire naturelle des poissons. Tome second. - pp. [1-2], j-lxiv [= 1-64], 1-632, Pl. 1-20. Paris. (Plassan).

Gallica ("2007" = 1997?): http://gallica.bnf.fr/ark:/12148/bpt6k975315
Page-level metadata: same as above
Scanning quality: same as above
Speed: same as above

Göttingen 2008: http://resolver.sub.uni-goettingen.de/purl?PPN574644156 and http://resolver.sub.uni-goettingen.de/purl?PPN574644423
Page-level metadata: Roman pp. yes (Arabic), Arabic pp. yes (Arabic), plates not (only structure metadata)
Scanning quality: colour without hue, very high resolution and solid colour quality
Speed: quick, 5-7 sec, rarely longer 

Harvard 2008: http://www.biodiversitylibrary.org/item/30017
Page-level metadata: Roman pp. not, Arabic pp. yes, plates not
Scanning quality: colour with yellow-orange hue, resolution high, thumbnail interferes in image (plate p. 492 + 2)
Speed:  same as above

Smithsonian 2009: http://www.biodiversitylibrary.org/item/44012
Page-level metadata: Roman pp. yes (Arabic), Arabic pp. yes (Arabic) and incorrect (p. 449), plates where?
Scanning quality: same as above
Speed: same as above
 

 

 

Other examples for page-level metadata:

Good examples:

Missouri BG: http://www.biodiversitylibrary.org/item/9519 (Bonpland 1813): mixed book (pages and plates)

Göttingen: http://resolver.sub.uni-goettingen.de/purl/?PPN600750280 (Esper 1782): plates-only book

Bad examples:

Smithsonian: http://www.biodiversitylibrary.org/item/41550 (Shuttelworth 1878): pages yes, plates not (incorrect OCRs)

Smithsonian: http://www.biodiversitylibrary.org/item/34049 (Walckenær 1802): Latin pp. not, Arabic pp. yes, plates not

Smithsonian: http://www.biodiversitylibrary.org/item/35904 (Reeve 1849): plates-only work, nothing shown!

 

 

 

Comparison of viewers
(speed, scrolling, zooming, structure metadata)

 

 

 

Comparison of journal volume summaries

Animalbase journal list page "A"

Linz (Annalen Naturhist. Mus. Wien): volume numbers, years, consistently sorted by year, different journal names! Best example of all.

Bielefeld (Der Naturforscher): volume numbers and year in the viewer itself, consistently sorted, moveable frame! but no direct page access

BHL Harvard (Anzeiger Akad. Wiss. Wien): volume numbers and years, not consistently sorted

BHL Woods Hole (Archiv für Naturgeschichte): volume numbers and years, not consistentlsy sorted by year

BHL Natural History Museum (Archiv für Naturgeschichte): partly without any metadata at all! vol 15??

(Links to the same journal should be united on the same page)

Gallica (Annales des Sciences Naturelles): not very comfortable, playing with 2 frames, consistently sorted by year, no permalink for the journal page??

Göttingen (Beyträge zur Naturgeschichte): not very comfortable, not sorted, no year, too many clicks needed

Cincinnati (Journal Cinc. Soc. Nat. Hist.): not comfortable, many clicks needed, no enlargement, no simple page access, moveable frames in both directions allows to cover both margin and headlines
(and another example for incorrect OCR, vol. 2 (2) p. 99
duryi OCR'd as dujryi)

 

 

 

 

Comparison of search page functionalities

Gallica 

BHL  

BHL-Europe test portal

Archive.org

OPAC SUB Göttingen

GVK system catalogue

Examples for queries:

Cuvier, G. 1817 (http://gallica.bnf.fr/document?O=N003851). Le règne animal distribué d'après son organisation, pour servir de base à l'histoire naturelle des animaux et d'introduction à l'anatomie comparée. Avec figures, dessinées d'après nature. Tome II, contenant les reptiles, les poissons, les mollusques et les annélides. - pp. j-xviij [= 1-18], 1-532. Paris. (Deterville).

 

Detailed comparisons of search pages:

Gallica : generally a bad example in terms of design and function.
Old browsers cannot use the search function (IE 5.0).
Linked items are not immediately visible, moving the mouse distorts the image, function does not work properly ("cuvier" and "le règne animal" is not found), hits are not sorted, only 15 results are shown, with too much information per result (up to 20 lines per result, including parts of the contents!), search function bad because it also screens content (this is never necessary and should only be possible in an advanced search mode)
Good: there are some parts of additional bibliographical information which can be seen immediately on request, without waste of time and space.
Mozilla: 2 results without scrolling; IE5:  no search possible. 

BHL: generally a good example.
Linked items are immediately visible, everything is arranged clearly (3-4 lines per result), moving the mouse does not disturb the image, query box very much upside in the headline of the web page (very unser friendly).
Search function itself very limited ("le règ" does not find "le règne animal", but "le reg" does, "archiv naturgeschichte" does not find "archiv für naturgeschichte"), titles are usually not found if you look for the author (example: cuvier). The search function is good if you know exactly its limits and how exactly you have to search. You will find "le règne animal" also if you insert "gne ani", this makes it a rather powerful search function for those who know the exact title.
Sorting of results is good (consistently by title alphabetically - but short words like "the" and "le" are sometimes ignored, sometimes not (why?), example "le reg"), up to 100 results are shown and load in very quickly. There are usually 5 lines between the query box and the first result - 3 of them usually with 0 information (we have to consider that authors, names and subjects usually don't bring any useful results):
...
Search Results for "archiv f"  (New Search)
Titles found : 16
Authors found : 0
Names found : 0
Subjects found : 0
Titles
...

Links of results go to the bibliography page - good in serials, bad in monographic works (there it should link to the item page directly).
Results usually show the most relevant data, titles, usually also author and year, and in one line the contributing library.
Mozilla: 6; IE5:  3 results without scrolling. 

Archive.org: not only for literature, queries compete with non-literature results.
Shows 50 results, with moving thumbnails (can be turned off, if not then queries take longer), moving the mouse does not disturb the image, search function screens also contents (like Gallica), but the presentation is better than in Gallica. Only one background colour, like BHL. Below the query box there are only 3 lines:
...
Search Results
Results: 1 through 50 of 229 (0.023 secs)
You searched for: archiv naturgeschichte 
...

followed by the results. This is a good space management.
Mozilla: 3; IE5:  2 results without scrolling.
Presentation of results: you can sort by Relevance,
Average rating, Download count, Date, Date added - but sorting by date does not mean that the date shows up in the results. This is not a scientific portal. Sorting at BHL is clearer and more useful for biodiversity researchers.

BHL-Europe test portal: needs to be improved.
Old browsers take very long to load in the entry page, mediaplayer is obviously needed to see results (!), this should be removed.
Too much space wasted, logo too large (50 % of the logo's height would do it), by consequence the headline covers too much space, in some browsers not a single result is visible without the need to scroll down,
there is additional unnecessary space wasted between the query box and the results, with another unnecessary query box, and another completely unnecessary headline:
...
#   Sel.   Organisation / Collection  Type Of Resource   Bibliographic citation   Resource
...
several background colours, moving the mouse distorts the whole image including the background colours, linked items are not immediately visible, shows only 10 results.
Mozilla: 3; IE5:  0 results without scrolling.
I would try to combine the main advantages of BHL with those of the German library catalogue web interfaces shown below.

OPAC SUB Göttingen: library catalogue (many libraries in Germany use this web interface, just the logo and left-side frame informaton is replaced), good example for show results design, very user friendly.
Very quick and simple search functions, query boxes very much upside on the page, saves much space to show results, main page and results load in quickly (also in old browsers), background colour uniformly white, font colour very clear, moving the mouse does not distort the image, all links so clearly visible that it is hardly possible to improve this.
"regne animal" gives 74 results, but only 10 are displayed. "
archiv für naturg*" gives 52 results, with icons showing different types of resource (explained by moving mouse over it),
2-3 lines (sometimes more lines for the linked title, only 1 line for metadata) per result is a very good example how literature titles can be presented in such lists. Moreover, there is not much space between the lines, this is by far the most effective way to show the results.
Mozilla: 10; IE5:  7 results without scrolling.
I am asking myself why only 10 results are displayed. This is the only major shortcoming of this presentation.

GVK (a northern German library system catalogue): another very useful and very user friendly example,
design like OPAC Göttingen, a little different, with slightly different functions.
Other library system catalogues use this too, just the logo in the upper left-side corner is replaced.
The main advantages of the OPAC web interface are also visible here. The same icons are used, the same general web page design.
Mozilla: 10; IE5:  7 results without scrolling.

 

 

 

 

 

 

Thank you for your attention!