BHL Survey 2010, evaluation of the results
15 March - 03 May 2010
Main tasks adressed in this survey
1 - Not to lose our current user
groups (for which it is necessary to understand who
they are)
2 - To improve our service
so that our current user groups can use BHL in a greater extent and more comfortably
than at present
3 - To attract new user groups
(for which it is necessary to understand what the users need among the things
we could provide)
Total number of participants: 1113 (successful answers for Question 16, the other questions had less)
Question 1: use frequencies
Question 1: How often do you use the Biodiversity Heritage Library BHL http://www.biodiversitylibrary.org
Most participants of the survey used BHL less than 2 times in a week. This group accounts for approximately 20 % of the page requests at BHL.
1/3 of the participants were frequent users, the others occasional users or had not used BHL before. The "frequent users" group (A and B) constituted by far the most important section among the participants in terms of usage.
There were only minor differences between participants before and after 06 April (the survey was open 16 Mar to 03 May). We tested this to rule out the possibility that after a few weeks responses would be restricted to occasional users.
This question was important for the evaluations of several questions of the
survey.
Frequent users have more experience with BHL functions, their voice should be
considered as more important when questions are asked about the current quality
of BHL.
Occasional users could eventually convert into more frequent users, if BHL is
improved, so their special needs and desires must not be neglected. The answers
of this group are important when questions about new ideas for new functionalities
are asked. See also Q2: occasional users do not work often with digitized literature,
so improving the service has limitations.
Total number: 1026
Frequent users total: 339
Participants after 06 April: 303
Credit points in these questions were calculated with the following
method:
Answers were always given at a 5-point-scale (total agreement
to total disagreement), with a neutral option (no opinion or "I don't understand")
which was not considered in the evaluation.
I calculated the % proportions of scale points 1-5 (totalling 100 all five taken
together), then multiplied the result by 4 for scale point 1 (total agreement),
by 3 for point 2 (moderate agreement), by 2 for point 3 (middle option), by
1 for point 4 (moderate disagreement) and by 0 for point 5 (total disagreement),
and finally dividided the result by 2.
This yielded a maximum value of 200 credit points for total agreement by all
participants, and a minimum value of 0 credit points for total disagreement
by all participants. Values around 150 corresponded to relatively strong agreement,
100 for middle options, 50 to relatively strong disagreement. Green coloured
bars reflect more or less strong levels of agreement, orange and yellow bars
reflect more or less strong levels of disagreement.
Example:
Question 2, bullet point 2D "for researching biology":
Results: 81 "almost always" + 194 "often"
+ 253 "sometimes" + 204 "rarely" +
230 "never" (+ 5 "I don't know", not considered) (= total 967 answers for this bullet point)
% Proportions: 8.4 + 20.2 + 26.8 + 21.2 + 23.9 = 100 %
Credit points: ((8.4 x 4) + (20.2 x 3) + (26.8 x 2) + 21.2) / 2 = 84.0
Question 2: use functions
Question 2: How often do you use the Biodiversity Heritage Library for the following?
[Scale: almost always, often, sometimes, rarely, never, I don't know/understand]
Results Question 2:
The highest value was obtained for the control question "work with digitized
literature" (177 credits in frequent users).
BHL is mostly used for veryfying nomenclatural questions (141 credits in frequent users, a very high value), less frequently for researching the publication history of species and for verifying the correct citations of literature sources. BHL is substantially less frequently used to find illustrations, and even less for researches on the biology behind the taxonomic name.
Many users download files at high resolution (124 credits), some provide links to BHL sources (for which they need stable URLs) or use BHL for data mining. The bullet point "fulfil a library user's request" obtained the lowest ranking.
These answers were more or less expected.
The biology behind a name of a species is significantly less frequently
requested than the names and acts as such. Information on the biology is substantially
more quickly outdated, and for looking for information on the biology it is
necessary to consult more recent literature. Indirectly this means that BHL
should provide more literature after 1920.
Finding illustrations is not as highly requested as text (86 credits
by al users), but must not be neglected. The lower rating might have to do with
plate numbers not being given in the page level metadata - this makes it difficult
even for specialists to find illustrations. Even if you know on which plate
the animal is figured, it will take you a long time to find it. General interest
readers (28 persons) were significantly more interested in finding illustrations
than average (132 credit points).
Bullet points H (stable URLs), I (data mining) and J (library requests) addressed particular interest groups, who responded more positively to the points than average (the "all users" value for database providers in bullet point H was 64). The low rates of affirmation does not mean that the service was not requested. It means that the service is only requested by particular target user groups.
Occasional users would not do different things with BHL, but just less frequently in every single bullet point.
Total number (bullet point A): 1026
Frequent users total (bullet point A): 366
Question 3: levels of satisfaction
Question 3: We would like to know how satisfied you are with various features of the Biodiversity Heritage Library.
[Scale: strongly agree, agree, neither agree or disagree, disagree, strongly disagree, I don't know/understand]
Users were moderately content with all mentioned features, differences between
items were finely tuned.
Frequent users were generally slightly more satisfied with the functions.
The highest levels of satisfaction were recorded for the PDF downloading function
(entire book) and - surprisingly - for the scan quality. Slightly lower but
still very high levels of satisfaction were registered for the search function
(bullet points A and B yielded exactly the same values, the only difference
was that A obtained slightly higher values for "I don't understand").
The levels of agreement with OCR download functions, high resolution images
downloads and the taxon finder function ranked lower. Still lower support had
the create-my-own-PDF function and the data mining access.
The online viewer had by far the lowest level of agreement.
This question had the highest ratings of answers "I don't understand/I
don't know". This had partly to do with "first" users not knowing
much about BHL. Frequent users knew substantially better what was asked here.
Ratings in % for "I don't understand/I don't know" for bullet points
A-J:
All users: 7, 6, 15, 6, 6, 11, 23, 14, 23, 33
Frequent users: 0, 0, 7, 0, 0, 5, 18, 7, 16, 26
Scan quality. Scan quality was rated as fine or good. Being surprised about this result we asked the community why they have not selected lower ratings of satisfaction for current BHL functions. Being specialists ourselves we know that the scan quality of current BHL libraries (Smithsonian, Harvard, Missouri, Natural History Museum London) are far below those of other providers outside the current BHL group, and that image qualities of the plate figures by those providers are certainly not sufficient if one really needs to work with these figures.
We obtained feedback that many people were so thankful that BHL provides this free service at all, that there simply did not arise the idea to complain about quality. When users were asked more directly about the scan quality several experienced users affirmed that several things in the scanning process could indeed be improved. We will see if ratings will increase next time.
Conclusions
This question was important to be compared with answers after having improved the BHL web presentation. These were things were improvements are on its way.
Ease of reading a book in the online viewer had the lowest rating. The online viewer would need improvement most urgently.
Total number (bullet point A): 1023
Frequent users total (bullet point A): 373
Question 4: PDF download reasons
Question 4: In this question we would like to understand why you need to download a PDF file.
[Scale: almost always, often, sometimes, rarely, never, I don't know/understand]
PDFs are downloaded for a variety of reasons.
Highest rankings were recorded for shortcomings in the online viewer and the
BHL search function. Computer offline was also important. All other reasons
were less important, but none was of minor importance.
Frequent users has slightly higher concerns for long lasting free access
to BHL contents, which corresponds to higher rates of persons who consulted
BHL contents without downloading PDF files. Occasional users needed PDFs
for printing more frequently than frequent users, and they tended to find books
again even more rapidly on their own harddisk than frequent users.
4E was unique in this block in that 25 % did not understand this question. This concerned also the German survey (25 % also there), where the meaning of the term "searchable" was even more finely tuned and misunderstandings would be excluded. The results suggest that searching within the full text of a digitized PDF file is not done by many participants.
4C: Skeptical users (fearing that free access may not be long lasting)
were unevenly distributed among participants (average 77 credits). Frequent
users were generally more skeptical (82 credits). Surprisingly large differences
were recorded between languages.
English-language participants (344 persons) were the least skeptical (52 credits)
(North Americans had 55 credits), French, Italians and South Americans (140
participants) were more skeptical than average (93-95 credits), German-language
participants (143 persons) were very skeptical (103 credits), most skeptical
were eastern Europeans and Russians (45 persons, 108 credits). Librarians (92
persons) were much less skeptical (49 credits), this had influence on the low
value for English participants in general.
4H: suggests that in roughly 30-35 % of the cases PDFs are not downloaded.
Conclusions
Users download PDF files for a variety of reasons. Some reasons can be neutralized
by improving BHL functions (PDF reader against online viewer, making it easier
to find a book again by improving the search functions) or simply by time (long
lasting free access), others not (computer offline).
Bullet point H (I don't download PDFs) received surprisingly high rates, even
higher by frequent users. It will be important to compare the rating for this
bullet point with future surveys.
Total number (bullet point A): 1015
Frequent users total (bullet point A): 350
Question 5: referrers
Question 5: We would like to know from which website you come to the Biodiversity Heritage Library.
[Scale: almost always, often, sometimes, rarely, never, I don't know/understand]
Most users seem to have bookmarked BHL, especially frequent BHL users.
Important referrers were Google, Wikipedia, EoL, occasionally also library catalogs.
Other paths were rarely used. Bing, Yahoo and others can be neglected, the BHL
blog seems to have only few participants, Species 2000 was inserted to
obtain a negative calibration (Species 2000 does not provide links to BHL).
We were surprised that positive responses were obtained at all. It is possible
that Species 2000 is used by some participants who get to BHL indirectly via
other providers to which Species 2000 provides links.
Frequent users have more commonly bookmarked BHL than the others, and
use Internet Archive more frequently, as well as library catalogs. Frequent
users use Google exactly as frequently as do occasional users.
We compared the results with those of Google Analytics in the same time period (15 March - 03 May 2010). The results differed markedly:
Direct traffic: 18.4 %
Google search engine: 49.7 %
Other search engines: 2.5 %
Referring sites: 29 %
This striking difference is difficult to explain. Google Analytics is a service
provided by Google, a commercial company which makes money with such tools.
It is also possible that the results are correct and that Google counts an extremely
high number of useless clicks as successful traffic (people who came from Google,
saw immediately that BHL did not provide what they were looking for and left
the site again quickly).
Conclusions
It will be more important than previously expected to develop strategies
for higher rankings in Google. Even for frequent users the Google search engine
is much more important than we had thought. Other search engines can be neglected,
but it will be important to keep an eye on these, too, since Google's star might
be sinking some day.
Internet Archive is also important. We received feedback that users tend to
look up material found at BHL in Internet Archive in the hope to find the same
work in higher quality, for example a Google book.
Other search engines and the BHL blog yielded only slightly higher rates than Species2000, so those were close to zero.
Library catalogues was difficult to evaluate more closely for a better understanding of this point. North Americans had 46 credits, South Americans 24 and Germans 25.
Total number (bullet point A): 1000
Frequent users total (bullet point A): 367
Question 6: search methods
Question 6: Help us to understand your preferred method of searching for books online. Please rate each of the following search strategies.
[Scale: totally prefer, very much prefer, moderately prefer, slightly prefer, not at all prefer, I don't know/understand]
Most users searched for the author. But not always and exclusively.
There were hardly any differences between frequent and occasional users. Those
who were used to the current BHL default function tended to rank this method
higher. But they also ranked higher the wildcard option search for titles and
authors.
Google-like search was not preferred, even less by frequent users. Google
returns too many insignificant results and is not able to search for an exact
sequence of letters, incorrect spellings are automatically corrected (it is
not possible to search for an uncommon spelling of a name), these are shortcomings
in the Google search function and annoying for professional BHL users. BHL users
seem to know exactly what they are looking for.
Not many users are friends of advanced search options with Boolean terms
like "and", "or", "not" etc. This means that
they prefer an effective and quick default search function. If they do not find
immediately what they have been looking for, they prefer to start a new search
with other keywords instead of using a Boolean search function. Some comments
in the freetext questions however suggest that some users are quite hapy with
Boolean search terms - it would be convenient to provide these as an additional
option.
Scientific names of species and genera are extremely often searched for.
This means that linking taxonomic names with literature sources, as done by
uBio tools (taxon finder) or AnimalBase is very important and highly requested.
Common/vernacular names of animals and plants are very rarely looked
for (50 credits). A special analysis of the general interest readers and artists
(29 persons) gave a value of 95 credits - probably significantly higher but
still not extremely much.
Conclusions
The preferred default search function following these results would be one like that: author, year and some words of the title, with a wildcard option, yielding as few results as possible, and an independent seach function where taxonomic names would be found.
The current BHL default should be maintained as a possible option "exact
letter combination in title", so that "relle des mo" could be
inserted and return only extremely few results (of a title "Histoire
naturelle des mollusques").
Boolean terms are not preferred and should only be offered as an extra function
on request. The default search function should be able to understand the word
"and" as a word belonging to the title.
Total number (bullet point A): 1055
Frequent users total (bullet point A): 372
Question 7: future developments
Question 7: Help us prioritize developments for the future of the Biodiversity Heritage Library.
[Scale: very important, quite important, moderately important, slightly important, not at all important, no opinion, I don't understand]
Frequent users tended to rank priorities for improvements slightly higher.
Non-English users tended to rank priorities higher.
Highest priorities had proposals to submit requests for scanning literature,
improvements of the online viewer and high resolution downloads.
Proposals for improving metadata were rated higher by non-English participants
(121 credits by all non-English users, 92 by English occasional users).
Login functions ranked considerably lower, the majority rated these as not
very important. Highest rates had 7K (saving favorites), obviously in line with
responses from other questions that the search function in BHL is not optimal.
Weakest values had search by collection, quicker download of low resulution
PDFs and (extremely low) tagging content with keywords.
Conclusions and thoughts
Submit requests for scanning literature was the most preferred item
for improving BHL.
The question is how to realize that. We would need a compound catologue from
which titles could be selected. This would be the bidlist's catalogue which
would not give results in the default search function, but only in an extra
search function "bidlist/list of desired titles".
Requests for download of high resolution images go in line with improving
the create-my-own-pdf service. The big problem is that high resolution images
will mostly be requested from colour plates - which in turn have not been marked
in the page-level metadata. So there is a question how to realize this. Suggesting
metadata improvements is the next important item - and a prerequisite for downloading
high-res images on demand.
Quick download of low resolution PDFs had low ratings. This is on contrast with personal feedback, and possibly based on misunderstandings. Users desire a well readable text, independent from the image resolution. Experienced users criticized that text pages scanned by BHL libraries were often brown on tan (Smithsonian style, also Harvard, London, Missouri and the others), instead of black on white. The tan is not needed for understanding the text, neither is the brown colour of the letters. It would be possible to convert these pages into black-on-white (Bielefeld style), this would automatically concord with a significant reduction in file size, and also in much faster loading times in the online viewer.
Login functions ranked generally lower. Users prefer a powerful service
by default, not restricted to users who are logged in. Login has several disadvantages,
many think that login is boring to manage. Occasional users tend to forget their
passwords. Even if the password is known, it takes time to login and enter the
password. "Another account, another password", said one participant
in the freetext answers.
Many participants may also have feared that login is the first step for ceasing
the free service. But the results suggest that this was not so. The ratings
for bullet points 7A, 7K and 7L did not differ in the "skeptical users"
group. We analysed 223 persons who responded positively (radio button options
1 or 2) to Question 4C "I am skeptical that free access will be long
lasting". These had 99 credit points in 7A, 103 in 7K and 95 in 7L,
so no visible differences.
Total number (bullet point B): 1029
Frequent users total (bullet point B): 358
Questions 7 and 16: default portal language
Question 7: Help us prioritize developments for the future of the Biodiversity Heritage Library.
[Scale: very important, quite important, moderately important, slightly important, not at all important, no opinion, I don't understand]
Question 16: My language is that of the country where I am living/working.
All user groups ranked English default considerably higher than local language
default (except the 37 Spanish participants who ranked 111/109). German native
speakers ranked local language extremely low. English native speakers ranked
English-default slightly lower than non-English users.
No visible difference was spotted between frequent and occasional users.
Total number (Q7 bullet point G): 1039
13 % of the participants worked in countries where a different language was
official.
This applied less to Italian and English native speakers, and considerably more
to German, Spanish and French native speakers.
The results suggest that from all points of view and in all countries, English
should be the default language of the BHL portal.
Total number (Q16): 1113
Question 8: APIs
Question 8: Are you aware that BHL allows you to download all of the data on taxonomic names and book information (such as titles and authors) through the use of APIs (application programming interfaces) and other exportable formats?
APIs were only used by very few participants. Only 30 persons were recorded
who actually used APIs. 48 % responded that they did not know that BHL offers
APIs.
Frequent users knew slightly better that BHL offered APIs.
English native speakers understood better what APIs were and more frequently
knew that BHL offered APIs, but only 6 % (12 recorded persons) actually used
these APIs.
The ratings for A and B among Germans, French and Italians were below 10 %,
less than average (17.5 %).
53 % of the Italian, Spanish and French participants did not understand this
question, many more than average (35 %).
Total number: 1071
Frequent users total: 364
Questions 9 and 10: freetext answers
Question 9: What features do you find most helpful to use in other digital library websites such as Google Scholar, Google Book Search, Gallica, Botanicus, AnimalBase, etc.?
[Free text answer]
Question 10: Please provide any additional comments you may have about the Biodiversity Heritage Library.
[Free text answer]
These questions were mainly thought to detect new ideas and items we have
missed to ask in the questions above.
We received several 100 answers. Most participants either praised our work,
or gave a comment that they were not able to give a comment here, or repeated
or refined/explained in more detail aspects or subjects raised in the above
questions.
The latter comments can be important to understand better the results of the
above questions.
Most concerned the search options. The problem with the scan quality
presentation was more explicitly explained, several users expressed that
they would prefer to read a textbook "black on white" instead of "brown
on tan".
New ideas that were brought up by several participants were restricted
to the following points:
Content: the need to fill gaps in serial runs, and the need to expand
the digitized contents to paleontological works.
Better access to articles of serials: metadata should be present for
articles, articles should be searchable/be returned in result sets.
Some other items that were brought up concerned functions that are already available
at BHL but users justs did not know them, which we identified primarily as a
problem of communication.
Question 11: user profile, profession
Question 11: In the context of your research needs, what best describes your profession?
[Check-boxes that allow respondents to select multiple options]
Most participants were bioscientists, either paid or unpaid (many professionals
added comments that they were retired, others worked full-time but were unpaid
due to the lack of funding, and did not miss to complain about that). Multiple
answers were possible in this checkbox question.
The proportion of unpaid amateur researchers was considerably higher among frequent
users (16 % vs. 21 %).
Teachers, librarians and students were moderately important groups. Students
were more important in the frequent users group.
Database providers, librarians and students had higher proportions among frequent
users.
All other target user groups participated in extremely low numbers (artists
and publishers had less than 10 persons).
Special figure to highlight the more generally interested readers:
Library staff was a special target user group largely restricted to
North America.
All languages: all users: 11 %, frequent users: 15 %
English natives: all users 19 %, frequent users 25 %
non-English natives: all users 3 %, frequent users 5 %
See also the library staff figure under Question 14.
Total number of answers: 1335
Frequent users: 431
Question 12: user profile, specialisation
Question 12: My special group of organisms:
[Select one]
This question was important to know which kind of literature should be digitized.
Of those who worked on special groups, most participants had only one special
group (97 % of the frequent users). Zoology had 54 %, botany 40 % (among the
frequent users). Other organisms (algae, lichen, fungi, bacteria) taken together
had ratings below 5 % of all users (bacteria only 0.2 %).
Botanists worked mostly on angiosperms (36 % of the frequent users).
Zoologists were specialized in insects (19 % of the frequent users), molluscs
(16 %), vertebrates (12 %), and others (8 %). The proportion of entomologists
was lower than it could be expected from the number of species they have to
deal with (75-80 % of the animals). Most entomologists worked on Coleoptera,
but the other insect groups were also important, there were no exceptions.
It is possible that BHL is considerably weaker in providing insect literature,
than for other animal groups. It is also possible that the importance of pre-1900
literature is lower in insects than it is in vertebrates and molluscs.
Total number: 917
Frequent users total: 311
Question 13: user profile, disciplines
Question 13: My special interest is:
[Check-boxes that allow respondents to select multiple options]
Checkbox options allowed multiple answers. Figures for all users and frequent
users exclude the participants of the German survey (because the German survey
had a scrollbox instead of checkbox options).
The "Other, please specify" option was an open textbox, employed to
more potential user groups.
The participants were interested in various different fields of biodiversity
research.
More than 80 % were interested in taxonomy, systematics and nomenclature
(91 % of the frequent users). Many participants selected more than one option
(average 3.0 options were selected, 2.8 by frequent users, 3.3 % by German users
when they had checkbox options).
Next to taxonomy, participants were mainly interested in biogeography, morphology,
history of science, evolution and nature conservation, more rarely in paleontology
and molecular biology, only very few in physiology and other fields (informatics,
ethnobotany, bibliography, horticulture, archaeology/anthropology, developmental
biology and ethology - these disciplines would probably have yielded more responses,
had we explicitly given these as options).
We observed differences between various user groups.
Frequent users were much more interested in taxonomy (and nomenclature)
than occasional users (only 9 % of the frequent users were not interested in
taxonomy, 18 % of all users, 36 % in the 166 participants of the "I have
not used BHL before" group). Besides taxonomy, only in history of science
the proportion was higher in frequent users (26 %) than in the occasional users
group (20 % in all users).
In other words, BHL is also consulted by people interested in biogeography,
ecology, evolution and nature conservation, but visibly less frequently. Those
who are interested in taxonomy and history of science consult BHL more frequently
than the others.
Regional differences in fields of interest:
Germans, South Americans and Italians had a broader range of fields of interest
than average (and selected more checkboxes - this is why the average in each
line is not zero), North Americans selected less checkboxes than average in
this question.
North Americans were slightly less interested than average in biogeography,
ecology and nature conservation, slightly more in evolution and phylogeny.
South Americans were more interested than average in evolution, morphology
and biogeography.
Eastern central Europeans and Russians were much more interested in ecology,
also more than average in nature conservation and paleontology, and less than
average in history of science, taxonomy and evolution.
Germans (when they had checkbox options) were more interested than average
in biogeography, morphology, nature conservation and paleontology.
Italians had their special interests in biogeography, ecology, taxonomy
and physiology, and were much less interested than average BHL users in evolution
and phylogeny.
Primary field of interest:
In the German survey no checkbox answers were possible and participants
were forced to select only one item in a scrollbox. This allowed us to determine
the primary field of interest of these researchers.
Many participants (38, = 33 % of the 116 German participants) felt forced to
add other fields of interest in the free text box below, much more than in the
other surveys.
73 % of the Germans saw themselves primarily as being interested in taxonomy/nomenclature
(64 %) and morphology (9 %).
The third most important field was paleontology (8 %), which ranked much lower
in the overall image above (3-5 %).
The proportions derived from the German survey were taken to recalculate the
others surveys, and to answer the question what was their primary field of
interest?. Among the frequent users group we would thus expect that 74 %
see their primary interest in taxonomy and nomenclature, 8 % would have their
primary interest in morphology, 5 % in evolution/phylogeny, 5 % in paleontology,
2 % in history of science, ecology and molecular biology, and only 1 % in biogeography
and nature conservation.
This suggests that we have four major independent groups among BHL users,
accounting for 93 % of the audience.
1 - Taxonomists (74 %)
2 - Researchers studying morphology, presumably species identification (8 %)
3 - Researchers studying evolutionary biology and phylogeny (5 %)
4 - Paleontologists, paleobotanists (5 %)
Total number of answers: 2212
Frequent users answers total: 731
Total number of users excluding German survey: 920
Total number of frequent users excl. German survey: 311
German users English and international survey: 58 persons, 191 answers
German survey participants: 116
Fields of interest and organism groups
Special analysis: distribution of disciplines (fields of interest)
among specialists of certain organism groups.
Five groups of specialists were selected for a closer analysis to know more
about the distribution of the fields of interest among bioscientists: fishes
(29 persons), birds (34 persons), molluscs (119 persons), coleopteran insects
(73 persons) and angiosperm plants (262 persons).
Table: proportions listed by specialists, "all" means all users average (from the above Q13 figure), in bold proportions recorded above the average values.
Taxonomy: all 83 %, insects 100 %, plants 89
%, molluscs 85 %, fishes 72 %, birds 62
%
Morphology: all 31 %, fishes 38 %, insects 26 %, molluscs
25 %, plants 23 %, birds 15 %
Biogeography: all 39 %, insects 45 %, fishes 38
%, birds 38 %, molluscs 38 %, plants 33 %
Ecology: all 29 %, birds 29 %, molluscs
29 %, insects 26 %, plants 22 %, fishes 21 %
Nature conservation: all 20 %, fishes 31 %, plants
21 %, birds 18 %, insects 16 %, molluscs
15 %
Evolution: all 26 %, molluscs 29 %, plants 25 %, fishes
21 %, insects 21 %, birds 18 %
Paleontol.: all 12 %, molluscs 27 %, fishes 10 %, birds 9
%, insects 7 %, plants 6 %
History of science: all 20 %, birds 24 %, plants 17
%, molluscs 16 %, fishes 10 %, insects 10 %
Verbal interpretation of these data:
Taxonomy: strongest in insects, above average in plants and molluscs,
much less in birds and fishes.
Morphology: above average in fishes, the others slightly below, very
weak in birds.
Biogeography: highest rating in insects, but not much lower in the others,
least in plants.
Ecology: most interesting for birds and molluscs, but not much less for
the others.
Nature conservation: most important in fishes, average in plants, slightly
less in the others.
Evolution and phylogeny: molluscs slightly above average, the others
slightly below.
Paleontology: most interesting and much above average in molluscs, much
less in the other groups. Should also be high in dinosaurs and trilobites.
History of science: most important in birds, less interesting in the
others.
Fields of interest and professionality
Special analysis: distribution of special interests among professional
and amateur bioscientists. This analysis has the general problem that the border
limit between amateur/hobby sciensists and professional scientists is badly
defined. Retired scientists did not know how to define themselves (considering
their skills they should have selected professional, but since they were unpaid
many selected amateur/hobby scientist).
Professional bioscientists (paid):
- Taxonomy: 85.9 %
- Morphology: 32.8 %
- Biogeography: 38.2 %
- Ecology: 26.3 %
- Nature conservation: 16.4 %
- Evolution and phylogeny: 29.4 %
- Molecular biology: 12.4 %
- Paleontology: 12.0 %
- History of science: 14.0 %
Amateur/hobby bioscientists (unpaid):
- Taxonomy: 90.7 %
- Morphology: 30.6 %
- Biogeography: 41.5 %
- Ecology: 30.6 %
- Nature conservation: 17.6 %
- Evolution and phylogeny: 16.1 %
- Paleontology: 13.0 %
- Molecular biology: 6.2 %
- History of science: 20.7 %
Major differences (> 5 %) are marked in bold. There were hardly any differences
between the two groups, except that professional scientists were more interested
in phylogeny/evolution and molecular analyses (presumably because they have
the funds to study on a molecular basis), and that amateur scientists were even
more interested in taxonomy, systematics and nomenclature than the professionals.
History of science had a higher rating in amateur scientists, possibly due to
the fact that many retired scientists defined themselves as amateurs.
Fields of interest and general interest users
Special analysis: distribution of special interests among general
interest readers and artists (29 persons) (all = all users for comparison).
- Taxonomy: all 83 %, gen. int. = 48 %
- Morphology: all 31 %, gen. int. = 28 %
- Physiology: all 4 %, gen. int. = 17 %
- Biogeography: all 39 %, gen. int. = 52 %
- Ecology: all 29 %, gen. int. = 41 %
- Nature conservation: all 20 %, gen. int. = 41 %
- Evolution and phylogeny: all 26 %, gen. int. = 24 %
- Paleontology: all 12 %, gen. int. = 24 %
- History of science: all 20 %, gen. int. = 40 %
The differences are striking (but caution with the low number of persons
who defined themselves as general interest readers in this survey). General
interest readers were much less than our average audience interested in taxonomy/systematics/nomenclature,
and much more in nature conservation, history of science, ecology, paleontology,
biogeography and physiology. The needs of this group would be met in a greater
extent if more modern literature containing information on ecology and the conservation
status, and more paleontological litrature would be provided.
Conclusions
Specialists of various groups of organisms use BHL for slightly different reasons. Those who are interested in birds use BHL more for gaining information on the ecology and history of science, in fishes information on the morphology of the species is very important, malacologists have a broader variety of interests, entomologists are almost all taxonomists who are much interested in biogeography, and botanists are mainly interested in taxonomy and systematics.
If the data represent the community adequately, then this would suggest for example, that more ornithologists could be attracted if more modern literature would be available with ecological information, more ichthyologists could be attracted by including red data lists and other information on nature conservation, and more entomologists could be attracted by more efficient analyses of the digitized literature in terms of geographical data.
Question 14: user profile, regions
Question 14: The region where I am working is:
[Select one]
Most participants came from North America and Europe. Europeans tended to consult BHL slightly more frequently than North Americans.
Within Europe, participants came from various countries in Britain and central
and southern Europe (Germany, Netherlands, France, Spain and Italy). Less participation
was recorded from Scandinavia, eastern Europe, Greece and Turkey.
Most eastern Europeans were occasional users. The proportion of frequent users
was unusually high among participants from Spain and Italy.
We detected two main groups of users: bioscientists and librarians. Their distribution was globally unevenly distributed.
Librarians came mostly from North America, some from Europe and Australia, while bioscientists came from various different regions, many from Europe, one-third from North America, also many from South America.
I found no explanation why the proportion of North Americans was so high in the library staff user group.
Total number: 892
Frequent users total: 310
Question 15: user profile, languages
Question 15: My native language is:
[Select one]
50 % of the participants were English native speakers, 50 % spoke other languages.
There were only weak differences among frequent and occasional users.
Among the most frequently recorded other languages were German (16 % all users,
15 % frequent users), French (7 and 9 %), Spanish (5 and 6 %), Italian (5 %),
Dutch (4 and 5 %) and Portuguese (3 %). Czech and Russian had 1-2 %, Scandinavian
languages together 1 %, Chinese 1 %, all others together 5 and 3 %.
We had anticipated such a distribution, so we translated the survey into all frequently spoken languages except Dutch, covering 86 % of the participants. Obviously we underestimated only the importance of the Dutch language.
Total number: 1082
Frequent users total: 360
Survey history: answers by date
Total number of answered surveys: 1877 (1563 until 06
April)
Total number of successfully answered surveys: 1063 (= 57 %) (759 until
06 April (= 49 %))
Due to a bug in the surveymonkey program, 50 % of the answers until 06 April were not recorded by the program (and lost). We did not find the reason of this bug, the surveymonkey support did not know the exact reason either. It had to do with the various different language versions and with the fact that some questions had to be skipped in the course of the language choices. After 06 April we set up one single English survey with some multilanguage components and removed the skip options. This brought the solution, the success rates increased immediately to more than 90 %.
Total number of answered surveys after 06 April: 314
Total number of successfully answered surveys: 304 (Question 1) (= 97 %)
We analysed the survey in the objective to get to know if and
to which extent non-English participants had more difficulties in understanding
the questions of the survey.
Each bullet point of questions 2, 3, 4, 5, 6 and 7 had the option to select
"I don't know/I don't understand". (We should perhaps have
inserted somewhere a bullet point that gave no sense at all, to get a negative
calibration to see how many actually liked to admit that they did not understand).
34.5 % selected option D in Question 8 "I don't understand/I don't know
what APIs are". Higher rates were recorded for this bullet option among
Italian, Spanish and French participants, but this was combined with knowledge
on APIs and did not exclusively refer to language.
Highest levels for not-understanding were recorded for bullet
points 3J (data mining, 33 % of all participants), 4E (full text
not searchable in online viewer, 25 %), 3I (create-my-own-PDF, 23 %),
3G (download OCR file, 23 %), 3C (taxon name finding functionality,
15 %), 3H (download high resolution images, 14 %), 3F (download
PDF, 11 %), all others were below 10 %. Surprisingly, the equivalent bullet
points in Question 7 did not obtain high rates for I don't know/I don't understand
(worst understood in this question was bullet point 7F (tagging BHL content
with keywords like Flickr, 6 %)).
Bullet points 2B-2G, 6A and 7G obtained below 1 % and were best understood.
Average levels of non-understanding regarded by question were the following:
Question 2 (2 %), Question 3 (14 %), Question 4 (3 %, excluding 4E), Question
5 (2 %), Question 6 (2 %), Question 7 (2 %). Question 3 "how satisfied
are you with BHL features" was worst understood.
Levels of understanding of the English survey was analysed by mother languages of participants. These analyses yielded extremely weak results, hardly any difference was detected between English and non-English native speakers (negative values indicate that the level of understanding was lower, = the proportion of persons who selected I don't understand was higher):
Difference English native speakers against average: -0.1
%
Difference Eastern Europeans and Russians against average: -2.8 %
Difference German native speakers against average: +2.6 %
Difference French native speakers against average: -3.5 %
This suggests that French and Eastern Europeans had slightly more difficulties in understanding the English questions, that English native speakers did not understand the bullet points better than non-English natives, and that - surprisingly - the Germans understood the questions better than the others.
The survey was modified at 06 April: until 06 April six language versions had been offered (English, German, Italian, French, Spanish and Portuguese), after 06 April all bullet points were exclusively available in English. The French and German responses were analysed for differences before and after the change.
Germans: German survey (113 persons) against English/international
survey (35 persons): +4.0 %
French: French survey (17 persons) against English/international survey
(50 persons): -8.0 %
This suggests that 8 % of the French did not understand the questions if these were asked in English (but attention, the numbers for the French surveys were low). Surprisingly, the Germans seemed to have understood the questions better if they were asked in English instead of in German... whatever this implies...
The general conclusion is that language skills of participants had not been a significantly limiting factor for the understanding of the survey. German and French native speakers enjoyed the opportunity to use the German and French surveys, but this did not mean that they understood the questions better than if they had been asked in English.
Main final conclusions
1 - The search function must be improved (and we have a precise guide how).
2 - The set of results must
be refined, metadata must
be improved.
3 - The default language
of the portal should be English in all countries.
4 - The online viewer is
important and must be improved.
5 - The main target user group is taxonomists (55 % zoologists, 40 % botanists).
6 - Scientific users come from Europe (45 %), North America
(35 %), South America (8 %) and Australia (5 %).
7 - For attracting new user groups it is indispensable to scan more recent literature,
published after 1920.
8 - The Google search engine
is an important referrer.
9 - Google books and Google scholar are important competitors.
Comparison with the previous BHL-Europe survey (Oct-Nov 2009)
- The main results and conclusions were confirmed.
- The BHL-Europe survey gave us slightly more detailed instructions (search
functions, website design) that can be used by our technicians. In the view
of the present results the BHL-Europe summary section can serve as a good guideline,
it is up to date.
- Some issues raised in the BHL-Europe survey have already been improved in
the meantime (for example speed).
Outlines for the next survey
1 - We can repeat some questions and compare responses, to see if eventual
improvements will have been acknowledged.
2 - We will be able to see if new target user groups will have been attracted.
3 - We would not need to set up several different language versions of the survey,
but it would be convenient to allow freetext responses in various languages.
4 - To know more about our potentials to attract general interest readers, and
to know more about our limits, we might ask a question, from which time period
do you currently use material digitized by BHL, and which time period would
you need material.
Data compiled by Francisco Welter-Schultes, in collaboraton with Bianca Libscomb, mainly for the presentation at the Vienna meeting 26 May 2010
Links to AnimalBase contributions for Meeting Berlin May 2009 (distribution of languages in early zoological literature)
Meeting Leiden Aug 2009 (comparisons of viewers, proposals for improving the portal)