Navigation
SEARCH
TOOLBOX
LANGUAGES
Create a book
STELLAR Network of Excellence
The open archives, objectives and issues
Create a book

The open archives, objectives and issues

From Stellar Deliverable 6.1

Jump to: navigation, search

1 Historical outlines

“Open access” is a concept which has been dramatically developed and strengthened by Internet and the deployment of high speed networks of communication. Sharing results and scientific information is a land mark of scientific research all along its history. There is then no surprise if when means and willingness converged, open access reached a reality never experienced before. This open accessed materialized itself in the form of the so-called Open Archives (OA). We will shortly describe the milestones of this development, but we first start with a definition of the concept of Open Access in science as it has been proposed at the beginning of this century:

“By ‘open access’ to […] literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited”. (Budapest declaration for an open access initiative, 2002)

This quotation evokes well the expectations of researchers, but also the origin of their fears which have proven to be the main obstacle to the adoption of Open access. Among these fears the issues which seem to count most are those of protection of the intellectual property rights and the use of the work made available. These fears have been taken into account and answered quite clearly (e.g. by the Open Access Now campaign by BioMed Central).

The origin of the modern open access concept in science goes back to 1991, with the creation of ArXiv by Paul Ginsparg in Los Alamos. This archive did rapidly grow and became a reference for researchers in Physics which was the first discipline to benefit from this service. In 1997, the creation of Cogprint by Stevan Harnad, in the domain of Cognitive Science, confirmed for another discipline the relevance of this approach and the interest of researchers. The movement has then never ceased to develop with a growing support of the scientific organization, of research institutions and of a significant part of the scientific community.

There is today a large number of Open Archive repositories, either institutional or disciplinary. The Open Archive Initiative (OAI ) registers hundreds of repositories and services all around the world and covering most of the disciplines. Common standards for tagging (Dublin Core) and harvesting (OAI-PMH) repositories have been created and largely adopted; they allow a flexible access to the resources making open access a fully global concept. This development which came first from researchers is now supported by decision makers, with an explicit support from the European institutions. There is a clear trend to make compulsory open access to publications as a condition for research grants by national research funding agencies or European framework programs. Several declarations witness it like the Berlin declaration (October 2003) or the recent statements from the EC DG Research to ensure current and future access to research and innovation. Initiatives like the Driver project , have been set to create a production-quality infrastructure providing advanced services on sources “virtually organized and structured according to the needs of the specific audience of the users communities”.

The OAI movement started in physics and developed quickly in mathematics, computer science and natural sciences. Human and Social Sciences (HSS) were slower to adopt this type of communication, possibly because the obstacles met in other fields are dramatized there; it is in this sector more difficult to manage, for example, quality issues and trust (plagiarism, authenticity of a resource, etc.). The production and the use of scientific information is often more fragmented in the case of HSS because of the variety of the approaches of a same question, also because the construction of the discourse participate to the construction of knowledge as well as to its dissemination. The result is more editorial initiatives, small and regional conferences and a difficulty to internationalize the research . One of the first fully trans-disciplinary OA, the Hyper-Article on-Line (HAL) by the French CNRS, confirms that HSS repository grows slower than those of other domains. But, this is rapidly evolving. However, the analysis of the behavior of the initial TeLearn Open Archive shows that researchers on TEL, although often educated as HSS scientists or working closely to HSS, have developed a relationship to publication and the archives close to that of researchers in technology . Several initiatives have emerged to modify the perception of Open Access in HSS, it is for example the case of the project OAPEN which intends to improve the visibility of European research in Humanities and Social Sciences while maintaining the quality control which is ensured by peer reviewed publications. Actually the creation of Open access journals has very soon appeared when the OA have developed. But still the absence of classical publishers created a kind of doubt about their reliability and quality. Beyond the creation of repositories, the practice of researchers is questioned as never before with the “complication” of the public availability which makes concretely each reader a potential challenger of the ideas or results presented; an issue possibly more sensible in the case of education.

The most recent evolution is a shift of focus from archiving, tagging and retrieving documents, to offering on top of the repository a set of tools supporting the creation of groups of researchers sharing interest and ready to involve in collaborative research. “Science 2.0” and “iLab” are the new buzz words which emerge symptomatically to witness this evolution. The most common tools are those characteristic of the web2.0 technology: blogs, forum, wikis, webinars, repositories, forge, etc. The organization of these communities stimulates the emergence of bottom up peer reviewing. Readers can participate in discussions about the relevance and quality of a result or a statement in a more transparent way than in the non digital world—a practice which is complementary and not contradictory to blind reviews .

High speed networks, huge increase of storage capacities and progress in video technology allow to store and share video records of talks and conferences. This form of communication which is very common, even more than journal publications in the TEL sector, can be accessed by a much larger public than the one which could until then attend conferences. Keynotes which often remain unpublished , or expert panels with their specific dynamic of argumentation, are made easily and largely accessible. Indeed this change the situation of scientific communication. Moreover, based on web2.0 tools, it is possible to engage in real discussions unlike in the case of conferences where it is hardly possible to ask questions or develop an argumentation. A space can be created around talks by gathering printed material and other evidence supporting the ideas and arguments presented, establishing links with other users sharing the same interest.

2 Issues and frequent concerns

Some of the issues about OA are of a technical nature. Essentially it has to do with the robustness of the repositories and the capacity to maintain the resource over time. These issues will not be addressed here, one may just want to know that they are looked after by the technical teams and engineers of the institutions which provide such repositories (in most cases they are universities, and some NGO). Once guaranties have been given about the OA sustainability, the main remaining problems are that of the understanding researchers have of their duties, rights and their sense of ownership. We focus the rest of this section on these questions. Although the scientific community knows from a long while about open access and open archives, there are many issues and threads which prevent researchers, especially in human and social sciences, to make use of OAs. At the present time, only about 15% of the scientific publications are accessible on open archive, and they are essentially from physics, mathematics, chemistry, biology and computer-science. Researchers in social and human sciences are especially concerned by plagiarism, IPR and copyright. There are no definitive answers to these questions, especially about copyright which is ruled differently in different countries and depending on publishers’ policies. Several initiatives have been taken to inform authors and support them: “e-prints in library and information science” (http://eprints.rclis.org/copyright/), Sherpa-Romeo information about “Publishers copyright policies and self-archiving” (http://www.sherpa.ac.uk/romeo/). The current consensus can be represented by the following quote from the LIS site:

“The right to self-archive the refereed postprint is a legal matter because the copyright transfer agreement pertains to that text. But the pre-refereed preprint is self-archived at a time when no copyright transfer agreement exists and so the author holds exclusive and full copyright. In general, when you publish in a journal you transfer copyright to the publisher. Most journals permit self-archiving, but it depends on the publisher's copyright policy.”

Open archiving pre-print is not seriously questioned, and in mathematics and physics we can see that uploading pre-print on open archives is very frequent if not general. Still there is another thread which is plagiarism. Against this thread the best solution is to help authors to realize that the more their publications are disseminated, the more they are protected.

The institutional archiving ensures that the material uploaded is clearly identified with all the basic data including the date of the publication. This is an efficient way to fix the ownership, date, and content of a publication. Moreover, the means and the computational power now exist, to verify the similarities of the content of papers in a way much more efficient than ever possible until now. Computational techniques like latent semantic analysis (LSA) allow the clustering documents based on the analysis of their content and the investigation of the proximity of their content. (Excerpt from TeLearn FAQ , telearn.org)

When initiating an open archive one discovers that essentially researchers are not aware of their rights and ownership. All confusions are possible about the relations between the rights of the authors, those of the institutions and those of a publisher. It is then necessary to accompany the creation of an open archive with information and training about these rights, taking into account the specificities of the domain and of the local law (e.g. the right of the employee).

A few years ago, no OA specific to TEL research existed. Researchers used either the archive of their mother discipline or of their institution. Actually, we must notice that TEL was not present as such, nor as a sub-domain of the seminal ArXiv. This situation has changed when the EC has created the NoEs as new instruments to structure the scientific communities creating the incentive for common infrastructures. Kaleidoscope, a FP6 NoE, created in 2006 the first OA dedicated to TEL: TeLearn. We present this OA below and give a view of its evolution and use.