Provide a timeline for implementation elaborating or modifying the one proposed in Staging. The following timeline for implementation is copied and adapted from the RFA. We accelerate the timeline to reflect what is already completed and in production (http://osf.io/preprints/) to Stage 0. Stage 0 = [Accelerated from ASAPbio’s 6 month milestone
to 0 month milestone] Milestones to be completed on the project
start date or whatever date the ASAPbio GB is ready for launch.
- Launch of The Commons, populated by legacy preprints from approved
sources in their original formats (.pdf, .doc, .html, .xml etc.).
- New manuscripts can enter the database by The Commons ingest service.
- Ingest service collects required metadata (what metadata is required
can be updated following determination by the Governing Board).
- The ingest service can collect all associated files (supplementary
figures, data, movies, etc) and assigns DOIs.
- The Commons offers a web interface for browsing and searching
manuscripts.
- The Commons renders ingested preprints in the browser supporting a
wide variety of original formats.
- Usage metrics shared openly.
- [Accelerated from 18 month milestone] API for machine
access to preprint metadata is available.
- Service can refer authors to appropriate external repositories if file
size limits are exceeded.
- [Additional feature] Authors can integrate external
repository with ingested preprint from variety of repositories (e.g.,
figshare, dataverse)
- [Additional feature] Full co-author accounts and
permissions integration to ingestion service. Adding unregistered
co-authors. Unregistered author account claiming.
- [Additional feature] Updating preprints to newer version
- [Additional feature] Login via institution credentials
(150+ institutions at start date)
- [Additional feature] Facilitate sharing of preprint via
social media
Stage 1 = Milestones to be completed within 6 months of project start
- Indexing of preprint files from other services for full-text search.
- Commons search results display text that has been extracted from PDFs
to show search terms in context. If preprints are displayed, they can
be displayed as PDFs. All pages are tagged with schema.org meta tags
to ensure that content is discoverable.
- Ingestion sources are requested to provide all manuscripts regardless
of whether they passed screening. While rejected manuscripts will not
be displayed to users, they will be later used to train screening
systems (possibly anonymized).
- All preprints are accessible by bulk download.
- Researchers are able to sign up for alert services to be notified by
email (or RSS or Atom feed or other notification system) of new
preprints relating to search terms of interest.
- [Additional feature] Initial release of conversion and
editing system.
Stage 2 = Milestones to be completed 18 months after project start
- Preprint conversion tool is operating. All newly received preprints
are converted to XML. Older preprints for which a compatible authors’
file is available will be converted as well.
- API for machine access to full-text preprints is available.
Stage 3 = Milestones to be completed 3 years after project start
- Automated manuscript screening system. Trained by the corpus of both
accepted and rejected manuscripts submitted to the Preprints Commons,
this tool will flag manuscripts that are likely to require careful
human screening. Easy for ingestion sources to use, either locally or
by API calls.