Deliverable 4.4 Toolset

Report on Risk, Benefit, Impact and Value.

The Portuguese Web Archive

- See associated risks in HoliRisk.

The Portuguese Web Archive preserves the information published on the web of interest for the Portuguese community for future access. It also provides research resources - for instance in the fields of History, Sociology or Linguistics - and preserves information from the past that is no longer available on the Internet. With the creation of a system that supports regular crawls of the Portuguese web and its long term storage and access, it is intended to provide the following services:

As a bonus, the Portuguese Web Archive also strives to achieve the following goals:

Key Partnerships

Key Activities

Value Propositions

Customer Relationships

Customer Segments

Key Resources

Channels

Cost Structure

Revenue Streams

ABC

Training

A - Related to the first Value Preposition (Long-term preservation of AIP)
B - Related to the third Value Preposition (Access to Preserved Information)
C - Related to the second Value Preposition (Resource discovery)


Key Activities

Preservation Planning

"The OAIS functional entity which provides the services and functions for monitoring the environment of the OAIS and which provides recommendations and preservation plans to ensure that the information stored in the OAIS remains accessible to, and understandable by, and sufficiently usable by, the Designated Community over the Long Term, even if the original computing environment becomes obsolete." ([1], Page 1-14)

SIP Ingestion and AIP Generation

The Submission Information Package (SIP) is "an Information Package that is delivered by the Producer to the OAIS for use in the construction or update of one or more AIPs and/or the associated Descriptive Information." ([1], Page 1-15) The Archival Information Package (AIP) is an Information Package, consisting of the Content Information and the associated Preservation Description Information, which is preserved within an OAIS. ([1], Page 1-9) The Ingest Functional Entity is "the OAIS functional entity that contains the services and functions that accept SIPs from Producers, prepares AIPs for storage, and ensures that AIPs and their supporting Descriptive Information become established within the OAIS." ([1], Page 1-12)

DIP Dissemination

A Dissemination Information Package (DIP) is "an Information Package, derived from one or more AIPs, and sent by Archives to the Consumer in response to a request to the OAIS." ([1], Page 1-11)

Archive Administration

The Administration Functional Entity "provides the services and functions for the overall operation of the Archive system. Administration functions include soliciting and negotiating submission agreements with Producers, auditing submissions to ensure that they meet Archive standards, and maintaining configuration management of system hardware and software. It also provides system engineering functions to monitor and improve Archive operations, and to inventory, report on, and migrate/update the contents of the Archive. It is also responsible for establishing and maintaining Archive." ([1], Page 4-2)

Data Management

The Data Management Functional Entity "provides the services and functions for populating, maintaining, and accessing both Descriptive Information which identifies and documents Archive holdings and administrative data used to manage the Archive. Data Management functions include administering the Archive database functions (maintaining schema and view definitions, and referential integrity), performing database updates (loading new descriptive information or Archive administrative data), performing queries on the data management data to generate query responses, and producing reports from these query responses." ([1], Page 4-2)

AIP Storage

The Archival Storage Functional Entity "provides the services and functions for the storage, maintenance and retrieval of AIPs. Archival Storage functions include receiving AIPs from Ingest and adding them to permanent storage, managing the storage hierarchy, refreshing the media on which Archive holdings are stored, performing routine and special error checking, providing disaster recovery capabilities, and providing AIPs to Access to fulfil orders." ([1], Page 4-2)



Key Partnerships

Software Providers

Software Providers develop and maintain software having in consideration the organizations requirements. These software products are used to support the value propositions of the organization.

Hardware Providers

Hardware providers sell and maintain the hardware deployed in the organization to support the value propositions of the organization.



Value propositions

Supply Evidence in Court

One specific value proposition of PWA is provide evidence in court, as the archive contains a collection of Portuguese web sites in different points in time that might be used as evidence in court.

Long-term preservation of AIP

"Long Term Preservation is the act of maintaining information, Independently Understandable by a Designated Community, and with evidence supporting its Authenticity, over the Long Term." ([1], Page 1-13) "Long Term may extend indefinitely. In the OAIS reference model there is a particular focus on digital information, both as the primary forms of information held and as supporting information for both digitally and physically archived materials." ([1], Page 1-1)

Resource Discovery

"The access functional entity contains the services and functions which make the archival information holdings and related services visible to Consumers." ([1], Page 1-8) It also provides the services and functions that support Consumers in determining the existence, description, location and availability of information stored in the OAIS. ([1], Page 4-2)

Access to Preserved Information

"Allows Consumers to request and receive information from the archive. Access functions include communicating with Consumers to receive requests, applying controls to limit access to specially protected information, coordinating the execution of requests to successful completion, generating responses (Dissemination Information Packages, query responses, reports) and delivering the responses to Consumers." ([1], Page 4-3)



Customer Relationships

Submission agreement

"The agreement reached between an OAIS and the Producer that specifies a data model, and any other arrangements needed, for the Data Submission Session. This data model identifies format/contents and the logical constructs used by the Producer and how they are represented on each media delivery or in a telecommunication session." ([1], Page 1-15)

Order agreement

"An agreement between the Archive and the Consumer in which the physical details of the delivery, such as media type and format of Data, are specified." ([1], Page 1-13)

Event or Adhoc DIP Dissemination Session

"A delivery of media or a single telecommunications session that provides Data to a Consumer. The Data Dissemination Session format/contents is based on a data model negotiated between the OAIS and the Consumer in the request agreement. This data model identifies the logical constructs used by the OAIS and how they are represented on each media delivery or in the telecommunication session." ([1], Page 1-10) A DIP Dissemination Session can either be Event based or Adhoc. In case it is event based it is "a request that is generated by a Consumer for information that is to be delivered periodically on the basis of some event or events." ([1], Page 1-11) If it is adhoc it means that there is "a request that is generated by a Consumer for information the OAIS has indicated is currently available." ([1], Page 1-9)

Search session

"A session initiated by the Consumer with the Archive during which the Consumer will use the Archive Finding Aids to identify and investigate potential holdings of interest." ([1], Page 1-15)

SIP Submission Session

"A delivery of media or a single telecommunications session that provides Data to an OAIS. The Data Submission Session format/contents is based on a data model negotiated between the OAIS and the Producer in the Submission Agreement. This data model identifies the logical constructs used by the Producer and how they are represented on each media delivery or in the telecommunication session." ([1], Page 1-11)



Key Resources

Preserved AIP

An Archival Information Package is "an Information Package, consisting of the Content Information and the associated Preservation Description Information (PDI), which is preserved within an OAIS." ([1], Page 1-9)

Descriptive Information

"The set of information, consisting primarily of Package Descriptions, which is provided to Data Management to support the finding, ordering, and retrieving of OAIS information holdings by Consumers". ([1], Page 1-11)

Archiving Infrastructure

The Archiving infrastructure contains the services and functions for the ingestion, storage and retrieval of AIPs. ([1], Page 1-9)



Channels

Query Service

The Query service allows consumers to perform queries on the holdings of the archive, to locate, analyse, order or retrieve potential information of interest. ([1], Page 1-15 and Page 1-8)

Order Service

A service (Ordering Aid) that assists the Consumer in discovering the cost of, and in ordering, AIPs of interest. ([1], Page 1-13)

Submission Service

The submission service supports the SIP ingestion. Producers submit SIPs through this service and receive receipt confirmations when the SIP is correctly ingested into the Archive. ([1], Page 4-5)



Customer segments

Designated Community

Universities and Public organizations are the main consumers of the archive, as such they have specific requirements for the archive.

Consumers

Universities and Public organizations use the information in the archive for various objectives. Public organizations, as courts, can use the information to provide evidence in court cases and prosecution. Universities can use the information for research in the fields of sociology or information technology, for example.

Producers

The web site owners are the producers of the information that is ingested in archive.



Cost structure

Archive Development

One of the costs of the archive is development. This cost comes from the initial development of the archive and also new functions and requirements that arise from the designated community and also when new technology is needed.

Archive Maintenance

Another of the costs of the archive is maintenance. Maintenance of the archive is performed both by the software and hardware providers and also by the archive staff. There is the need to perform periodic procedures to guarantee that the archive is running smoothly and the information in the archive remains relevant for the designated community.

Wages and Salaries

One of the costs of running the archive is the wages and salaries of the staff that supports the operation of the archive. There is staff with different qualification in the team and wages and salaries differences are also taken in consideration.

Storage and Backup

One of the main costs in the archive is related to storage and backups. As storage is not outsourced there is the need to check and maintain periodically the hardware that deals with the information storage in the archive.



Revenue streams

Public Funding

The main portion of the revenue of the archive comes from public funding. As the Foundation for National Scientific Computing (FCCN) where PWA is incorporated is a public organization there is budget allocated for PWA as part of the annual government budget.

Training

PWA offers training of human resources in the field of web archiving to enable the maintenance of the archive in the future.



References

[1] The Consultative Committee for Space Data Systems, Space data and information transfer systems - Open archival information system - Reference model - Magenta Book. June 2012. CCSDS 650.0-M-2.




Back to Top