Digitization of archival documents target program. Digitization of archives is the most modern way to store documents! Digitization of paper archives: stages

I.E. Khvorova

The process of digitizing documents to create an electronic archive

The article is devoted to the process of digitization - the most convenient mechanism for saving and using information in modern conditions. The author considers the main points of the digitization process archival materials, including an analysis of possible ways of converting documents into electronic form, categories of documents for digitization, storage formats electronic documents and a study of contemporary digitization standards. The article also analyzes the necessary conditions to create a virtual archive of a historical person or event in a contemporary context. The author pays special attention to the peculiarities of the organization of an electronic resource, explores existing modern projects on this topic and analyzes the possible difficulties that a researcher may encounter when creating an electronic archive.

Key words: digitization, archival document, electronic fund of use, virtual archive.

realities modern life increasingly require us to actively use new information technologies. The information age also introduces its own adjustments into the world of dialogue between archival sources and the researcher. Modern digital technologies allow to implement the most global ideas for creating an alternative storage field - storing documents in a virtual plane.

Digitization archival documents is by far the most convenient tool for storing and using information. It enables safer and more economical storage, easy retrieval and use.

© Khvorova I.E., 2017

information, as well as for quick access to archive materials. The process of digitizing documents is based on the idea of ​​forming a single electronic fund of use (hereinafter referred to as EFP). EFP is a combination electronic copies archival documents that are recorded on digital media and are intended to be used instead of original documents1. An important point when creating an EFP, it is necessary to regulate the process of its creation.

According to the Deputy Head of the Federal Archival Agency O.V. Naumov, the main goals of digitizing documents are to expand and simplify access to documents of the archival fund, ensure the safety of originals by gradually withdrawing them from circulation, providing access to the electronic fund, as well as speeding up the process of providing public services2.

Consider the main points of the digitization process. First of all, it must be taken into account that the archival fund Russian Federation stores an extensive documentary array. Thus, before you start scanning the documents themselves, it is advisable to digitize the scientific reference apparatus of the archive, create electronic inventories for data retrieval. The initial digitization of the archive inventories allows the user to get acquainted with the list of stored documents and order documents of interest via the Internet without leaving home. At the same time, it is also necessary to take into account the fact of the scientific and technical edition of the headings of cases - only if the inventory is improved, it is possible to start digitizing the fund.

Categories of documents for digitization include not only documents for which there is a threat of loss of information, but also the most valuable, unique materials and documents in demand. The criteria for documents about the uniqueness and value of documents are somewhat blurred, and demand is not a constant value, so when choosing digitization materials, the question often arises: which documents need to be digitized in the first place? It may be easier to achieve a more efficient sample if the selection process involves more than just members expert commission archive, but also involved professionals - historians, political scientists, sociologists, public figures, etc. The process of creating such a working group requires clear regulation. It should be noted that, despite the formulated selection criteria3, they are not mandatory and regional archives

the right to choose those documents that need to be digitized in the first place.

The choice of the digitization method is important, since the medium and the format of the material presentation also carry information useful for the work of the researcher, so it is very important to convey it in the most accurate form comparable to the original.

In the direct digitization of documents, it is important to observe all security measures when working with the original, special attention must be paid when scanning books and ancient acts (light, printer, specialist behavior during digitization must comply with accepted standards). At this stage, the financial capabilities of the digitization project play a key role - the choice of equipment for scanning determines the degree of safety of this process for the original. The choice of a more economical hardware device inevitably entails the risk of irreparable damage to documents, and also reduces the chances of creating a comparable, full-color replacement copy4.

When digitizing a document, at least two copies of the original are made - a working copy and a master copy. Both materials must be labeled and registered in a special register. Thanks to such a registration system, the search for a scanned document will be faster and more comfortable for the user, and the credentials will make it easy to track the document in the general information system of the archive.

Among the image storage formats, TIFF and JPEG formats are most widely used by archivists. The TIFF format owes its popularity to its ability to preserve image quality through lossless data compression algorithms. The JPEG format has high compression capability, but there is a loss in image quality. The changes may not be visible to the naked eye, but the compressed image will have sharp contrasts or pixels. Therefore, JPEG cannot be used as an intermediate format in image processing. In JPEG it is permissible to save only the final version. As for other types of electronic documents and the most preferred formats for their storage, there are text documents (ms-word, txt, pdf, html, xml, rtf), video (mov, avi, mpeg, mp4) and audio (wav, aiff, mp3), drawings (autocad), graphs and diagrams (tiff, pcx), databases in the form of spreadsheets and relational databases (xls, xml, html, mdb). For photographic documents, the pdf format is also acceptable. The resulting data array can be used for transmission over digital channels,

stored on digital optical media (write-once CD-R, DVD-R, write-once CD-RW, DVD-RW, DVD-RAW discs), magnetic (hard drives, floppy disks, magnetic tapes), flash drives, etc. d.

Requirements for the quality of the resulting digital copies are not clearly formulated. There are no criteria for assessing the quality of electronic copies of paper documents: image parameters, tone reproduction (contrast), brightness, noise, color accuracy, sharpness, resolution, geometric distortion, etc. A number of documents are already used abroad, whose experience in assessing the quality of digitized documents was would be useful for Russia. The US National Standard ANSI/AIIM MS44 "Guidelines for quality control of image scanners" establishes and discloses the basic terminology, basic parameters and criteria for assessing the quality of scanning, as well as approaches to their practical measurement. On its basis, taking into account modern requirements, in 2000, international ISO standards were developed to assess the quality of scanning black and white documents.

After digitization, the original is returned back to the archive for storage, and the copies made replenish the user's electronic fund and become available to researchers.

The search for a solution to the problem of storage and operational use of archival documents began in the mid-1990s, when the US Library of Congress began the total digitization of the existing collection of microfilms (as an independent collection and as the main carrier of insurance and user funds). In the process of digitization, American specialists were faced with the need to develop unified approaches to the implementation of the processes of transferring information from material media to electronic form, i.e., there was a need to regulate digitization processes.

According to the level of their regulation, modern digitization standards can be divided into 3 categories of standards: international, national and organization standard. The international standard ISo5 is being developed by a group of scientists, its use provides technological, economic and social benefits, but is not mandatory for any country participating in the ISO organization. The national standard is mandatory for use by authorities state power different levels, it is being developed to improve the digitization procedure, taking into account the specifics of legislation, document standards

tions in a particular country. The organization standard emphasizes the specifics of a particular company in the process of converting a document into electronic form and its subsequent storage and use. The most well-known national standard that regulates digitization processes is the S6 Digitization Standard (S6: Digitization Standard)6, put into effect by the National Committee for Standardization of Australia and New Zealand in 2006.7 Unfortunately, there is no similar document in Russia regulating the processes digitization and creation of EFP.

In 2012, employees of the All-Russian Research Institute of Documentation and Archiving (VNIIDAD) and the Federal Archival Agency (Rosarchiv) developed "Methodological recommendations for electronic copying of archival documents and managing the resulting information array." The model for these recommendations is the S6 Digitization Standard and the FADGI Technical Recommendations. In order to continue work in this direction, it seems appropriate to develop a Strategy for the development and updating of sectoral regulatory and methodological acts regulating various areas of activity in the field of informatization of archives, as well as create a regulation and a long-term plan for its implementation. The developers have identified key points that must be taken into account when preparing a regulatory document.

1. The structure and content of the developed and put into effect documents of international and national standardization systems are determined by the specifics legal framework the country where they originated. Therefore, their direct translation and use in the Russian Federation require a balanced approach.

2. It is necessary to create more than one normative document, regulating the issues of digitization, but a set of legal acts describing the requirements for all stages and aspects of this process.

3. Structure and general content of this regulatory and methodological documentation can be built on foreign analogues chosen by industry experts as examples that are closest and adequate to Russian conditions. At the same time, domestic developments should in full reflect the specifics of all processes and aspects of digitization (including the requirements for equipment, personnel, procedures and quality management) inherent in Russia8.

Digitization of documents is a necessary measure for organizing quick access to archive documents. The presence is definitely

th array of documents allows you to create a new, user-friendly, scientifically useful information resource- virtual archive. Examples of such virtual repositories can be seen on the website of the Russian State Archive of Literature and Art.

Virtual archive of I.A. Bunin is a resource with a convenient classification of submitted documents and a user-friendly interface. The documents are divided into three groups: "Manuscripts", "Cuttings from newspapers and magazines", "Illustrative materials" and are scanned copies of the originals of the funds of the Russian State Archive of Literature and Art and the Archive of the Russian Diaspora in Leeds9. The project developers indicate the key points for the implementation of such projects: financial questions, the problem of disunity of archival documents and the need to negotiate with the repositories of different institutions and even countries to collect the necessary material and, finally, the solution of legal issues - since the electronic publication of materials also applies Copyright. The resource is a positive example of the implementation of a project to create a virtual archive. The website of the Russian State Archive of Literature and Art also offers such excellent examples of electronic repositories as a resource dedicated to documents Patriotic War 1812 and the First World War.

The creators of the electronic repository called "The Reunited Virtual Archive of Osip Mandelstam" were Oxford University and the Mandelstam Society. The project developers aimed to identify, describe and place on the Internet all or the largest possible number of surviving creative and biographical materials of Osip Mandelstam, regardless of their physical location10. At the same time, the project combines manuscripts, transcripts of texts, as well as comments on them.

The search for documents is the most important stage in the creation of such projects. This process is complicated by the fact that some of the previously known and introduced collections may no longer exist. Once on sale in the 1990s, they changed owners, even whose names are not always possible to establish. In other cases, the owners are unable to find certain autographs or documents. This is fully explained by the fact that the acquisition of archival-manuscript funds with documents

figures of the Russian diaspora was carried out mainly by donation and was not systematic11. The developers note that the same thing happens with state archives. So, in the National Archives of France, for a number of years, they could not find the matriculation documents of O.E. Mandelstam (they were rediscovered in April 2008)12.

It is important to note that the virtual archive of O.E. Mandelstam is not only a good example of the implementation of such a project, but also, thanks to a detailed description of the process of creating such an archive on the site, it is a kind of tutorial for successors and researchers of the digitization process.

Analysis of implemented Russian projects on the creation of virtual repositories of historical materials emphasizes the problem of fragmentation of documents and the complexity of their search. Thus, when preparing a project, it is important to focus on possible cooperation with foreign archives.

Thus, the role of joint work, joint projects to create a single virtual field for storing materials on the same subject is increasing.

When digitizing, attention must be paid to the process of selecting documents for their digitization, as well as the selection of high-quality scanning equipment to ensure that the digitization process is safe for the originals. However, on this moment one of the most serious issues in this area is the need to regulate the process of digitization (including a detailed description of the process of selecting materials and fixing the terminology base). Without an appropriate, legally approved standard, the process of creating a virtual archive remains laborious and unattainable for most researchers.

Notes

Guidelines for electronic copying of archival documents and management of the received information array. [Electronic resource] URL: http://archives.ru/documents/rekomend_el-copy-archival-documents/section-2.shtml (date of access: 05/13/2016). Features of digitizing documents in contemporary archives. [Electronic resource] URL: https://www.pcweek.ru/ecm/article/detail.php7ID-154329 (accessed 05/13/2016).

Yumasheva Yu.Yu. Archives and the “digital arms race” // Historical informatics. 2013. No. 3. P. 93.

ISO - international organization for standardization. Developer and publisher of international standards. [Electronic resource] URL: http://www.iso. org/iso/ru/ (date of access: 05/13/2016).

Report on research work on topic 2.2.4 "Development of a draft industry standard for creating electronic copies of archival documents", Plan of research and development work carried out on the basis of the state task of the Federal Archival Agency for 2014 No. 89 dated 26.12. 2013 (first stage) "Research and analysis of foreign regulatory and methodological documentation governing the issues of digitization of archival documents" / Yu.Yu. Yumashev. M.: VNIIDAD, 2012. S. 84-163. There. S. 20.

United electronic archive of Ivan Bunin. [Electronic resource] URL: http://www.bunin-rgali.ru/ (date of access: 05/13/2016). Reunited virtual archive of Osip Mandelstam. [Electronic resource] URL: http://mandelstam-world.info/intro.php (date of access: 05/13/2016).

Popov A.V. Russian Abroad and Archives: Documents of the Russian emigration in the archives of Moscow: problems of identification, acquisition, description and use (Materials for the history of Russian political emigration. Issue 4). M.: RGGU, 1998. S. 150-151.

Reunited virtual archive of Osip Mandelstam.

SCAN: Technologies

What is digitization?

02.10.2015, Fri, 14:05, Msk 2707

Scanning, retroconversion and related services. Review of technologies for converting documents into electronic form.

There are several options for organizing digitization processes. They can be carried out independently or with outsourcing of services, with the export of documents or the performance of work on their territory. When digitizing, office, professional document or planetary scanners can be used. Data can be extracted in manual, semi-automatic or automatic modes, with preliminary archival processing of paper documents or information classification already in in electronic format etc.

Which way to choose?

The solution depends on the specific task, because each of the above "or" determines the quality of the result and the cost of the work. For example, the question of bound documents is eternal: is it more profitable to scan slowly in a bound state or spend money on stitching, but quickly digitize it on document scanners?

The easiest way to choose the path that works best for you is to seek the expertise of a digitization organization. Interested in the work, large companies will conduct a survey for free, and the best approach will be determined for you. Don't miss out on this opportunity and don't expect to be tempted to order services: most of these companies are also interested in supplying hardware and software for do-it-yourself digitization.

How many documents do you need to scan?

The defining parameter is the volume of documents.

The defining parameter is the volume of documents. For daily scanning of small batches of embroidered documents (for example, primary accounting) a regular office scanner that can withstand a load of several thousand pages a day will do. You just need to supplement it with a convenient program for indexing.

For regular scanning large volumes professional equipment is required. These are industrial scanners that cost a lot of money (such equipment is used by the Federal Tax Service, the Federal Customs Service, and large banks). Therefore, a framework agreement for the provision of periodic digitization services may be a less expensive alternative.

Converting large retrospective arrays to electronic form on your own is not economically justified: in addition to purchasing equipment and training employees, significant labor and time costs will be required. It is definitely more efficient to order a service, since a large company can allocate a large staff and solve the problem quickly.

Where to scan documents?

The defining parameter is the demand for scanned documents. Does the seizure of documents affect the activity of the organization at the time of scanning? This is especially critical when digitizing documents that are regularly accessed by employees, or which may be suddenly requested by the regulatory authority, as well as for eliminating emergencies associated with documents. Examples: financial and personnel documents, technical and operational documentation, registry office books and other industry funds.

Scanning area, organized in the premises of the company-customer of services.

If you need to digitize them quickly enough, then the traditional approach is to order services with the departure of a scanning team to your territory. Often this turns out to be cheaper than delivering documents to the production of the contractor and back, but everything is determined by the territorial remoteness. Field work regulations mean scanning the issued case within one or two working days, without a long withdrawal from the workflow.

Should I embroider documents?

Determining parameters: the state of the documents and the possibility of stitching. If there is such an opportunity, and the paper is suitable for pulling with a document scanner, then it should be embroidered. The fact is that scanning of bound documents on a planetary (book) scanner is several tens of times slower than streaming digitization. Proportionately increase the time of work and the cost of labor. Scanning on document scanners, even taking into account the jointing, is faster and cheaper.

You can embroider yourself, or you can entrust this to the artist.

You can embroider on your own, or you can entrust it to the contractor: if a reputable company is chosen, you should not be afraid of losing documents. On the contrary, strict regulation of all processes and high-quality materials allow companies to insure themselves against additional financial losses and damage to their image. This approach is even trusted Russian courts: when organizing scanning, usually by an internal order, jointing and subsequent stitching of court cases is allowed.

By the way, large companies can simultaneously carry out professional archival processing: firstly, part of the work is already done in preparation for scanning, and secondly, archival processing helps to identify unclaimed documents and reduce the volume of scanned arrays, which can reduce the cost of work.

What quality to choose?

Today, any object can be scanned with high quality: from a small library card to 8A0 cards and theatrical scenery.

Determining parameters: type of document and amount of resource received in electronic form. Today, scanning equipment can produce images with resolutions ranging from 200 to 1200 dots per inch (dpi). For works of art a resolution of 400-600 dpi is typically used to produce high quality reproductions. Higher quality is used only when it is necessary to enlarge the image and detail small objects, such as coins.

Detailed and low-contrast drawings, often made on tracing paper and blue, need to be scanned at a resolution of 300-400 dpi and additional image processing in graphic editors. The rest of the documents are usually scanned at 300 dpi, which is enough to print copies without losing quality. Necessary image cropping, geometric correction, color correction, conversion to pdf, tiff, jpeg, etc. formats can be carried out in a fully automatic mode by programs built into the scanning equipment or supplied with it.

In most cases, the color shooting mode is used. This is necessary for all documents that have been corrected or stamped over text, to verify that an electronic copy was taken from the original document with a seal and signature, as well as for the legibility of fading texts and to convey the unique features of the original. The need for color scanning of artistic works is not discussed. The "grayscale" mode is used only in some cases: when documents do not contain color attributes, or when it is necessary to reduce the volume of the received electronic resource.

Scanning can be carried out independently. The main task is to train employees to work correctly with complex equipment, since the quality of the resulting images is important for subsequent indexing: a poorly scanned document, shadows, flare and other defects on an electronic image can make it unreadable important information. This will prevent automatic data extraction technologies from being applied and may lead to indexing errors. Uploading erroneous data to some systems ( state registers, accounting systems) is not allowed.

Indexing

Simple scanning is rarely used, since in subsequent work it will be only slightly easier to look for information in a set of graphic files by turning over the paper. To be able to search, it is necessary to select several attributes (index fields) in the document.

Employees involved in mass indexing of documents by manual entry.

Selected attributes can be added to the file name. This practice has developed in Russian courts: in order for the scanning operator not to have access to the internal systems of the court, when digitizing, all the necessary details are entered in the file name. Subsequently, these details are recognized by the judicial system when loading each document separately.

But usually digitized documents are uploaded to information system group, which requires the creation of a database. So, if you need to attach a document to an existing card in accounting system, it may be enough to extract a couple of attributes that uniquely define it - usually a number and a date.

If it is necessary to form a search base on the basis of the documents themselves, then the amount of data to be extracted is determined by the task: from a couple of details for searching a file in an electronic archive to transferring all significant information to an analytical database (name, addresses, TIN, KPP, dates, numbers of application documents etc.).

Museums, libraries and archives apply their own indexing rules when digitizing storage units and accounting documents. A separate area of ​​services is also vectorization, which is used, in particular, in the digitization of logging tapes (automatic) and drawings (manual rendering in CAD systems).

How much data to extract? The answer to this question is also best obtained using expertise, since the number of details to be extracted depends on the functional task and largely determines the cost of digitization. In some cases, you can limit yourself to collections of documents, when electronic images are combined under the auspices of the main document (for example, a contract or register of accounts). In others, it is necessary to extract all the data contained in the document to fill in the information system card.

Data Retrieval Examples

Analysis of orders placed on the zakupki.gov.ru portal by companies with state participation and state institutions (44-FZ, 223-FZ), shows that:

– To bind electronic copies of the ORD to the system electronic document management the number, date and type of document is sufficient.

– Scanning financial documentation is often accompanied by extracting the number, date, names and details of payers, amounts.

- Digitization of archival documents of municipalities (decrees of administrations, city executive committees, village councils, etc.) in order to provide services and inventory objects of land and property relations requires extracting the number and date of the document, all full names and addresses. Moreover, the addresses must be compared with the current KLADR/FIAS directories.

– The digitization of documents of the Archival Fund of the Russian Federation is accompanied by the strict filling of the NSA and the description of the funds in accordance with the archival legislation.

– Indexing inventories and registers implies the recognition of all ordinal records.

– To work with drawings in electronic form, it is necessary to extract almost all stamp fields.

– Scanning composite cases requires not only extracting the details of each document, but also establishing relationships. The most difficult case is design documentation, where the generated database has a multi-level hierarchy and document links.

AT last years document digitization service is becoming very popular for the vast majority of companies. Almost all modern organizations, to one degree or another, have mastered Information Technology and do not present their work without the use of personal computers. Today, document templates are first created on a computer and then printed out. However legal force have documents with signatures and seals, and they again have to be digitized with the help of digitization of documents.

For this, digitization of documents is required.

Often you have to refer to archival documentation, so many companies prefer to have electronic copies of all paper documents of their company. Digitization of paper documents can make life easier not only for managers, accountants, economists and secretaries, but also for representatives of technical and creative professions: designers and fashion designers, builders and architects, engineers and designers, as well as many other professionals. The process of digitizing archival documents containing a variety of diagrams, drawings, formulas, drawings and photographs is more laborious and requires the participation of qualified specialists with extensive experience in digitizing documents.

What you need to pay attention to when digitizing documents:

professional equipment

Digitization of complex documents requires professional equipment with a large number of technical capabilities. After digitization, the specialist checks the documents and the electronic copy, the computer recognizes the text of the document, and corrects possible errors.

Handmade possible

When digitizing documents, manual labor is indispensable, it is necessary when:

  • preparation of documents for scanning: removal of paper clips, files and other fasteners;
  • scanning documents manually;
  • reverse packaging of documents in folders;
  • when entering information into the system;
  • verification of the entered information.

Software

Properly selected software allows you to simplify and speed up the solution of tasks for digitizing documents at times.

Experience and speed

The company "Capital Archivist" has many years of experience in digitizing paper documents and fully guarantees. That everything will be done on time and at the best price.

Do you have serious problems with the storage of paper documents? The shelves are bursting with ridiculously swollen folders, and you are looking for the right piece of paper for three hours? Then it's time to start digitizing documents, which will make your office or apartment cleaner, and the search method easier and more convenient. Create your own electronic library, edit, copy and move digital files at will. The ability to create digital documents is one of the blessings of civilization. So take advantage of it!

Before you dare to digitize your documents, you should know that there are two ways to store them - as images and as text files. Storing images will require much more hard drive space, but you can still keep the style original document. Converting scanned images to a text file will require additional costs time, since it is necessary to carry out the process of optical character recognition OCR (however, to be precise, this name is not entirely correct, since here we are talking about working with digital information, however, as is often the case, the term has taken root).

How to choose a format for storing documents? Very simple: if the original document is handwritten and it is important for you to keep its “characteristic” (a letter from a loved one) or if the document is, for example, a work of art, then save it as an image (sometimes recognizable handwriting is just as important as and written words). Another more prosaic reason for saving handwritten documents as images is the lack of a commercially available software solution suitable for interpreting handwritten characters. So far, this technology is stuck in PDAs and tablets, in which it is implemented in a slightly different form than we need. With a tablet, you write characters by hand, typing them in order, and the software converts them into typed text in real time. Recognition of the handwriting of a single person from a scanned document is a matter of the future.

Scanners

Whether you store your documents as images or as text files, you will need a scanner to digitize them. If you want to digitize a relatively small number of documents, then a multifunction printer or flatbed scanner will be enough for you. Their only drawback is their relatively slow speed. Keep in mind that only more expensive models have an automatic sheet feeder for handling multi-page documents.


Among the best models we will name ScanSnap S1500 from Fujitsu and ScanJet Professional 3000 from HP. The document scanning speed of these devices averages 20 pages per minute or more. The ScanJet Professional 3000 has a more reliable paper feed mechanism, while the ScanSnap S1500 has more advanced software. Both scanners are in roughly the same price range, so the choice is yours.

OCR - software

Most scanners come with software to implement OCR, which is installed on your computer. If you are dissatisfied with the accompanying software or there is none, then such programs are quite common and can be purchased separately. There are the following market offerings:

FineReader 9 Express from ABBYY, $100 for regular and $400 for Pro 10;
. OmniPage 17 Standard by Nuance, $150 for the regular version and $500 for the professional version;
. Acrobat X Standard by Adobe, $299 for the regular version and $449 for the professional version;
. Nuance's PaperPort 12 Standard costs $100 for the regular version and $200 for the professional version of the software, though there's no OCR feature, just a scanned document management option.

Permission

For documents stored as images, a resolution of 150 to 200 dpi is usually sufficient, but OCR software works much better if images are stored at a higher resolution of 300 dpi. It all depends on what you need. If you just want to keep at least the minimum readability of your scanned document, you can lower the resolution requirements. If high quality is important to you, increase it accordingly.

OCR on the web

There are several online services that provide scanned document recognition service. Among the most famous are free resources Free OCR , NewOCR and OCR Online. They are great for small projects, that is, they work only with small documents. You must first scan the original into your computer's memory and then upload an image of the document to a website. Naturally, each of the resources has its own limitations both in terms of the volume and content of the document. So, web applications only recognize text, without lines or additional characters that are present on the page.

Service Free OCR is free, however, the size of the uploaded file cannot exceed 2 MB and contain no more than 5000 pixels, which is about 50 dpi for a regular standard document. Moreover, you can process no more than 10 such documents per hour. Website Services NewOCR you can also use it for free, but its interface is extremely primitive, but the volume of processed documents is 2.5 times larger - up to 5 MB. And finally the resource OCR Online requires a free account, but allows you to upload up to 15 files per hour up to 4 MB at a resolution of about 200 dpi per page. If you are not satisfied with such volumes, then you can buy a paid access for $ 3.95 (8 cents per page) and get the opportunity to process up to 50 documents at a time or pay $ 49.95 for processing up to 5000 (1 cent per page). This web application works with both text and graphic elements, but, of course, it is far from the standards of Acrobat X or FineReader 10.

E-books

Perhaps you, like me, love the smell of a real book, love the feel of thick paper and the look of beautiful graphics. However, today more and more people prefer to deal with e-books, which are read using the so-called special readers, tablets, smartphones, players and other portable devices. A huge number of online stores offer simply gigantic amounts of content. But what if you want to have your own collection of e-books that are not available in digital format?

To convert your favorite "physical" books to e-books, you first need to scan them and then convert them to text format using an OCR program. This is tedious even if you use a very fast FLATBED scanner. Such scanners resemble "copiers", having a pressure cover, so they can scan not only individual sheets, but entire books. If you are ready to “gut” your favorite book, you can use the SHEETFED scanner, which works like a fax, that is, with separate pages (like the ScanSnap S1500 from Fujitsu and the ScanJet Professional 3000 from HP).

After you have translated your documents, textbooks or books into PDF formats, Word or fb2, you can use special programs for organizing, editing or reading electronic documents. For example, Caliber or Stanza. - free organizer and editor for your e-book collection. The program helps to work with the catalog - organize, classify, comment, search, save new and old books on your computer's hard drive or in the e-reader's memory.

2. Organization of work on the digitization of archival documents

2.1. Goals of digitizing archival documents

The digitization of archival documents is carried out in order to form an electronic fund of use (EFF).

The electronic fund of use is a set of electronic copies of documents of the Archival Fund, recorded on digital media, and intended for use instead of original documents, which should provide:

    document safety,

    the possibility of forming electronic resources, providing prompt access to the document, incl. using Internet technologies.

Positioning of electronic copies of archival documents and electronic fund of use as an insurance fund of archival documents unacceptable .

The procedure for creating an electronic fund of use (electronic copies of archival documents) is one of the important tasks of the archive and should be regulated by a specially developed Regulation for the creation of an electronic fund of use (electronic copies of archival documents), approved in in due course after its consideration and approval by the methodological commission and discussion by the directorate of the archive.

2.2. Electronic fund of use

EFP includes copies of digitized in full (completely) storage units.

EFP consists of three arrays of electronic copies:

2.3. Methods for creating EFP

The electronic fund of use is created:

    in a targeted manner within the framework of state, departmental, regional programs and annual (prospective) plans for the work of the archive;

    target order for all documents specified for insurance copying;

    target order for the most frequently requested documents;

    in the process of fulfilling orders;

    in the course of other work.

The main technological operations for creating electronic copies of archival documents:

    selection of documents for digitization;

    preparation of documents for digitization;

    transfer of documents for scanning / acceptance of documents / registration in accounting documentation;

    choice of a method for digitizing documents on various media (for example, for photographic documents, the determining factors are: the type and type of document carrier (photo paper, film, glass), a roll or a separate frame, the size (format) of the carrier (paper and photo frame), the characteristics of the document (a separate sheet document, photograph, or a set of documents (photographs pasted into a photo album); for audio documents - an information carrier, the presence specialized equipment to reproduce the original, etc.);

    digitizing a document - creating an electronic copy - a master copy;

    double (minimum) recording on media: master copy and working copy;

    labeling of media / registration of media and their contents (master copy and working copy) in accounting documentation;

    transfer of copy media for storage;

    return of original documents to storage.

2.4. Criteria for selecting archival funds for creating electronic copies

In a planned manner, electronic copies of archival documents are created primarily for:

    the most used documents, regardless of the time of their creation, material and manufacturing technique;

    especially valuable and unique documents,

    documents that are in unsatisfactory physical condition with a high degree of destruction of the base, which may lead to the loss of the original;

    documents for which there is a threat of loss of information (for example: for documents on a paper basis - the fading of the text; for phono recordings on a magnetic tape - demagnetization; for color photographic negatives - loss of color, etc.) with a satisfactory physical condition of the carrier;

    fulfillment of requests and orders, preparation of publications and exhibition projects.

Only those collections are subject to digitization for which scientific and technical processing or improvement of inventories (in terms of editing titles) has already passed or is not expected in the future.

Of the funds that are equivalent in value, the funds, the documents of which are in an unsatisfactory physical (technical) condition and are most intensively used, as well as color photographic documents, are subject to priority copying.

2.5. Planning work on the creation of an electronic fund of use

In order to organize and control the work on the digitization of funds in each archive, a long term plan digitization, which includes the names of funds intended for creating electronic copies within the framework of the entire collection of the archive (Appendix No. 2).

Should be monitored and revised annually long-term plan carried out based on the results of the implementation of the annual digitization plan, fixed in the List of funds to be digitized.

Long-term planning should be carried out by structural units that are entrusted with functional responsibilities on the creation of electronic copies, taking into account proposals from the departments for the use of documents, departments for ensuring the safety of documents and other structural divisions.

When planning work on digitization, the following columns are included in the planning indicators:

    names and numbers of funds, collections, storage units and names of documents planned for digitization;

    estimated time frame for digitization;

    completion mark.

On the basis of the Perspective Plan, a List (Lists) of funds subject to digitization is (are) created annually, in which (s) the sequence of digitization of funds within a given year (Appendix No. 3) is determined.

The sequence of digitization is determined by the value and informational significance documents, their physical condition, the intensity of their use, as well as the availability of technical and personnel capabilities.

The lists are coordinated with the structural divisions of the archive involved in the creation of the EFP (primarily with the preservation department and the archives in which the files to be scanned are stored) and are approved by the director of the archive (archival institution).

In the annual planning of digitization work, the following columns are included in the indicators:

    names and numbers of funds, collections, numbers of inventories, storage units, names of documents;

    the number of documents to be digitized in the respective storage units;

    the volume of storage units for photographic documents - in sheets / frames / units, for phono, film and video documents - in hours / minutes / seconds;

    document format;

    mark of completion - the date of digitization, the number and date of the act on the transfer of the external media for storage, the marking of the external media;

    ciphers for storing electronic master copies;

    ciphers for storing electronic working copies.

    (Note: the last three paragraphs are filled in upon completion of work).

Digitization is carried out by funds in compliance with the systematization of storage units in the inventory.

It is acceptable to maintain the Perspective Plan and the annual Lists of funds to be digitized in the form of a computer database with the creation of a mandatory annual printout of both documents.

2.6. Structural divisions for the creation of EFP

Work on the creation of an electronic fund of use should be carried out by a specialized department of the archive.

EFP creation is planned and organized as independent view work.

EFP creation cannot be considered as additional functionality employees of other departments.

2.7.-2.8. The approximate composition of the specialists of the structural unit for the creation of EFP.
Employee Qualification Requirements

The composition and functions of specialists in structural unit archive, which is entrusted with the functionality of creating an EFP:

Methodist

Functionality: acceptance of archival documents for scanning, accounting of EFP (EFP-1, EFP-2, EFP-3), quality control of work performed when transferring electronic copies for storage, monitoring of rescanning issues, delivery of archival documents to archives after scanning, transfer for storage of marked media with EFP.

Scan operator

Functionality: carrying out scanning operations, encryption of electronic copies;

External media writer

Functionality: writing electronic copies to media, labeling media.

Engineer

Accounting for the use of working copies, Maintenance computer technology, periodic testing of information carriers with EFP (EFP-1, EFP-2, EFP-3).

Specialist in graphic processing of digital copies and preparation of copies of the second and subsequent generations (if necessary).

The main criteria for choosing models scanners for creating electronic copies of archival documents are:

    security and safety of the original during the scanning process;

    the quality of the electronic copy;

    scanner table size corresponding to maximum size originals intended for digitization, and eliminating the need for fragmentary scanning of documents with subsequent computer “gluing” (“sticking”) of images;

    other specifications equipment.

The problem of the optimal selection of scanning equipment for the digitization of archival documents (according to the "price - quality" criteria) is the subject of a research work carried out in 2011 by order of the Federal Archival Agency by the Research Institute of Reprography (Tula). A report on the topic "Development of guidelines for the selection of scanning equipment that can meet the needs of Russian archives" in January 2012 was posted on the Archives of Russia portal. The Archives of Russia portal hosts a distribution kit developed on the basis of a multivariate analysis of a computer program for selecting equipment - MregForm and step by step instructions on its application.

In 2012, the Research Institute of Reprography (Tula), commissioned by the Federal Archives, developed “Methodological recommendations, software for assessing and monitoring the quality of the functioning of scanning equipment when performing work on the digitization of archival documents in Russian state archives" , posted on the portal "Archives of Russia".

The most optimal solution in terms of choosing scanning equipment is:

Professional book planetary (non-contact) scanners of at least A2 format, equipped with cold light lamps, or LED illuminators and a book cradle for scanning non-embroidered color, black-and-white and grayscale originals (books, drawings, dilapidated materials, atlases) of archival documents, supplied in the following configuration:

Choice digital cameras (cameras) is determined by the size of the matrix and financial possibilities archive.

To date, digital cameras are the safest way for original archival documents to create electronic copies of documents. However, their use also has its limitations and disadvantages, the main of which is the problem of compliance with the light regime.

It is possible to combine different equipment to solve the problems of digitizing documents of different formats.

Specifications of computer hardware:

    System unit:

      Minimum requirements:

        CPU with at least 2 cores and a clock speed of at least 2.8 GHz;

        Memory type DDR3, at least 2 GB, HDD at least 500 GB SATA;

        Video card not less than 512 MB, GPU frequency not less than 700 MHz, type GDDR5, Gigabit Ethernet, Multi-DVD.

      Optimal requirements (for streaming digitization mode):

        Chipset - Intel, CPU with at least 2 cores, volume of at least 6 MB and operating frequency of at least 3.2 GHz;

        Memory type DDR3, at least 8 GB, expandable up to 32 GB, HDD at least 1000 GB SATA;

        Discrete graphics card with at least 1 GB of memory and a memory bandwidth of at least 25.6 Gb/s;

        The ability to protect information using the built-in hardware module;

        Software pre-installed by the manufacturer for the protection and safe deletion of information.

    Monitor:

      diagonal of at least 19 inches,

      backlight type - LED,

      monitor brightness not less than 250 cd/cm2, contrast ratio not less than 1000:1, dynamic contrast ratio not less than 3,000,000:1,

      viewing angles of at least 170 degrees horizontally and 160 vertically.

Technical requirements for server hardware, electronic content storage systems and printing devices are determined based on the actual volume of digital content available, the prospects for its growth and the need for printing electronic copies.

2.12. Basic requirements for the technological premises of the unit
on creating electronic copies of archival documents and workplaces of employees

The premises where work is carried out on the digitization of archival documents and the creation of electronic copies should have a natural and artificial lighting. The orientation of window openings to the north or northeast is desirable. Window openings should be equipped with adjustable blinds or curtains to completely close (if necessary) window openings.

Workplaces for creating electronic copies are equipped with special tables, attachments, lifting and swivel chairs (chairs), adjustable in height and angle of inclination of the seat and back.

Illumination on the surface of the table in the area where the document is placed should be 300-500 lux, illumination of the screen surface - no more than 300 lux. Lighting should not create glare on the surface of the screen and the scanning table. It is permissible, when using professional scanning equipment equipped with its own lamps, to completely turn off the lighting during the digitizing process.

Minimum area for one workplace should be at least 6 sq.m, the distance between desktops with video monitors should be at least 1.2 m.

The monitor screen should be no closer than 500 mm from the user's eyes, taking into account the size of alphanumeric characters and symbols.

The room must be well ventilated. Ventilation openings on the equipment must not be blocked.

Placement near the equipment of indoor plants is not allowed.

The room must be equipped with a safe or a lockable cabinet for storing archival documents accepted for digitization.

Premises where work is carried out on the digitization of archival documents and the creation of electronic copies should be taken under protection.

2.13. Preparation and submission of documents for digitization

Preparation of documents for carrying out work on the creation of electronic copies of the use fund is carried out in accordance with the procedure for issuing archival documents from archives.

Preparation of documents for work on creating electronic copies includes:

    seizing cases,

    verification of search data,

    reconciliation with the inventory of case titles,

    sheet numbering check,

    clarification in the sheets of witnesses.

When preparing files, the physical condition of documents is checked: documents with low contrast and fading texts are identified, as well as documents that require restoration and strengthening of the base. If necessary, specialists in ensuring the preservation of archival documents and specialists in the digitization of documents are involved for consultations in order to prevent the possibility of damage to files during scanning.

Cases intended for digitization, as a rule, are not subject to stitching and can only be stitched in exceptional cases in agreement with the management of the archive, with the complete impossibility of copying the bound case.

The decision to open a case can be motivated by:

    A) ensuring the safety of documents (the case is tightly sewn and when it is opened 180 degrees and the pressure glass is used, damage (deformation) of documents may occur);

    B) the inability to present all the information of the document on an electronic copy, because some of the information "leaves" in the spine.

The decision to embroider documents is made only if there are binding conditions in the archive after scanning files embroidered for digitization.

Upon completion of work, the matter without fail re-weaves.

The transfer of documents for digitization to a specialized unit is carried out by the archive staff responsible for the creation of an electronic fund for use, and is issued by an Order (requirement) for the production of electronic copies ( Approximate form of the Order (requirement) - Appendix No. 4), drawn up in accordance with the sequence of scanning of funds, fixed in the List of funds intended for digitization.

The Order (requirement) for making copies states:

    Reason for digitization (in the case of planned work - a reference to the position in the Annual List of funds intended for digitization; in the case of an order for other purposes - an indication of the number, date and name of the document on the basis of which the work is performed, the goals of the work, details of the customer ).

    Accounting ciphers (fund number, inventory number, unit number, sheet numbers (revolutions - if necessary).

    The number of sheets/turns of sheets to be digitized.

    Resolution, format, media (for orders that are not carried out as part of the archive digitization program).

    Note (indication of special requirements for preservation, the need to use specialized digitizing methods, the possibility of using pressure glass and / or graphic processing (for orders that are not carried out as part of the archive digitization program)).

    Date of transfer for digitization,

    Order execution date;

    Date of receipt of the order (for orders that are not carried out as part of the archive digitization program);

    Date of return of the originals to the vault;

    Cipher and storage location of the electronic master copy (on the built-in media and external media);

    Cipher and storage location of the electronic working copy (on external media);

    Code and place of storage of the second generation copy (if necessary - for orders that are not carried out as part of the archive digitization program).

    The order (requirement) for the production of electronic copies of the FP of archival documents is signed by the director or deputy director (chief custodian of funds).

The order form is drawn up in the required quantity, but not less than 2 copies. One copy is stored in a centralized record in the affairs of funds, the other - in the department for ensuring the safety of documents or in a structural unit in which the centralized storage of the electronic fund for the use of the archive is carried out. The order form is registered in the Register of orders for the creation of electronic copies of documents (Appendix No. 5). The journal is maintained in the structural unit, which is entrusted with the functionality of creating electronic copies.

The journal is drawn up according to the rules for the registration of the accounting documentation of the archive, i.e. its sheets are stitched, numbered; their number is indicated in the certification sheet. Columns in the journal and entries are kept in a spread. It is acceptable to maintain an order register in electronic form.

Performers bear personal responsibility for the safety of the original archival documents during the entire time of working with them.

To avoid re-scanning the same documents (rescan), employees who fill out orders and maintain a journal obliged before sending and receiving documents for scanning, make sure that the documents have not previously been digitized.

If the document has already been digitized, all work on the execution of the order is carried out with its working electronic copy.

Re-scanning (digitizing) (rescan) of documents is unacceptable!

2.14. Digitization (scanning) of documents.
General approaches and requirements

The decision on the method of digital processing (scanning or digital shooting) is made by the head. a structural unit that is entrusted with the functionality of creating electronic copies.

Regardless of the goals, objectives, requirements of orders, etc. archival document is digitized once .

The result of the digitization process is electronic master copy of the document .

Requirements for creating a master copy:

General requirements:

Technical requirements for creating a master copy using scanning equipment:

Every day, the work of creating electronic copies should begin with:

    carrying out routine adjustment of scanning equipment using a scanner calibration kit (calibration tables);

    calibrate your computer monitor.

    carrying out procedures for setting up equipment in accordance with " methodological recommendations, software for assessing and controlling the quality of the functioning of scanning equipment when performing work on the digitization of archival documents in the Russian state archives.

All three types of adjustment must be carried out after each (any) shutdown of the equipment.

The results of the equipment setup should be recorded daily in the Electronic Copies Creation Log or the Routine Setup Protocol. (Appendix No. 6).

Carrying out the scanner setup procedure does not exclude the use in the process of digitization (creating a master copy of archival documents on a paper basis) of special test objects (color and gray scales, technical world), intended for subsequent control of the color, contrast and clarity of the electronic image in the process of its storage. The scales are placed next to the original and must fall into the scanning area. In this case, the electronic master copy must necessarily contain images of the original archival document and test objects.

However, taking into account the fact that the cost of test objects and technical objects is quite high, when digitizing archival documents, one can limit oneself to carrying out three types of equipment settings described above.

In order to exclude the re-digitization of documents, the creation of a master copy is carried out with the highest possible technical parameters:

Resolution of at least 300 dpi - for digitizing documents in A4 format and more;

Resolution not less than 600 dpi - for digitizing documents of less than A4 format;

True color color mode.

See Table #1 for details.

When scanning documents with fine lines, fine details, photographic documents, the resolution must be at least 600 dpi to ensure the reproduction of drawings, maps, and documents in poor physical condition.

Table No. 1

The main parameters of the process of creating electronic copies of archival documents

Media/format

Scan Mode

Compression formats

Master copy

shades of gray

Paper (parchment)
until the middle of the 19th century.

At least 600

At least 600

Necessarily

If necessary

Necessarily

If possible

Necessarily

If necessary

Necessarily

If possible

Necessarily

If necessary

Necessarily

If possible

Necessarily

If necessary

Necessarily

If possible

Paper standard

Necessarily

If necessary

Necessarily

If possible

Necessarily

If necessary

Necessarily

If possible

Necessarily

If necessary

Necessarily

If possible

Necessarily

If necessary

Necessarily

If possible

Paper thin/
tracing paper

Necessarily

If necessary

Necessarily

If possible

Necessarily

If necessary

Necessarily

If possible

Necessarily

If necessary

Necessarily

If possible

Necessarily

If necessary

Necessarily

If possible

photo paper

Necessarily

If necessary

Necessarily

If possible

Necessarily

If necessary

Necessarily

If possible

Necessarily

If necessary

Necessarily

If possible

Necessarily

If necessary

Necessarily

If possible

Necessarily

If necessary

Necessarily

If possible

Necessarily

If necessary

Necessarily

If possible

Necessarily

If necessary

Necessarily

If possible

Necessarily

If necessary

Necessarily

If possible

If the document contains faded or hard to read text, two versions of the electronic master copy are created: in color mode and in grayscale mode.

In this case, depending on the capabilities of the scanning equipment, the following methods of solving the problem are possible:

Each individual image (cover, spread, page, reverse side, etc.) of a digitized document is a separate file, which is automatically numbered in order in the digitization program built into the scanning equipment (in the project).

These files, created as a result of scanning and presented in the digitizing program (in the project), are original master copies highest quality due to the lack external influences(copying, updating, replication, emulation, migration, graphic processing, conversion to another format, etc.). However, the storage of the original master copies in the project is only possible for a limited period of time, so the created files are translated from the scanning program to the graphic editor, after which WITHOUT any processing procedures (except for renaming the files) they are written to the built-in storage media (server, data storage system , e-library). These primary saved and recorded files are master copies.

Graphic processing of electronic master copies is not allowed!

Master copies must be kept:

In order to avoid the loss of digital information on the built-in storage media, one-time replication of the received files to external storage media - compact or optical discs (CD-R, DVD-R) is allowed.

Electronic copies on a CD or optical discs are replicated, excluding the possibility of subsequent addition of information to this media.

It is mandatory to store each CD or optical disc in its individual primary packaging (preferably in a hard box).

Deletion of files created as a result of scanning and presented in the digitizing program (in the project) from the memory of the scanning device and the working computer before recording

Technical requirements for creating electronic master copies using a digital camera do not fundamentally differ from the requirements for creating master copies using scanners.

However, it is worth emphasizing that digital photography should be done at the highest possible camera resolution (at least 150 DPI), in color mode (exceptions are described above) and using test objects.

To represent the real physical dimensions of the original document, it is mandatory to place rulers next to it.

When you digitize documents using a digital camera, captions are automatically given to images. After transferring the frames from the memory card to the hard disk of the computer, it is necessary to bring the titles of the digitized documents in accordance with the numbers of the original sheets in a graphics editor. Any other graphic processing of images is not allowed.

Deleting images created as a result of digital photography and presented on the camera's memory card before recording received files to built-in storage media and subsequent replication to external storage media is strictly prohibited!

2.15. Quality control of electronic copies

Currently, there are no developed and tested methods for automated quality control of created electronic copies, therefore it is advisable to use a combination of visual control methods listed below:

    sheet-by-sheet viewing and comparison of original documents with electronic copies;

    reconciliation of quantity compliance electronic files the number of sheets, checking the sequence of sheets;

    checking the availability of electronic copies of turnovers of sheets of documents;

    analysis of image quality on a monitor screen with a resolution of 1280x1024 pixels, including color reproduction, sharpness, contrast;

    checking the readability of the document at 200% scaling;

    image density estimation;

    analysis of the quality of printouts of selected graphic images created on a printer with a resolution of 600 dpi.

The quality control of electronic copies should be carried out repeatedly at different stages of the creation of EFP-1:

The results of each stage of quality control of electronic copies must be documented and reflected in the protocol (act) of control (Appendix No. 7, 8, 9 10). If an electronic copy is defective at any of the stages, information about it is entered into the protocol (act) of the corresponding stage, which is checked against the control protocol (act) of the previous stage and is the basis for the repeated procedure for working with the electronic master copy.

It is possible to control the quality of electronic copies using:

In this case, a special note is made in the Protocols (acts of verification) about which software tool was used and what are the results of the assessment of the quality of the electronic copy made using this software product.

Quality control of electronic copies should be carried out for each file. The selective control technique is admissible in exceptional cases on the same type (mass) sources.

2.16. Marking of electronic master copies

Each file of the electronic master copy must have a unique cipher-marking name. The development and implementation of a unified marking of electronic copies is aimed at:

    unique identification of the electronic copy;

    the ability to correlate the master copy with the original archival document;

    the possibility of arranging electronic copies of sheets of each digitized case in the catalog structure in ascending order of sheet numbers in order to facilitate their sheet-by-sheet viewing.

The main principle that must be observed when marking electronic copies is the inclusion in the file name structure of all elements of the document's archive cipher. Below is an example of marking an electronic copy of an archival document (the system of archival ciphers of the XX-XXI centuries).

If archival ciphers of documents are built according to a different scheme (which is typical for accounting documentation of the 19th - early 20th centuries), a system that fully reproduces this scheme should be developed and implemented for marking electronic copies.

Traditionally, filenames should contain the main search data of the archive document, separated by "_" (underscores), which include the abbreviation of the archive name (or archive index in automated system), fund number, inventory number, item number (file), sheet number, front side or back code (1 - front side; 2 - back side), scanning mode (color - for color; s - for shades of gray), storage extension (format).

As additions to this scheme, the marking may also contain:

    P - letter index of the fund

    272 - fund number

    3 - inventory number

    a - letter to the inventory number

    8 - sheet number

    1 - obverse or reverse cipher

    color - scan mode

    tiff - format.

If the file is an image of sheets of a storage unit, digitized in a spread, the file name will look like this:

Example: 01_Р272_3а_964_8_2_9_1_color TIFF

    P - letter index of the fund

    272 - fund number

    3 - inventory number

    a - letter to the inventory number

    964 - number of storage unit (case)

    8 - sheet number

    2 - turnover code

    9 - sheet number

    1 - front side code

    color - scan mode

    tiff - format.

For marking electronic copies of documents of collections (non-stock organization of storage), the following marking scheme is proposed:

Example: 01_photo_3a_964_8_1_color TIFF, where

    Photo - the name of the collection;

    3 - inventory number

    a - letter to the inventory number

    964 - number of storage unit (case)

    8 - document number

    1 (2) - cipher of the front side (turnover)

    color - scan mode

    tiff - format.

The principle of marking should be unified for the entire array of digitized documents.

The procedure for marking files of electronic copies indicating all signatures is carried out manually, which slows down the process of creating electronic copies, but allows, if necessary, to organize a separate electronic systems accounting storage of electronic arrays of graphic images.

Histogram files (if necessary) are also marked, but in the position "scan mode" the designation "gr" is put.

2.17. Directory structure on built-in storage media
(storage of electronic master copies)

The organization of storage of electronic master copies on built-in storage media (server, data storage system, electronic library, RAID array) must comply with the principles of hierarchical accounting and description of archival documents and consist of a set of subfolders:

    Folder: Stock No.

      Folder: inventory number

        Folder: Item No.

          Folder: Sheet No. (Sheet No. range) of the document

            Folder: Color (color electronic copy)

            Folder: Grayscale (electronic copy made in grayscale mode - if necessary)

              Files in ascending order of sheet numbers

            Folder: (if necessary) histogram files for individual electronic copies

              Files in ascending order of sheet numbers.

2.18. Storing electronic master copies on built-in media

Electronic master copies must be stored on the server/data storage system/in electronic library with the mandatory formation of RAID arrays used to prevent the loss of information and improve the reliability of data storage.

The structure of storing digital information on a RAID array must completely match the main storage on the server / in the storage system / in the electronic library.

In this case, both electronic master copies (on the server in the storage system / in the electronic library and on the RAID array) have the status of inviolable, access to which is maximally limited.

Replication of master copies to the server and RAID array is documented by the Act (Appendix No. 11).

2.19. Recording electronic master copies to external storage media

Upon completion of creating electronic master copies and placing them in the corresponding section of the directory on the built-in media and in the RAID array, it is necessary to replicate the electronic master copies to external media, which will be the control copy of the master copy and must be stored in another archive division ( document preservation department).

The replication process is activated (Appendix No. 11-a).

At the same time, on the basis of acts, a Log of replication of electronic copies (master copies) is maintained (Appendix No. 12).

2.20. Creation and marking of working electronic copies

To ensure the possibility of active use of the created electronic copies, it is necessary to make a working copy of the electronic copies to create copies of the second and subsequent generations.

To do this, re-replication of electronic copies to another set of external storage media is carried out, confirmed by the preparation of an act (Appendix No. 11-a. The form coincides with the form of the Act of replication of electronic master copies to an external media).

At the same time, on the basis of the acts, the Log of replication of electronic working copies is maintained (The form of the log is the same as the form of the Log of replication of electronic master copies (Appendix No. 12)).

Working copies of electronic copies are made from master copies made with the maximum resolution and saved in *tiff format.

Electronic copies on CD or optical discs are replicated, excluding the possibility of subsequent addition of information to these electronic media.

Table number 2

The main parameters of the process of creating working electronic copies of archival documents

Media/format

Resolution (DPI)

Presentation mode

Compression formats

working copy

shades of gray

Paper (parchment)
until the middle of the 19th century.

Necessarily

If necessary

Necessarily

Necessarily

If necessary

Necessarily

Necessarily

If necessary

Necessarily

Necessarily

If necessary

Necessarily

Paper standard

Necessarily

If necessary

Necessarily

Necessarily

If necessary

Necessarily

Necessarily

If necessary

Necessarily

Necessarily

If necessary

Necessarily

Paper thin/
tracing paper

Necessarily

If necessary

Necessarily

Necessarily

If necessary

Necessarily

Necessarily

If necessary

Necessarily

Necessarily

If necessary

Necessarily

photo paper

Necessarily

If necessary

Necessarily

Necessarily

If necessary

Necessarily

Necessarily

If necessary

Necessarily

Necessarily

If necessary

Necessarily

Necessarily

If necessary

Necessarily

Necessarily

If necessary

Necessarily

Necessarily

If necessary

Necessarily

Necessarily

If necessary

Necessarily

If later in the process of active use of the working copy it is necessary to replace it, then creating a new working copy (re-replication) is possible only:

This creates new act for replication, a new entry is made in the Electronic Working Copy Replication Log. In the old record, a note is made about the destruction (a reference is made to the number and date of the Act on the destruction of the external storage medium).

The created new working copy is assigned the marking of the destroyed copy (see clause 2.22. The field “date of copy creation” is filled with updated information).

In the Journal of registration of disks of working copies (EFP-2) (See clause 2.24, Appendix No. 17), a note about the Act is made in the "Note" field of the entry about the first working copy technical condition copies and the Act of its destruction. The newly created second working copy goes through re-registration(p. 2.24) .

Copying working copies or creating copies of the second and subsequent generations from them in other departments of the archive and without following the described procedure is unacceptable.

The external media with working copies recorded on it remains in the structural unit, which is entrusted with the functionality of creating electronic copies of archival documents for use and creating derivative copies.

2.21. Ensuring the authenticity, reliability and integrity of the electronic copy

According to GOST 15489-1-2007, in order to ensure the authenticity, reliability and integrity of documents (including electronic copies of any order), it is necessary to implement and document procedures for controlling the creation, completeness and immutability, receipt, transmission, storage and selection of documents and thereby ensuring that the creators of documents are authorized and identified, and documents are protected from unauthorized addition, deletion, modification, use and concealment (classification). From this definition it follows that the provision of these characteristics of electronic copies is possible only as a result of the development, implementation and use of a system for accounting and managing digital content, in which electronic copies of documents must be recorded, accompanied and associated "with metadata that reflect the operations performed with them in the course of business activities ”, and strict compliance with the relevant digital resource management regulations.

2.22. Labeling of external storage media

For each external storage medium, a separate information insert is created. Inserts for external media (CD-R, DVD-R), to which electronic master copies and working copies have been replicated, must indicate:

    Example:

      GA RF, disk No. 1; 07/22/2012. 523 files, 3.83 GB, EFP-1 - original; F.R-499, op.1, d.1-8

      File name

      File name

Marking external media of working copies is done in the same way, and in the position "disc type" is written "disc type (electronic copies - working copy (duplicate))".

Insert form - Appendix No. 15.

2.23. Quality control of external storage media

After burning optical discs, it is necessary to test them and control their quality (readability):

    visual control is carried out simultaneously with the recording of the disk and is recorded in the Log of the technical condition and diagnostics of external media (Appendix No. 16);

    check for reading failures using the "Scan Disc" utility

    checking the readability of information recorded on the media using software and technical means; grade physical condition media safety.

The result of the control is documented in an act. (Appendix No. 13, clause 1.1.)

Such control must be carried out in the mode of routine maintenance at least once a year for all available external (disk) storage media.

2.24. Media Registration

Each external media must be registered in the External Media Logs:

    Journal of registration of external media (disks) with electronic master copies (EFP-1);

    Journal of registration of external media (disks) of working copies (EFP-2).

    Journal of registration of external media (disks) with copies of the second and subsequent generations (EFP-3).

    (The forms of all three magazines differ slightly from each other. Appendix No. 17)

Registration is carried out by an employee of the structural unit, which is entrusted with the functionality of creating electronic copies upon completion of recording each new disc.

The magazines must be stitched, the sheets are numbered, their (sheets) number is reflected in the certification sheet or certification record at the end of the magazine. The magazines are being rotated.

2.25. Transfer of external storage media

Recorded external media with electronic master copies of archival documents are transferred to the department for ensuring the safety of documents or the archive division, which is entrusted with the function of storing the fund for using the archive, according to the certificate of acceptance and transfer of master copies for storage (Appendix No. 18), drawn up in the structural division , which is entrusted with the functionality of creating electronic copies.

Reception of external media is carried out in one copy of storage units - control on which electronic master copies are recorded.

When electronic copies are received by an employee of the document preservation department or the archive department, which is entrusted with the function of storing the archive use fund, a scientific reference apparatus(inventory (Appendix No. 19), certification sheet for the inventory, cards of electronic media) and other accompanying information, the completeness of the production of electronic copies, visual and technical condition are checked.

External media with electronic working copies are stored in the structural unit, which is responsible for the creation of electronic copies, and are used to make copies for various purposes.