Skip to Main Content

E-book & Streaming Media Management

This guide describes E-book and streaming media collection management resources and tips.

Data & File Requirements

There are some requirements for data sync bibliographic collection data.

1. Leader and directory
Leader offset Leader element in MARC bibliographic format Valid values in MARC bibliographic format
00-04 Record length Computer-generated, five-character number equal to the length of the entire record, including itself and the record terminator. The number is right justified and each unused position contains a zero.
5 Record status a, c, d, n, p
6 Type of record a, c, d, e, f, g, i, j, k, m, o, p, r, t
7 Bibliographic level a, b, c, d, i, m, s
8 Type of control blank space, a
9 Character coding scheme blank space, a
10 Indicator count 2
11 Subfield code length 2
12-16 Base address of data Computer-generated, five-character numeric string that indicates the first character position of the first variable control field in a record. The number is right justified and each unused position contains a zero.
17 Encoding level blank space, 1, 2, 3, 4, 5, 7, 8, u, z
18 Descriptive cataloging form blank space, a, c, i, n, u
19 Multipart resource record level blank space, a, b, c
20 Length of the length-of-field portion 4
21 Length of the starting-character-position portion 5
22 Length of the implementation-defined portion 0
23 Undefined 0
2. Fields and subfields

Field/Subfield  

 Name

Requirement
008/15-17 Country codes Ensure Country code (008/15-17) is not blank.
008/35-37 Language codes  Codes should all be in lowercase.
035 System Control Number Required if available. If available, include an OCLC control number, with valid prefix, in every record. 
040$b Cataloging Source: Language of cataloging Include a language code if any cataloging data is in a language other than English. If this is not coded, our system will assume the item is cataloged in English.
040$e Cataloging Source: Description conventions Include a cataloging description MARC code for rare and archival materials only.
066 Character Sets Present Where this field exists, include 880 fields.
245 Title Statement This tag is mandatory. Include the title proper.
5xx Note Fields Use UTF-8 Unicode or MARC-8 character encoding.
6XX Subject Fields common error: 6xx 2nd indicator 4 Source not specified — the formulation of the subject added entry conforms to a controlled list, but the source cannot be specified by one of the thesaurus or subject heading systems covered by the other 2nd indicator values or by a code for a specific subject heading list in $2.
common error: 6xx 2nd indicator 7 plus $2 Source is specified in $2 — Subject headings or terms are based on other subject authorities (i.e. on authorities other than those listed here). Identify the source $2.
880 Alternate Graphic Representation Where this field exists, include field 066.

 

3. Local system number

001 field of the record should be MMS ID.

4. File name

The file name should start with the 7-digit collection ID (required), followed by 'CDS' (required), date of the file was created or description, separated by period, and the file type (e.g. '.mrc' and 'marcxml'). (Collection ID is only available after the data sync collection is created.)

Export Data from Alma

Since MMS ID is needed for data sync collection, we need to export the MARC records from Alma. 

  • Create a set of ebook records (e titles):
    • Using advanced search, search e-titles using the collection name or other criteria
    • Select the records you want to export or click on Select All, then click on Save and Filter Query. Save it to a set.
  • Export the set:
    • Go to Admin > Run a Job, find Export Bibliographic Records, select Next on the top right
    • Find the set you created, select Next on the top right
    • Change format from XML to Binary
    • Click on Next and then Submit
    • Go to Admin Jobs > Monitor Jobs, click on the job you just run
    • The job report includes a link to the MARC records of the set. Click the link, it will start to downloading the MARC file
  • Rename the MARC file if collection ID is available

Evaluate MARC Data

Often, the WCM selected records should meet the following requirements. If not, before uploading a collection to WCM, we need to make sure the collection meets the above requirements. There are three steps for data evaluation that you may follow: 

1. Identify non UTF8 field(s)

  • Using MarcEdit: MARCValidator to identify non utf-8 fields 

            

  • If non-utf8 fields are found, you could use the editing tools in MarcEdit to delete fields/subfields/records

2. Using MarcEdit evaluate some of the fields and subfields requirements

  • Open MarcEdit, click on Tools > Export > Export Tab Delimited Records on the top 

            

  • Select MARC File and Save File, select 'Tab' as delimiter, uncheck Normalized Data (because we want to keep indicators), add the following fields and subfields:
    • LOR; 008; 035$a; 040$b; 040$e; 066; 245; 5xx; 6xx; 880

            

  • Click Process, MarcEdit will generate a TXT file with the name and location you defined in the former step. 
  • If you are interested in the result, we suggest you use Google Spreadsheet to open it as Excel might not be able to identify
    • Click import > Upload, select the TXT file, select Tac as seperator. 

3. Run a Python script

The Python script uses the TXT file exported from MarcEdit as the input. Download the Python script into the folder as the TXT file, open terminal, type 'python data_sync_validator.py <filename of the TXT file>' (e.g. python data_sync_validator.py report.txt). If there's no problematic records in the TXT file, the script will print 'All records are valid!' in the terminal. If not, it will generate a report and a CSV file of all problematic records in the same folder.