User Tools


Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
public:nnels:cataloguing:metadata-cleanup [2019/01/11 10:23]
robert.macgregor
public:nnels:cataloguing:metadata-cleanup [2023/09/20 09:34] (current)
robert.macgregor
Line 1: Line 1:
-====== How to Clean Up Book Metadata in Drupal  ======+=====Fixing a Record in Drupal Step by Step=====
  
-  - Login at https://nnels.ca using your account information +====Logging In and Selecting Records to Edit====
-  - Once you login, you should see a link called ''Shortcuts'' at the top of the page +
-  - Click on ''Summary/Subjects Editor'' in the drop-down menu. This will take you to a list of all our repository items (book records). +
-    - You can change the search criteria to only display books that are missing any or all of: subjects, summaries, genres, literary format, and audience. +
-  - **Select a Record Set** to work on. Refer to the [[public:nnels:publishing:projects:cataloguing:metadata-cleanup:completed-records|Completed Records Set page]].  +
-  - Click on the ''edit'' link (in the far right column) for a particular book in order to edit it. +
-  - This will take you to the book record. The two tabs that are of most concern are ''Genre/Formats'' and ''Subjects/Descriptors''.  +
-  - You want to check that there’s data in all of these fields: +
-    - **Genre** - the terms must be separated by commas; select Genre terms from the [[public:nnels:publishing:projects:cataloguing:metadata-cleanup:genre|NNELS Genre Taxonomy]]. +
-    - **Literary Format** - select one from the drop-down menu +
-    - **Subject** - use FAST headings (below); the subject terms must be separated by commas; if the term contains commas then it must be surrounded by double quotation marks +
-    - **Audience** - select one from the drop-down menu (except "specialised", please don't use "specialised"+
-    - **Abstract** - you can copy and paste from Amazon, WorldCat, or the publisher’s website; clear all formatting when done (highlight text and click that white eraser icon) +
-  - Other fields to check data for: +
-    - Click on the ''Basic Details'' tab. If you notice that the **Title** of a book contains weird formatting or is missing punctuation - please fix this to the correct title! Also, we prefer to use sentence case for titles. +
-    - Click on the ''Creators/Contributors'' tab. If you notice that the **Creator** field of a book is empty - please add the author's name(s) in the Creator field in "Last name, First name" format +
-===== Assigning Genres =====+
  
-Select from the genre terms found on [[public:nnels:publishing:projects:cataloguing:metadata-cleanup:genre|this page]] of the wiki.+  *Login at https://nnels.ca using your account information 
 +  *Once you login, you should see a link called ''Shortcuts'' at the top of the page 
 +  *Click on ''Summary/Subjects Editor'' in the drop-down menu. This will take you to a list of all our repository items (book records). 
 +  **Select a Record Set* to work on. Refer to the [[public:nnels:cataloguing:metadata-cleanup:completed-records|Completed Records Set page]]
 +  *Click on the ''edit'' link (in the far right column) for a particular book in order to edit it.
  
-===== Assigning Subject Keywords ===== +====Editing Records====
- +
-[[http://classify.oclc.org/classify2/|OCLC Classify]] +
- +
-Use OCLC Classify to help you assign subject keywords to books. Use the **FAST Subject Headings**.  +
- +
-For example, the [[http://classify.oclc.org/classify2/ClassifyDemo?search-title-txt=reluctant%20communist&startRec=0|FAST Subject Headings for The Reluctant Communist]] are: Manners and customs, Korea (North), "Jenkins, Charles Robert, 1940-", Military deserters, United States, Americans, Defectors, "Korean War (1950-1953)" +
- +
-===== Assigning Data to Other Fields===== +
- +
-For assigning missing audience, literary format, and other metadata to NNELS records, we recommend using the following resources to help you out: +
- +
-  * [[http://www.worldcat.org/|WorldCat]] +
-  * Public library catalogues, i.e. [[https://vpl.bibliocommons.com|Vancouver Public Library]] +
-  * Evergreen - you can search for a title in Evergreen and view the MARC records created by many other libraries +
- +
----- +
- +
-=====Fixing a Record in Drupal Step by Step=====+
  
 These fields are all mandatory.  This is the order I do things in. These fields are all mandatory.  This is the order I do things in.
  
 +Select a Record Set to work on. Refer to the [[public:nnels:cataloguing:metadata-cleanup:completed-records|Completed Records Set page]].
  
 ====1 Basic Details==== ====1 Basic Details====
Line 53: Line 24:
   *Only capitalize the first word of the title and proper names.   *Only capitalize the first word of the title and proper names.
   *Use a colon to separate any subtitle from the title (space, colon, space).  Ex:  The hobbit : there and back again   *Use a colon to separate any subtitle from the title (space, colon, space).  Ex:  The hobbit : there and back again
-  *If the title has a reference to the series and it's position in the series, use "book" to identify where it falls in the series (there are a lot of ways these come from the publishers - volume, no., number, book, etc. - this isn't terribly important, but it makes things more uniform).  Ex:  The two towers : the lord of the rings book 2+ 
  
  
 ====2 Creators / Contributors==== ====2 Creators / Contributors====
  
-This information should already be in the record, but it needs to be looked at quickly to make sure it is actually there, and is in the right format.  If the information needs to be added or if it looks wrong, some sources to find the right information are:  Google, Goodreads, Worldcat, OCLC Classify, publisher's website.+This information should already be in the record, but it needs to be looked at quickly to make sure it is actually there, and is in the right format.  If the information needs to be added or if it looks wrong, some sources to find the right information are:  OCLC Classify (first place to look, as it should have the name authority) Google, Goodreads, Worldcat, publisher's website.
  
 ===2.1 Creator=== ===2.1 Creator===
Line 70: Line 41:
   *If the item is a collection of short stories, it is sufficient to put in the editor's name.  Again, one line per editor.   *If the item is a collection of short stories, it is sufficient to put in the editor's name.  Again, one line per editor.
   *Some items are created by non-human entities, like corporations.  This may come up with academic printing presses or conference proceedings.  Use the corporate name.  Ex. University of Alberta Press   *Some items are created by non-human entities, like corporations.  This may come up with academic printing presses or conference proceedings.  Use the corporate name.  Ex. University of Alberta Press
-  *The OCLC website (discussed below) includes authorship - so this might clear up any confusion if their is any.+  *The OCLC website (discussed below) includes authorship - so this might clear up any confusion if there is any.
  
 ===2.2 DC Contributor=== ===2.2 DC Contributor===
Line 84: Line 55:
  
 These are subject headings that will be applied to the item.  Currently we use FAST subject headings and copy catalogue them from OCLC.  The website is:  [[http://classify.oclc.org/classify2/]] These are subject headings that will be applied to the item.  Currently we use FAST subject headings and copy catalogue them from OCLC.  The website is:  [[http://classify.oclc.org/classify2/]]
 +
 +<note>Remove Subject Heading ''Blacks'' from any title. We no longer use the Subject Heading ''Blacks'' as it is a culturally outdated term. We do accept more precise Subject Headings including ''Black race'', ''Author, Black'', ''Women, Black'' etc. Check OCLC or LC for the appropriate Subject Heading to use for each title.</note>
  
   *Search by title.  If it is a pretty generic title you may get a lot of hits (hundreds), in which case include the author's last name in your search.   *Search by title.  If it is a pretty generic title you may get a lot of hits (hundreds), in which case include the author's last name in your search.
Line 94: Line 67:
   *The Usage Count tells you how many libraries use each particular heading.  Sometimes there will be a list of headings that have a Usage Count of 1 (while the others have hundreds or thousands) - if there are a lot of these 1s then they can be omitted if there are a bunch of more used ones.   *The Usage Count tells you how many libraries use each particular heading.  Sometimes there will be a list of headings that have a Usage Count of 1 (while the others have hundreds or thousands) - if there are a lot of these 1s then they can be omitted if there are a bunch of more used ones.
   *If you can't find any Subject headings to copy and paste, try to find something similar and take one or two that fit.  If the item is part of a series, you can probably take one from one of the other books.   *If you can't find any Subject headings to copy and paste, try to find something similar and take one or two that fit.  If the item is part of a series, you can probably take one from one of the other books.
 +  *If a record set comes with BISAC terms those should be kept. You can find a full list of terms on the BISAC website at [[https://bisg.org/page/BISACEdition|Complete BISAC Subject Headings List, 2021 Edition]]
 +  *LCSH terms can be used if FAST terms are difficult to find, or at cataloguer's discretion if it would speed up the process significantly (for example if a large record set comes with robust LCSH terms already attached) - A lot of FAST terms are deconstructed LCSH terms
 +
 +=== Indigenous Subject Headings ===
 +
 +Replace outdated subject headings with more up-to-date terminology pulled from one of the following sources:
 +  * [[https://docs.google.com/spreadsheets/d/1qWWY5549qnS69_LpHEL7_XeiuyrX3onr58I4u3jrj5c/edit#gid=416984343|GVPL's Indigenous Subject Headings]]
 +  * [[https://xwi7xwa.library.ubc.ca/collections/indigenous-knowledge-organization/|X̱wi7x̱wa Library Subject Headings]]
 +  * [[https://main.lib.umanitoba.ca/indigenous-subject-headings|Manitoba Archival Information Network (MAIN) Subject Headings]]
 +
 +Refer to the Greater Victoria Public Library's (GVPL) list first.
 +
 +Use the following guidelines when working with Indigenous subject headings:
 +  * Use "Indigenous" for original peoples in all areas. 
 +  * In Canada, Indigenous refers to First Nations, Métis, and Inuit collectively. Use First Nations, Métis, or Inuit for materials about those groups specifically and individually.
 +  * First Nations and Métis do not require a geographic qualifier for Canada, as those terms are only in use for people within Canada, but may have a geographic qualifier for provinces or cities.
 +  * Inuit requires a geographic qualifier for Canada, as the term can also apply to peoples in Greenland and Alaska. 
 +  * "Indigenous peoples" should be followed by a geographic qualifier
 +  * If materials identify people specifically as "Métis" or "Michif," use the subject heading "Métis." Otherwise, use "Indigenous peoples -- Mixed descent."
 +  * Use the LCSH "Indian(s)" headings as pattern examples and examples of geographic subdivision
 +  * Whenever possible, add a geographic subdivision, broad (North America) or narrow (British Columbia -- Victoria). Add even if not in original heading (if one can be determined).
 +
 +The following are generalized workflow guidelines:
 +  * Remove the most culturally inappropriate headings from the catalogue; term "Indians" is a common one encountered in LCSH; "Indigenous" is the more current and appropriate term
 +  * Remove "Indian" from the names of Indigenous groups (example: use Dene Tha' not Dene Tha' Indians)
 +  * If a name of an Indigenous group is not on any of the 3 lists above, the general guideline would be to search for  the preferred form of the name as identified by the groups themselves or Indigenous reference sources (example would be to use Haudenosaunee, not Iroquois)
 +  * This also means if the preferred form of the name as identified by the group includes the term "Indian," we can keep it in place. This may be more common with US Indigenous groups, as "American Indian" is typically viewed as acceptable by many Indigenous nations.
 +  * In some cases, in order to use more appropriate terminology, you may have to replace a subdivision with a first level heading (example might be the use of "Residential schools" as a subdivision)
 +  * Follow established rules in LC for geographic subdivisions ; example would be Indigenous peoples -- Civil Rights -- Canada or Indigenous peoples -- Canada -- Claims
 +  * Be mindful when cataloguing items which use "folklore," "mythology," or "legends" ; the content found within these items may have spiritual and religious roots and should get a "Religion" subject heading instead.
 +
  
 ===3.2 Audience=== ===3.2 Audience===
Line 114: Line 118:
  
   *If the abstract is empty or isn't right, find a summary and copy and paste it in.  Amazon, WorldCat, Goodreads, and the publisher's site are good places to find them.   *If the abstract is empty or isn't right, find a summary and copy and paste it in.  Amazon, WorldCat, Goodreads, and the publisher's site are good places to find them.
 +  *Delete characters that aren't displaying properly - sometimes there will be squares or other special characters that aren't correct.
 +  *Fix any spacing errors or missing punctuation or "&tbsp"s that appear.
 +  *Replace any words or phrases in all caps with sentence case. Screen readers may read each letter rather than the entire word if in all caps. 
   *After pasting in the summary, or if there is one already present, highlight the whole thing and click the Remove Format button (the icon is a white eraser between the quotation marks and the Omega symbol).  This gets rid of any formatting like italics, bold, etc.   *After pasting in the summary, or if there is one already present, highlight the whole thing and click the Remove Format button (the icon is a white eraser between the quotation marks and the Omega symbol).  This gets rid of any formatting like italics, bold, etc.
  
Line 119: Line 126:
  
 This is generally fine.  Items that aren't in English may have to have their language properly selected.  When working with a non-English record set, if the first handful of items have the language correctly set, you can assume that the rest are ok. This is generally fine.  Items that aren't in English may have to have their language properly selected.  When working with a non-English record set, if the first handful of items have the language correctly set, you can assume that the rest are ok.
- 
  
 ====4 Genre / Formats==== ====4 Genre / Formats====
Line 129: Line 135:
 Only one category can be picked. Only one category can be picked.
  
-  *Only use Non-fiction, Fiction, Comic strip, Drama, Short story, Poetry.+  *Only use Non-fiction, Fiction, Drama, Short story, Poetry.
   *Don't use the other formats.   *Don't use the other formats.
   *Non-fiction and Fiction should be obvious.   *Non-fiction and Fiction should be obvious.
-  *Comic strip is for comics and graphic novels.  Not for children's picture books. 
   *Drama is for plays.   *Drama is for plays.
   *Short story is for a single short story or a collection of short stories.   *Short story is for a single short story or a collection of short stories.
Line 141: Line 146:
 This field only needs one entry, but can have as many as necessary separated by a comma. This field only needs one entry, but can have as many as necessary separated by a comma.
  
-  *Here is a list of Genre terms with descriptions:  LINK!!!!+  *Here is a list of Genre terms with descriptions: [[public:nnels:cataloguing:metadata-cleanup:genre|NNELS Genre Taxonomy]]
   *It is important to ONLY use those terms.  The field will auto-populate in Drupal.  If an incorrect genre terms is used, then Drupal will include that term in the list that it auto-populates from - it is time consuming to get rid of those incorrect terms periodically.   *It is important to ONLY use those terms.  The field will auto-populate in Drupal.  If an incorrect genre terms is used, then Drupal will include that term in the list that it auto-populates from - it is time consuming to get rid of those incorrect terms periodically.
   *There are terms specifically for Non-fiction, and terms specifically for Fiction.   *There are terms specifically for Non-fiction, and terms specifically for Fiction.
-  *Sometimes a single genre is fine, sometimes multiple genres are better.  Ex:  Science fiction, Apocalyptic fiction might be better than just Science fiction.  The same applies to nonfiction.  Ex:  I would use History, Science for a history of medicine and medical procedures (in fact I did!). +  *Most times a single genre is fine, sometimes multiple genres are better.  Ex:  Science fiction, Apocalyptic fiction might be better than just Science fiction.  The same applies to nonfiction.  Ex:  I would use History, Medicine, Health and Fitness for a history of medicine and medical procedures (in fact I did!).  Just use the least necessary to accurately describe the item
-  *There are genre terms that should be added to describe the form or type of the item in addition to what it's about.  Ex:  Fantasy fiction, Comics (Graphic works) would be a fantasy graphic novel; Music, Nonfiction comics, Biographies would be a biography about a musician or musical group told in a graphic, comic book style format.+  *There are genre terms that should be added to describe the form or type of the item in addition to what it's about.  Ex:  Fantasy fiction, Comics (Graphic works) would be a fantasy graphic novel; Music, Nonfiction comics, Biographies and autobiographies would be a biography about a musician or musical group told in a graphic, comic book style format.
   *There are genre terms that signify special content that should be added as needed.  These are Canadian fiction, Canadian nonfiction, Canadian drama, Canadian poetry, French language materials, Indigenous materials, Juvenile fiction, Juvenile nonfiction, Young adult fiction, Young adult nonfiction.   *There are genre terms that signify special content that should be added as needed.  These are Canadian fiction, Canadian nonfiction, Canadian drama, Canadian poetry, French language materials, Indigenous materials, Juvenile fiction, Juvenile nonfiction, Young adult fiction, Young adult nonfiction.
-  *Canadian genre terms are for book by Canadian authors or about Canadian subjects.  Same with Indigenous materials+  *Canadian genre terms are for books by Canadian authors or about Canadian subjects.  Same with Indigenous materials.
-  *Special genres should be added redundantly.  Ex:  Poetry, Canadian poetry would allow a patron to find the item using different search avenues - whether they are looking for Canadian poetry or whether they are trying to find poetry in general.+
  
-Genre tips:+===Genre tips===
  
-  *Literary arts is used for books about books, criticism, libraries, etc.  It is also for books about authors.  Ex:  Literary arts, Biographies would be a biography about an author. +  *Literary arts is used for books about books, criticism, libraries, etc.  It is also for books about authors.  Ex:  Literary arts, Biographies and autobiographies would be a biography about an author. 
-  *For autobiographies/memoirs, use both Autobiographies and Biographies+  *Biographies and autobiographies covers memoirs as well as biographies
-  *Juvenile fiction can be tough because it's usually a big combination of Humorous fiction, Magical realist fiction, Science fiction, Fantasy fiction, Detective and mystery fiction, etc.  So instead of trying to pin it down, just use Juvenile fiction.  This also prevents juvenile results from showing up when patrons looks for adult genre books like mystery or science fiction.  Genres that should be added to Juvenile fiction should be things like Comics (Graphic works), Nonfiction comics, Picture books, Choose-your-own stories.+  *Juvenile fiction can be tough because it's usually a big combination of Humorous fiction, Magical realist fiction, Science fiction, Fantasy fiction, Detective and mystery fiction, etc.  So instead of trying to pin it down, just use Juvenile fiction.  This also prevents juvenile results from showing up when patrons looks for adult genre books like mystery or science fiction.  Genres that should be added to Juvenile fiction should be things like Comics (Graphic works), Picture books, Choose-your-own stories, and Canadian fiction and Indigenous materials.
   *Young adult items should be treated like adult books, in that they should get full genre treatment.  This is because young adult material tends to be more focused in its content, and also adults read them.   *Young adult items should be treated like adult books, in that they should get full genre treatment.  This is because young adult material tends to be more focused in its content, and also adults read them.
   *Picture books are specifically for children's picture books.   *Picture books are specifically for children's picture books.
 +  *If unsure, picture books can be identified in the WorldCat description - they are often around 30 pages long, the pages are unnumbered, are illustrated, and often over-sized.  Ex from WorldCat Description field:  36 unnumbered pages : colour illustrations ; 24 cm.
   *If you can't figure out the genre, or it doesn't fit any of the categories, use Literature - only for fiction.   *If you can't figure out the genre, or it doesn't fit any of the categories, use Literature - only for fiction.
- 
  
 =====General tips===== =====General tips=====
Line 164: Line 168:
   *Sometimes the record won't save properly when you click on Save.  Click on View changes first, then hit Save.   *Sometimes the record won't save properly when you click on Save.  Click on View changes first, then hit Save.
  
 +====Fix invalid characters in Drupal====
 +
 +Sometimes when a record set is uploaded to Drupal there will be invalid characters (they will generally show up as a string of random nonsense characters).  This has to do with character encoding - MarcEdit uses Mark8 format and Drupal uses UTF8.  It is rarely a problem, but converting the character encoding should fix it.  This can be done in MarcEdit in Marc Tools when converting MRK to MRC or MRC to XML - just make sure "Default Character Encoding" is set to Mark8 and the "Translate to UTF8" box is ticked.
public/nnels/cataloguing/metadata-cleanup.1547231013.txt.gz · Last modified: 2019/01/11 10:23 by robert.macgregor