Documents Data Miner 2

Searching Strategies









What is it?  Documents Data Miner 3 is a library management system for U.S. government documents.  A web-based data warehousing and data mining tool, DDM3 assists depository libraries in processing, cataloging, and bibliographic control of federal documents.  In addition, DDM3 contains a pilot module for a public access Catalog which can be set to a depository’s name and profile.


Development and Partnership:  DDM3 is based on the original Documents Data Miner <>, announced in 1998 as a partnership between the Government Printing Office and Wichita State University Libraries.  DDM2, announced in the Fall of 2001 as a pilot project, was a collaboration between the Wichita State University Libraries and Computing Center.  The GPO/WSU Partnership arrangements for DDM3 are in process.  The development team from University Libraries are Nan Myers, Associate Professor and Government Documents Librarian, and John Williams, Head of Acquisitions.  John Ellis, Manager of Internet Applications (retired) for the University IT Center, is the programmer for DDM, DDM2 and DDM3.



WHAT’S NEW (October 2015)   


Login is now required.  The Login feature was removed in May 2003 and has be reintroduced with the roll out of DDM3.  Users will log have to register using their email address as the userid,


Full Text Indexing was added in spring 2003 and is still used in the MARC LOCATOR, URL LOCATOR and CATALOG modules.  In those modules, DDM3 searches on “words” rather than letters, with search logic.  DDM3 can search for:

  • A word or phrase.
  • The prefix of a word or phrase.
  • A word near another word.
  • A word inflectionally generated from another (for example, the word “drive” is the inflection stem of drives, drove, driving, and driven).
  • A word that has a higher designated weighting than another word.


Key changes:  Title searches in these three modules will require the use of quotes for exact title searching or for groups of words.  Incomplete SuDoc or Item Numbers will require the use of the % sign for a wildcard search, such as C 3%.  (Note: Other DDM3 modules use a “string search,” where there is no concept of words – just of letters.  Automatic left/right truncation is built into these modules.  Thus, a string search on “Kansas” will also bring up “Arkansas.”

Improved Excel Formatting (XMLX Format):  Provides higher sorting capability when exporting into Excel.  Places leading zeros where needed, such as in item numbers or depository numbers.  Export to Excel now requires Excel 2000 with Service Pack 3 or above.  Older software users should select the CSV button for export.


Upgraded Servers.  DDM3 has been upgraded to SQL Server 2012 and IIS (web server) 8.1. All the search modules are now written in C# using .net framework 4.5.  Initial programming was done in VisualStudio 2013









Modules of DDM3 are selected from the “Search” tab in the menu at the top of the page.  Once users are into a module, such as the SHIPPING LISTS, they should navigate from the Banner at the top of the page or use the “Back” key in the browser.  The Banner offers the options to go HOME, to TOOLS, to Search Modules, to e-mail us from Contact.   All pages in DDM3 have the menu bar at the top of the page for navigation




·        Union List Configuration:  This done from the List of Classes query return page. There is a small dashboard at the top of the page where the user can select a filter (state, region or distance) and then add the parameter for that selection.  The user can then filter the current selection or save the configuration and then filter the selection.  Saved configurations will persist in the user profile until changed and saved again.


·        Exports and Downloads:  Users can build their own inhouse databases with files from the TOOLS/downloads page.  A user can also download any selection from the search pages as either an exel or csv file,


Difference between “export” and “download”:

Export – will transfer the selected useable tabular data in Ascii text from the database directly to your hard drive.

Download – will import   data, saved in a zipped file,  to your computer.  You select the point of import (Excel, etc.).  Downloads are much faster.




·        Communications:  “Contact us” from the Tools page.  Users may also contact us from “Contact” on the menu bar.


·        Reports:  “Additions to and deletions from the List of Classes since 6-1-2002” [6/1/2002-6/1/2003] is available for download. 


·        Agency/Sub-agency List:  Available in Internet Explorer only.  Provides entrée to the sub-agencies by class.  Clicking on class takes the user to the complete List of Classes entries for that sub-agency.




Only official GPO data from the Federal Bulletin Board files is used in DDM3.  DDM3 files are updated as soon as possible after the GPO files are updated at the FBB.  Generally GPO posts new files monthly, on the first Friday of each month for the List of Classes (listclas), Inactive or Discontinued List (inactlst], Library  Directory [profiles], and Item Lister’s Profiles [unionl].  Shipping lists are posted more frequently and are updated several times weekly in DDM3.  All modules display the date of latest refreshing at the bottom of the page, except the Shipping List module, which displays the date at the top.  (The original DDM does reflects the latest update on the frame of its homepage.)




Documents Data Miner was originally designed as a collection development tool.  DDM3 retains all the union listing capabilities of DDM.  By selecting TOOLS and then SESSION CONFIGURATION, a user is at the screen to enter their depository number and to filter their depository profile by state, by region, or by specific distance (radius) from themselves.  The default is the state of the Home depository, so if a depository number is entered, the state is the union list feature whether selected or not.


The Union List feature is a powerful tool for collection development, whether building or down-sizing.  The user can look at data in their own profile and then click on the Item Number in the display to determine which other depositories in a certain geographic area also select specific items.




Perhaps a library wants to see what they DO NOT select, or what has been dropped from their profile.  This is particularly useful during the annual update cycle in June and July.  By going to the STATUS feature in their depository profile screen, the user may query for five options:


·        Active

·        Inactive

·        Unselected

·        Active + Dropped

·        Active + Dropped + Unselected




Four modules in DDM3 originally appeared in Documents Data Miner, which is still available.  Nothing has changed in the use of these modules.  These are:





·        TOOLS




DDM3 offers six additional modules:




·        SHELF LISTS


·        URL LOCATOR

·        CATALOG








From the LIST OF CLASSES, a user can:

  • Search the current LIST OF CLASSES by field,
  • Search the INACTIVE/DISCONTINUED LIST by field,
  • Or, merge the searches by choosing “all” at the “Status” box.

The search grid offers the following options:

  • Agency:  Search “all” or use the pop-up box for a list of agencies with the sum of active item number stems for each agency.
  • Item Number:  Enter a full item number, which requires exact spacing and punctuation, or enter a partial item number.
  • SuDoc Stem:  Enter a complete or partial SuDoc stem.  The search is on a string, so no truncation symbol is required.   Example:  Search for C, C 1, or C 1.54.  Spacing and punctuation must be exact, but the ending colon is not required.  Wildcard searches are possible.  Example:  D%23%.
  • Title:  Enter an exact title, or words from a title.  Automatic left/right truncation is built in.
  • Format:  A drop-down box allows the following choices – Any Format, Paper, Microfiche, CD-ROM disks, Electronic, and Electronic Library.  Electronic Library (EL) refers to a title which is available online.  These are the formats supplied by GPO.  We added “Unknown” since over 1,000 active records have no format data supplied.
  • Status:  Search for either active, inactive/discontinued, or all.


Results Screens (or Query Returns):  The results screen (called the “Complete Class List”) will state what you requested and offer a list of results.  The array in the LIST OF CLASSES is always in SuDoc order.


Additional features:

  • Click on SuDoc stem for a list of all SuDoc stems assigned to that item number. As part of our de-selection process, we have to determine what other SuDoc stems we will lose if we de-select an Item Number.
  • Click on Item Number for the Union List feature.  If you do not have Union List parameters set at this point, the DDM3 will take you to the TOOLS page to set those up.
  • Status of a record – what is “Inactive” and what is “Undefined”?  The LIST OF CLASSES and the INACTIVE/DISCONTINUED LIST are maintained as two separate databases by the GPO.  There are times when an item number falls off the LOC and yet has not been added to the INACTIVE LIST.  Data Miner automatically tags these records as if they had been discontinued.  But to maintain referential integrity, the display in DDM3 at the GPO Status box reflects either:

·        “Inact” - at GPO Status: This item number appears in GPO’s INACTIVE AND DISCONTINUED ITEMS text file edition from the FBB (Nov. 7, 2001).

·        “Undef.” - at GPO Status: Is neither in the GPO’s INACTIVE/ DISCONTINUED LIST nor in the LIST OF CLASSES.

  • The “And” Function: Documents Data Miner 2 allows the user to “build” selections.  If you want to narrow a search, you may.   There is no limit to the number of fields you may combine.





Since this data can be searched from the LIST OF CLASSES module, it is not used as often as it was when first developed.  It does offer search limited only to data about inactive or discontinued items.  From this screen, you may search by Item Number, SuDoc Stem, or Title.

This module does provide some added value:

  • Inactive Date – Available if the item was made inactive after we began loading files in DDM in October 1997.
  • Notes Field – The annotations for the NOTES were mined from the BDLD in 1997, where they were manually entered from Shipping Lists, Technical Supplements Additions & Changes, and other LPS sources.   The Inactive/Discontinued List data was originally mined from the BDLD prior to a data file being available at the FBB.  Official GPO data has now overlaid the initial data from the BDLD; however, the Notes were retained.




This module merges profile data with List of Classes fields, creating the Union List function.  It also contains the depository directory information and e-mail functions.


Use this point of entry:

  • To search any depository profile
  • To obtain complete depository directory information for all depositories
  • To e-mail another depository
  • To click on URLs for depository homepages

The search parameters at “Depository Selection” are designed to allow varied searches:

  • Enter the depository number
  • Enter an institution or library name — or a partial name,
  • Search for depositories in a certain city,
  • Request depositories for an entire state,
  • Or, search by type of library, such as Community College Libraries, Academic Law Libraries, or State Libraries.  There is a pop-up table for states and types of libraries.


Search Demonstrated:  Search for a depository number, such as 0204A for Wichita State University.  Then, click on submit.  This presents a screen that allows four functions:


1.  Click on the E-Mail Address to send a the depository librarian at that institution. (Since this data is supplied to the GPO by individual depositories, you may occasionally encounter an outdated e-mail or a blank box if a library has not kept the GPO informed of up-to-date information.  Corrections should be sent to the GPO.)

2.  Click on the Home URL to go to that depository’s homepage. 

3.  Click on the Depository Number to search the profile.  This brings up a search grid for that library’s Item Lister profile of depository selections. 

            4.  Click on the Depository Name for directory information.


Directory Information:  All the fields available in the GPO database are displayed: names, addresses, phone numbers, e-mail and URL addresses, depository type, library type and size, designation code, year designated as a depository, and congressional district.  We have added:

  • Date of Last Update
  • Selected GPO Item Count Selected GPO Percent
  • Longitude and Latitude
  • DDM Inactive Item Count*


*DDM Inactive Item Count:  This is our tracking mechanism for how many inactive items we have tagged for your depository since July 1998.  Items become “inactive” either when a depository de-selects it or the GPO removes it from profiles when the item becomes inactive/discontinued.


Depository Profile Search:  A profile may be searched by:

  • Agency:  Search any agency or use the pop-up box for a list of agencies with the sum of active item numbers for that depository library.
  • Item Number:  Search as you would the LIST OF CLASSES module.
  • SuDoc Stem:  Search as you would the LIST OF CLASSES module.
  • Title:  Search as you would the LIST OF CLASSES module.
  • Formats:  Search as you would the LIST OF CLASSES module.
  • Status:  See Profile Status Options on p. 3.


Shelf List Feature in the Depository Profile Search:  The query return screen of item data in a profile includes a hotlink to the Shelf List module.  Clicking on Shelf List provides a list of all pieces from that Item Number/SuDoc Number which appeared on shipping lists from the GPO from 1997 to present.

Shelf List records provide hotlinks to the shipping lists on which the pieces appear.






This module represents a searchable publication of the GPO – the 2002 SUPERSEDED LIST: U.S. Documents That May Be Discarded By Depository Libraries, Annotated for Retention by Regional Depositories.  It is searchable by:


  • Agency Name:  Full or partial searches are possible, as are truncated and wild card searches using %.  For example, %aviation% brings up the FAA and the Aviation Medicine Office.
  • Item Number:  Full or partial searches are possible, as are truncated and wild card searches using %.  For example, 00%% brings up all item numbers beginning with the double 00.
  • SuDoc Number:  The SuDoc entry requires exact spacing and punctuation.  However, truncated and wild card searches are also possible.
  • Title:  Exact title or words in a title.  Use % for wild card searches.


The query return provides:  The agency name, SuDoc, Item Number, Title, Instructions, Regional Note, and a filter against a depository profile.





Searchable Shipping Lists:  DDM3 offers the only searchable depository shipping list utility available to the GPO and depository libraries.  Shipping lists may be searched by:

·        Shipping List Number

·        Title

·        Fiscal Year and Month

·        Shipping Year and Month

·        Item Number

·        SuDoc Number

·        Category:  All or filter for Paper, Microfiche, Electronic, Separates.

·        Depository Filter:  This filter eliminates shipping lists with item numbers not selected by the depository.


PDF Links:  Shipping lists for FY2001 - FY2004 are hotlinked to the pdf versions of the official lists from the FDLP Desktop.  PDF versions are created for paper, electronic, and separates lists.  No pdf versions are available for microfiche lists.


MARC Records:  The Shipping List module is linked to the MARC LOCATOR module.  Retrieval of an individual shipping list will attach MARC records to the Title, SuDoc and Item Number data.  MARC records can be viewed and downloaded into a library’s catalog or saved to a disk.  Shipping lists offer:

·        Individual MARC record download, or

·        Bulk download of either all monograph records or all serial records affiliated with a specific shipping list.


The Shipping List module currently warehouses all shipping lists available at the Federal Bulletin Board (8166 GPO shipping lists as of 10-17-03).





  • Warehouses all MARC records created by GPO Cataloging Division from monthly files posted at the Federal Bulletin Board, which they began in December 1998.  In the fall of 2002, all GPO MARC records from 1990 through 1998 were added to the MARC LOCATOR.  As of 10-17-03,  206,924 MARC records are available from DDM3 dating from 1990 to the present.


  • Records are searchable using:

·        OCLC number

·        Item or SuDoc numbers

·        Agency (from 1xx fields)

·        Title

·        Title Key Words

·        Subject (from 6xx fields)

    • Formats


  • Query return provides title, item number, SuDoc number, hotlinked PURLS,  OCLC number, access to the MARC view of the OCLC record, the GPO timestamp, and the option to download the record.  If the search is done on “agency,” the agency name also appears.




  • A subset of the GPO MARC Record Locator.

·        Restricted to records with the 856 field for hotlinking to Web resources.

·        Warehouses 38,565 records with PURLS as of 10-17-03.

·        Searchable in the same multiple fields as the MARC Locator records.

·        Query return provides the same data as the MARC Locator records.




  • Ties the individual pieces on the shipping lists to the MARC records and offers the only existing automated shelf-listing of multi-part titles and the general publications classes of the SuDoc class system.
  • Currently holds data elements for 154,100 individually shipped pieces.


The Shelf List module is also linked to the Depository Profiles.  Searches in a specific depository profile produce a query return screen with a column for “Shelf List.”  Clicking on Shelf List provides a list of all pieces from that Item Number/SuDoc Number which appeared on shipping lists from the GPO from 1997 to present.  Shelf List records provide hotlinks to the shipping lists on which the pieces appear.




The DDM3 Catalog is designed as a public access catalog to GPO MARC records, offering both PUBLIC and MARC (staff) views of the records.  This module is still under development. 


  • The Catalog is designed to serve as an individual library’s catalog.  The Depository Number and filter for profile are added by setting up a Session Configuration from TOOLS.


  • Query returns may be arrayed in three different ways by clicking on

·        Year

·        Title

·        Call Number


  • From the index of the query returns, patrons may View each record.  The Public View includes:
    • Title
    • Author
    • Publication
    • Description
    • Subject Headings
    • Hotlinks from PURLs
    • Call Number
    • OCLC Number
    • MARC Revision Date – Date record last updated by GPO Cataloging.


  • The staff view is of the MARC record.  It also includes header data:
    • OCLC number
    • Whether the record is for a monograph or serial
    • MARC Revision Date – Date record last updated by GPO Cataloging.
    • DDM Revision Date – Date loaded into DDM3.
    • The word “Leader” is hotlinked to the Leader explanation in MARC 21 Format.


  • Subject headings can be cut and pasted into a box at the bottom of the record.  Clicking on “Search” provides an index of all records with the same subject heading.






For additional information or feedback, please contact:


Nan Myers

Associate Professor and Government Documents Librarian

Wichita State University, Wichita KS 67260-0068                     

(316) 978-5130 or 1-800-572-8368


October 17, 2003