Difference between revisions of "DMR Post Upgrade 2013"

From Ball State University Libraries Wiki
Jump to: navigation, search
("Index Bug" Timeline)
("Index Bug" Timeline)
Line 48: Line 48:
 
*OCLC was contacted on the morning of 6/17/13 and given all the details about everything that has happened. They will be meeting with their developers 6/18 at 4pm, and getting back to us on 6/19 in the morning to talk about any possible solutions.
 
*OCLC was contacted on the morning of 6/17/13 and given all the details about everything that has happened. They will be meeting with their developers 6/18 at 4pm, and getting back to us on 6/19 in the morning to talk about any possible solutions.
 
*MADI was advised to continue avoiding work on the DMR during this time. All changes were, fortunately, made to an offline copy of the DMR server. During this entire ordeal, the DMR has been available to end-users.
 
*MADI was advised to continue avoiding work on the DMR during this time. All changes were, fortunately, made to an offline copy of the DMR server. During this entire ordeal, the DMR has been available to end-users.
 +
*OCLC called us back on 6/19. After some confusion with forgetting to bring up our issue at their developer meeting, OCLC contacted the developers directly to sort out an issue. They increased the priority of the problem at this time.
 +
*We called OCLC at 4pm on 6/19, but still had no news. Later that day OCLC informed us the developers had created a plan and wanted us to put it into motion on 6/20.
 +
*The morning of 6/20 we were instructed to manually delete the broken Windows services, run a standard CDM service install, run a standard CDM stop, then a new indexALL.
 +
*This indexALL took about 3 hours to complete.
 +
*We were instructed to run a standard CDM start. Everything seemed to be working but there were 3 incorrect Windows services still hanging around. We manually disabled those and restarted the server.
 +
*Everything seemed to be working correctly after some tests, and MADI was instructed to begin work again on the morning of 6/21 (after LITS "flipped the switch" to swap offline/online servers).
 +
*OCLC is going to continue looking into the long alias problem mentioned above.

Revision as of 10:31, 21 June 2013

With the successful implementation of CONTENTdm 6.3 along with a handful of major customizations, the Digital Media Repository should now be in a fairly sustainable state of review and polish.

This wiki article will serve as a singular place to list known bugs, keep track of customizations (and their removal if deemed unnecessary), and manage other projects related to the DMR at a technical level.

Nomenclature and Commenting

There are currently three servers running CONTENTdm in the library (soon to be two). To prevent confusion, these will be referred to in this document as follows:

Common System URL Description
DMR Server LIBX http://libx.bsu.edu/ Our live DMR server (now running CONTENTdm 6.3)
BoT Server LIBCDM2 http://libcdm2.bsu.edu/ Board of Trustees Minutes Repository (this will soon be folded back into the DMR)
Test Server LIBCDMTEST http://libcdmtest.dhcp.bsu.edu/ Test server for the newest version of CONTENTdm (Available on-campus only.)

We also have the OLD version of LIBX in offline storage so that we can access it if necessary.

To help make tracking down customizations in the code easier, all customizations shall be paired with a comment (PHP, JavaScript, or HTML style) in the format of: BSU initials mm/dd/yy Brief Description.

For future reference, we will continue to use the DMR Upgrade Project 2013 Code Changes page to track any and all customizations.


"Index Bug" Timeline

  • Shortly after the upgrade, MADI noticed a handful of collections (especially the Board of Trustees collection) had incorrect item counts. Some of the discrepancies numbered below 10, while others were over 1000. Regular indexing, deleting and re-adding, copying from the old server, and several other attempts to fix the issue didn't work.
    • Copying these collections to the old server and indexing them worked correctly, meaning that the collections themselves were not broken.
  • OCLC contacted about item count discrepancies. OCLC suggested that we delete the effected collections, run an indexALL command, then re-add them manually.
  • The fix OCLC recommended was done around 5/24/13 on the five collections identified as having incorrect item counts: BSUBoT, DCFIFB, BSUDlyNws, BSUCmncPrg, GibEdArch.
  • Once this fix was applied these collections (as well as those collections added to the system AFTER the upgrade) started working correctly.
  • Not long after that, we discovered more problems that didn't present as item count discrepancies. Instead, when items were added/edited/deleted, sometimes the collection would either become completely unsearchable, produce duplicate items, or any number of strange errors would happen. Individual items in the collection still worked, but the collections themselves were essentially unavailable. Also, users couldn't connect to all collections using the Project Client, as it would produce an error.
  • Work was halted on all collections but those that were fixed using the steps from OCLC previously.
  • The fix OCLC originally recommended was applied to ALL collections on 6/14-6/16/13. (This did not include 5 collections with aliases too long to re-add manually. OCLC is looking into this.)
  • All collections (except for the 5 with large aliases) were deleted using the web interface. The indexALL was then performed. After it completed, all collection folders were renamed to remove the ".delete" portion added to them when deleted, then all collections were added back into the system, one by one, using the web interface.
  • One final indexALL was started and finished relatively quickly with no error.
  • Testing the server afterward produce a NEW bug where the Monitor and Search services were not being installed, uninstalled, or located properly, making the entire system unusable.
  • OCLC was contacted on the morning of 6/17/13 and given all the details about everything that has happened. They will be meeting with their developers 6/18 at 4pm, and getting back to us on 6/19 in the morning to talk about any possible solutions.
  • MADI was advised to continue avoiding work on the DMR during this time. All changes were, fortunately, made to an offline copy of the DMR server. During this entire ordeal, the DMR has been available to end-users.
  • OCLC called us back on 6/19. After some confusion with forgetting to bring up our issue at their developer meeting, OCLC contacted the developers directly to sort out an issue. They increased the priority of the problem at this time.
  • We called OCLC at 4pm on 6/19, but still had no news. Later that day OCLC informed us the developers had created a plan and wanted us to put it into motion on 6/20.
  • The morning of 6/20 we were instructed to manually delete the broken Windows services, run a standard CDM service install, run a standard CDM stop, then a new indexALL.
  • This indexALL took about 3 hours to complete.
  • We were instructed to run a standard CDM start. Everything seemed to be working but there were 3 incorrect Windows services still hanging around. We manually disabled those and restarted the server.
  • Everything seemed to be working correctly after some tests, and MADI was instructed to begin work again on the morning of 6/21 (after LITS "flipped the switch" to swap offline/online servers).
  • OCLC is going to continue looking into the long alias problem mentioned above.