Playing around with library software

What a way to spend your vacations! I have the unconfortable feeling that i have lent books to friends and that they are... well, not lost but ... somewhere. Wanting to get rid of this confusion, I started looking at library cataloging software.

This led me to document library software on the evergrowing Koumbit Wiki. I originally had the reflex of looking for Drupal modules, but that proved to be quite complicated. Basically, you would need to mash up a few simple modules: open library (or bookpost? - which should really be merged, btw) and the good library module would give you a simple library. But workflow would suffer: i can't imagine typing in all those books in the library when we have barcode scanners and the like. At the other end of the spectrum, you have monsters like the eXtensible Catalog toolkit and the Millennium OPAC integration project that connect with real library software sites (through scary acronyms like OPAC and MARC). Honesltly, I just gave up: I want software that actually just works, out of the box, without me configuring anything.

Because I felt this could be used by Koumbit eventually (which is growing its own library), I insisted on finding a web-based solution, because that would let people update the database in a distributed fashion. Unfortunately, the other software I could find seemed to be aimed at huge library sites or weren't feature complete (how do I plug this barcode scanner on that website?!). In general, the problem there is that most of the library software is aimed at actual libraries that will actually go through the trouble of filling in all the information themselves, because they consider themselves as canonical sources of information (like Amazon or the Open Library are): I'm not. I'm a consumer: I want to enter a ISBN code or a title and just get the cover page, the author, date and all that stuff. I didn't find any web-based software that could do that easily.

In the end, I think I will settle for the excellent Tellico Project, which supports barcode scanning (even with a simple webcam!), fetching covers and everything from the network, and even cataloging other things than your book collection (wines, CDs, mp3 collection, etc). Pretty amazing stuff. Only problem is it sits on your desktop so it's totally not networked (so I can use it only on one workstation, which is a problem in my setup). Backups, then, are key.

Unfortunately, during my tests of Tellico 2.2 in Debian Squeeze, Tellico didn't live up to its documented features: I couldn't scan a barcode with my webcam, and even by entering it manually, no source could give me with the metadata I was looking for. It seems that this version is not sufficient for my uses for now.

GCstar therefore inherits the gold in this review because it works well and seems simple enough. It does have some workflow problems and no scanning support, so I am not likely to use it much. I do consider scanning my whole library with this anyways, especially since I can get a USB barcode scanner that would just behave as a keyboard and could therefore interoperate with GCstar... Problem is: such a scanner is 50-100$, not a small price for an arguably obsolete device I would use just once...

To see the full list of evaluated software and analysis, head for this wiki page and of course contribute if you find something new!

Commentaires

Portrait de anarcat

#1 anarcat : Zotero

Some software I missed back then (maybe because it wasn't so well developped) is Zotero, which covers a lot of use cases and seems to be all the rage right now.

It can also double as a bookmarks manager, which is a huge plus for me.


Portrait de anarcat

#2 anarcat : I actually converted my

I actually converted my gcstar library to zotero using this script:

#!/usr/bin/python

# first, convert your library to a bibtex export
#
# this can be done by:

# 1. converting to tellico (or importing straight from tellico)
# 2. conver your tellico library to a bibliography
# 3. export to a bibtex file
# 4. import the bibtex file into Zotero

# the "accessDate" field will be empty - this is what this script
# tries to fix

from __future__ import print_function
import time
import xml.etree.ElementTree as ET
import sqlite3
import sys
import glob

conn = sqlite3.connect(glob.glob('/home/anarcat/.zotero/zotero/*.default/zotero/zotero.sqlite'))
sql = conn.cursor()

sqlcnt = 0
sqlentries = []
for row in sql.execute('SELECT * FROM items i INNER JOIN itemData id ON id.itemID = i.itemID INNER JOIN itemDataValues idv ON idv.valueID = id.valueID WHERE i.itemTypeID = 2 AND (fieldID=110);'):
    #print row
    sqlentries.append(row[-1])
    sqlcnt += 1
sqlentries.sort()

tree = ET.parse('biblio.gcs')
root = tree.getroot()

biblioentries = []
bibliodates = []
bibliocnt = 0
for item in root:
    try:
        biblioentries.append(item.attrib['title'])
        #print item.attrib['title'], time.strftime('%Y/%m/%d', time.strptime(item.attrib['added'], '%d/%m/%Y'))
        bibliodates.append((item.attrib['title'], time.strftime('%Y-%m-%d', time.strptime(item.attrib['added'], '%d/%m/%Y'))))
        bibliocnt += 1
    except ValueError:
        #print ''
        bibliocnt += 1
    except KeyError:
        pass
biblioentries.sort()

#print bibliodates
if bibliocnt == sqlcnt:
    print("count is good", file=sys.stderr)
else:
    print("count differs, sqlcnt: %d, bibliocnt: %d" % (sqlcnt, bibliocnt), file=sys.stderr)

for entry in bibliodates:
    # XXX: innefficient, O(n^2)
    if entry[0] in sqlentries:
        print(('INSERT INTO itemDataValues (value) VALUES ("%s");' % entry[1]).encode('utf8'))
        print(('INSERT INTO itemData (itemID, fieldID, valueID) VALUES ((SELECT i.itemID FROM items i INNER JOIN itemData id ON id.itemID = i.itemID INNER JOIN itemDataValues idv ON idv.valueID = id.valueID WHERE i.itemTypeID=2 AND id.fieldID=110 AND idv.value="%s"), 27, LAST_INSERT_ROWID());' % entry[0]).encode('utf8'))

        try:
            sql.execute('INSERT INTO itemDataValues (value) VALUES (?);', (entry[1],))
            valueID = 'LAST_INSERT_ROWID()'
        except sqlite3.IntegrityError: # column value is not unique
            sql.execute('SELECT valueID FROM itemDataValues WHERE value = ?', (entry[1],))
            valueID = sql.fetchone()[0]
        try:
            sql.execute('INSERT INTO itemData (itemID, fieldID, valueID) VALUES ((SELECT i.itemID FROM items i INNER JOIN itemData id ON id.itemID = i.itemID INNER JOIN itemDataValues idv ON idv.valueID = id.valueID WHERE i.itemTypeID=2 AND id.fieldID=110 AND idv.value=?), 27, %s);' % valueID, (entry[0],))
        except sqlite3.IntegrityError:
            print((u"book %s already had a date field" % entry[0]).encode('utf-8'), file=sys.stderr)
            # cleanup the duplicate, if it was ours
            try:
                sql.execute('DELETE FROM itemDataValues WHERE valueID=LAST_INSERT_ROWID();')
            except sqlite3.IntegrityError:
                pass
    else:
        print((u"biblio entry %s missing from sql" % entry[0]).encode('utf-8'), file=sys.stderr)

conn.commit()
conn.close()

# XXX: innefficient, O(n^2)
for sqlentry in sqlentries:
    if sqlentry not in biblioentries:
        print((u"sql entry %s missing from biblio" % sqlentry).encode('utf-8'), file=sys.stderr)