So after a nearly 4 month hiatus from even bothering to look at my coding, I sat down and cleaned up the parsing code I wrote way back when and got some subroutines going for it. Nothing too fancy, but it should discriminate satisfactorily between RIS and BiBTex formatted library exports (based upon file extension). I’ll probably throw in a screener to check the first line or two of the input file just to be on the safe side, but running on extension alone for now should be fine. Click on through for the code!

As I’ve already got the parser and uploader mashed together1, this piece won’t be hard to integrate in. Now it’s just time to write another subroutine or two for the other more common citation library formats, and this script will be ready to rock2!

Added perk: I totally just got my <pre> tags to wrap properly within my frames! Woohoo!
<perma-embed in CSS-file />


1I haven’t put that code up yet, as I got distracted by life in the interim. But, it’s totally functional. Fo’realz!
2Well, on Mac OS X and/or Unix systems at least. I need to verify and/or troubleshoot the script on Windows-based platforms, as I’m unsure how good the uri_escape and File::Spec are at respecting the different operating systems.

#!/usr/bin/env perl -w
use strict;
use URI::Escape;
use File::Spec;

open(INFILE,"<$ARGV[0]")||die("Could not open file ".$ARGV[0]."\n");
my%paths=();
# Crude discriminator of file extension for citation library format if(substr($ARGV[0],-4)eq".ris") {%paths=ris()} if(substr($ARGV[0],-4)eq".bib") {%paths=bibtex()}
close(INFILE);
# Just for troubleshooting purposes, and to demonstrate proper return of the subroutine data foreach my$idnum (keys(%paths)) {print STDOUT ("ID: $idnum\tPath: $paths{$idnum}\n");}
sub ris { my($id,%lists)=undef; while() { chomp; # Obtain article ID from library if(/ID\s\s/) {$id=substr($_,6)} # Obtain file path for PDF (if available) if(/L1\s\s/) { $_=uri_unescape(substr($_,6)); s/file:\/\/localhost//; # Deposit ID and filename to a hash $lists{$id}=$_; } } # Return the hash to the primary routine for processing return %lists; }
sub bibtex { my(@id,%lists)=undef; while() { chomp; # Obtain article ID from library if(/pmid\s/) {@id=split(/[{},]/)} # Obtain file path for PDF (if available) if(/local-url\s/) { my@li=split(/[{}]/,$_); $_=uri_unescape($li[1]); s/file:\/\/localhost//; # Deposit ID and filename to a hash $lists{$id[1]}=$_; } } # Return the hash to the primary routine for processing return %lists; }
__END__

Leave a Reply

Your email address will not be published. Required fields are marked *