[Lvlug] A bit of fun with this script

dann s washko dann at thelinuxlink.net
Wed Dec 7 18:59:48 EST 2005


I was adding some tables to a calendar database today to include the 
sport schedules at work.  Normally I'd get printed copies of the 
schedules and have to type the information in manually.  The Athletic 
director uses a program called Schedule Star and recently moved to the 
online version which provides the schedules on-line.  We offer the 
schedules off our website also with door-to-door directions and want to 
keep the schedules there.  Furthermore, by including them in the 
database we can do more "fun" things down the line.

Schedule Star offers and rss feed of the schedules, but only provides a 
daily or weekly feed.  I cannot get the entire schedule for a team - 
bummer or I would pull the information out of the xml.

I'm sick on re-typing in the information so I found that if I save the 
webpage and open in in OpenOffice 2.0 I can copy the table out into a 
spread sheet and export to a csv.  The file is mangled and looks like this:

***********************

Friday,03/31/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,03:45:00 PM,-- ,-- ,--
,,,,,,,Catasauqua ,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Monday,04/03/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,Notre Dame GP ,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Wednesday,04/05/2005,Away ,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,,,,,,,,,Bangor,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Friday,04/07/2005,Away ,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,,,,,,,,,,Wilson,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Monday,04/10/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,,,,,,Palmerton ,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Wednesday,04/12/2005,Away ,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,,,,,,Palisades,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Monday,04/17/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,11:00:00 AM,-- ,-- ,--
,,,,,,,,Quakertown ,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Wednesday,04/19/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,07:00:00 PM,-- ,-- ,--
,,,,,Saucon Valley ,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Friday,04/21/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,07:00:00 PM,-- ,-- ,--
,,,,,,,,,,,Salisbury ,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Monday,04/24/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,07:00:00 PM,-- ,-- ,--
,,,,Northern Lehigh ,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Wednesday,04/26/2005,Away ,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,,,,Catasauqua,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Friday,04/28/2005,Away ,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,Notre Dame GP,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Monday,05/01/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,07:00:00 PM,-- ,-- ,--
,,,,,,,,,,,,Bangor ,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Wednesday,05/03/2005,Away ,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,,,Northwestern,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Friday,05/05/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,07:00:00 PM,-- ,-- ,--
,,,,,,,,,,Pen Argyl ,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Monday,05/08/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,07:00:00 PM,-- ,-- ,--
,,,,,,,,,Palisades ,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Wednesday,05/10/2005,Away ,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,,Saucon Valley,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Friday,05/12/2005,Away ,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,,,,,,,,Salisbury,,,,,,,,,,,,,,,,,


***************************

My PERL skills are almost non-existent (should have stuck with my 
learning there) so I used some standard text processing commands to 
format the data the way I want it.  This is the result of my script:

***********************************

vbbaseb,2005-03-31,Catasauqua ,H,03:45:00,dis,dep,dir
vbbaseb,2005-04-03,Notre Dame GP ,H,04:00:00,dis,dep,dir
vbbaseb,2005-04-05,Bangor,A,04:00:00,dis,dep,dir
vbbaseb,2005-04-07,Wilson,A,04:00:00,dis,dep,dir
vbbaseb,2005-04-10,Palmerton ,H,04:00:00,dis,dep,dir
vbbaseb,2005-04-12,Palisades,A,04:00:00,dis,dep,dir
vbbaseb,2005-04-17,Quakertown ,H,11:00:00,dis,dep,dir
vbbaseb,2005-04-19,Saucon Valley ,H,07:00:00,dis,dep,dir
vbbaseb,2005-04-21,Salisbury ,H,07:00:00,dis,dep,dir
vbbaseb,2005-04-24,Northern Lehigh ,H,07:00:00,dis,dep,dir
vbbaseb,2005-04-26,Catasauqua,A,04:00:00,dis,dep,dir
vbbaseb,2005-04-28,Notre Dame GP,A,04:00:00,dis,dep,dir
vbbaseb,2005-05-01,Bangor ,H,07:00:00,dis,dep,dir
vbbaseb,2005-05-03,Northwestern,A,04:00:00,dis,dep,dir
vbbaseb,2005-05-05,Pen Argyl ,H,07:00:00,dis,dep,dir
vbbaseb,2005-05-08,Palisades ,H,07:00:00,dis,dep,dir
vbbaseb,2005-05-10,Saucon Valley,A,04:00:00,dis,dep,dir
vbbaseb,2005-05-12,Salisbury,A,04:00:00,dis,dep,dir

*******************************************

Dis = dismissal time
dep = departure time
dir = direction code for the directions form

I don't know of any way I can automate adding these values as they 
differ for each item and are not on the Schedule star schedules.

Also, the time needs to be converted into 24hr for the database (other 
than PERL, I am not sure how this could be done with some simple text 
processing commands).

This is my script, a bit convoluted and I am sure it could be trimmed 
down a bit.  I was wondering what changes others might make to compact 
it a bit or do this better:

*******************************

#!/bin/bash

sed -e :a -e '$!N;s/\n,/ /;ta' -e 'P;D' $1 |sed 's/,,*,/,/g'  |sed 's/-- 
/ /g' |cut -d "," -f2,3,5,9 |sed 's/Home/H/g' |sed 's/Away/A/g' |sed 
's/AM//g' |sed 's/PM//g' | sed 's/ ,/,/g' |awk -F, '{OFS=","; print 
$1,$4,$2,$3}' | cut -d, -f2,3,4 > $1.input.tmp


sed -e :a -e '$!N;s/\n,/ /;ta' -e 'P;D' $1 |sed 's/,,*,/,/g'  |sed 's/-- 
/ /g' |cut -d "," -f2,3,5,9 |sed 's/Home/H/g' |sed 's/Away/A/g' |sed 
's/AM//g' |sed 's/PM//g' | sed 's/ ,/,/g' |awk -F, '{OFS=","; print 
$1,$4,$2,$3}'|cut -d, -f1 | sed 's/\//-/g' |awk -F- '{OFS="-
"; print $3,$1,$2}' > $1.date.tmp

paste -d, $1.date.tmp $1.input.tmp > $1.paste.tmp

cat $1.paste.tmp | xargs -I {} echo "$2,{},dis,dep,dir" > $2.final.csv

rm $1.*.tmp

*********************************************************************

This was pieced together one step at a time and I did not go back and 
remove the redundancy yet.  It's functional to the point that I can now 
bring it back into OpenOffice, add the missing information, export it to 
csv again and import it into the database.

I execute:  ./processor infile team_abbreviation

./processor vbbaseb.csv vbbaseb

-- 
Dann S. Washko
The Linux Link Tech Show
Check Us Out Weekly:  Live/Stream/Podcast

      http://www.thelinuxlink.net

The Linux Link Web Radio Portal
    TLLTS -- LUGRadio -- The LinuxBox Show


get slack (www.slackware.com) and get happy



More information about the Lvlug mailing list