[Lvlug] A bit of fun with this script
dann s washko
dann at thelinuxlink.net
Wed Dec 7 18:59:48 EST 2005
I was adding some tables to a calendar database today to include the
sport schedules at work. Normally I'd get printed copies of the
schedules and have to type the information in manually. The Athletic
director uses a program called Schedule Star and recently moved to the
online version which provides the schedules on-line. We offer the
schedules off our website also with door-to-door directions and want to
keep the schedules there. Furthermore, by including them in the
database we can do more "fun" things down the line.
Schedule Star offers and rss feed of the schedules, but only provides a
daily or weekly feed. I cannot get the entire schedule for a team -
bummer or I would pull the information out of the xml.
I'm sick on re-typing in the information so I found that if I save the
webpage and open in in OpenOffice 2.0 I can copy the table out into a
spread sheet and export to a csv. The file is mangled and looks like this:
***********************
Friday,03/31/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,03:45:00 PM,-- ,-- ,--
,,,,,,,Catasauqua ,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Monday,04/03/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,Notre Dame GP ,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Wednesday,04/05/2005,Away ,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,,,,,,,,,Bangor,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Friday,04/07/2005,Away ,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,,,,,,,,,,Wilson,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Monday,04/10/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,,,,,,Palmerton ,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Wednesday,04/12/2005,Away ,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,,,,,,Palisades,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Monday,04/17/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,11:00:00 AM,-- ,-- ,--
,,,,,,,,Quakertown ,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Wednesday,04/19/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,07:00:00 PM,-- ,-- ,--
,,,,,Saucon Valley ,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Friday,04/21/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,07:00:00 PM,-- ,-- ,--
,,,,,,,,,,,Salisbury ,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Monday,04/24/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,07:00:00 PM,-- ,-- ,--
,,,,Northern Lehigh ,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Wednesday,04/26/2005,Away ,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,,,,Catasauqua,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Friday,04/28/2005,Away ,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,Notre Dame GP,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Monday,05/01/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,07:00:00 PM,-- ,-- ,--
,,,,,,,,,,,,Bangor ,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Wednesday,05/03/2005,Away ,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,,,Northwestern,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Friday,05/05/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,07:00:00 PM,-- ,-- ,--
,,,,,,,,,,Pen Argyl ,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Monday,05/08/2005,Home,,,,,,,,,,,,,,,,,,,,,,-- ,07:00:00 PM,-- ,-- ,--
,,,,,,,,,Palisades ,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Wednesday,05/10/2005,Away ,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,,Saucon Valley,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Friday,05/12/2005,Away ,,,,,,,,,,,,,,,,,,,,,,-- ,04:00:00 PM,-- ,-- ,--
,,,,,,,,,,,Salisbury,,,,,,,,,,,,,,,,,
***************************
My PERL skills are almost non-existent (should have stuck with my
learning there) so I used some standard text processing commands to
format the data the way I want it. This is the result of my script:
***********************************
vbbaseb,2005-03-31,Catasauqua ,H,03:45:00,dis,dep,dir
vbbaseb,2005-04-03,Notre Dame GP ,H,04:00:00,dis,dep,dir
vbbaseb,2005-04-05,Bangor,A,04:00:00,dis,dep,dir
vbbaseb,2005-04-07,Wilson,A,04:00:00,dis,dep,dir
vbbaseb,2005-04-10,Palmerton ,H,04:00:00,dis,dep,dir
vbbaseb,2005-04-12,Palisades,A,04:00:00,dis,dep,dir
vbbaseb,2005-04-17,Quakertown ,H,11:00:00,dis,dep,dir
vbbaseb,2005-04-19,Saucon Valley ,H,07:00:00,dis,dep,dir
vbbaseb,2005-04-21,Salisbury ,H,07:00:00,dis,dep,dir
vbbaseb,2005-04-24,Northern Lehigh ,H,07:00:00,dis,dep,dir
vbbaseb,2005-04-26,Catasauqua,A,04:00:00,dis,dep,dir
vbbaseb,2005-04-28,Notre Dame GP,A,04:00:00,dis,dep,dir
vbbaseb,2005-05-01,Bangor ,H,07:00:00,dis,dep,dir
vbbaseb,2005-05-03,Northwestern,A,04:00:00,dis,dep,dir
vbbaseb,2005-05-05,Pen Argyl ,H,07:00:00,dis,dep,dir
vbbaseb,2005-05-08,Palisades ,H,07:00:00,dis,dep,dir
vbbaseb,2005-05-10,Saucon Valley,A,04:00:00,dis,dep,dir
vbbaseb,2005-05-12,Salisbury,A,04:00:00,dis,dep,dir
*******************************************
Dis = dismissal time
dep = departure time
dir = direction code for the directions form
I don't know of any way I can automate adding these values as they
differ for each item and are not on the Schedule star schedules.
Also, the time needs to be converted into 24hr for the database (other
than PERL, I am not sure how this could be done with some simple text
processing commands).
This is my script, a bit convoluted and I am sure it could be trimmed
down a bit. I was wondering what changes others might make to compact
it a bit or do this better:
*******************************
#!/bin/bash
sed -e :a -e '$!N;s/\n,/ /;ta' -e 'P;D' $1 |sed 's/,,*,/,/g' |sed 's/--
/ /g' |cut -d "," -f2,3,5,9 |sed 's/Home/H/g' |sed 's/Away/A/g' |sed
's/AM//g' |sed 's/PM//g' | sed 's/ ,/,/g' |awk -F, '{OFS=","; print
$1,$4,$2,$3}' | cut -d, -f2,3,4 > $1.input.tmp
sed -e :a -e '$!N;s/\n,/ /;ta' -e 'P;D' $1 |sed 's/,,*,/,/g' |sed 's/--
/ /g' |cut -d "," -f2,3,5,9 |sed 's/Home/H/g' |sed 's/Away/A/g' |sed
's/AM//g' |sed 's/PM//g' | sed 's/ ,/,/g' |awk -F, '{OFS=","; print
$1,$4,$2,$3}'|cut -d, -f1 | sed 's/\//-/g' |awk -F- '{OFS="-
"; print $3,$1,$2}' > $1.date.tmp
paste -d, $1.date.tmp $1.input.tmp > $1.paste.tmp
cat $1.paste.tmp | xargs -I {} echo "$2,{},dis,dep,dir" > $2.final.csv
rm $1.*.tmp
*********************************************************************
This was pieced together one step at a time and I did not go back and
remove the redundancy yet. It's functional to the point that I can now
bring it back into OpenOffice, add the missing information, export it to
csv again and import it into the database.
I execute: ./processor infile team_abbreviation
./processor vbbaseb.csv vbbaseb
--
Dann S. Washko
The Linux Link Tech Show
Check Us Out Weekly: Live/Stream/Podcast
http://www.thelinuxlink.net
The Linux Link Web Radio Portal
TLLTS -- LUGRadio -- The LinuxBox Show
get slack (www.slackware.com) and get happy
More information about the Lvlug
mailing list