[Lvlug] Re: Writing an awk or Ruby script
Randy Kramer
rhkramer at gmail.com
Mon Jan 8 16:36:34 EST 2007
Ok, never mind--I got something working (sort of a proof of concept) in awk.
I may have some questions, but I'll get back to the list if/when I do.
Randy Kramer
On Sunday 07 January 2007 06:35 pm, Randy Kramer wrote:
> This was mentioned in another thread, and I was going to start by
> converting my existing nedit macro and then extending it, but I decided to
> make it simpler by keeping the existing nedit macro and using it for the
> (one-time) partial conversion. The following steps will be required
> "continuously" in the future (i.e., not just for the one time conversion).
>
> As mentioned in the other post, I'll consider either an awk or a Ruby
> approach to doing this.
>
> Given a file that looks much like an mbox file, that is with a header for
> each record something like this:
>
> <example record>
> From rhk Wed Dec 27 00:05:00 2006
> Date: 27 Dec 2006 00:00:00 -0500
> From: rhk
> To: rhk
> Subject: *00000001* A record title
>
> UniqFN: "2006 12 27 00:00:00 *00000001*.aml"
> <some other (non-email-like) headers>
>
> <record content (text)>
>
> </example record>
>
> Break the file apart so that each record is in a separate file (complete
> with the "From " record separator).
>
> The name of the file should be either:
>
> The content of the UniqFN "field", e.g.: "2006 12 27 00:00:00
> *00000001*.aml"
>
> or:
>
> The text from the Subject: "field" plus a .aml extension, e.g.: "A record
> title.aml"
>
> In the case of the UniqFN filename, there are no duplicates (at least
> during this initial mass conversion step)--I will have to be prepared to
> deal with possible duplicates in the future (see the procedure outlined
> below for the Subject filename).
>
> In the case of the Subject filename, there is a reasonable chance there
> will be duplicates, and I want a good/user friendly way of handling them,
> perhaps something like this:
>
> * Check the directory the file would be written to to see if there is
> already a file with that name.
> * If so, warn the user and display the proposed filename in an edit box
> which the user can edit to create a different file name
> * Check the directory again to confirm the edited filename is not a
> duplicate, and repeat the previous step if necessary
> * [Ideally, the Subject: field in the record should be changed to
> reflect the edited title]
> * Write the file to the directory with that (non-duplicate) filename
>
> An alternate could be to let the script append a (serial) number to any
> duplicate filename to make it unique--this eliminates the need for the
> operator intervention. As above, the Subject: field should be modified to
> reflect the new title.
>
> Randy Kramer
>
> PS: I don't think it matters here, but my current plan is that these files
> will have multiple filenames created via hardlinks, so any particular file
> will have a UniqFN and one or more mnemonic file names (from one or more
> titles). I'll start with one filename per file, and don't care too much
> which. (I'll always have a way of finding either from the other.)
More information about the Lvlug
mailing list