Page 1 of 1

MMS zip download link problem -- file naming

Posted: Sun Jan 21, 2007 9:25 pm
by whitingeric
First -- thanks for the new gospel library navigation. Tons better...

Simple problem: ... -1,00.html
April 2006 conference has some bad html -- it references a J: drive with a filename with 'wade' in it... That link fails.

<td valign="top" class="featurestext"><a href="" class="featureslink">General Conference, April 2006</a><font color="#316398">  (200 KB)</font></td>

More difficult problem:
I used a simple perl script to wget the html and then wget all the zip files referenced on the main page. I discovered that the filenaming conventions are interesting...

Code: Select all

foreach $_ (`wget [url],18495,344-1-1,00.html[/url]  -q -O -`) {
    next unless (/\.zip/);
    if (/href="(\S+\.zip)/){
        `wget $link  `;
There is a little bit of mixed up case and hypen and underscores are not used consistiently. Not a big deal at all, but it makes the file listing interesting. :) I think there are at least 6 different ways that 'Mark My Scriptures' is spelled:


Also the Ensign files are all named differently.

Here is the ls *.zip after the script runs: (try to find all the ensign files in any order)