[Techtalk] website link checking that does orphans?

Miriam English mim at miriam-english.org
Tue Apr 14 06:35:05 UTC 2015


Very cool! I used your weborphans on my site and found 2 bad links that 
I didn't know about (left over from a sitewide reorganisation when I 
moved servers).

However it did get stuck in an infinite loop bouncing between two of my 
pages -- chapters 7 and 8 of one of my online novels, "Companions".

Substituted characters, recombined to 
http://miriam-english.org/stories/books/Companions/7%20-%20Interview.html
Substituted characters, recombined to 
http://miriam-english.org/stories/books/Companions/8%20-%20Picnic.html

I tried using linklint and gave up after several fruitless minutes 
trying to get it to work. I'll try it again later when I have more patience.

Cheers,

	- Miriam

On 14/04/15 11:56, Akkana Peck wrote:
> Miriam English writes:
>> Interesting problem. I checked back on my notes about linklint. I remember
>> it worked quite differently to how I expected and I really struggled with
>> it. In my notes I wrote to myself that it was "ridiculously difficult to
>> program".
>
> Thanks for the suggestions about using linklint.
>
> But looking through the suggestions, and thinking about how I'd have
> to wget the whole site with special arguments to save to a temporary
> directory every time I wanted to do a link check, I decided that if
> for some bizarre reason this tool didn't already exist, it should.
> So I wrote it.
>
> https://github.com/akkana/scripts/blob/master/weborphans
>
> The hard part turned out to be turning all links into absolute
> links, then turning those into equivalent paths on the local
> filesystem. That sounded easy but turned out to have a lot of tricky
> aspects (I'm still working on some edge cases).  But it's good
> enough that I was able to find the 10 bad links and 606 orphaned
> files on this website I inherited.
>
>          ...Akkana
> _______________________________________________
> Techtalk mailing list
> Techtalk at linuxchix.org
> http://mailman.linuxchix.org/mailman/listinfo/techtalk
>

-- 
If you don't have any failures then you're not trying hard enough.
  - Dr. Charles Elachi, director of NASA's Jet Propulsion Laboratory
-----
Website: http://miriam-english.org
Blogs:   http://miriam-e.dreamwidth.org
          http://miriam-e.livejournal.com




More information about the Techtalk mailing list