2009-02-11 by crccheck 3 comments

Using XPath to find email address links

I wanted to write a Greasemonkey script to modify all mailto: links on a page, but to do that, I have to find them first.

Here’s my first attempt:

//a[@href]/text()[contains(.,”@”)]

It’s pretty bad. I wanted to find links starting with “mailto:” but couldn’t figure out how to operate on the href attribute.

second attempt:

//a/@href[contains(.,”mailto:”)]/..

Here you can see that I managed to operate on the href attribute, and then back back up to the a node. I learned how to select the attribute, instead of using the attribute to select the node.

third attempt:

//a[contains(@href,”mailto:”)]

Simplified even further!

fourth attempt:

//a[starts-with(@href,”mailto:”)]

I knew there was a starts with function… I just had to look it up.

3 Comments on "Using XPath to find email address links"

ion
April 11, 2011 1:55 pm

Just try strpos(…) mailto after the spider process
Paresh
July 11, 2014 12:19 pm

What is there is no link or I could say no hyper link present to the email id.

How to find out E-mail ID in this case?
Email finder
August 25, 2019 11:02 am

I’d suggest you to try out strpos or email regex to capture all the email addresses on a page. It’d create an array of all the emails that it can find on that page. Even if there’s no hyperlink to that email. Which is the case for many websites out there.

3 Comments on "Using XPath to find email address links"

Leave a Reply