I wanted to write a Greasemonkey script to modify all mailto: links on a page, but to do that, I have to find them first.
Here’s my first attempt:
//a[@href]/text()[contains(.,”@”)]
It’s pretty bad. I wanted to find links starting with “mailto:” but couldn’t figure out how to operate on the href attribute.
second attempt:
//a/@href[contains(.,”mailto:”)]/..
Here you can see that I managed to operate on the href attribute, and then back back up to the a node. I learned how to select the attribute, instead of using the attribute to select the node.
third attempt:
//a[contains(@href,”mailto:”)]
Simplified even further!
fourth attempt:
//a[starts-with(@href,”mailto:”)]
I knew there was a starts with function… I just had to look it up.
Just try strpos(…) mailto after the spider process
What is there is no link or I could say no hyper link present to the email id.
How to find out E-mail ID in this case?
I’d suggest you to try out strpos or email regex to capture all the email addresses on a page. It’d create an array of all the emails that it can find on that page. Even if there’s no hyperlink to that email. Which is the case for many websites out there.