Logo

Parametric Search Appliance

This is custom script. Do not install updates.

 

Thunderstone Search Appliance Manual

<<Previous: Data From Field Example ↑Up: Data from Field Next>>: Required REX

Data From Field Example - Subfetch to use PDF contents for a Web Page

Subfetches allow you to use content from other URLs to populate the current URL's record. We may have a site about articles, where each article has a web page describing the article, and a link to a PDF of the actual article. We'd like searches that match article contents to take us to the web page, not the article PDF itself.

If the web page has a meta header called "pdfLink" with a URL to the article PDF, we can use the body of the PDF as a replacement for the web page's body with two Data from Field rules like this:

REX Search - (Empty) Replace - (Empty) From Field - Meta Field -> From Meta Field - pdfLink To Field - Subfetch

REX Search - .+ Replace - (Empty) From Field - Text From Meta Field - (Empty) To Field - Body

The Subfetch Data from Field rule fetches the URL specified in the pdfLink header. While this grabs the PDF, it doesn't change anything on its own. We then pull from the PDF's text output, and use that as the Body of the current web page.


Copyright © Thunderstone Software     Last updated: Jul 28 2017

<<Previous: Data From Field Example ↑Up: Data from Field Next>>: Required REX
Page generated in 0.08 seconds.
2024-11-23 18:23:55 EST