Introduction The Search Internet feature in the Sherlock application allows users to perform Internet searches using one or more Internet search engines. Each search engine Sherlock uses is represented by a plug-in file that describes the formats the engine expects for queries and produces in its responses. These files are stored in the Internet Search Sites folder in the System Folder. Developers can create a new plug-in to add a search engine to Sherlocks repertoire if they know how to interpret the HTML files which underlie search engine web pages and if theyre proficient with tools such as BBEdit, a flexible text editor available from Bare Bones Software and ResEdit, a free utility from Apple. As a quick example, well create a simple plug-in to search the Apple web site. (You can learn more by experimenting and looking at the source text for the Sherlock plug-ins in the Internet Search Sites folder. Also take a look at Technote 1141: Extending and Controlling Sherlock.) Open the System Folder, then open the Internet Search Sites folder. Select one of the plug-in files. For this example, well use the AltaVista.src file. From the File menu, choose Duplicate. Rename the AltaVista.src copy file to Apple.src. (The copy will retain the Alta Vista icons&emdash;you can remove or change them with ResEdit.) Using a text editor such as BBEdit, open Apple.src. (To do so, select the Any File popup menu in the BBEdit open file dialog. These are TEXT files, but their file types are set to issp, which Mac OS 8.5 recognizes as Sherlock plug-ins.)
This is what youll see
# © 1998 Apple Computer, Inc. <search name = "AltaVista action = "http://www.altavista.com/cgi-bin/query" update="http://si.info.apple.com/updates/AltaVista.src.hqx" updateCheckDays = 3 method = get> <input name="pg" value="q"> <input name="what" value="web"> <input name="fmt" value="."> <input name="kl" value="en"> <input name="q" user> <interpret bannerStart="</layer>" bannerEnd="<BODY>" resultListStart="RealName (sm)" resultListEnd="Pages: <b>" resultItemStart="<dl><dt>" resultItemEnd="</dl>" > </search>
Explanation Looks like HTML, doesnt it? Its not, but the syntax for a Sherlock search block is similar. Here is a brief explanation of the values you see (caps and lowercase do not matter). Search blocks begin with a <SEARCH ...> tag containing a number of attributes, as described in the following table and end with a </SEARCH> tag. A typical search block describing an Internet search site contains one or more INPUT tags and an INTERPRET tag. The SEARCH block attributes describe the search site, how it is to be accessed, and where to find updates to the search plug-in file.
Attribute Name
name
Name of the search
plug-in.
method
Specifies what HTTP command to
use for communications with the HTTP server.
Currently, either action
Specifies the full URL for the
search server. Any relative links in the result
list will be localized using this URL.
update
Optional attribute specifying
where to find the most recent version of the search
plug-in file. If provided, the Sherlock application
will periodically check this URL for changes. If
the file at this URL is more recent than the one
currently installed, Sherlock will prompt the user
to download the new file and automatically install
it. Preferrably, the file located at this address
should be in BinHex format (but not otherwise
compressed or encoded).
dateCheckDays
Optional attribute specifying
the number of days between times when the update
URL is checked for more recent versions of the
search plug-in file. If this attribute is not
present, the default value of 30 days is
used.
description
Optional attribute containing
text describing the search engine, its
capabilities, and the content type of the search
results. This text may be used for display in user
interface facilities.
bannerImage
Optional attribute specifying an
URL for an image that will be displayed in the
details pane when any result from a query using
this search plug-in is selected. Note: the banner
properties of the bannerLink
Optional attribute specifying an
URL that will be loaded when the banner image is
clicked. Note: the banner properties of the
Lets look at an example To begin, well look at HTML source for a page containing the Find button. Heres what we find on the Apple home page: <!-- FIND FEATURE --> <CENTER> <FORM METHOD="POST" ACTION="http://search.apple.com/cgi-bin/nph-apple_search.pl"> <INPUT TYPE="hidden" NAME="qparser" VALUE="simple"> <INPUT TYPE="hidden" NAME="POINTER" VALUE="FRONT"> <STRONG>Find:</STRONG> <INPUT TYPE="text" NAME="query" SIZE=20> <INPUT TYPE="submit" NAME="buttonshort" VALUE="Shortcut"> <INPUT TYPE="submit" NAME="buttonsearch" VALUE="Search"><BR> This gives us the information we need to change the first section of this plug-in. We use the ACTION parameter from the Apple home page for the action parameter of the plug-in. Well skip the optional update and updateCheckDays lines, which tell Sherlock where and when to look for the latest version of the plug-in. Depending on what the search engine requires, the method line contains either get or post. In this case, we see that the Apple home page form method is POST. Now we can modify the beginning of the file as follows:
# © 1998 Apple Computer, Inc. <search name = "Apple" action = "http://search.apple.com/cgi-bin/nph-apple_search.pl" method = post>
The Apple home page INPUT TYPE parameters contain the information we need to feed Sherlocks input attribute. After a bit of head-scratching and experimenting (you experiment by trying different attributes with Sherlock until you achieve success), we find that the one we want is contained in <INPUT TYPE="text" NAME="query" SIZE=20>. Now we can expand to this:
# © 1998 Apple Computer, Inc. <search name = "Apple" action = "http://search.apple.com/cgi-bin/nph-apple_search.pl" method = post> <input name = "query" user>
(Some search engines work differently. Read Technote 1141: Extending and Controlling Sherlock for more information. Also experiment and look carefully at the HTML and the URL the page generates when it sends a search query.) To
proceed, we need to look at the HTML source for a page with
the results of a search and identify unique text that marks
the beginning of the list, the end of the list and the
beginning and end of each item. Search for Java,
then look at the source for the results to find out how they
are formatted. Here is a fragment of the format, showing the
beginning of the list and the beginning and end of each line
containing a match:
<H3>Page Matches</H3> <BLOCKQUOTE>Found <B>1599</B> pages with your term in 265665 Apple web pages. <P>Duplicates may have been removed...</BLOCKQUOTE> <dl><dt><tt> 1. </tt><IMG SRC="http://www.apple.com/find/.../FONT><p><p></dd> <dt><tt> 2. </tt><IMG SRC="http://www.apple.com/find/.../FONT><p><p></dd> <dt><tt> 3. </tt><IMG SRC="http://www.apple.com/find/.../FONT><p><p></dd> <dt><tt> 4. </tt><IMG SRC="http://www.apple.com/find/.../FONT><p><p></dd> <dt><tt> 5. </tt><IMG SRC="http://www.apple.com/find/.../FONT><p><p></dd> <dt><tt> 6. </tt><IMG SRC="http://www.apple.com/find/.../FONT><p><p></dd> <dt><tt> 7. </tt><IMG SRC="http://www.apple.com/find/.../FONT><p><p></dd> <dt><tt> 8. </tt><IMG SRC="http://www.apple.com/find/.../FONT><p><p></dd> <dt><tt> 9. </tt><IMG SRC="http://www.apple.com/find/.../FONT><p><p></dd> <dt><tt> 10. </tt><IMG SRC="http://www.apple.com/find/.../FONT><p><p></dd> </dl>... <CENTER>... <H3> marks the beginning of the search results list and </dl> marks the end. Each result line begins with <tt> and ends with </dd>.
The final result So
heres the final plug-in code: # © 1998 Apple Computer, Inc. <search name = "Apple" action = "http://search.apple.com/cgi-bin/nph-apple_search.pl" method = post> <input name = "query" user> <interpret resultListStart = "<H3>" resultListEnd = "</dl>" resultItemStart = "<tt>" resultItemEnd = "</dd>" > </search>
|
|
|