Oracle® Secure Enterprise Search Administrator's Guide 10g Release 1 (10.1.6) Part Number B19002-02 |
|
|
View PDF |
This chapter contains the following topics:
Oracle Secure Enterprise Search (SES) provides uniform search capabilities over multiple repositories.
Oracle SES uses a crawler to collect data from these sources. The crawler supports a number of built-in source types, as well as a published, plug-in architecture for adding new types. Multiple Oracle SES instances may also share content through the federated source type.
Oracle SES supports the following built-in source types:
Web: A Web source represents the content on a specific Web site. Web sources facilitate maintenance crawling of specific Web sites.
Table: A table source represents content in an Oracle database table or view.
File: A file source is the set of documents that can be accessed through the file protocol.
E-mail: An e-mail source derives its content from e-mails sent to a specific e-mail address. When Oracle SES crawls an e-mail source, it collects e-mail from all folders set up in the e-mail account, including Drafts, Sent Items, and Trash e-mails.
Mailing list: A mailing list source derives its content from e-mails sent to a specific mailing list.
OracleAS Portal: An OracleAS Portal source allows users to search across multiple OracleAS Portal repositories, such as Web pages, files on disk, and pages on other OracleAS Portal instances.
Federated: A federated source is a repository that maintains its own index. Oracle SES can issue a search, and the repository can return results.
User-defined: You can implement a crawler plug-in to crawl and index a proprietary document repository, such as Lotus Notes or Documentum.
The following diagram illustrates Oracle SES architecture.
See Also:
|
Oracle SES includes the following components:
The Oracle SES crawler is a Java process activated by a set schedule. When activated, the crawler spawns a configurable number of processor threads that fetch information from various sources and index the documents. This index is used for searching sources.
The crawler maps links and analyzes relationships. Whenever the crawler encounters embedded, non-HTML, or non-textual documents during the crawling, it automatically detects the document type and filters and indexes the document.
Use the Oracle Secure Enterprise Search administration tool to manage and monitor Oracle SES components. For example:
Define sources and crawling scope
Configure the query application
Monitor crawl progress and query performance
See Also:
|
Oracle Secure Enterprise Search provides several APIs. For example, the Crawler Plug-in API enables you to create your own secure crawler plug-in to meet your requirements. With the Web Services API, you can customize your search interface.
Oracle SES also provides an out-of-the-box query application.
Information in an enterprise can be spread across Web pages, databases, mail servers or other collaboration software, document repositories, file servers, and desktops. Oracle SES searches all your data through the same interface. Oracle SES is fully globalized and works with 27 languages, including Chinese, Japanese, Korean, Arabic, and Hebrew.
This section introduces a few of the features in Oracle SES. It includes the following topics:
See Also: Chapter 3, "Understanding Crawling and Searching" for more features relating to the crawler |
Much of the information within an organization is publicly accessible. Anyone is allowed to view it. Therefore, it is relatively easy for a crawler to find and index that information.
However, there are other sources that are protected. These protected sources may only be viewable by certain users or groups of users. For example, while users can search within their own e-mail folders, they should not be able to search anyone else's e-mail.
For protected sources, the Oracle SES crawler will index documents with the proper access control list. When end users perform a search, only documents that they have privileges to view will be returned.
Oracle Secure Enterprise Search provides the capability of searching multiple Oracle SES applications with their own document repositories and indexes. It provides a unified query framework to search the different document repositories that are crawled, indexed, and maintained separately. Federated search allows a single query to be run across all indexes. It aggregates the search results to show one result list to the user. User credentials are passed along with the query so that each remote application can authenticate the user against its own document repository.
The following diagram illustrates Oracle SES federation architecture.
Oracle SES offers a Web services API that lets you build a custom query application very easily.
Oracle SES provides an extensible crawler plug-in framework that lets you crawl and index proprietary document repositories.
See Also:
|