Should a site use Apache Solr for its search?
A question that regularly comes up from our clients is: is it worth changing our site to use Apache Solr for the site's search or should we stick with the site's build in search?
With a monolithic search system, the search subsystem provides the search pages, indexing system, configuration, etc. In order to use a different search system, like Apache Solr or Elasticsearch, people built monolithic modules that had their own UI, their own indexing logic, their own configuration, etc. This was common practice with Drupal 5 and 6 and resulted in one module for working with Apache Solr, one for working with Elasticsearch, etc, each with different functionality, stability, quality, etc. It also leads to maintenance problems as the duplicate functionality had to be separately maintained, so bug fixes or UX improvements to one didn't make their way to others.
In Drupal 7 and, to a larger extent, Drupal 8 the community has combined efforts around a modular system in the Search API module suite. Instead of one system for the results display, the indexing, the querying, etc, it is broken out into separate components that can be replaced. For example, if a site doesn't need facets then don't need that piece, or if it needs to work with ElasticSearch rather than Apache Solr, or even Xapian or Sarnia or Sphinx, then just install that server integration piece.
We standardized on Search API for all search functionality on Drupal 7 and 8 sites. With this approach, the search "engine" is demoted to being a data storage system, i.e. a search index storage server. This means that the search results theming, the search page management, the facets, and most of the configuration is not dependent upon using Apache Solr, most of the same functionality can be achieved using the database as the index storage.
The issue of whether or not to use Apache Solr (or Elasticsearch, or...) as the search index server boils down to a few aspects:
Apache Solr is designed around fast indexing and retrieval of data whereas the normal database index is not optimized for such queries. Anecdotally, one Drupal 7 facet-based search site with approximately 150,000 nodes went from ~30 second page load times for the search engine to near instantaneous responses from a 3rd party hosted Apache Solr service.
- Separation of concerns
In addition to Apache Solr itself being faster than a regular database for querying data, it can also be separated onto a different server. This means that any performance problems that might occur with that server will not negatively affect other parts of the site.
Search indexes can become very large in comparison to the site of the source data. When this data is in the main database it greatly expands the size of the site's regular database backups, making backing up the site more cumbersome. As a result, moving the search index to a different server means that the database backups will be noticeably smaller and more efficient.
Today a good many web hosting providers make Apache Solr, or another indexing system, available as a standard feature, so our clients using Pantheon, Acquia, etc can get set up with very little work. For situations where this isn't available, it is really easy to hook up to a hosted search solution, like Hosted Apache Solr, OpenSolr, ElasticSearch, etc, and these can work really well. Lastly, people who have the server chops can always just install it on their server(s) directly, though obviously that adds a bit more maintenance to deal with.
So yes, it is well worth using Apache Solr or another search index software, especially if a site's hosting provider has it available.