17 April 2014

The Ruby Reflector

Topic

Lucene

  Source Favicon
By Seth Falcon of Chef Blog 2 years ago.
Email

The search outage was the result of Lucene doing a merge of its index that took minutes. Analysis of our Lucene index revealed that we had well over 900,000 fields in our index. If you look hard, you can find benchmarks on Lucene using 30 fields. Our system was using Lucene wrong.

In addition to mapping each flattened key into a field, the indexer was "expanding" all compound keys by inserting an 'X' to allow a kind of wild card searching. For example, the JSON: {"a": …

opscode.com Read
  Source Favicon
By Peter Cooper of Ruby Inside over 2 years ago.
Email

…features, communicating by JSON over RESTful HTTP, based on Lucene, written in Java.

Queue Classic: A Powerful PostgreSQL-Backed Queueing Library

Queue Classic is a PostgreSQL-backed queueing library built by Ryan Smith of Heroku that focuses on concurrent job locking, minimizing database load and providing a simple, intuitive user experience.

Ruby Jobs of the Week

Sr (Agile) Software Engineer at Apple [ Cupertino, CA]

Apple is seeking the …

rubyinside.com Read
  Source Favicon
By Todd Hoff of High Scalability almost 3 years ago.
Email

…to over 2B requests a day, and a data warehouse of over 20TB that is used to drive email campaigns, SEM, and general reporting. We are a Linux/ Java/ Apache/Tomcat/ Postgres/ Lucene shop, and have built our own distributed computing architecture. We also maintain duplicate data centers (one active, one standby) for redundancy and maintenance purposes.

Too bad, it sounds like it would have been a good article.

highscalability.com Read
  Source Favicon
On Scout ~ The Blog 3 years ago.
Email

CouchDB. We also use the Scout CouchDB Database and CouchDB Lucene monitoring plugins to keep an eye on the size of our databases and Lucene indexes . Our messaging database, which holds information about every SMS message that we have ever sent, is a whopping 130GB. The Lucene index alone for this database is over 13GB. Being able to see the sizes of these data stores, and their growth over time, helps us plan for the future.

Performance-wise, are there any high-level metrics …

scoutapp.com Read
  Source Favicon
By Maurício Linhares of Codeshooter's Weblog over 3 years ago.
Email

Query Parser Syntax ( Solr is somewhat a web interface to Lucene) you'll see that you can use the "*" operator to perform a partial match. We could then search for "battle*" and this would yield the results we expect, but doing this kind of partial matching is slow and could possibly become a bottleneck for your application, so we have to figure out another way to do this.

When all you need is prefixed partial matching, the solr.EdgeNGramFilterFactory is …

codeshooter.wordpress.com Read
  Source Favicon
By Jonathan Ellis of Spyced over 3 years ago.
Email

…also saw Lucandra , which implements a Cassandra back end for Lucene and is used in several high volume production sites, grow up into Solandra , embedding Solr and Cassandra in the same JVM for even more performance.

Community

Cassandra hit its stride in 2010, starting with graduation from the ASF incubator in April. 2010 saw 1025 tickets resolved, nearly twice as many compared to 2009 (565).

Like many Apache projects, Cassandra…

spyced.blogspot.com Read
  Source Favicon
By Ilya Grigorik of igvita.com over 3 years ago.
Email

…or use Java clients to access your Lucene indexes. Solr and Lucene began as independent projects, but just this past year both teams have decided to merge their efforts - all around, great news for both communities. If you haven't already, definitely take Solr for a spin .

Real-time Search with Lucene

Real-time search was a big theme at Lucene Revolution. Unlike many other IR toolkits, Lucene has always supported incremental index updates, but unfortunately …

igvita.com Read
  Source Favicon
On paperplanes over 3 years ago.
Email

The awesome dudes at Basho released Riak 0.13 and with it their first version of Riak Search yesterday. This is all kinds of exciting, and I'll tell you why. Riak Search is (why down below) based on Lucene, both the library and the query interface. It mimicks the Solr web API for querying and indexing. Just like you'd expect something coming out of Basho, you can add and remove nodes at any time, scaling up and down as you go. I've seen an introduction …

paperplanes.de Read
  Source Favicon
By Brian Doll of New Relic over 3 years ago.
Email

…search the wiki, the source code, javadocs, mailing lists and website content. Of course my books Lucene in Action and Solr in Action are a great place to start, too!

What's on the horizon for Lucene/ Solr in the near future?

Lucene is getting improved in so many areas, including at its very core, it's hard to keep up. Lucene has always been very fast, but amazingly its developers are still finding ways to make it go faster. Solr clearly benefits from that directly. …

newrelic.com Read
  Source Favicon
By Todd Hoff of High Scalability over 3 years ago.
Email

Hadoop, RabbitMQ, Zookeeper, Thrift, HDFS and Lucene. We're rewriting Digg from the ground up and we need amazing developers to join our world-class team. If you think you are up for the challenge, or you know someone who might be, take a look at our jobs page for more information.

CloudSigma

Instantly Scalable European Cloud Servers . Create virtual servers in the cloud that are fully scalable and adaptive. Control your servers via our web console or API. CloudSigma…

highscalability.com Read