I needed to run a diagnostic test to see if I could use Elastic Search on the products that I’m working on currently, so I’m putting up this brief resource on my experience with Elastic Search for the purposes of testing. The scripts that I used are available at the end of this article for download.
Host System Mac Running Elastic Search
Guest System Virtual Box networked via Bridged Networking
Test Data Set & Elastic Search Configuration:-
One Million Indexes Written using two concurrent writers. One Writing index (zero to five hundred thousand). Second Writing index(five hundred thousand to one million)
Indexes set with a duplicate limit 1 i.e. feed each other with your data, multimaster configuration
Tool written in Python to Insert Mock Data Containing Searchable information ID, Numeral Text etc.
- Test Tool written as a wrapper over curl jusing JSON api. Not intended for blazing fast speed. Things can be faster via native library and even more so with native Java protocol, but explicit is alway better than implicit and premature optimization is the root of all evil.
- Version of Elastic Search Used is 0.17.8
- Basic Replicating Scenario. Stats Attached in Excel File. I would love to run anyone through this.
Everything went Smooth. No Issues Encountered. Please See Attached xlsx file for results
- Virtual Machine Powered off and restarted and relatched into Cluster
Everything resynced fine and very fast. No issues Encountered.
Approx 50 ms querying times on string search exact with concurrent index being manipulated by two concurrent threads writing to index.
Approx 150 ms querying times on search containing substring and 3 OR’s being manipulated by two concurrent threads writing to index.
Highlights SOLR’s deficiency when it comes to realtime writes and reads decreasing performance of single Index.
Useful Video that highlights Elastic Search Internals.
Test Tool Command Line Options:
Usage: writer.py [options] Options: -h, --help show this help message and exit -o HOST, --host=HOST ip/hostname of es instance e.g. stor.mystormachine.com, 192.168.0.3, localhost, etc -p PORT, --port=PORT port of stor instance use 9200 if unsure -n NAME, --name=NAME specifies namespace e.g. test -l LOWER, --lower=LOWER specifies lower number of range -u UPPER, --upper=UPPER specifies upper number of range -t TEST, --test=TEST true means to run command, ignore means ignore
Links to Scripts / Utilities from this post
P.S. I’m copying this from my own wiki, so if something is formatted to look overly dramatic, please forgive me 🙂