Thursday, September 19, 2013

YCSB on HBase

* This post is for using YCSB on HBase 0.94.11 and Hadoop 1.2.1, for YCSB on HBase 0.96 and Hadoop 2.2, please go to this post.

YCSB (Yahoo Cloud Serving Benchmark) is a benchmark tool with common set of workloads for evaluating the performance of different “key-value” and “cloud” serving stores. HBase is one of the targets that can be benchmarked using YCSB.

The first step to use this benchmark is to donwload the source from YCSB git:
git clone http://github.com/brianfrankcooper/YCSB.git

Although it is mentioned you are able to download the binary from the site but the binary will not work when your hbase server version is different compare to the hbase client version used in YCSB binary and you will most likely get the error like below:

java.lang.IllegalArgumentException: Not a host:port pair: 

Once finish cloning, cd into the newly created directory YCSB and edit the following files using your favorite editor.

-YCSB/hbase/pom.xml
Edit the following line shown below to the hbase and hadoop version you have in your environment. In my case, my hbase is 0.94.11 and hadoop is 1.2.1.


-YCSB/pom.xml
Edit the following line to change the slf4j version to 1.4.3.


-YCSB/elasticsearch/pom.xml
Edit the following line to change the slf4j version to 1.4.3.


The changes to the last 2 pom.xml files is to make sure hbase and ycsb use the same version of slf4j. If this is not changed, you might face the problem shown below when running ycsb.

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoopuser/ycsb-0.1.4/hbase-binding/lib/hbase-binding-0.1.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoopuser/ycsb-0.1.4/hbase-binding/lib/slf4j-log4j12-1.4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: slf4j-api 1.6.x (or later) is incompatible with this binding.
SLF4J: Your binding is version 1.5.5 or earlier.
SLF4J: Upgrade your binding to version 1.6.x. or 2.0.x
Exception in thread "Thread-1" java.lang.NoSuchMethodError: org.slf4j.impl.StaticLoggerBinder.getSingleton()Lorg/slf4j/impl/StaticLoggerBinder;
        at org.slf4j.LoggerFactory.bind(LoggerFactory.java:128)
        at org.slf4j.LoggerFactory.performInitialization(LoggerFactory.java:108)
        at org.slf4j.LoggerFactory.getILoggerFactory(LoggerFactory.java:279)
        at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:252)
        at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:265)
        at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:94)
        at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.(RecoverableZooKeeper.java:98)
        at org.apache.hadoop.hbase.zookeeper.ZKUtil.connect(ZKUtil.java:127)
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.(ZooKeeperWatcher.java:153)
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.(ZooKeeperWatcher.java:127)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1507)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.ensureZookeeperTrackers(HConnectionManager.java:716)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:986)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:961)
        at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:227)
        at org.apache.hadoop.hbase.client.HTable.(HTable.java:170)
        at org.apache.hadoop.hbase.client.HTable.(HTable.java:129)
        at com.yahoo.ycsb.db.HBaseClient.getHTable(HBaseClient.java:118)
        at com.yahoo.ycsb.db.HBaseClient.update(HBaseClient.java:302)
        at com.yahoo.ycsb.db.HBaseClient.insert(HBaseClient.java:357)
        at com.yahoo.ycsb.DBWrapper.insert(DBWrapper.java:148)
        at com.yahoo.ycsb.workloads.CoreWorkload.doInsert(CoreWorkload.java:461)
        at com.yahoo.ycsb.ClientThread.run(Client.java:269)

Cd into YCSB directory and run mvn clean package to build the package. Once you see the following output, it means the build is successful.

[INFO] YCSB Root ......................................... SUCCESS [40.653s]
[INFO] Core YCSB ......................................... SUCCESS [46.852s]
[INFO] Cassandra DB Binding .............................. SUCCESS [44.413s]
[INFO] HBase DB Binding .................................. SUCCESS [1:49.114s]
[INFO] Hypertable DB Binding ............................. SUCCESS [45.091s]
[INFO] DynamoDB DB Binding ............................... SUCCESS [38.011s]
[INFO] ElasticSearch Binding ............................. SUCCESS [3:22.121s]
[INFO] Infinispan DB Binding ............................. SUCCESS [2:43.266s]
[INFO] JDBC DB Binding ................................... SUCCESS [13.182s]
[INFO] Mapkeeper DB Binding .............................. SUCCESS [8.313s]
[INFO] Mongo DB Binding .................................. SUCCESS [5.941s]
[INFO] OrientDB Binding .................................. SUCCESS [15.621s]
[INFO] Redis DB Binding .................................. SUCCESS [4.171s]
[INFO] Voldemort DB Binding .............................. SUCCESS [14.630s]
[INFO] YCSB Release Distribution Builder ................. SUCCESS [13.381s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 12:45.433s
[INFO] Finished at: Thu Sep 12 18:25:37 SGT 2013
[INFO] Final Memory: 65M/165M
[INFO] ------------------------------------------------------------------------


You should be able to look for ycsb-0.1.4.tar.gz file inside YCSB/distribution/target directory. Copy this file to the directory where you have the access permission and untar it. Once untar, copy the hbase-site.xml file from your hbase conf directory to your ycsb-0.1.4/hbase-binding/conf/ directory.

Before you can run the test, you need to start your hdfs (start-dfs.sh)and hbase (start-hbase.sh). Go into hbase shell and create a table call usertable with column family call family.


After create the table and the column family, you can start loading data into your database

$ ~/ycsb-0.1.4/bin/ycsb load hbase -P ~/ycsb-0.1.4/workloads/workloada -p columnfamily=family -p recordcount=10000 -p threadcount=4 -s | tee -a workloada_load.dat

Start running the benchmark with the command below:

$ ~/ycsb-0.1.4/bin/ycsb run hbase -P ~/ycsb-0.1.4/workloads/workloada -p columnfamily=family -p operationcount=10000 -p recordcount=10000 -p threadcount=4 -s | tee -a workloada_run.dat

The steps above are very simple validation using workloada with only 10000 records loaded into database and 10000 operations during the run. Please take note for run operation (especially for read and update operation tests), you need to specify the recordcount also for your test database size. If you never specify, it will use the default value which is specified in the workload files (default is 1000) and this will cause your test to only execute 10000 operations again and again on 1000 records and the rest of the 9000 records will not be accessed at all.

For more details on what are the available workloads, you can refer to the offcial git site.

Wednesday, September 11, 2013

Bad Table Rendering When Converting Word Document to PDF

It is always frustrated to see what is being formatted nicely in your Word document become mess up when converted to pdf. One of the problems is table format. Example below shows table see in Word and table see in pdf.
Table display nicely when you see in Office Word

What a mess after converting to pdf

Turn out this is because of the cell margin. Open your table properties and go to cell tab as shown below. Click on the Options... button.


This will bring up Cell Options window as shown below. Noticed that the top and bottom margin is not zero. Change this to zero and click OK.


Convert your pdf document again and you will see that the table format rendering is ok now.

The only problem now is the cell margin gone. To solve this, you just need to use Line Spacing Option as shown below to create the margin you like.



Finally, you got the table format you want in your pdf file.