The first step to use this benchmark is to donwload the source from YCSB git:
git clone http://github.com/brianfrankcooper/YCSB.git
Once the clone is done, you will see a folder call YCSB on your current path. cd into the newly created directory YCSB and edit the following files using your favorite editor.
-YCSB/hbase/pom.xml
Edit the corresponding line as shown below to reflect the changes.
For HBase, instead of using hbase, you need to change the artifactid to hbase-client and version to 0.96.0-hadoop2.
For Hadoop 2.2, there is no more hadoop-core and for YCSB to work, change the artifactid to hadoop-common and version to 2.2.0.
-YCSB/pom.xml
I pretty sure this is optional but for completeness, you can also choose to change the following
If you refer to my previous post, you notice i don't change slf4j version anymore. This is because HBase 0.96.0 also using same version as stated in the original pom.xml file which is 1.6.4.
cd into YCSB directory and run mvn clean package to build the package. Once you see the following output, it means the build is successful.
[INFO] YCSB Root ......................................... SUCCESS [40.653s] [INFO] Core YCSB ......................................... SUCCESS [46.852s] [INFO] Cassandra DB Binding .............................. SUCCESS [44.413s] [INFO] HBase DB Binding .................................. SUCCESS [1:49.114s] [INFO] Hypertable DB Binding ............................. SUCCESS [45.091s] [INFO] DynamoDB DB Binding ............................... SUCCESS [38.011s] [INFO] ElasticSearch Binding ............................. SUCCESS [3:22.121s] [INFO] Infinispan DB Binding ............................. SUCCESS [2:43.266s] [INFO] JDBC DB Binding ................................... SUCCESS [13.182s] [INFO] Mapkeeper DB Binding .............................. SUCCESS [8.313s] [INFO] Mongo DB Binding .................................. SUCCESS [5.941s] [INFO] OrientDB Binding .................................. SUCCESS [15.621s] [INFO] Redis DB Binding .................................. SUCCESS [4.171s] [INFO] Voldemort DB Binding .............................. SUCCESS [14.630s] [INFO] YCSB Release Distribution Builder ................. SUCCESS [13.381s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 12:45.433s [INFO] Finished at: Thu Sep 12 18:25:37 SGT 2013 [INFO] Final Memory: 65M/165M [INFO] ------------------------------------------------------------------------
You should be able to look for ycsb-0.1.4.tar.gz file inside YCSB/distribution/target directory. Copy this file to the directory where you have the access permission and untar it. Once untar, copy the hbase-site.xml file from your hbase conf directory to your ycsb-0.1.4/hbase-binding/conf/ directory. Also, you should copy the hadoop-auth-2.2.0.jar from your Hadoop installation directory to your ycsb-0.1.4/hbase-binding/lib/ directory. If not, you might see the following error when you try to run YCSB.
Exception in thread "Thread-3" java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName at org.apache.hadoop.security.UserGroupInformation.getOSLoginModuleName(UserGroupInformation.java:303) at org.apache.hadoop.security.UserGroupInformation.(UserGroupInformation.java:348) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.util.Methods.call(Methods.java:39) at org.apache.hadoop.hbase.security.User.call(User.java:414) at org.apache.hadoop.hbase.security.User.callStatic(User.java:404) at org.apache.hadoop.hbase.security.User.access$200(User.java:48) at org.apache.hadoop.hbase.security.User$SecureHadoopUser. (User.java:221) at org.apache.hadoop.hbase.security.User$SecureHadoopUser. (User.java:216) at org.apache.hadoop.hbase.security.User.getCurrent(User.java:139) at org.apache.hadoop.hbase.client.HConnectionKey. (HConnectionKey.java:67) at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:240) at org.apache.hadoop.hbase.client.HTable. (HTable.java:187) at org.apache.hadoop.hbase.client.HTable. (HTable.java:149) at com.yahoo.ycsb.db.HBaseClient.getHTable(HBaseClient.java:118) at com.yahoo.ycsb.db.HBaseClient.update(HBaseClient.java:303) at com.yahoo.ycsb.db.HBaseClient.insert(HBaseClient.java:358) at com.yahoo.ycsb.DBWrapper.insert(DBWrapper.java:148) at com.yahoo.ycsb.workloads.CoreWorkload.doInsert(CoreWorkload.java:461) at com.yahoo.ycsb.ClientThread.run(Client.java:269) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Before you can run the test, you need to start your hdfs (start-dfs.sh) and hbase (start-hbase.sh). Go into hbase shell and create a table call usertable with column family call family. You can ignore the warning message, you can refer to this link for more information.
After create the table and the column family, you can start loading data into your database
$ ~/ycsb-0.1.4/bin/ycsb load hbase -P ~/ycsb-0.1.4/workloads/workloada -p columnfamily=family -p recordcount=10000 -p threadcount=4 -s | tee -a workloada_load.dat
Start running the benchmark with the command below:
$ ~/ycsb-0.1.4/bin/ycsb run hbase -P ~/ycsb-0.1.4/workloads/workloada -p columnfamily=family -p operationcount=10000 -p recordcount=10000 -p threadcount=4 -s | tee -a workloada_run.dat
The steps above are very simple validation using workloada with only 10000 records loaded into database and 10000 operations during the run. Please take note for run operation (especially for read and update operation tests), you need to specify the recordcount also for your test database size. If you never specify, it will use the default value which is specified in the workload files (default is 1000) and this will cause your test to only execute 10000 operations again and again on 1000 records and the rest of the 9000 records will not be accessed at all.
For more details on what are the available workloads, you can refer to the offcial git site.