HBase troubleshooting of general ideas

2011-05-02  来源:本站原创  分类:Internet  人气:69 

1 How to find the problem in our cluster, HBase error by splunk and nagio alarm mechanism report. When the service is abnormal, such as exit, crash, master / regionserver such an exception is thrown, the administrator will receive the message.

2 issue tracking method http://hbase.apache.org/book.html # trouble.general in, HBase deal with the problem given the general idea.
1. The exception directly to Google or search-hadoop.com search. Google engineers can not do without the tools ah.
2. HBase the problem is often not independent. Can be found in the log a lot of exception, the most direct way is to find the first exception. Java's problems are generally so resolved. But do not just grep Error message, because the log level defines HBase some confusion. Sometimes some serious mistakes, but marked as INFO. The recommendations also grep about "Dump", because Regionserver may print out some of the metric.
3 will be able to pay attention to set the ulimit and xcievers. Regionserver because Zookeeper session timout may automatically exit this before the blog also discussed.
3. Log location
NameNode: $ HADOOP_HOME/logs/hadoop- <user>-namenode-<hostname>. Log
DataNode: $ HADOOP_HOME/logs/hadoop- <user>-datanode-<hostname>. Log
JobTracker: $ HADOOP_HOME/logs/hadoop- <user>-jobtracker-<hostname>. Log
TaskTracker: $ HADOOP_HOME/logs/hadoop- <user>-jobtracker-<hostname>. Log
HMaster: $ HBASE_HOME/logs/hbase- <user>-master-<hostname>. Log
RegionServer: $ HBASE_HOME/logs/hbase- <user>-regionserver-<hostname>. Log
4 Some important tools
a) search-hadoop.com
b) tail
c) top
d) jps
e) jstack
f) OpenTSDB. Not used. Comments are used under the ah.
g) clusterssh + top. This is a good idea. Ssh $ host top can be used to collect information on other machines in the cluster. This has become a monitor tool.
h) $. / bin / hbase hbck
Returns OK or INCONSISTENCY. If INCONSISTENCY, you can run more than a few times, because there may be a good cluster has not fully started or have Region in splitting. -Fix may be able to repair the inconsistency. (Have not tried, have the opportunity to look in the end it really easy to use. Documents written so not sure)

In addition, http://fuliang.iteye.com/blog/1024360 article server performance evaluation of the maintenance hbase command is very useful.

相关文章
  • HBase troubleshooting of general ideas 2011-05-02

    1 How to find the problem in our cluster, HBase error by splunk and nagio alarm mechanism report. When the service is abnormal, such as exit, crash, master / regionserver such an exception is thrown, the administrator will receive the message. 2 issu

  • HBase troubleshooting的一般思路 2014-12-02

    1. 如何发现问题 在我们的集群中,HBase的错误是通过splunk和nagio的报警机制报告的.当service出现异常,如退出,crash,master/regionserver 抛出异常等,管理员都会收到消息. 2. 问题跟踪方法 在http://hbase.apache.org/book.html#trouble.general中,HBase给出了处理问题的一般思路. 1. 将exception直接到Google或者search-hadoop.com 搜索.Google是工程师离不开的

  • FireFly change the general solution set is missing 2010-02-22

    FF change set missing: Users only need to enter the missing in the command line interface to change set the local Default workspace directory, then run the repair order can be specified. Suppose the user's local workspace path is c: \ ws, Firefly the

  • Web site optimization ideas (in. British resource control) 2010-08-20

    General ideas and methods of website optimization similar, if you do not know how to start your optimization. Here are two chart to describe the whole idea, in fact, foreign and domestic SEOr read a summary of almost the same, the difference is diffe

  • HBase Trouble shooting 2011-03-18

    Recently HBase often automatically exit, view the log found the following errors: 2011-03-18 06:59:19,523 INFO org.apache.zookeeper.ClientCnxn: Attempting connection to server ***/****: 2222 2011-03-18 06:59:26,191 WARN org.apache.zookeeper.ClientCnx

  • Automatic generation of update packages (subversion, ruby) continued 2009-04-26

    Automatically upgrade package 2 (subversion) Before writing the production side of the current directory to modify the extracted files, this is the current directory under a particular version of comparison before, and then get the difference file. I

  • An optimization Baidu and Google search results search web site is how to write out the (original)? 2010-03-23

    After six months of time, finally my personal search sites (Wo found: www.ausou.net) written, and Zhengshishangxian operations. Here I would like to share with them ideas and experiences, are also considered a summary bar! Want to write an Baidu and

  • Oracle8i Data Processing (BLOB, CLOB, NVARCHAR, NCLOB, RAW, NCHAR) 2010-04-16

    These days, wrote a write Oracle8i, the processing of some data types, in-line in this respect too little, to sum up here, we want to be useful, of course, not deep enough to write (not know much about these, not much confidence) BLOB: used to store

  • Web design steps and bigger picture 2010-04-28

    Long time not seen a good article worth sharing, and today saw a classic article, written in a very delicate sense of reality. Share with friends, peers do web design. Web Design steps: 1: Positioning Through with the customer, or business and custom

  • SQL split field [change] 2010-04-29

    Split string field and combined field is the two questions dealing with common, for instance following the analysis will be solved. Question: split a string into a record set of the original table Table1 (id int identity (1,1), name varchar (100)) id

  • Two-way pipeline (Bi-directional Pipe) Principles of Implementation and Analysis 2010-06-21

    * What is a pipe (Pipe)? Pipeline is jxta a concept which is more important. Pipeline is a virtual channel between the Peer. In general, we believe that such communication is a single communication link. But also not always the case. Because of a fir

  • How to repair MySQL database table 2010-08-13

    You may use the MySQL process, the various database tables accident damage, and these data are the latest data, and can not be found in the backup data. This chapter will continue the article, check out the list of questions, to tell you how to repai

  • Report group to achieve Sheet jump between 2010-08-22

    The application report group issued a report to open in more and more widely, can achieve some of the functions of a single report can not be achieved. Dry Run Report Designer provides users a more comprehensive report design and development group, b

  • Paw Robot Application Analysis 2010-09-06

    A few days ago, Paw Robot automatic search for network resources on high-definition video, CCTV began to find RTMP protocol used to transfer video files. This is our first domestic application found RTMP to transfer high-definition video site (I am h

  • Face a new environment as a Oracle DBA to understand the 2010-09-21

    (Because I think this very good, so reproduced to share, but also to facilitate their access to) the original address: http://www.itpub.net/thread-1166708-1-1.html Face a new environment as a Oracle DBA, first of all should know what? Here, talk to t

  • Oracle DBA in the new environment must be aware of the things [zt] 2010-10-31

    Face a new environment as a Oracle DBA, first of all should know what? Here, talk about those broad areas, such as understanding the overall situation of the entire IT environment, assuming you already know these, you need to face is that the databas

  • Why innovation is occurring in a small company which 2010-11-02

    Shouted the slogan of many large companies, but the real innovation to get out does not seem too much. But some small companies have to tinker with a lot of things, such as Android is a small company just started, and then later Google acquisition; D

  • SEO primary articles (keywords used) 2011-05-20

    seo search engine optimization, intuitive understanding of surface phenomena is the keyword in the rankings. That is natural, we solve a problem: how to determine the key words? Why do I determine the key words. First, the Web site to determine what

  • FireFly loss of general-purpose solution to change the set 2010-02-22

    FF changeset lost: Users only need to enter the missing command-line interface to change set the local Default workspace directory, then the specified repair command. Assuming the user's local workspace path is c: \ ws, the user needs to Firefly comm

  • Software Project Manager Newbie (2) - power come from? 2011-05-02

    Conflict is a former technology project manager for technology often encounter things. Started just a technical discussion of the technical discussion of the conflict becomes. 1 story Connotes bad mood recently, which resulted from a technical debate