python-Hbase
2023-05-06 16:16:31 5 举报
AI智能生成
简易搭建与问题记录
作者其他创作
大纲/内容
HDFS
虚拟机概览
系统
redhat
硬盘
150GB
内存
10GB
处理器
4
ip
Hadoop0:192.168.56.108<br>Hadoop1:192.168.56.109<br>Hadoop2:192.168.56.110
1.软件版本
hadoop3.3
/opt/module/hadoop-3.3
jdk1.8.0-201<br>
/opt/module/jdk1.8
2.系统变量配置
vim /etc/hosts
192.168.56.108 hadoop0<br>192.168.56.109 hadoop1<br>192.168.56.110 hadoop2
三台一致,source生效
vim /etc/profile
# export JAVA_HOME=/opt/module/jdk1.8<br># export JAVA_CLASSPATH=.:JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools/jar<br># export PATH=$PATH:$JAVA_HOME/bin<br># export HADOOP_HOME=/opt/module/hadoop-3.3<br># export PATH=$PATH:$HADOOP_HOME/bin<br># export PATH=$PATH:$HADOOP_HOME/sbin<br># export SQOOP_SERVER_EXTRA_LIB=$SQOOP_HOME/extLib<br># export HADOOP_COMMON_HOME=/opt/module/hadoop-3.3/share/hadoop/common<br># export HADOOP_HDFS_HOME=/opt/module/hadoop-3.3/share/hadoop/hdfs<br># export HADOOP_MAPRED_HOME=/opt/module/hadoop-3.3/share/hadoop/mapreduce<br># export HADOOP_YARN_HOME=/opt/module/hadoop-3.3/share/hadoop/yarn
三台一致,source生效
3.ssh免密登陆
1.防火墙关闭
service iptables status
service iptables stop
chkconfig iptables off
-- 关闭SELINUX<br># vim /etc/selinux/config<br>-- 注释掉<br>#SELINUX=enforcing<br>#SELINUXTYPE=targeted<br>-- 添加<br>SELINUX=disabled
2.SSH免密登陆
二、免密码登录本机<br><br>下面以配置hadoop-master本机无密码登录为例进行讲解,用户需参照下面步骤完成h-salve1~3三台子节点机器的本机无密码登录;<br><br>1)生产秘钥<br><br>ssh-keygen -t rsa<br><br>2)将公钥追加到”authorized_keys”文件<br><br>cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys<br><br>3)赋予权限<br><br>chmod 600 .ssh/authorized_keys<br><br>4)验证本机能无密码访问<br><br>ssh hadoop-master<br><br>最后,依次配置h-salve1~3无密码访问<br><br>二、hadoop-master本机无密码登录hadoop-slave1、hadoop-slave2、hadoop-slave3,以hadoop-master无密码登录hadoop-slave1为例进行讲解:<br><br>1)登录hadoop-slave1 ,复制hadoop-master服务器的公钥”id_rsa.pub”到hadoop-slave1服务器的”root”目录下。<br><br>scp root@hadoop-master:/root/.ssh/id_rsa.pub /root/<br><br>2)将hadoop-master的公钥(id_rsa.pub)追加到hadoop-slave1的authorized_keys中<br><br>cat id_rsa.pub >> .ssh/authorized_keys<br>rm -rf id_rsa.pub<br><br>3)在 hadoop-master上面测试<br><br>ssh hadoop-slave1<br><br>三、配置hadoop-slave1~hadoop-slave3本机无密码登录hadoop-master<br><br>下面以hadoop-slave1无密码登录hadoop-master为例进行讲解,用户需参照下面步骤完成hadoop-slave2~hadoop-slave3无密码登录hadoop-master。<br><br>1)登录hadoop-master,复制hadoop-slave1服务器的公钥”id_rsa.pub”到hadoop-master服务器的”/root/”目录下。<br><br>scp root@hadoop-slave1:/root/.ssh/id_rsa.pub /root/<br><br>2)将hadoop-slave1的公钥(id_rsa.pub)追加到hadoop-master的authorized_keys中。<br><br>cat id_rsa.pub >> .ssh/authorized_keys<br>rm -rf id_rsa.pub //删除id_rsa.pub<br><br>3)在 hadoop-slave1上面测试<br><br>ssh hadoop-master<br><br>依次配置 hadoop-slave2、hadoop-slave3
3.Hadoop部署
1、hadoop-master上 解压缩安装包及创建基本目录<br><br>#下载 <br>wget http://apache.claz.org/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz<br>#解压 <br>tar -xzvf hadoop-2.7.3.tar.gz -C /usr/local <br>#重命名 <br>mv hadoop-2.7.3 hadoop<br>
2. ./hadoop/etc/hadoop/core-site.xml
<configuration><br> <property><br> <name>fs.defaultFS</name><br> <value>hdfs://hadoop0:9888</value><br> <description>WEB端口地址</description><br> </property><br><br> <property><br> <name>hadoop.tmp.dir</name><br> <value>file:/opt/module/hadoop-3.3/tmp</value><br> <description>tmp文件地址</description><br> </property><br><br> <property><br> <name>hadoop.proxyuser.root.hosts</name><br> <value>*</value><br> <description>设置所有IP均可访问</description><br> </property><br> <property><br> <name>hadoop.proxyuser.root.groups</name><br> <value>*</value><br> <description>设置所有用户均可访问</description><br> </property><br></configuration><br>
3. ./hadoop/etc/hadoop/hdfs-site.xml
<configuration><br><br><property><br> <name>dfs.replication</name><br> <value>2</value><br></property><br><br><property><br> <name>dfs.namenode.name.dir</name><br> <value>file:/opt/module/hadoop-3.3/tmp/namenode</value><br> <description>命名空间和事务在本地文件系统永久存储的路径</description><br></property><br><br><property><br> <name>dfs.datanode.data.dir</name><br> <value>file:/opt/module/hadoop-3.3/tmp/datanode</value><br> <description>DataNode在本地文件系统中存放块的路径</description><br></property><br><br> <property><br> <name>dfs.namenode.http-address</name><br> <value>hadoop0:9870</value><br> <description>NameNode地址</description><br> </property><br><br><property><br> <name>dfs.namenode.secondary.http-address</name><br> <value>hadoop1:9868</value><br> <description>secondaryNameNode地址</description><br></property><br><br><br><br></configuration><br>
4. ./mapred-site.xml
cp ./hadoop/etc/hadoop/mapred-site.xml.template ./hadoop/etc/hadoop/mapred-site.xml <br>vim ./hadoop/etc/hadoop/mapred-site.xml
<configuration><br><br><property><br> <name>mapreduce.framework.name</name><br> <value>yarn</value><br> <description>MapReduce引擎</description><br></property><br><br> <br><property><br> <name>mapreduce.jobhistory.webapp.address</name><br> <value>hadoop0:19888</value><br> <description>任务管理器WEB端口</description><br></property><br><br></configuration><br>
5. . /yarn-site.xml
<configuration><br><br><!-- Site specific YARN configuration properties --><br> <property> <br> <name>yarn.nodemanager.aux-services</name><br> <value>mapreduce_shuffle</value><br></property><br><br><property><br> <name>yarn.resourcemanager.hostname</name><br> <value>hadoop2</value><br> <description>yarn主节点指定</description><br></property><br><property><br> <name>yarn.application.classpath</name><br> <value>/opt/module/hadoop-3.3/etc/hadoop:/opt/module/hadoop-3.3/share/hadoop/common/lib/*:/opt/module/hadoop-3.3/share/hadoop/common/*:/opt/module/hadoop-3.3/share/hadoop/hdfs:/opt/module/hadoop-3.3/share/hadoop/hdfs/lib/*:/opt/module/hadoop-3.3/share/hadoop/hdfs/*:/opt/module/hadoop-3.3/share/hadoop/mapreduce/*:/opt/module/hadoop-3.3/share/hadoop/yarn:/opt/module/hadoop-3.3/share/hadoop/yarn/lib/*:/opt/module/hadoop-3.3/share/hadoop/yarn/*</value><br></property><br><br><br></configuration><br>
6.masters与slaves
workers
hadoop0<br>hadoop1<br>hadoop2<br>
3.3只有workers
配置hadoop-slave的hadoop环境<br><br>下面以配置hadoop-slave1的hadoop为例进行演示,用户需参照以下步骤完成其他hadoop-slave2~3服务器的配置。<br><br>1)复制hadoop到hadoop-slave1节点<br><br>scp -r /usr/local/hadoop hadoop-slave1:/usr/local/<br><br>登录hadoop-slave1服务器,删除slaves内容<br><br>rm -rf /usr/local/hadoop/etc/hadoop/slaves<br><br>2)配置环境变量<br><br>vi /etc/profile<br>## 内容<br>export HADOOP_HOME=/usr/local/hadoop<br>export PATH=$PATH:$HADOOP_HOME/bin<br><br>使得hadoop命令在当前终端立即生效;<br><br>source /etc/profile<br><br>依次配置其它slave服务
集群启动与初始化
1、格式化HDFS文件系统<br><br>进入master的~/hadoop目录,执行以下操作<br><br>bin/hadoop namenode -format<br>格式化namenode,第一次启动服务前执行的操作,以后不需要执行。<br>
2、然后启动hadoop:<br><br>sbin/start-dfs.sh<br>sbin/start-yarn.sh<br>
3、使用jps命令查看运行情况<br><br>#master 执行 jps查看运行情况<br>25928 SecondaryNameNode<br>25742 NameNode<br>26387 Jps<br>26078 ResourceManager<br><br>#slave 执行 jps查看运行情况<br>24002 NodeManager<br>23899 DataNode<br>24179 Jps<br>
4、命令查看Hadoop集群的状态<br><br>通过简单的jps命令虽然可以查看HDFS文件管理系统、MapReduce服务是否启动成功,但是无法查看到Hadoop整个集群的运行状态。我们可以通过hadoop dfsadmin -report进行查看。用该命令可以快速定位出哪些节点挂掉了,HDFS的容量以及使用了多少,以及每个节点的硬盘使用情况。<br><br>hadoop dfsadmin -report<br>
5、hadoop 重启<br><br>sbin/stop-all.sh<br>sbin/start-all.sh<br>
HIVE
环境变量
修改环境变量:<br><br>执行命令:vi /etc/profile<br><br>export JAVA_HOME=/usr/local/software/jdk1.8.0_66<br><br>export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar<br><br>export HADOOP_HOME=/usr/local/software/hadoop_2.7.1<br><br>export HBASE_HOME=/usr/local/software/hbase_1.2.2<br><br>export HIVE_HOME=/usr/local/software/apache-hive-2.3.0-bin<br><br>export PATH=.:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HBASE_HOME/bin:$HIVE_HOME/bin:$PATH<br><br>执行命令:source /etc/profile 刷新环境变量<br>————————————————<br>版权声明:本文为CSDN博主「_否极泰来_」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。<br>原文链接:https://blog.csdn.net/yuan_xw/article/details/78197917
conf
hive-site.xml
Zookeeper
conf
zoo.cfg
cp zoo_sample.cfg zoo.cfg<br>
# The number of milliseconds of each tick<br>tickTime=2000<br># The number of ticks that the initial <br># synchronization phase can take<br>initLimit=10<br># The number of ticks that can pass between <br># sending a request and getting an acknowledgement<br>syncLimit=5<br># the directory where the snapshot is stored.<br># do not use /tmp for storage, /tmp here is just <br># example sakes.<br>dataDir=/opt/module/apache-zookeeper-3.7.1-bin/data<br># the port at which the clients will connect<br>clientPort=2181<br># the maximum number of client connections.<br># increase this if you need to handle more clients<br>#maxClientCnxns=60<br>#<br># Be sure to read the maintenance section of the <br># administrator guide before turning on autopurge.<br>#<br># http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance<br>#<br># The number of snapshots to retain in dataDir<br>#autopurge.snapRetainCount=3<br># Purge task interval in hours<br># Set to "0" to disable auto purge feature<br>#autopurge.purgeInterval=1<br><br>## Metrics Providers<br>#<br># https://prometheus.io Metrics Exporter<br>#metricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider<br>#metricsProvider.httpPort=7000<br>#metricsProvider.exportJvmInfo=true<br>#hadoop对应于前面在hosts里面配置的主机映射 2888是数据同步和消息传递端口,3888是选举端口<br>server.0=hadoop0:2888:2889<br><br>#hadoop对应于前面在hosts里面配置的主机映射 2889是数据同步和消息传递端口,3889是选举端口 <br>server.1=hadoop1:2888:2889<br><br>#hadoop对应于前面在hosts里面配置的主机映射 2890是数据同步和消息传递端口,3890是选举端口 <br>server.2=hadoop2:2888:2889<br><br>
问题记录
zookeeper启动失败:Error: Could not find or load main class org.apache.zookeeper.server.quorum.QuorumPeer<br>
在官网下载包的时候下载的是源码包,没有进行编译过的,所以需要手动进行编译,也可以直接下载官方编译好的(编译好的tar包会带有bin的标识),索性重新下载个编译好的,解压后启动成功<br>
看到这个的时候去查了一下百度一直没有看到正确的解决方案,所以去了zookeeper官网去查,查document,在它的standalone Operation中看到:<br>在这里插入图片描述<br><br>The server is contained in a single JAR,这个服务是包含jar包的。
文件权限问题
org.apache.zookeeper.server.quorum.QuorumPeerConfig$ConfigException: Address unresolved: hadoop0:2889
这个错误是因为 server.1的端口号后面有空格的缘故。<br>
server.#要与myid一一对应
java.net.BindException: Address already in use
Cannot open channel to 2 at election address hadoop2/192.168.56.110:2889<br>java.net.ConnectException: Connection refused (Connection refused)<br>
三台全部启动,status正常
别的服务器未开启服务,当然连不上
修改zoo.cfg本机ip纠正为0.0.0.0
关闭防火墙<br>
systemctl stop firewalld<br>systemctl disable firewalld<br>systemctl status firewalld<br>
Hbase
安装hbase<br><br>首先在hadoop-master安装配置好之后,在复制到从节点<br><br>wget http://mirror.bit.edu.cn/apache/hbase/1.3.1/hbase-1.3.1-bin.tar.gz<br>#解压<br>tar -xzvf hbase-1.3.1-bin.tar.gz -C /usr/local/<br>#重命名 <br>mv hbase-1.3.1 hbase<br>
环境变量配置
vim /etc/profile
#内容<br>export HBASE_HOME=/opt/module/hbase-2.4.17<br>export PATH=$HBASE_HOME/bin:$PATH<br>#使立即生效<br>source /etc/profile<br>
修改系统变量ulimit
ulimit -n 10240
配置文件
/hbase/conf
hbase-env.sh
#内容<br>export JAVA_HOME=/usr/lib/jvm/jre-1.7.0-openjdk.x86_64<br>export HBASE_CLASSPATH=/usr/local/hbase/conf<br># 此配置信息,设置由hbase自己管理zookeeper,不需要单独的zookeeper。<br>export HBASE_MANAGES_ZK=true<br>export HBASE_HOME=/usr/local/hbase<br>export HADOOP_HOME=/usr/local/hadoop<br>#Hbase日志目录<br>export HBASE_LOG_DIR=/usr/local/hbase/logs
hbase-site.xml<br>
<configuration><br> <property><br> <name>hbase.rootdir</name><br> <value>hdfs://hadoop-master:9000/hbase</value><br> </property><br> <property><br> <name>hbase.cluster.distributed</name><br> <value>true</value><br> </property><br> <property><br> <name>hbase.master</name><br> <value>hadoop-master:60000</value><br> </property><br> <property><br> <name>hbase.zookeeper.quorum</name><br> <value>hadoop-master,hadoop-slave1,hadoop-slave2,hadoop-slave3</value><br> </property><br></configuration><br>
1.hadoop端口
zookeeper端口
regionservers
hadoop0<br>hadoop1<br>hadoop2<br>
复制hbase到从节点中<br><br>scp -r /opt/module/hbase-2.4.17 hadoop1:/opt/module/hbase-2.4.17<br>scp -r /opt/module/hbase-2.4.17 hadoop2:/opt/module/hbase-2.4.17<br>
集群启动
启动hbase<br><br>启动仅在master节点上执行即可<br><br>~/hbase/bin/start-hbase.sh
master中的信息<br><br>[hadoop@master ~]$ jps<br>6225 Jps<br>2897 SecondaryNameNode # hadoop进程<br>2710 NameNode # hadoop master进程<br>3035 ResourceManager # hadoop进程<br>5471 HMaster # hbase master进程<br>2543 HQuorumPeer # zookeeper进程
错误1,SLF4J,依赖包重复,导致无法正常拉起,删除一个就好
SLF4J: Class path contains multiple SLF4J bindings.<br>SLF4J: Found binding in [jar:file:/opt/module/hadoop-3.3/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]<br>SLF4J: Found binding in [jar:file:/opt/module/hbase-2.4.17/lib/client-facing-thirdparty/slf4j-reload4j-1.7.33.jar!/org/slf4j/impl/StaticLoggerBinder.class]<br>SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.<br>SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]<br>SLF4J: Class path contains multiple SLF4J bindings.<br>SLF4J: Found binding in [jar:file:/opt/module/hadoop-3.3/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]<br>SLF4J: Found binding in [jar:file:/opt/module/hbase-2.4.17/lib/client-facing-thirdparty/slf4j-reload4j-1.7.33.jar!/org/slf4j/impl/StaticLoggerBinder.class]<br>SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.<br>SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]<br>
salve中的信息<br><br>[hadoop@slave1 ~]$ jps<br>4689 Jps<br>2533 HQuorumPeer # zookeeper进程<br>2589 DataNode # hadoop slave进程<br>4143 HRegionServer # hbase slave进程
子主题
如果安装了独立的zookeeper<br><br>启动顺序: hadoop-> zookeeper-> hbase<br>停止顺序:hbase-> zookeeper-> hadoop<br><br>使用自带的zookeeper<br><br>启动顺序: hadoop-> hbase<br>停止顺序:hbase-> hadoop
问题记录
HBASE-shell拉起<br>
3210 Jps<br>[root@hadoop0 hbase-2.4.17]# ssh hadoop1<br>Last login: Thu May 4 09:20:38 2023 from hadoop0<br>[root@hadoop1 ~]# jps<br>1680 SecondaryNameNode<br>1846 HQuorumPeer<br>2088 Jps<br>1609 DataNode<br>[root@hadoop1 ~]# exit<br>logout<br>Connection to hadoop1 closed.<br>[root@hadoop0 hbase-2.4.17]# ./bin/hbase shell<br>LoadError: load error: irb/completion -- java.lang.NoSuchMethodError: jline.console.completer.CandidateListCompletionHandler.setPrintSpaceAfterFullCompletion(Z)V<br> require at org/jruby/RubyKernel.java:974<br> require at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54<br> <main> at classpath:/jar-bootstrap.rb:42<br>
还是slf4j不兼容导致
备份hbase的jar包到安全目录!
使用Hadoop的jar包替换hbase的jar包
slf4j包替换
hbase-site.xml配置
org.apache.hadoop.hbase.PleaseHoldException: Master is initializing<br>
以上报错说:Master正在初始化<br>出现以上错误的原因可能有以下:<br>1、集群中的节点时间不同步,可以在启动的集群中使用命令行:date,查看各个节点的时间是否同步,如果不同步,可以参考这篇博客进行集群离线状态时间同步的修改https://blog.csdn.net/m0_46413065/article/details/116378004<br>2、如果以上方式仍然没有效果,可能报错的原因二是:HDFS中和Zookeeper中的HBase没有删除,所以这里需要将其进行删除,具体的命令如下:注意:删除Zookeeper中的 /hbase 目录,需要保证zookeeper已经开启,否则无法连接上。<br>————————————————<br>版权声明:本文为CSDN博主「weixin_43648549」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。<br>原文链接:https://blog.csdn.net/weixin_43648549/article/details/123615758<br>
使用方法二创建表格成功
ntp 集群时间对齐
安装ntp<br><br>集群中每台机器都需要安装ntp。<br><br>3.1查看已经安装版本指令<br><br>查看指令:yum list installed | grep ntp<br><br>3.2安装ntp指令<br><br>安装指令:yum -y install ntp<br><br>4.配置ntp服务端<br>
修改机器本地时间为标准时区时间。
vim /etc/ntp.conf
# For more information about this file, see the man pages<br># ntp.conf(5), ntp_acc(5), ntp_auth(5), ntp_clock(5), ntp_misc(5), ntp_mon(5).<br><br>driftfile /var/lib/ntp/drift<br><br># Permit time synchronization with our time source, but do not<br># permit the source to query or modify the service on this system.<br>restrict default nomodify notrap nopeer noquery<br><br># Permit all access over the loopback interface. This could<br># be tightened as well, but to do so would effect some of<br># the administrative functions.<br>restrict 127.0.0.1 <br>restrict ::1<br><br># Hosts on local network are less restricted.<br>#restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap<br><br># Use public servers from the pool.ntp.org project.<br># Please consider joining the pool (http://www.pool.ntp.org/join.html).<br>#server 0.centos.pool.ntp.org iburst<br>#server 1.centos.pool.ntp.org iburst<br>#server 2.centos.pool.ntp.org iburst<br>#server 3.centos.pool.ntp.org iburst<br>server 127.127.1.0<br>fudge 127.127.1.0 stratum 10<br>#broadcast 192.168.1.255 autokey # broadcast server<br>#broadcastclient # broadcast client<br>#broadcast 224.0.1.1 autokey # multicast server<br>#multicastclient 224.0.1.1 # multicast client<br>#manycastserver 239.255.254.254 # manycast server<br>#manycastclient 239.255.254.254 autokey # manycast client<br><br># Enable public key cryptography.<br>#crypto<br><br>includefile /etc/ntp/crypto/pw<br><br># Key file containing the keys and key identifiers used when operating<br># with symmetric key cryptography. <br>keys /etc/ntp/keys<br><br># Specify the key identifiers which are trusted.<br>#trustedkey 4 8 42<br><br># Specify the key identifier to use with the ntpdc utility.<br>#requestkey 8<br><br># Specify the key identifier to use with the ntpq utility.<br>#controlkey 8<br><br># Enable writing of statistics records.<br>#statistics clockstats cryptostats loopstats peerstats<br><br># Disable the monitoring facility to prevent amplification attacks using ntpdc<br># monlist command when default restrict does not include the noquery flag. See<br># CVE-2013-5211 for more details.<br># Note: Monitoring will not be disabled with the limited restriction flag.<br>disable monitor<br>
service ntpd start
systemctl enable ntpd.service<br>
查看当前节点同步的时间服务器<br><br>查看当前节点时间同步的时间服务器。<br><br>查看指令:ntpq -p<br>
查看ntp端口<br><br>查看ntp启动后发布的端口,默认端口:123。<br><br>查看指令:netstat -anp | grep ntp
WEB端口
http://192.168.56.108:60010/master-status
Python远程接口
thrift
HbaseMaster节点安装thrift服务
安装thrift<br><br>下载thrift:wget http://mirror.bit.edu.cn/apache/thrift/0.10.0/thrift-0.10.0.tar.gz<br><br>tar zvxf thrift-0.10.0.tar.gz<br><br>cd thrift-0.10.0/<br><br>./configure<br><br>sudo make && make install<br><br>注:如果报g++: error: /usr/lib64/libboost_unit_test_framework.a: No such file or directory这样的错误则执行以下操作<br><br>yum install boost-devel-static
no acceptable C compiler found in $PATH<br>
代表你没有安装C编译器<br>
执行 yum -y install gcc-c++命令进行安装,安装完后,输入gcc -v检查是否安装成功,出现下图所示代表安装c编译器成功。<br>
thrift服务 <br>./bin/hbase-daemon.sh start thrift
jps查看
python客户机安装相关package
pip install thrift<br>
pip install happybase
command ‘gcc‘ failed: No such file or directory
yum install gcc
pip install hbase-python
问题记录
1.import hbase,can't find google
pip install protobuf
TypeError: Descriptors cannot not be created directly.<br> If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.<br> If you cannot immediately regenerate your protos, some other possible workarounds are:<br> 1. Downgrade the protobuf package to 3.20.x or lower.<br> 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).<br>
pip install protobuf==3.20.1
TTransportException: TSocket read 0 bytes
修改connection传输协议,使用TCompactProtocol
脚本示例
from thrift import Thrift<br>from thrift.transport import TSocket, TTransport<br>from thrift.protocol import TBinaryProtocol,TCompactProtocol<br>from hbase import Hbase<br>from hbase.ttypes import *<br>import pandas as pd<br>from hbase.Hbase import *<br>class hbaseUtils(object):<br> __slots__ = ['transport', 'client']<br> # @staticmethod<br> def __init__(self):<br> # server端地址和端口,web是HMaster也就是thriftServer主机名,9090是thriftServer默认端口<br> transport = TSocket.TSocket('192.168.56.108', 9090)<br> # 可以设置超时<br> transport.setTimeout(5000)<br> # 设置传输方式(TFramedTransport或TBufferedTransport)<br> self.transport = TTransport.TFramedTransport(transport)<br> # 设置传输协议<br> protocol = TCompactProtocol.TCompactProtocol(self.transport)<br> # 确定客户端<br> self.client = Hbase.Client(protocol)<br>HB = hbaseUtils()<br>HB.transport.open()<br>HB.client.getTableNames()<br>
截图示例
Docker
简易安装
系统版本和内核版本校验
Docker支持64位版本的CentOS 7和CentOS 8及更高版本,它要求Linux内核版本不低于3.10。<br><br>查看Linux版本的命令这里推荐两种:lsb_release -a或cat /etc/redhat-release。<br><br>lsb_release -a查看效果:<br>显然,当前Linux系统为CentOS7。再查一下内核版本是否不低于3.10。<br><br>查看内核版本有三种方式:<br><br> cat /proc/version<br> uname -a<br> uname -r<br><br>三种形式都可以查看到内容版本,比如:<br>
国内 daocloud一键安装命令:<br>curl -sSL https://get.daocloud.io/docker | sh<br>
官方的一键安装方式:<br>curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun<br>
sudo systemctl start docker<br>
安装完成Docker之后,这里汇总列一下常见的Docker操作命令:<br><br> 搜索仓库镜像:docker search 镜像名<br> 拉取镜像:docker pull 镜像名<br> 查看正在运行的容器:docker ps<br> 查看所有容器:docker ps -a<br> 删除容器:docker rm container_id<br> 查看镜像:docker images<br> 删除镜像:docker rmi image_id<br> 启动(停止的)容器:docker start 容器ID<br> 停止容器:docker stop 容器ID<br> 重启容器:docker restart 容器ID<br> 启动(新)容器:docker run -it ubuntu /bin/bash<br> 进入容器:docker attach 容器ID或docker exec -it 容器ID /bin/bash,推荐使用后者。<br><br>更多的命令可以通过docker help命令来查看。
JupyterNotebook
https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html<br>
jupyter/tensorflow-notebook<br>
docker pull jupyter/base-notebook:latest 拉取镜像
docker run --rm -p 8888:8888 jupyter/base-notebook:latest 运行镜像<br>
3. 这样虽然能访问,但是我想将notebooks的根目录映射到本地。但是从上面的启动日志来看,默认的根目录在/home/jovyan,这个目录包含很多隐藏文件,映射时有些文件会报错,所以我们需要修改jupyter notebook的工作目录。<br>4. 修改设置参数的命令格式:docker run -p 8888:8888 jupyter/base-notebook start-notebook.sh --NotebookApp.password='sha1:74ba40f8a388:c913541b7ee99d15d5ed31d4226bf7838f83a50e'<br>就是在后面跟start-notebook.sh 然后在加上参数和值。<br><br>参数列表请看:https://jupyter-notebook.readthedocs.io/en/stable/config.html<br>————————————————<br>版权声明:本文为CSDN博主「Qwertyuiop2016」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。<br>原文链接:https://blog.csdn.net/Qwertyuiop2016/article/details/120439121
NotebookApp.password:notebook访问密码<br>NotebookApp.allow_password_change:是否允许远程修改密码<br>NotebookApp.allow_remote_access:这个不知道是啥意思,反正我每次都加了<br>NotebookApp.open_browser:是否打开浏览器,这个在容器里默认就是False,所以可以不加<br>NotebookApp.notebook_dir:notebook工作目录<br>————————————————<br>版权声明:本文为CSDN博主「Qwertyuiop2016」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。<br>原文链接:https://blog.csdn.net/Qwertyuiop2016/article/details/120439121
conda
https://repo.anaconda.com/archive/<br>
$ sh Anaconda3-2022.05-Linux-x86_64.sh<br>
source /dellfsqd2/ST_LBI/USER/myname/app/conda/anaconda3/bin/activate<br><br>$ conda init
$ conda create --name snowflakes
子主题
PIP
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple package_name<br>
清华:https://pypi.tuna.tsinghua.edu.cn/simple<br><br>阿里云:http://mirrors.aliyun.com/pypi/simple/<br><br>中国科技大学 https://pypi.mirrors.ustc.edu.cn/simple/<br><br>华中科技大学:http://pypi.hustunique.com/<br><br>山东理工大学:http://pypi.sdutlinux.org/<br><br>豆瓣:http://pypi.douban.com/simple/<br><br>中科大:https://pypi.mirrors.ustc.edu.cn/simple/<br>
收藏
0 条评论
下一页