IT技术互动交流平台

corosync+pacemaker的crmsh常用指令介绍

作者: 和风细雨  发布日期:2014-04-26 08:15:17

----本文大纲

corosync、pacemaker各自是什么

常见高可用集群解决方案

安装corosync、pacemaker

pacemaker资源管理器(CRM)命令注解

实例演示

一、corosync、pacemaker各自是什么

corosync是用于高可用环境中的提供通讯服务的,它位于高可用集群架构中的底层(Message Layer),扮演着为各节点(node)之间提供心跳信息传递这样的一个角色;

pacemaker是一个开源的高可用资源管理器(CRM),位于HA集群架构中资源管理、资源代理(RA)这个层次,它不能提供底层心跳信息传递的功能,它要想与对方节点通信需要借助底层的心跳传递服务,将信息通告给对方。通常它与corosync的结合方式有两种:

  • pacemaker作为corosync的插件运行;

    pacemaker作为独立的守护进程运行;

    注:

    由于corosync的早期版本不具备投票能力,所以集群内的节点总数应为奇数,并且大于2

    在corosync1.0的时候,其本身不具备票务功能(votes),不过在corosync2.0之后引入了votequorum

    cman(DC)+corosync(如果想用pacemaker又想用cman,只能把cman当成corosync的插件来用)

    二、常见高可用集群解决方案

    • heartbeat+crm

      cman+rgmanager

      cman+pacemaker

      corosync+pacemaker(pacemaker作为资源管理器)

      三、安装corosync、pacemaker

      #yum install -y corosync

      其配置文件位于/etc/corosync/下,模板为corosync.conf.example

      # Please read the corosync.conf.5 manual page
      compatibility: whitetank #兼容08.以前的版本
      totem {
          version: 2 #totme 的版本
          secauth: off #安全认证是否打开,最好打开
          threads: 0 #用于安全认证开启并行线程数
          interface {
              ringnumber: 0 #环号码,如果一个主机有多块网卡,避免心跳信息回流
              bindnetaddr: 192.168.1.1 #网络地址(节点所在的网络地址)
              mcastaddr: 226.94.1.1 #广播地址
              mcastport: 5405    #多播占用的端口
              ttl: 1 #只向外一跳心跳信息,避免组播报文环路
          }
      }
      #totem定义集群内各节点间是如何通信的,totem本是一种协议,专用于corosync专用于各节点间的协议,协议是有版本的
      logging {
          fileline: off
          to_stderr: no #日志信息是否发往错误输出(否)
          to_logfile: yes #是否记录日志文件
          to_syslog: yes #是否记录于syslog日志-->此类日志记录于/var/log/message中
          logfile: /var/log/cluster/corosync.log #日志存放位置
          debug: off #只要不是为了排错,最好关闭debug,它记录的信息过于详细,会占用大量的磁盘IO.
          timestamp: on #记录日志的时间戳
          logger_subsys {
              subsys: AMF
              debug: off
          }
      }
      amf {
          mode: disabled
      }

      如果想让pacemker在corosync以一个插件来用,就要在corosync.conf文件写如下内容

       

      service {
          ver:0
          name:pacemaker
      }
      #corosync启动后会自动启动pacemaker
      aisexec {
          user :root
          group:root
      }
      #启用ais功能时以什么身份来运行,默认为root,aisexec区域也可以不写
      • 第一步、生成密钥文件以corosync-keygen命令来生成密钥(生成的密钥的算法为/dev/random随机生成)生成密钥之会将在配置文件目录下自行生成一个authkey文件,将这两个文件复制到各集群节点上。

         

        #scp -p authkey corosync.conf 192.168.1.111:/etc/corosync/

        第二步、启动corosync

        [root@essun corosync]# ssh essun.node2.com 'service corosync start'
        Starting Corosync Cluster Engine (corosync): [  OK  ]
        [root@essun corosync]# service corosync start
        Starting Corosync Cluster Engine (corosync):               [  OK  ]

        查看日志信息,可以明显的看到corosync是否启动正常(在每一个节点上都要查看)

         

        #tail -40 /var/log/cluster/corosync.log
        Apr 25 23:12:01 [2811] essun.node3.com       crmd:     info: update_attrd:  Connecting to attrd... 5 retries remaining
        Apr 25 23:12:01 [2806] essun.node3.com        cib:     info: cib_process_replace:   Digest matched on replace from essun.node2.com: cb225a22df77f4f0bfbf7bd73c7d4160
        Apr 25 23:12:01 [2806] essun.node3.com        cib:     info: cib_process_replace:   Replaced 0.4.1 with 0.4.1 from essun.node2.com
        Apr 25 23:12:01 [2806] essun.node3.com        cib:     info: cib_process_request:   Completed cib_replace operation for section 'all': OK (rc=0, origin=essun.node2.com/crmd/24, version=0.4.1)
        Apr 25 23:12:01 [2806] essun.node3.com        cib:     info: cib_process_request:   Forwarding cib_delete operation for section //node_state[@uname='essun.node3.com']/transient_attributes to master (origin=local/crmd/9)
        Apr 25 23:12:01 [2811] essun.node3.com       crmd:     info: do_log:    FSA: Input I_NOT_DC from do_cl_join_finalize_respond() received in state S_PENDING
        Apr 25 23:12:01 [2811] essun.node3.com       crmd:   notice: do_state_transition:   State transition S_PENDING -> S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE origin=do_cl_join_finalize_respond ]
        Apr 25 23:12:01 [2809] essun.node3.com      attrd:   notice: attrd_local_callback:  Sending full refresh (origin=crmd)
        Apr 25 23:12:01 [2806] essun.node3.com        cib:     info: write_cib_contents:    Wrote version 0.3.0 of the CIB to disk (digest: 02ededba58f5938f53dd45f5bd06f577)
        Apr 25 23:12:01 [2806] essun.node3.com        cib:     info: cib_process_request:   Completed cib_apply_diff operation for section nodes: OK (rc=0, origin=essun.node2.com/crmd/26, version=0.5.1)
        Apr 25 23:12:01 [2807] essun.node3.com stonith-ng:     info: cib_process_diff:  Diff 0.4.1 -> 0.5.1 from local not applied to 0.3.1: current "epoch" is less than required
        Apr 25 23:12:01 [2807] essun.node3.com stonith-ng:   notice: update_cib_cache_cb:   [cib_diff_notify] Patch aborted: Application of an update diff failed, requesting a full refresh (-207)
        Apr 25 23:12:01 [2806] essun.node3.com        cib:     info: cib_process_request:   Completed cib_apply_diff operation for section status: OK (rc=0, origin=essun.node2.com/crmd/29, version=0.5.2)
        Apr 25 23:12:01 [2806] essun.node3.com        cib:     info: cib_process_request:   Completed cib_apply_diff operation for section status: OK (rc=0, origin=essun.node2.com/crmd/31, version=0.5.3)
        Apr 25 23:12:01 [2806] essun.node3.com        cib:     info: cib_process_request:   Completed cib_query operation for section 'all': OK (rc=0, origin=local/crmd/4, version=0.5.3)
        Apr 25 23:12:01 [2807] essun.node3.com stonith-ng:     info: cib_process_diff:  Diff 0.5.1 -> 0.5.2 from local not applied to 0.5.3: current "num_updates" is greater than required
        Apr 25 23:12:01 [2807] essun.node3.com stonith-ng:   notice: update_cib_cache_cb:   [cib_diff_notify] Patch aborted: Application of an update diff failed (-206)
        Apr 25 23:12:01 [2806] essun.node3.com        cib:     info: cib_process_request:   Completed cib_query operation for section 'all': OK (rc=0, origin=local/crmd/5, version=0.5.3)
        Apr 25 23:12:01 [2807] essun.node3.com stonith-ng:     info: cib_process_diff:  Diff 0.5.2 -> 0.5.3 from local not applied to 0.5.3: current "num_updates" is greater than required
        Apr 25 23:12:01 [2807] essun.node3.com stonith-ng:   notice: update_cib_cache_cb:   [cib_diff_notify] Patch aborted: Application of an update diff failed (-206)
        Apr 25 23:12:01 [2806] essun.node3.com        cib:     info: cib_process_request:   Completed cib_query operation for section 'all': OK (rc=0, origin=local/crmd/6, version=0.5.3)
        Apr 25 23:12:01 [2806] essun.node3.com        cib:     info: cib_process_request:   Completed cib_apply_diff operation for section cib: OK (rc=0, origin=essun.node2.com/crmd/34, version=0.5.4)
        Apr 25 23:12:02 [2809] essun.node3.com      attrd:   notice: attrd_trigger_update:  Sending flush op to all hosts for: probe_complete (true)
        Apr 25 23:12:02 [2806] essun.node3.com        cib:     info: cib_process_request:   Completed cib_query operation for section //cib/status//node_state[@id='essun.node3.com']//transient_attributes//nvpair[@name='probe_complete']: No such device or address (rc=-6, origin=local/attrd/2, version=0.5.4)
        Apr 25 23:12:02 [2806] essun.node3.com        cib:     info: cib_process_request:   Completed cib_query operation for section /cib: OK (rc=0, origin=local/attrd/3, version=0.5.4)
        Apr 25 23:12:02 [2809] essun.node3.com      attrd:   notice: attrd_perform_update:  Sent update 4: probe_complete=true
        Apr 25 23:12:02 [2806] essun.node3.com        cib:     info: cib_process_request:   Completed cib_apply_diff operation for section status: OK (rc=0, origin=essun.node2.com/attrd/4, version=0.5.5)
        Apr 25 23:12:02 [2806] essun.node3.com        cib:     info: cib_process_request:   Forwarding cib_modify operation for section status to master (origin=local/attrd/4)
        Apr 25 23:12:02 [2806] essun.node3.com        cib:     info: cib_process_request:   Completed cib_query operation for section //cib/status//node_state[@id='essun.node3.com']//transient_attributes//nvpair[@name='probe_complete']: No such device or address (rc=-6, origin=local/attrd/5, version=0.5.5)
        Apr 25 23:12:02 [2806] essun.node3.com        cib:     info: cib_process_request:   Completed cib_query operation for section /cib: OK (rc=0, origin=local/attrd/6, version=0.5.5)
        Apr 25 23:12:02 [2809] essun.node3.com      attrd:   notice: attrd_perform_update:  Sent update 7: probe_complete=true
        Apr 25 23:12:02 [2806] essun.node3.com        cib:     info: cib_process_request:   Forwarding cib_modify operation for section status to master (origin=local/attrd/7)
        Apr 25 23:12:02 [2806] essun.node3.com        cib:     info: retrieveCib:   Reading cluster configuration from: /var/lib/pacemaker/cib/cib.dnz3rc (digest: /var/lib/pacemaker/cib/cib.dOgpug)
        Apr 25 23:12:02 [2806] essun.node3.com        cib:     info: cib_process_request:   Completed cib_apply_diff operation for section status: OK (rc=0, origin=essun.node2.com/attrd/4, version=0.5.6)
        Apr 25 23:12:02 [2806] essun.node3.com        cib:     info: write_cib_contents:    Archived previous version as /var/lib/pacemaker/cib/cib-2.raw
        Apr 25 23:12:02 [2806] essun.node3.com        cib:     info: write_cib_contents:    Wrote version 0.5.0 of the CIB to disk (digest: 420e9390e2cb813eebbdf3bb73416dd2)
        Apr 25 23:12:02 [2806] essun.node3.com        cib:     info: retrieveCib:   Reading cluster configuration from: /var/lib/pacemaker/cib/cib.kgClFd (digest: /var/lib/pacemaker/cib/cib.gQtyTi)
        Apr 25 23:12:14 [2806] essun.node3.com        cib:     info: crm_client_new:    Connecting 0x1d8dc80 for uid=0 gid=0 pid=2828 id=2dfaa45a-28c4-4c7e-9613-603fb1217e12
        Apr 25 23:12:14 [2806] essun.node3.com        cib:     info: cib_process_request:   Completed cib_query operation for section 'all': OK (rc=0, origin=local/cibadmin/2, version=0.5.6)
        Apr 25 23:12:14 [2806] essun.node3.com        cib:     info: crm_client_destroy:    Destroying 0 events

        如果正常后,就可以使用crm status命令来查看当前集群节点信息了

        [root@essun corosync]# crm status
        Last updated: Fri Apr 25 23:18:11 2014
        Last change: Fri Apr 25 23:12:01 2014 via crmd on essun.node2.com
        Stack: classic openais (with plugin)
        Current DC: essun.node2.com - partition with quorum
        Version: 1.1.10-14.el6_5.3-368c726
        2 Nodes configured, 2 expected votes
        0 Resources configured
        Online: [ essun.node2.com essun.node3.com ]

        当前有两个节点在线,node2和node3

        四、pacemaker资源管理器(CRM)命令注解

        1、crm有两种工作方式批处理模式就是在命令行中直接输入命令(如上个命令执行时使用的crm status)交互式模式(crm(live)#)

        进入到crmsh中交互执行

        2、crm命令介绍

        一级子命令

        [root@essun corosync]# crm
        crm(live)# help
        This is crm shell, a Pacemaker command line interface.
        Available commands:
            cib              manage shadow CIBs #cib沙盒
            resource         resources management #所有的资源都在这个子命令后定义
            configure        CRM cluster configuration #编辑集群配置信息
            node             nodes management #集群节点管理子命令
            options          user preferences #用户优先级
            history          CRM cluster history#
            site             Geo-cluster support
            ra               resource agents information center #资源代理子命令(所有与资源代理相关的程都在此命令之下)
            status           show cluster status #显示当前集群的状态信息
            help,?           show help (help topics for list of topics)#查看当前区域可能的命令
            end,cd,up        go back one level #返回第一级crm(live)#
            quit,bye,exit    exit the program #退出crm(live)交互模式
        • resource子命令

           

          • 所有的资源状态都此处控制

             

             

            crm(live)resource# help
            vailable commands:
                    status           show status of resources #显示资源状态信息
                    start            start a resource #启动一个资源
                    stop             stop a resource #停止一个资源
                    restart          restart a resource #重启一个资源
                    promote          promote a master-slave resource #提升一个主从资源
                    demote           demote a master-slave resource #降级一个主从资源
                    manage           put a resource into managed mode
                    unmanage         put a resource into unmanaged mode
                    migrate          migrate a resource to another node #将资源迁移到另一个节点上
                    unmigrate        unmigrate a resource to another node
                    param            manage a parameter of a resource #管理资源的参数
                    secret           manage sensitive parameters #管理敏感参数
                    meta             manage a meta attribute #管理源属性
                    utilization      manage a utilization attribute
                    failcount        manage failcounts #管理失效计数器
                    cleanup          cleanup resource status #清理资源状态
                    refresh          refresh CIB from the LRM status #从LRM(LRM本地资源管理)更新CIB(集群信息库),在
                    reprobe          probe for resources not started by the CRM #探测在CRM中没有启动的资源
                    trace            start RA tracing #启用资源代理(RA)追踪
                    untrace          stop RA tracing #禁用资源代理(RA)追踪
                    help             show help (help topics for list of topics) #显示帮助
                    end              go back one level #返回一级(crm(live)#)
                    quit             exit the program #退出交互式程序
            • configure子命令

              • 所有资源的定义都是在此子命令下完成的

                crm(live)configure# help
                Available commands:
                        node             define a cluster node #定义一个集群节点
                        primitive        define a resource #定义资源
                        monitor          add monitor operation to a primitive #对一个资源添加监控选项(如超时时间,启动失败后的操作)
                        group            define a group #定义一个组类型(将多个资源整合在一起)
                        clone            define a clone #定义一个克隆类型(可以设置总的克隆数,每一个节点上可以运行几个克隆)
                        ms               define a master-slave resource #定义一个主从类型(集群内的节点只能有一个运行主资源,其它从的做备用)
                        rsc_template     define a resource template #定义一个资源模板
                        location         a location preference #定义位置约束优先级(默认运行于那一个节点(如果位置约束的值相同,默认倾向性那一个高,就在那一个节点上运行))
                        colocation       colocate resources #排列约束资源(多个资源在一起的可能性)
                        order            order resources #资源的启动的先后顺序
                        rsc_ticket       resources ticket dependency
                        property         set a cluster property #设置集群属性
                        rsc_defaults     set resource defaults #设置资源默认属性(粘性)
                        fencing_topology node fencing order #隔离节点顺序
                        role             define role access rights #定义角色的访问权限
                        user             define user access rights #定义用用户访问权限
                        op_defaults      set resource operations defaults #设置资源默认选项
                        schema           set or display current CIB RNG schema
                        show             display CIB objects #显示集群信息库对
                        edit             edit CIB objects #编辑集群信息库对象(vim模式下编辑)
                        filter           filter CIB objects #过滤CIB对象
                        delete           delete CIB objects #删除CIB对象
                        default-timeouts set timeouts for operations to minimums from the meta-data
                        rename           rename a CIB object #重命名CIB对象
                        modgroup         modify group #改变资源组
                        refresh          refresh from CIB #重新读取CIB信息
                        erase            erase the CIB #清除CIB信息
                        ptest            show cluster actions if changes were committed
                        rsctest          test resources as currently configured
                        cib              CIB shadow management
                        cibstatus        CIB status management and editing
                        template         edit and import a configuration from a template
                        commit           commit the changes to the CIB #将更改后的信息提交写入CIB
                        verify           verify the CIB with crm_verify #CIB语法验证
                        upgrade          upgrade the CIB to version 1.0
                        save             save the CIB to a file #将当前CIB导出到一个文件中(导出的文件存于切换crm 之前的目录)
                        load             import the CIB from a file #从文件内容载入CIB
                        graph            generate a directed graph
                        xml              raw xml
                        help             show help (help topics for list of topics) #显示帮助信息
                        end              go back one level #回到第一级(crm(live)#)
                        quit             exit the program  #退出crm交互模式
                • node子命令

                  • 节点管理和状态命令

                    crm(live)resource# cd ..
                    crm(live)# node
                    crm(live)node# help
                    Node management and status commands.
                    Available commands:
                        status           show nodes status as XML #以xml格式显示节点状态信息
                        show             show node #命令行格式显示节点状态信息
                        standby          put node into standby #模拟指定节点离线(standby在后面必须的FQDN)
                        online           set node online # 节点重新上线
                        maintenance      put node into maintenance mode
                        ready            put node into ready mode
                        fence            fence node #隔离节点
                        clearstate       Clear node state #清理节点状态信息
                        delete           delete node #删除 一个节点
                        attribute        manage attributes
                        utilization      manage utilization attributes
                        status-attr      manage status attributes
                        help             show help (help topics for list of topics)
                        end              go back one level
                        quit             exit the program
                    • ra子命令

                      • 资源代理类别都在此处

                        crm(live)node# cd ..
                        crm(live)# ra
                        crm(live)ra# help
                        Available commands:
                                classes          list classes and providers #为资源代理分类
                                list             list RA for a class (and provider)#显示一个类别中的提供的资源
                                meta             show meta data for a RA #显示一个资源代理序的可用参数(如meta ocf:heartbeat:IPaddr2)
                                providers        show providers for a RA and a class
                                help             show help (help topics for list of topics)
                                end              go back one level
                                quit             exit the program

                        注:

                        虽然这些命令所用的单词都很简单,但我还是将经常用得到的标注一下,虽然现在刚学完,记的比较清楚,但可能在以后的某一天对这里的某一个命令出现了盲区,岂不痛心疾首。(千万不要高估自己的记忆力,有时一个不小心就会骗了你!)

                        五、实例演示

                        注:

                        • 配置高可用的前提

                          • 时间同步

                            无密码登录

                            主机名解析

                            此处只为了演示命令的使用,并非生产环境配置

                            1、本机环境

                            系统:

                            centos 6.5 x86_64

                            节点:

                            essun.node2.com 192.168.1.111

                            essun.node3.com 192.168.1.108

                            各节点所需要的软件与资源

                            虚拟ip 一个 192.168.1.100

                            在两个节点上各安装上httpd服务,添加默认测试页,测试完成后禁止服务开机自动启动。

                            挂载nfs资源,提供nfs的主机为192.168.1.110

                            2、定义资源

                            • 禁用stonith-enable(如果不清楚有那些参数,可以使用按两下tab键对命令补全,使用cd ..可以反回到上一级命令)

                              crm(live)configure# property stonith-enabled=false #(假装故障的节点已经安全的关机了, 不启用stonith进行裁决)
                              crm(live)configure# verify #(此处没有信息就表示己经是正确操作)
                              crm(live)configure# commit #(此时就可以正常提交了)
                              crm(live)configure# show  #(显示己经提交且正在生效的属性信息)
                              node essun.node2.com
                              node essun.node3.com
                              primitive webip ocf:heartbeat:IPaddr \
                                  params ip="192.168.1.100"
                              property $id="cib-bootstrap-options" \
                                  dc-version="1.1.10-14.el6_5.3-368c726" \
                                  cluster-infrastructure="classic openais (with plugin)" \
                                  expected-quorum-votes="2" \
                                  stonith-enabled="false"
                              • 忽略投票规则

                                crm(live)configure# property no-quorum-policy=ignore
                                crm(live)configure# verify
                                crm(live)configure# commit
                                • 定义一个虚拟ip

                                  crm(live)configure# primitive webip ocf:heartbeat:IPaddr params ip=192.168.1.100
                                  crm(live)configure# verify
                                  crm(live)configure# commit
                                  crm(live)configure# show
                                  node essun.node2.com
                                  node essun.node3.com
                                  primitive webip ocf:heartbeat:IPaddr \
                                      params ip="192.168.1.100"
                                  property $id="cib-bootstrap-options" \
                                      dc-version="1.1.10-14.el6_5.3-368c726" \
                                      cluster-infrastructure="classic openais (with plugin)" \
                                      expected-quorum-votes="2" \
                                      stonith-enabled="false"
                                  • 注解:以上语句可以分为四段第一段:primitive:定义一资源所使用的命令第二段:webip:为资源起一个名字第三段;ocf:heartbeat;IPaddr:所使用资源代理的类别,由谁提供的那一个代理程序(此处可以使用crm ra#list 后面跟上RA的四种类别来查看所使用的代理程序是由谁提供的)第四段:params:指定定义的参数

                                    ip:参数名

                                    定义一个文件系统挂载

                                    先进入ra中查找文件系统所使用的资源代理

                                    crm(live)configure ra# classes
                                    lsb
                                    ocf / heartbeat pacemaker
                                    service
                                    stonith
                                    crm(live)configure ra# list ocf
                                    CTDB             ClusterMon       Delay            Dummy            Filesystem       HealthCPU
                                    HealthSMART      IPaddr           IPaddr2          IPsrcaddr        LVM              MailTo
                                    Route            SendArp          Squid            Stateful         SysInfo          SystemHealth
                                    VirtualDomain    Xinetd           apache           conntrackd       controld         dhcpd
                                    ethmonitor       exportfs         mysql            named            nfsserver        pgsql
                                    ping             pingd            postfix          remote           rsyncd           symlink
                                    crm(live)configure ra# providers Filesystem
                                    heartbeat

                                    由此可知文件系统的资源代理是由ocf:heartbeat提供

                                    查看此资源代理可的参数

                                    crm(live)configure ra# meta ocf:heartbeat:Filesystem
                                    Manages filesystem mounts (ocf:heartbeat:Filesystem)
                                    Resource script for Filesystem. It manages a Filesystem on a
                                    shared storage medium.
                                    The standard monitor operation of depth 0 (also known as probe)
                                    checks if the filesystem is mounted. If you want deeper tests,
                                    set OCF_CHECK_LEVEL to one of the following values:
                                    10: read first 16 blocks of the device (raw read)
                                    This doesn't exercise the filesystem at all, but the device on
                                    which the filesystem lives. This is noop for non-block devices
                                    such as NFS, SMBFS, or bind mounts.
                                    20: test if a status file can be written and read
                                    The status file must be writable by root. This is not always the
                                    case with an NFS mount, as NFS exports usually have the
                                    "root_squash" option set. In such a setup, you must either use
                                    read-only monitoring (depth=10), export with "no_root_squash" on
                                    your NFS server, or grant world write permissions on the
                                    directory where the status file is to be placed.
                                    Parameters (* denotes required, [] the default):
                                    device* (string): block device
                                        The name of block device for the filesystem, or -U, -L options for mount, or NFS mount specificatio
                                    n.
                                    directory* (string): mount point
                                        The mount point for the filesystem.
                                    fstype* (string): filesystem type
                                        The type of filesystem to be mounted.
                                    ...........省略中.......

                                    此处带有*表示必须参数,现在我们就可以定义了

                                    crm(live)configure# primitive webnfs ocf:heartbeat:Filesystem params device="192.168.1.110:/share" directory="/var/www/html" fstype="nfs"  op monitor interval=60s timeout=60s op start timeout=60s op stop timeout=60s
                                    crm(live)configure# verify
                                    crm(live)configure# commit
                                    crm(live)configure# show
                                    node essun.node2.com
                                    node essun.node3.com
                                    primitive webip ocf:heartbeat:IPaddr \
                                        params ip="192.168.1.100"
                                    primitive webnfs ocf:heartbeat:Filesystem \
                                        params device="192.168.1.110:/share" directory="/var/www/html" fstype="nfs" \
                                        op monitor interval="60s" timeout="60s" \
                                        op start timeout="60s" interval="0" \
                                        op stop timeout="60s" interval="0"
                                    property $id="cib-bootstrap-options" \
                                        dc-version="1.1.10-14.el6_5.3-368c726" \
                                        cluster-infrastructure="classic openais (with plugin)" \
                                        expected-quorum-votes="2" \
                                        stonith-enabled="false"

                                    注解:

                                    primitive #定义资源命令

                                    webnfs #资源ID

                                    ocf:heartbeat:Filesystem # 资源代理(RA)

                                    params device="192.168.1.110:/share" #共享目录

                                    directory="/var/www/html" #挂载目录

                                    fstype="nfs" #文件类型

                                    op monitor #对此webnfs做监控

                                    interval=60s #间隔时间

                                    timeout=60s #超时时间

                                    op start timeout=60s #启动超时时间

                                    op stop timeout=60s #停止超时时间

                                    定义web服务资源

                                    crm(live)configure# primitive webserver lsb:httpd
                                    crm(live)configure# verify
                                    crm(live)configure# commit
                                    crm(live)configure# show
                                    node essun.node2.com
                                    node essun.node3.com
                                    primitive webip ocf:heartbeat:IPaddr \
                                        params ip="192.168.1.100"
                                    primitive webnfs ocf:heartbeat:Filesystem \
                                        params device="192.168.1.110:/share" directory="/var/www/html" fstype="nfs" \
                                        op monitor interval="60s" timeout="60s" \
                                        op start timeout="60s" interval="0" \
                                        op stop timeout="60s" interval="0"
                                    primitive webserver lsb:httpd
                                    property $id="cib-bootstrap-options" \
                                        dc-version="1.1.10-14.el6_5.3-368c726" \
                                        cluster-infrastructure="classic openais (with plugin)" \
                                        expected-quorum-votes="2" \
                                        stonith-enabled="false"

                                    将多个资源整全在一起(绑定在一起运行)

                                    crm(live)configure# group webservice webip webnfs webserver
                                    crm(live)configure# verify
                                    crm(live)configure# commit
                                    crm(live)configure# show
                                    node essun.node2.com
                                    node essun.node3.com
                                    primitive webip ocf:heartbeat:IPaddr \
                                        params ip="192.168.1.100"
                                    primitive webnfs ocf:heartbeat:Filesystem \
                                        params device="192.168.1.110:/share" directory="/var/www/html" fstype="nfs" \
                                        op monitor interval="60s" timeout="60s" \
                                        op start timeout="60s" interval="0" \
                                        op stop timeout="60s" interval="0"
                                    primitive webserver lsb:httpd
                                    group webservice webip webnfs webserver
                                    property $id="cib-bootstrap-options" \
                                        dc-version="1.1.10-14.el6_5.3-368c726" \
                                        cluster-infrastructure="classic openais (with plugin)" \
                                        expected-quorum-votes="2" \
                                        stonith-enabled="false"

                                    换个方式查看一下己生效的资源信息

                                    crm(live)configure# cd ..
                                    crm(live)# status
                                    Last updated: Sat Apr 26 01:51:45 2014
                                    Last change: Sat Apr 26 01:49:54 2014 via cibadmin on essun.node3.com
                                    Stack: classic openais (with plugin)
                                    Current DC: essun.node2.com - partition with quorum
                                    Version: 1.1.10-14.el6_5.3-368c726
                                    2 Nodes configured, 2 expected votes
                                    3 Resources configured
                                    Online: [ essun.node2.com essun.node3.com ]
                                     Resource Group: webservice
                                         webip  (ocf::heartbeat:IPaddr):    Started essun.node2.com
                                         webnfs (ocf::heartbeat:Filesystem):    Started essun.node2.com
                                         webserver  (lsb:httpd):    Started essun.node2.com

                                    上图表示所有的资源都在node2上,也就是192.168.1.111这个ip上,使用curl命令访问一下,看一下效果

                                    [root@bogon share]# ifconfig eth0
                                    eth0      Link encap:Ethernet  HWaddr 00:0C:29:63:4A:25 
                                              inet addr:192.168.1.110  Bcast:255.255.255.255  Mask:255.255.255.0
                                              inet6 addr: fe80::20c:29ff:fe63:4a25/64 Scope:Link
                                              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
                                              RX packets:2747 errors:0 dropped:0 overruns:0 frame:0
                                              TX packets:1161 errors:0 dropped:0 overruns:0 carrier:0
                                              collisions:0 txqueuelen:1000
                                              RX bytes:212090 (207.1 KiB)  TX bytes:99626 (97.2 KiB)
                                    [root@bogon share]# curl http://192.168.1.111
                                    来自于NFS文件系统

                                    此时模拟node2节点故障,看资源会是否转移

                                    crm(live)node# standby essun.node2.com
                                    crm(live)# status
                                    Last updated: Sat Apr 26 02:05:24 2014
                                    Last change: Sat Apr 26 02:04:17 2014 via crm_attribute on essun.node3.com
                                    Stack: classic openais (with plugin)
                                    Current DC: essun.node2.com - partition with quorum
                                    Version: 1.1.10-14.el6_5.3-368c726
                                    2 Nodes configured, 2 expected votes
                                    3 Resources configured
                                    Node essun.node2.com: standby
                                    Online: [ essun.node3.com ]
                                     Resource Group: webservice
                                         webip  (ocf::heartbeat:IPaddr):    Started essun.node3.com
                                         webnfs (ocf::heartbeat:Filesystem):    Started essun.node3.com
                                         webserver  (lsb:httpd):    Started essun.node3.com

                                    再curl一次

                                    [root@bogon share]# curl http://192.168.1.111
                                    curl: (7) couldn't connect to host
                                    [root@bogon share]# curl http://192.168.1.100
                                    来自于NFS文件系统
                                    [root@bogon share]# curl http://192.168.1.108
                                    来自于NFS文件系统

                                    注解:

                                    第一次curl表示httpd服务己经不再节点node2上运行了

                                    第二次curl表示我使用vip还是可能访问得到挂载页面,表示服务没有因node2下线而终止

                                    第三次curl表示使用node3ip同样也能访问到服务,可能判断服务运行于node3上。

                                    这时,如果node2重新上线服务是不会切换到node2上的,如果想让node2上线后可以切换回来可以使用位置约束来指定其权重

                                    下面使用第二种方式来限定资源,先将组定义删除,可以在crm configure #edit 编辑cib文件,将组定义的条目删除即可

                                    crm(live)node# online essun.node2.com
                                    crm(live)# status
                                    Last updated: Sat Apr 26 02:20:13 2014
                                    Last change: Sat Apr 26 02:19:29 2014 via crm_attribute on essun.node2.com
                                    Stack: classic openais (with plugin)
                                    Current DC: essun.node2.com - partition with quorum
                                    Version: 1.1.10-14.el6_5.3-368c726
                                    2 Nodes configured, 2 expected votes
                                    3 Resources configured
                                    Online: [ essun.node2.com essun.node3.com ]
                                     Resource Group: webservice
                                         webip  (ocf::heartbeat:IPaddr):    Started essun.node3.com
                                         webnfs (ocf::heartbeat:Filesystem):    Started essun.node3.com
                                         webserver  (lsb:httpd):    Started essun.node3.com

                                    服务果然没有回来,看我咋把它收回来的a_c!

                                    第一步,删除组限定,最好的办法使用edit命令,同样也可使用命令

                                    crm(live)resource# stop webservice #组别名
                                    crm(live)configure# delete webservice #删除组别
                                    crm(live)configure# verify
                                    crm(live)configure# commit
                                    crm(live)configure# show
                                    node essun.node2.com \
                                        attributes standby="off"
                                    node essun.node3.com
                                    primitive webip ocf:heartbeat:IPaddr \
                                        params ip="192.168.1.100"
                                    primitive webnfs ocf:heartbeat:Filesystem \
                                        params device="192.168.1.110:/share" directory="/var/www/html" fstype="nfs" \
                                        op monitor interval="60s" timeout="60s" \
                                        op start timeout="60s" interval="0" \
                                        op stop timeout="60s" interval="0"
                                    primitive webserver lsb:httpd
                                    property $id="cib-bootstrap-options" \
                                        dc-version="1.1.10-14.el6_5.3-368c726" \
                                        cluster-infrastructure="classic openais (with plugin)" \
                                        expected-quorum-votes="2" \
                                        stonith-enabled="false" \
                                        no-quorum-policy="ignore" \
                                        last-lrm-refresh="1398450597"

                                    这时己经没有组别定义了,这样就可以进行我的“计划”了

                                    定义排列约束(在一起的可能性)

                                    crm(live)configure# colocation webserver-with-webnfs-webip inf: webip webnfs webserver
                                    crm(live)configure# verify
                                    crm(live)configure# commit
                                    crm(live)configure# show
                                    node essun.node2.com \
                                        attributes standby="off"
                                    node essun.node3.com
                                    primitive webip ocf:heartbeat:IPaddr \
                                        params ip="192.168.1.100"
                                    primitive webnfs ocf:heartbeat:Filesystem \
                                        params device="192.168.1.110:/share" directory="/var/www/html" fstype="nfs" \
                                        op monitor interval="60s" timeout="60s" \
                                        op start timeout="60s" interval="0" \
                                        op stop timeout="60s" interval="0"
                                    primitive webserver lsb:httpd
                                    colocation webserver-with-webnfs-webip inf: webip webnfs webserver
                                    property $id="cib-bootstrap-options" \
                                        dc-version="1.1.10-14.el6_5.3-368c726" \
                                        cluster-infrastructure="classic openais (with plugin)" \
                                        expected-quorum-votes="2" \
                                        stonith-enabled="false" \
                                        no-quorum-policy="ignore" \
                                        last-lrm-refresh="1398450597"

                                    注解:

                                    colocation:排列约束命令

                                    webserver-with-webnfs-webip: #约束名(ID)

                                    inf:#(可能性,inf表示永久在一起,也可以是数值)

                                    webip webnfs webserver:#资源名称

                                    定义资源启动顺序

                                    crm(live)configure# order ip_before_webnfs_before_webserver mandatory: webip webnfs webserver
                                    crm(live)configure# verify
                                    crm(live)configure# commit
                                    crm(live)configure# show
                                    node essun.node2.com \
                                        attributes standby="off"
                                    node essun.node3.com
                                    primitive webip ocf:heartbeat:IPaddr \
                                        params ip="192.168.1.100"
                                    primitive webnfs ocf:heartbeat:Filesystem \
                                        params device="192.168.1.110:/share" directory="/var/www/html" fstype="nfs" \
                                        op monitor interval="60s" timeout="60s" \
                                        op start timeout="60s" interval="0" \
                                        op stop timeout="60s" interval="0"
                                    primitive webserver lsb:httpd
                                    colocation webserver-with-webnfs-webip inf: webip webnfs webserver
                                    order ip_before_webnfs_before_webserver inf: webip webnfs webserver
                                    property $id="cib-bootstrap-options" \
                                        dc-version="1.1.10-14.el6_5.3-368c726" \
                                        cluster-infrastructure="classic openais (with plugin)" \
                                        expected-quorum-votes="2" \
                                        stonith-enabled="false" \
                                        no-quorum-policy="ignore" \
                                        last-lrm-refresh="1398450597"

                                    注解:

                                    order :顺序约束的命令

                                    ip_before_webnfs_before_webserver #约束ID

                                    mandatory: #指定级别(此处有三种级别:mandatory:强制, Optional:可选,Serialize:序列化)

                                    webip webnfs webserver #资源名,这里书写的先后顺序相当重要

                                    定义位置约束

                                    crm(live)configure# location webip_and_webnfs_and_webserver webip 500: essun.node2.com
                                    crm(live)configure# verify
                                    crm(live)configure# commit
                                    crm(live)configure# show
                                    node essun.node2.com \
                                        attributes standby="off"
                                    node essun.node3.com
                                    primitive webip ocf:heartbeat:IPaddr \
                                        params ip="192.168.1.100"
                                    primitive webnfs ocf:heartbeat:Filesystem \
                                        params device="192.168.1.110:/share" directory="/var/www/html" fstype="nfs" \
                                        op monitor interval="60s" timeout="60s" \
                                        op start timeout="60s" interval="0" \
                                        op stop timeout="60s" interval="0"
                                    primitive webserver lsb:httpd
                                    location webip_and_webnfs_and_webserver webip 500: essun.node2.com
                                    colocation webserver-with-webnfs-webip inf: webip webnfs webserver
                                    order ip_before_webnfs_before_webserver inf: webip webnfs webserver
                                    property $id="cib-bootstrap-options" \
                                        dc-version="1.1.10-14.el6_5.3-368c726" \
                                        cluster-infrastructure="classic openais (with plugin)" \
                                        expected-quorum-votes="2" \
                                        stonith-enabled="false" \
                                        no-quorum-policy="ignore" \
                                        last-lrm-refresh="1398450597"

                                    注解:

                                    • location:位置约束命令webip_and_webnfs_and_webserver:约束名称webip 500: essun.node2.com:对那一个资源指定多少权重在那一个节点

                                      定义默认资源属性

                                      crm(live)configure# rsc_defaults resource-stickiness=100
                                      crm(live)configure# verify
                                      crm(live)configure# commit
                                      crm(live)configure# show
                                      node essun.node2.com \
                                          attributes standby="off"
                                      node essun.node3.com
                                      primitive webip ocf:heartbeat:IPaddr \
                                          params ip="192.168.1.100"
                                      primitive webnfs ocf:heartbeat:Filesystem \
                                          params device="192.168.1.110:/share" directory="/var/www/html" fstype="nfs" \
                                          op monitor interval="60s" timeout="60s" \
                                          op start timeout="60s" interval="0" \
                                          op stop timeout="60s" interval="0"
                                      primitive webserver lsb:httpd
                                      location webip_and_webnfs_and_webserver webip 500: essun.node2.com
                                      colocation webserver-with-webnfs-webip inf: webip webnfs webserver
                                      order ip_before_webnfs_before_webserver inf: webip webnfs webserver
                                      property $id="cib-bootstrap-options" \
                                          dc-version="1.1.10-14.el6_5.3-368c726" \
                                          cluster-infrastructure="classic openais (with plugin)" \
                                          expected-quorum-votes="2" \
                                          stonith-enabled="false" \
                                          no-quorum-policy="ignore" \
                                          last-lrm-refresh="1398450597"
                                      rsc_defaults $id="rsc-options" \
                                          resource-stickiness="100"

                                      注解:

                                      这样定义代表集群中每一个资源的默认粘性,只有当资源服务不在当前节点时,粘性才会生效,比如,这里我定义了三个资源webip、webnfs、webserver,对每一个资源的粘性为100,那么加在一起就变成了300,之前己经定义node2的位置约束的值为500,当node2宕机后,重新上线,这样就切换到node2上了。

                                      最后看一下状态,资源都运行于node2上,将node2故障

                                       

                                      crm(live)# status
                                      Last updated: Sat Apr 26 03:14:30 2014
                                      Last change: Sat Apr 26 03:14:19 2014 via cibadmin on essun.node3.com
                                      Stack: classic openais (with plugin)
                                      Current DC: essun.node2.com - partition with quorum
                                      Version: 1.1.10-14.el6_5.3-368c726
                                      2 Nodes configured, 2 expected votes
                                      3 Resources configured
                                      Online: [ essun.node2.com essun.node3.com ]
                                       webip  (ocf::heartbeat:IPaddr):    Started essun.node2.com
                                       webnfs (ocf::heartbeat:Filesystem):    Started essun.node2.com
                                       webserver  (lsb:httpd):    Started essun.node2.com
                                      crm(live)# node
                                      crm(live)node# standby essun.node2.com

                                      资源己在node3上运行了

                                      crm(live)# status
                                      Last updated: Sat Apr 26 03:18:17 2014
                                      Last change: Sat Apr 26 03:15:20 2014 via crm_attribute on essun.node3.com
                                      Stack: classic openais (with plugin)
                                      Current DC: essun.node2.com - partition with quorum
                                      Version: 1.1.10-14.el6_5.3-368c726
                                      2 Nodes configured, 2 expected votes
                                      3 Resources configured
                                      Node essun.node2.com: standby
                                      Online: [ essun.node3.com ]
                                       webip  (ocf::heartbeat:IPaddr):    Started essun.node3.com
                                       webnfs (ocf::heartbeat:Filesystem):    Started essun.node3.com
                                       webserver  (lsb:httpd):    Started essun.node3.com

                                      再curl两次

                                       

                                      [root@bogon share]# ifconfig eth0
                                      eth0      Link encap:Ethernet  HWaddr 00:0C:29:63:4A:25 
                                                inet addr:192.168.1.110  Bcast:255.255.255.255  Mask:255.255.255.0
                                                inet6 addr: fe80::20c:29ff:fe63:4a25/64 Scope:Link
                                                UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
                                                RX packets:2747 errors:0 dropped:0 overruns:0 frame:0
                                                TX packets:1161 errors:0 dropped:0 overruns:0 carrier:0
                                                collisions:0 txqueuelen:1000
                                                RX bytes:212090 (207.1 KiB)  TX bytes:99626 (97.2 KiB)
                                      [root@bogon share]# curl http://192.168.1.100
                                      来自于NFS文件系统
                                      [root@bogon share]# curl http://192.168.1.108
                                      来自于NFS文件系统
                                      [root@bogon share]#

                                      将node2重新上线看资源是否能回来

                                      crm(live)node# online essun.node2.com
                                      crm(live)node# cd ..
                                      crm(live)# status
                                      Last updated: Sat Apr 26 03:21:46 2014
                                      Last change: Sat Apr 26 03:21:36 2014 via crm_attribute on essun.node3.com
                                      Stack: classic openais (with plugin)
                                      Current DC: essun.node2.com - partition with quorum
                                      Version: 1.1.10-14.el6_5.3-368c726
                                      2 Nodes configured, 2 expected votes
                                      3 Resources configured
                                      Online: [ essun.node2.com essun.node3.com ]
                                       webip  (ocf::heartbeat:IPaddr):    Started essun.node2.com
                                       webnfs (ocf::heartbeat:Filesystem):    Started essun.node2.com
                                       webserver  (lsb:httpd):    Started essun.node2.com

                                      再curl三次

                                      [root@bogon share]# ifconfig eth0
                                      eth0      Link encap:Ethernet  HWaddr 00:0C:29:63:4A:25 
                                                inet addr:192.168.1.110  Bcast:255.255.255.255  Mask:255.255.255.0
                                                inet6 addr: fe80::20c:29ff:fe63:4a25/64 Scope:Link
                                                UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
                                                RX packets:2747 errors:0 dropped:0 overruns:0 frame:0
                                                TX packets:1161 errors:0 dropped:0 overruns:0 carrier:0
                                                collisions:0 txqueuelen:1000
                                                RX bytes:212090 (207.1 KiB)  TX bytes:99626 (97.2 KiB)
                                      [root@bogon share]# curl http://192.168.1.100
                                      来自于NFS文件系统
                                      [root@bogon share]# curl http://192.168.1.108
                                      curl: (7) couldn't connect to host
                                      [root@bogon share]# curl http://192.168.1.111
                                      来自于NFS文件系统
                                      [root@bogon share]#

                                      注解:

                                      1.100是虚拟的集群IP

                                      1.108为essun.node3.com

                                      1.111为essun.node2.com

                                      事实证明,资源还是夺回来了

                                      =======================到此corosync+pacemaker的crmsh常用指令介绍完毕===========

                                      PS:

                                      英文不好,可能注释的不够准确,各们看官请多多海涵a_c~~~~~~

Tag标签: corosync   pacemaker  
  • 专题推荐

About IT165 - 广告服务 - 隐私声明 - 版权申明 - 免责条款 - 网站地图 - 网友投稿 - 联系方式
本站内容来自于互联网,仅供用于网络技术学习,学习中请遵循相关法律法规