Skip to content

Heat stuck at bastion #397

@ghost

Description

Hi Everybody,

I am trying to deploy OCP 3.5 (even 3.7) on OSP 11 from Red Hat.
When I run the heat script, it does create the stack, all the necessary networks are created and creates the bastion and does the usual cloud-init provisioning steps (adding repos, updating, installation basic packages) and cloud init send the finished signal and get the HTTP 200.

After that, it get stuck at

$> openstack stack resource list -n 2 ocp2 | grep -i progress
| bastion_host                    | 98bd1fee-87c3-4360-bd4b-549e39d1345e | file:///Users/myself/projects/openshift-on-openstack/bastion.yaml                                              | CREATE_IN_PROGRESS | 2017-12-21T16:00:41Z | ocp2                                                     |
| deployment_write_templates      | c8be1435-3125-4e06-8234-b620dd556fa8 | OS::Heat::SoftwareDeployment                                                                                        | CREATE_IN_PROGRESS | 2017-12-21T16:01:12Z | ocp2-bastion_host-n4vsl5fz4maw                           |
| deployment_update_node_count    | 79327e5c-579d-4a95-a0b4-e93c52385afd | OS::Heat::SoftwareDeployment                                                                                        | CREATE_IN_PROGRESS | 2017-12-21T16:01:12Z | ocp2-bastion_host-n4vsl5fz4maw                           |
| deployment_tune_ansible         | a705f997-3cf0-44aa-90f1-af21e3a23ca1 | OS::Heat::SoftwareDeployment

If I force the signal with openstack heat resource signal ... it goes to the next step but I see that the ansible template isn't create and the usual pushed files aren't present.
The /etc/os-collect-config.conf points to the good endpoint:

$> cat /etc/os-collect-config.conf
[DEFAULT]
command = os-refresh-config
collectors = ec2
collectors = cfn
collectors = local

[cfn]
metadata_url = https://10.1.3.11:13005/v1/
stack_name = ocp2-bastion_host-n4vsl5fz4maw
secret_access_key = 7e7214750d1a48c9a4cad81010fe2173
access_key_id = 494ab1ed83b441168423aec7d868267c
path = host.Metadata
$> openstack endpoint list | grep heat
| 1b24a4cf65a74e38992c4d8230a6e7da | regionOne | heat-cfn     | cloudformation | True    | internal  | http://172.17.1.16:8000/v1               |
| 2f666c5f3f25445682d8cc6ca51f9488 | regionOne | heat         | orchestration  | True    | admin     | http://172.17.1.16:8004/v1/%(tenant_id)s |
| 557a1fc9ff2549a8bc142bd305ac26bb | regionOne | heat-cfn     | cloudformation | True    | public    | https://10.1.3.11:13005/v1               |
| 622df692e35b424b93cd24f54c577df4 | regionOne | heat         | orchestration  | True    | public    | https://10.1.3.11:13004/v1/%(tenant_id)s |
| da4ed879390b4b6c9d97e114aa011f49 | regionOne | heat         | orchestration  | True    | internal  | http://172.17.1.16:8004/v1/%(tenant_id)s |
| fba19a090ed6437f86513a91e9cdc0ba | regionOne | heat-cfn     | cloudformation | True    | admin     | http://172.17.1.16:8000/v1

After few hours, it times out and the stack is failed.

Does anyone might have a clue why?

Thanks a lot for your support
P.

parameters.yaml

parameters:
  ssh_key_name: myself
  bastion_image: rhel-guest-image-7.2-20160302.0.x86_64
  bastion_flavor: m1.medium
  master_image: rhel-guest-image-7.2-20160302.0.x86_64
  master_flavor: m1.medium
  infra_image: rhel-atomic-cloud-7.2-10.x86_64
  infra_flavor: m1.medium
  node_image: rhel-atomic-cloud-7.2-10.x86_64
  node_flavor: m1.medium
  loadbalancer_image: rhel-atomic-cloud-7.2-10.x86_64
  loadbalancer_flavor: m1.medium
  ocp_version: 3.5
  osp_version: 11

  external_network: internet_access
  container_subnet: 192.168.1.0/24
  loadbalancer_type: neutron

  dns_nameserver: 8.8.4.4,8.8.8.8
  node_count: 2

  rhn_username: ""
  rhn_password: "."
  rhn_pool: ""
  extra_rhn_pools: ""
  deployment_type: openshift-enterprise
  domain_name: "example.com"
  master_hostname: "openshift-master"
  node_hostname: "openshift-node"
  ssh_user: cloud-user
  master_docker_volume_size_gb: 25
  infra_docker_volume_size_gb: 25
  node_docker_volume_size_gb: 25

  system_update: false

resource_registry:
  #OOShift::LoadBalancer: ../openshift-on-openstack/loadbalancer_dedicated.yaml
  OOShift::LoadBalancer: ../openshift-on-openstack/loadbalancer_neutron.yaml
  OOShift::ContainerPort: ../openshift-on-openstack/sdn_openshift_sdn.yaml
  OOShift::IPFailover: ../openshift-on-openstack/ipfailover_keepalived.yaml
  OOShift::DockerVolume: ../openshift-on-openstack/volume_docker.yaml
  OOShift::DockerVolumeAttachment: ../openshift-on-openstack/volume_attachment_docker.yaml
  OOShift::RegistryVolume: ../openshift-on-openstack/registry_ephemeral.yaml

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions