Skip to end of metadata
Go to start of metadata
 This page is work in progress and is not ready for review.
  • Replace the gray cursive text with the actual content.
  • Remove non-applicable sections or insert N/A.

High Level Functional Overview and Design Specifications

Problem Statement

Background

CoprHD supports provisioning of compute resources on Vblock infrastructure.The goal of this project is to


    • simplify for users the provisioning of Vblock compute resources through CoprHD by
      • reducing the need to depend on element managers for often used operations
      • reducing the number of compute virtual pools users need to setup
    • increase the usability of catalog services by improving pre-check validations and error messages
    • improve reliability of Vblock catalog services

What is the problem, and why is it important?


    • CoprHD does not provide a mechanism to execute certain management operations like moving a host from one UCS blade to another.
    • CoprHD does not allow using multiple service profile templates with the same compute virtual pool thus forcing user to create multiple pools.
    • Vblock support does not have sufficient validations to prevent certain misconfigurations that could lead to order failures.

Addressing the above issues would improve the usability and reliability of the product for Vblock users.


Functional Requirements

JIRA ID1-Line DescriptionComments, notes, etc.
COP-27294

Release blade/activate blade service

Epic
COP-31758

Implement solution for Release blade/activate service

Story
COP-31243

Leia: vBlock Improvements and Quality Enhancements

Epic
COP-29765Ability to provision from same CVP using multiple service profile templates
COP-29757User wants boot volume datastore name to match boot volume name
COP-28947Fail host provision if ServiceProfile name in use
COP-28922Check ServiceProfileTemplate is ok for blades before provisioning
COP-28744As a storage admin, I'd like to get a better error message when the OS fails to install on a UCS blade so I can fix the issue myself rather than open a ticket.All information we have s already being displayed. So we will reject this?
COP-23466Not able to update privileges for the UCS user in ViPRShould this be changed to a bug?



Design Approach / High Level Implementation Details


    • Hash-out the design and high level implementation ideas
    • Are there any alternative design one should consider? Is this the optimal one?
    • What are the user visible changes? Will this change break user's script or automation code? Will it impact CoprHD QE automation scripts?
    • What are the changes to CoprHD components and component interaction? Are teams responsible for these other components on boards with this change?
    • What are the changes to persistent data? Will it require special consideration during upgrades?
    • Please use consistent and precise nomenclature / terminology.
    • Please include diagrams and illustrations if applicable.


Story 1: Ability to disassociate a host from one compute blade and re-assign to another blade

Requirement:

UCS provides the ability to move a host (service profile on the UCS) from one physical blade to another. This is used when a blade becomes inoperable or degraded or needs maintenance or reconfiguration and the host that was provisioned on it needs to be moved to another available blade.

Currently, CoprHD provides no mechanism to do this and users depend on element manager (UCS Manager) to do this.

Assumption:

User will select a specific blade to move the host to; ie automatic selection of blade from a CVP will not be needed. If users request this, it can be implemented in future.

Design:

A new Vblock catalog service will be created. It will have options to:

  • release the blade currently associated to a host. (Host will be shut down and the blade will become available)
  • OR associate a blade to a host that is currently shut down and not associated to any blade)
  • OR do both the above. ie release currently associated blade and associate host to another blade.

The process to release a blade from a host is:

  1. put the host in a maintenance mode ( if running VMs detected and only one host available, this operation should fail till the VMs are shutdown or moved)
  2. shut-down server/host (or fail and force user to shutdown OS)
  3. dis-associate the ServiceProfile from the blade in UCS
  4. release the blade back to the pool

The process of associating a blade to a shutdown host is:

  1. Either user manually or ViPR automatically chooses an available blade to bind the host to.
  2. associate the host's service profile to the chosen blade.
  3. boot up the host.
  4. take the host out of maintenance mode.

OR should we just validate that the host is already in maintenance mode in vcenter and powered down ?

Appropriate locking needs to be implemented so that the HostToComputeElementMatcher that runs as part of UCS, VCenter and ESX discovery does not remove the blade mapping of the host that is being provisioned.

<Flow chart to represent the workflow to be added>

Impact on other Catalog services:

  1. No Impact on Vblock Provisioning Catalog services - provisioning hosts to new or existing OS or bare metal clusters will not be impacted by another host being shutdowna nd its blade association removed.
  2. No Impact Vblock Decommissioning Catalog Servcies - Decommission Host API is already designed to handle hosts that have only a service profile and no blade association. So no impact on host or cluster decommissioning catalog services.
  3. Impact on Non-Vblock Catalog Services : Impact to Block and Vcenter catalog services that export a volume or datastore to a host or cluster need to validated to determine that there is no impact to these because of a host being shut-down and not having a blade association.


Improvement 2: Ability to provision from same compute pool using multiple service profile templates

CoprHD uses UCS service profile templates as the basis of the server and network configuration for the host to be provisioned. The service profile template (SPT) needs to be specified in the compute virtual pool and is used during provisioning.

Problem

Often customers want to use different service profile templates that differ in policies on how disks are configured or differ in flags to be set on ports etc. Since CoprHD allows only one service profile from a UCS to be selected per compute virtual pool, a user is forced to create multiple compute virtual pools - one per SPT they want to use and split blades with similar capabilities across these pools.


Compute Virtual Pool: Service Profile Template Selection





Provision Cluster Catalog UI

Provision Cluster Catalog UI


Proposed Solution

To maintain backward compatibility and not change anything for current users who do not need this feature, Compute Virtual Pools will continue to have an association to a maximum of one SPT per UCS. However, advanced users who have admin privileges will have the option at the time of provisioning to choose another SPT and thus over-ride the SPT selection in the virtual pool. This will allow multiple SPTs to be used with the same compute virtual pool.

A new field to select a Service Profile Template at the time of provisioning will be added to the following 4 provisioning catalog services:

  1.  Provision OS Cluster
  2.  Add Host To Cluster
  3.  Provision Bare Metal Cluster
  4.  Add bare metal host to cluster

If the new field is populated, the service profile template specified here will override the one specified in the compute virtual pool.

If the Compute Virtual Pool includes blades from multiple UCS systems, the service profile template selection made in this screen will limit the blade selection algorithm to only select blades from the UCS system that this template belongs to.



Proposed Provision Cluster Catalog UI Change




Improvement 3: Boot volume datastore name should match boot volume name

When a host is provisioned through Vblock catalog services, CoprHD creates a boot volume from which the host can boot. When ESX image is installed on this volume, a datastore by the name "datastore1" is created on this boot volume. The requirement is to change the default datastore name "datastore1" to the name of the boot volume.

<This can be achieved by renaming the datastore in the first boot script.>

Improvement 4: Fail host provisioning if service profile name is already in use.

When a host is provisioned through Vblock catalog service, a service profile with the same name as the host being provisioned is created on the UCS.

The current logic is such that if a service profile by the same name as the host being provisioned exists on the UCS, CoprHD appends "_1" to the name and creates the service profile.

Proposal is to change this behavior to fail provisioning if a service profile by the same name already exists on the UCS. User can either delete the existing service profile on the UCS or change the name of the host being provisioned to continue provisioning.

 

Improvement 5: Pre-check during provisioning to validate service profile template being used is okay.

Currently, CoprHD validates the service profile template selected is valid for use with the specified varrays

    1. at the time of creation of the compute virtual pool.
    2. at the time of provisioning - while selecting blades from the compute virtual pool as part of the CREATEHOST task.

      The ask is to validate the service profile template as part of provisioning pre-Check.

               Not certain yet that there is much value-add in this preCheck - the order will currently fail in the createHost task; with preCheck, it will fail before the task is initiated. Ofcourse, if the preCheck validation includes more than the validation in createHost, there will be value. Need to investigate.

Exclusions and Limitations

       The following are being excluded:


      • Adding or removing vlans on already provisioned hosts
      • Setting up ESX networking for hosts
      • Automatic selection of blades from selected compute virtual pool during Change Compute Element For Host operation. Blade needs to be manually selected.

Future Work and Related Projects

This project is neither dependent on other projects nor are other projects dependent on this.

Implementation Strategy

Implementation Timeframe


      • What are the milestones (code complete, code ready for review, code ready to commit)?
      • How much time will be needed to test this change and address all the issues after the initial commit?
      • Is this a multi-phase project?

Virtual Team

Testing Strategy

        • Testing will be carried out on real devices. 
        • UCS simulator is being developed and if available will be used in automation and for dev testing
        • Test beds will be with vmax/VNX arrays and vmax + vplex.
        • High level scenarios planned for testing are:
          • Dis-associate/associate Compute Element (CE) to another CE of same/different model/chassis/UCS
          • Dis-associate/associate Compute Element (CE) to another CE while discovery is running for vCenter/UCS
          • Dis-associate/associate Compute Element (CE) to another CE from same/different Compute Virtual Pool (CVP), manual/automatic selection of CVP
          • Dis-associate/associate Compute Element (CE) to another CE with equal/higher/lower (single or range value) qualifiers on within/different  CVP.
          • Dis-associate/associate Compute Element (CE) to another CE running with initial/updating service profile templates.
          • Boot value Data-store names, verification for existing service profile names and pre-check for service profile templates during provisioning will be verified for all applicable VCE Vblock catalog services.
          • Post modification of UCS local account privileges from non-admin to admin account, read-only to admin account, inactive to active account, provisioning will be verified for all applicable VCE Vblock catalog services.
          • During provisioning selection of multiple Service profile templates from same vPool will be verified for sys-admin/tenant-admin and normal tenant user.

        • Regression will be run on areas having code churn from bug fixes and also selected high priority test cases for the new implemented stories/improvements.
        • Existing set of all automated test cases will be run as part of regression.
        • Upgrade testing will be carried out for N-1 and N-2 releases of ViPR
          • Upgrade test will be focused on areas having schema changes and selected high priority test cases for the new implemented stories/improvements.
          • Automation will also be run from existing set of test cases.

        • Tests will be focused around
          •  VCE catalog services: 
            •                Provision Cluster
            •                Add host with OS to cluster
            •                Prepare Bare Metal Cluster
            •                Add Bare Metal Hosts to Cluster
            •                Update vCenter Cluster
            •                Decommission Host From Cluster
            •                Decommission Cluster

          • As part of regression Non VCE catalog services will also be verified such as:
            •                Block Services for VMware vCenter (create/remove volume and datastore)
            •                Block Services for Hosts (Create/remove volume)

          Test Automation Strategy:


          • Automation test code will be written for COP-29724 (Release/activate blade), COP-29757 (boot volume datastore name) and COP-29765 (multiple CVP selection from same vPool).
          • We also plan to automate test cases from the existing prioritized list if bandwidth is available.

Documentation Strategy


      • What are the necessary changes to CoprHD documentation?

Impact Evaluation

Public APIs and public functionality (end-user impact)


      • Is this change backward compatible? Could incompatibilities be avoided? If not, what changes will have customer and CoprHD QE apply to their automation software developed against the old APIs?
      • New APIs to be developed for releaseBlade and associateBlade operations.
      • API to be modified for allowing specifying an SPT at the time of provisioning

Other components and component interaction

    • NA

Persistence model



      • Are schema changes need in dbsvc or coordinatorsvc?
      • If so, what is the upgrade procedure?Is the schema migration (conversion) going to be reliable and scalable?
      • Does this change affect disk usage?

Upgrades


      • Are there any other special considerations regarding upgrades? Think twice - this is rarely obvious (wink)

      • Will we need any new tools to debug or repair system if the upgrade code specific to this change fails?

Performance


      •  NA

Scalability and resource consumption

    • NA

Security

    • NA

Deployment and serviceability


      • Will any error messages associated with this feature include "friendly" names for the elements involved, rather than just opaque URIs?

Developer impact, dependencies and conditional inclusion


      • What is the impact on build time?
      • Is there any impact on integration with IDEs (Eclipse and IntelliJ)?
      • Are there any new third party dependencies? If so, please list these dependencies here:
Package NamePackage VersionLicenseURLDescription of the package and what it is used for
















      • Shall this feature by included with a special build-time flag?
      • Shall this feature be enabled only under certain run-time condition?

Reviewed and Approved By

NameDateVoteComments, remarks, etc.

































    • The members of the CoprHD Technical Steering Committee (TSC) have been automatically added to your reviewer list
      • A majority of them must approve this document before the feature can be merged to a project integration branch (e.g. integration-*, release-*, master)
      • Please un-check the boxes next to their names when your document is ready for their review
    • Please refer to Design Approval Guidance for information on additional reviewers who should be added and the approval process
  • No labels