Skip to end of metadata
Go to start of metadata
 

 


High Level Functional Overview and Design Specifications

Problem Statement

Overview

    • CoprHD when used in OpenStack environments is used as a storage system. It has a Cinder driver and thus becomes a sub-system of Cinder.
    • Ideally we would like to position CoprHD as a storage controller and not a sub-system of Cinder. CoprHD has several high availability and scalability features which get masked when used as a 'child device driver' under cinder. Directly exposing the API will allow us to surface those capabilities directly to the OpenStack user.
    • The goal of this project is to implement the 'Block Storage API' of OpenStack natively within CoprHD and thus make CoprHD Cinder compatible, giving customers a choice to choose either Cinder or CoprHD .

Current Solution

         

 

Rationale

 

    • OpenStack is growing rapidly : We can see lot of evidence of this – e.g. the number of attendees at OpenStack summits is growing at around 40% year on year. ETD is planning to come out with ECI skew specifically aimed at OpenStack. VMware recently announced an OpenStack distribution. A recent report estimated the size of the OpenStack market to grow to about 1.7b$ by 2016.

    • CoprHD REST API is very similar to the Cinder API

    • With a moderate effort, we should be able to support the Cinder compatible REST API

    • CoprHD is probably in a unique position to do so, compared to any other storage array/software

    • Being API compatible allows CoprHD to position itself as a common management access point for all storage, not just another array.

Functional Requirements

    • API compatibility: Implement the Block Storage API of OpenStack within CoprHD. Both API version 1 and version 2 should be supported. OpenStack admin should be able to perform storage operations via the Horizon UI and the Cinder/Nova CLI. All basic volume operations should be supported. No changes should be required to other components of OpenStack such as Keystone or Nova.
    • Tenants mapping: There should be an ability to map OpenStack tenants to CoprHD tenants. Volumes created by OpenStack tenants should be seen under the right tenant in CoprHD.
    • Filtering services per tenant: The CoprHD admin should have an ability to either allow or deny a particular block virtual pool to OpenStack tenants.
    • Volume Types mapping: Each permitted block virtual pool on CoprHD should appear as a volume type in OpenStack. Important attributes of the vpool should appear as QoS attributes in the Horizon UI.
    • Keystone interface: Ability to add Keystone as an authentication provider in CoprHD. Ability to automatically register CoprHD into Keystone.
    • Volume operations: Create, Delete, Rename, List and Show volumes
    • Export operations: Attach/Detach volume; FC single/multi initiators; iSCSI single/multi initiators
    • Snapshot operations: Create, Delete, List, Show
    • Clone operations: Create volume from Snapshot/Volume 
JIRA ID1-Line DescriptionComments, notes, etc.
   
   

Design Approach / High Level Implementation Details

Background

    •  The term 'Cinder' is used in multiple contexts to mean different things. It is both the block storage API in OpenStack and its implementation in Python. In this project CoprHD will support the Cinder (API) but will replace the Cinder (Python implementation).

    • Existing CoprHD REST API and the OpenStack API (Cinder API) will operate in parallel. Whereas the REST API supports all operations, the Cinder API supports only a sub-set of operations. In particular, no 'system admin' or 'security monitor' type of operations will be supported via the Cinder API.

       High Level Design

In a typica OpenStack environment, all services will talk to each other over REST API, for example Nova and Cinder talk to each other using REST and so all other services too. Keystone is the authentication provider in OpenStack, alike AD and LDAP authentication and authorization service in CoprHD. Keystone essentially keeps the list of tenants, tenant users and their associated roles and the list of services ( like Cinder, Neutron etc ) with their 'what about' information. For CoprHD to act as an OpenStack Cinder, all that is required to be done is subscribe CoprHD as a Cinder service in Keystone instead of OpenStack Cinder. Following diagram depicts CoprHD acting as a Cinder in an OpenStack environment.

 

To achieve this CoprHD needs to do the following

 

    • Implement Cinder compatible API natively - CoprHD becomes a central component of the OpenStack solution ( not just the driver ). This means CoprHD will provide native implementation for all Cinder APIs here.
    • Use Keystone as an authentication provider for all requests coming in for Cinder service.
    • Define a clear mapping between OpenStack and CoprHD constructs.
    • Remove the VMware dependency for CoprHD deployment.

Internals

      

 

Low Level Design

 

Mapping between OpenStack and CoprHD constructs 

 

OpenStack ConstructCoprHD ConstructMapping
TenantTenantMapped
ProjectProjectMapped
RoleRoleAdmin in OpenStack will be treated as Tenant Admin in CoprHD.
Users will have the default User privileges in CoprHD. 
TokenTokenCoprHD should honour tokens issued by Keystone
Volume TypeVirtual Storage Pool ( vPool ) Mapped
Availability ZoneVirtual ArrayMapped
RegionVDCMapped
QuotaQuotaMapped
VolumeVolumeMapped
SnapshotSnapshotMapped
Consistency GroupConsistency GroupMapped
Volume CloneVolume CloneMapped
NodeHostMapped

 

Authentication/Authorization 

 

    • Existing ldap/ad model will be extended to add new authentication provider as "keystone" type, and it is called as 'Keystone Authentication Provider'
    • There will be only one keystone provider at any point of time ( singleton ).
    • Tenant token will be persisted in Cassandra.
    • For every request which comes on the cinder API port, we look for the X-AUTH-TOKEN. If token is found we validate it with keystone. If token is valid, we get the user and his roles from keystone and map it to corresponding CoprHD role.

The Keystone authentication provider is similar in many respects to LDAP and AD. It will use the common infrastructure for authentication providers. Following parameters will be recorded in the CoprHD database when adding the Keystone Auth Provider:

    • Keystone URL – it is of the form 'http://10.11.12.13:5000/v2.0'
    • Credentials – need to supply tenant admin credentials valid on Keystone here
    • Domain – select a domain which the OpenStack cluster will be associated with in CoprHD.

The UI screenshot of the 'Add authentication provider' screen is shown below:

 

Automatic Registration of CoprHD as a Block Storage Service (Cinder) in Keystone

Background:
  • In OpenStack, Keystone is the Authentication Provider and it also maintains a catalog of service endpoints/descriptions.  e.g. The "volumev2" service endpoint is registered in Keystone and when an instance needs volume services, it first checks Keystone for the endpoint address.

  • In order for CoprHD to provide Block Services for OpenStack, it must be registered in Keystone

Design Overview (See flowchart below):
  • When Keystone is being added as an Authentication Provider, there will be an option (checkbox) to "Register CoprHD as 'volumev2' service in Keystone"

  • If selected: 
    • CoprHD will use Keystone's REST interface to update the volumev2 service to point to CoprHD's VIP 
    • CoprHD will also have a text input box for the OpenStack Project ID/Name
Tenant Mapping in CoprHD to OpenStack Projects
  • CoprHD also currently requires the project id of an OpenStack project to be added as a key/value pair, named "tenant_id" to a tenant that is going to be used with OpenStack

    • This will be done automatically in the tenant creation phase by using Keystone's REST interface to obtain project ids.  NOTE: This will not work with existing CoprHD tenants.

  • CoprHD also requires that a project is "tagged" with the OpenStack project id in order to function with OpenStack.  This would also be done in the tenant creation phase.

NGINX

A new "server" block will be added in nginx.conf, this will expose the port 8776 and direct all APIs starting with the URI "/v1" or "/v2" to the apisvc. Since "/v1" is the legacy API, we will implement it by simply redirecting it "/v2" using nginx redirect/rewrite feature.


REST API

Within the API service, a new module called 'cinder service' will be added. This is a package under 'com.emc.storageos.api.service.impl.services.cinder'.

Services to be implemented include

    • Volume services (create, delete, update, list and show volumes, create volume from snapshot/volume)
    • Export services (attach and detach volumes)
    • Snapshot services (create, delete, list and show snapshots)
    • Volume Types (list and show volume types)
    • QoS
    • Miscelleneous

Every API operation will follow these steps:

    1. As part of every API, the first operation is to authenticate the user with Keystone. The header parameter 'X-AUTH-TOKEN' will be verified with Keystone. If the verification is successful, the operation will proceed.
    2.  The next step is to verify parameters, and return error if some parameters are missing or wrong.
    3. Next, the parameters are translated to those required by the native CoprHD REST API
    4. The native CoprHD REST API is now invoked. For asynchronous operations, an object ID is returned (e.g. ID of the newly created volume, etc.) The upper layers of OpenStack (e.g. nova) would typically poll on this object ID until the volume status is 'ready' or whatever they expect it to be.

 

Quality Of Service Specification ( QoS )

 

Quality of Service Specs in OpenStack define sets of storage performance capabilities. QoS specs entities can be mapped to Volume Types to indicate specific QoS policies enforced on volumes. Users that request volumes with associated QoS specs are guaranteed to get requested capabilities on these volumes. In OpenStack QoS can be created separately and then assigned to existing Volume Types.

In CoprHD QoS is a set of parameters that describe specified behaviour for Virtual Pools and Volumes created with a particular Virtual Pool. In our case those parameters were chosen from specific attributes of Virtual Pool. Quality of Service will play information role only. It is an extended information about Volume Type in OpenStack dashboard. Quality of Service will be created / updated / deleted along with Virtual Pool and will pertain only to this particular Virtual Pool.

Below is a list of OpenStack QoS API calls along with CoprHD implementation support:

 

Supported

Not supported

    • List QoS specs
    • Show QoS specification details
    • Get all associations for QoS specification
    • Create QoS specification (1)
    • Set or unset keys in QoS specification (1)
    • Delete QoS specification (1)
    • Associate QoS specification with volume type (2)
    • Disassociate QoS specification from volume type (2)
    • Disassociate QoS specification from all associations (3)

 

(1) OpenStack dashboard will be read-only for admin account.

(2) Quality of Service acts as an information about particular Volume Type (Virtual Pool). There is no need to associate or disassociate QoS specs because parameters are already tied to Virtual Pools.

(3) One set of QoS specs will be associated with up to one Volume Type.



Quality of Service parameters are made up from the following Virtual Pool parameters:

  1. Hardware
    • Provisioning Type
    • Protocol
    • Drive Type
    • System Type
    • Is expandable
    • Multi volume consistency
    • Raid Levels

  2. SAN Multi Path
    • Max / min Paths
    • Paths per initiator

  3. High Availability
    • High Availability

  4. Data Protection
    • Maximum Snapshots
    • Max Block Mirrors

Class Diagram

          

Flow Charts


 

Exclusions and Limitations

These functionality are not intended to be part of this project:

    • Ingest of volumes created via the cinder driver into the API model

    • Supporting the file API (manila)
    • Allowing OpenStack admin to create volume types
    • Allowing the OpenStack admin to also perform storage admin functions of CoprHD
    • Allowing non-OpenStack users to use the Keystone authentication provider

Future Work and Related Projects

    • Supporting the file API (Manila)
    • Full support for Keystone as an authentication provider (ie. allow use in non-OpenStack environments)
    • Volume stats
    • Backup
    • Replication

Implementation Strategy

Implementation Timeframe

Phase 1 (Basic API support)

This phase has already been implemented by EMC. The code is ready as of October 2015. Scope includes

    • Volume operations (create, delete, expand volume)
    • Snapshot operations
    • Volume Types (list and show only, not creation)
    • Add Keystone (v2) as authentication provider
    • Volume attach and detach (FC and iSCSI)
    • Create volume from snapshot, volume
    • Create volume from image (partially complete)
    • Consistency groups (partially complete)

Phase 2 (QoS and Service registration automation)

This phase will be implemented mainly by Intel. Target completion is Jan 2016 (CoprHD summit). Scope includes

    • QoS API: return important attributes of block virtual pool as QoS attributes. QoS to be visible in the Horizon UI.
    • Automatic service registration in Keystone: automation of the steps outlined above under 'service registration in Keystone'

Phase 3 (API completion)

This phase will be implemented mainly by Intel. Target completion is the Yoda release of CoprHD. Scope includes

    • Glance interface: Required for supported 'create volume from image' and 'create image from volume'.
    • Create Image from volume
    • Keystone v3 interface support
    • Testing using tempest


 

Phase 4 (Advanced features)

 

As the OpenStack Block API evolves, we need to add support for those operations in CoprHD. This mainly depends on the API evolution. Currently identified scope

 

    • Volume stats
    • Replication
    • Backup

 

 

Virtual Team

    • Who will work on this change?
    • Does the virtual team span multiple physical teams? If so, how are you going to share responsibilities and ensure consistency? 

Testing Strategy

    • How are you going to ensure good quality?
    • Does this change require any specific performance and scale testing?
    • Does this change require security scans?

Documentation Strategy

    • What are the necessary changes to CoprHD documentation?

Impact Evaluation

Public APIs and public functionality (end-user impact)

As of today, Cinder has v1, v2 and v3 version APIs. v1 and v2 are stable and widely used, v3 has been introduced most recently which would take its own time for adoption. Immediate obvious choice is to support v1 and v2 first, later releases will look at v3.

New APIs

S.No

METHOD

URI

DESCRIPTION

S.No

METHOD

URI

DESCRIPTION

API VERSIONS

1

GET

/

Lists information about all Block Storage API versions.

2

GET

/v2

Shows details for Block Storage API v2.

3

GET

/v2/{tenant_id}/extensions

Lists Block Storage API extensions.

VOLUMES

4

POST

/v2/{tenant_id}/volumes

Creates a volume

5

GET

/v2/{tenant_id}/volumes{?sort}

Lists summary information for all Block Storage volumes that the tenant who submits the request can access.

 

6

GET

/v2/{tenant_id}/volumes/detail{? Sort}

Lists detailed information for all Block Storage volumes that the tenant who submits the request can access.

7

GET

/v2/{tenant_id}/volumes/{volume_id}

Shows information about a specified volume.

8

PUT

/v2/{tenant_id}/volumes/{volume_id}{?description, name}

Updates a volume.

9

DELETE

/v2/{tenant_id}/volumes/{volume_id}

Deletes a specified volume.

 

10

POST

/v2/{tenant_id}/volumes/{volume_id}/action

Extends the size of a specified volume to a new size requested in GB.

 

VOLUME TYPES

11

GET

 

/v2/{tenant_id}/types

Lists volume types.

12

GET

/v2/{tenant_id}/types/ {volume_type_id}

Shows information about a specified volume type.

 

13

DELETE

/v2/{tenant_id}/types/ {volume_type_id}

Deletes a specified volume type.

 

SNAPSHOTS

14

POST

/v2/{tenant_id}/snapshots{?snapshot,volume_id,force,name,description}

Creates a snapshot, which is a point-in-time complete copy of a volume. You can create a volume from the snapshot.

 

15

GET

/v2/{tenant_id}/snapshots

Lists summary information for all Block Storage snapshots that the tenant who submits the request can access.

 

16

GET

/v2/{tenant_id}/snapshots/detail

Lists detailed information for all Block Storage snapshots that the tenant who submits the request can access.

 

17

GET

/v2/{tenant_id}/snapshots/{snapshot_id}

Shows information for a specified snapshot.

 

18

PUT

/v2/{tenant_id}/snapshots/{snapshot_id}{?description, name}

Updates a specified snapshot.

19

DELETE

/v2/{tenant_id}/snapshots/{snapshot_id}

Deletes a specified snapshot.

 

Limits extension (limits)

20

GET

/v2/{tenant_id}/limits

Shows absolute limits for a tenant.

 

Consistency Groups
    

Modified APIs

S.No
Method
URI
Description
1GET

/vdc/admin/authnproviders/{id}

Show authentication provider
2GET/vdc/admin/authnprovidersList authentication providers
3POST/vdc/admin/authnprovidersCreate an authentication provider
4PUT/vdc/admin/authnproviders/{id}Update authentication provider
5DELETE/vdc/admin/authnproviders/{id}Delete authentication provider


Other components and component interaction

  • UI : For the keystone authentication provider addition and enabling the automatic registration of CoprHD into keystone.
  • CLI : For all modified APIs.
  • Geo ( Multi-VDC ) : For having new authentication provider. We have tested the keystone authentication provider in multi-VDC setup and it is found to be working as equivalent to the other authentication providers - AD and LDAP.

Persistence model

      • Column Family "QuotaOfCinder"
      • Column Family "QosSpecification"

Upgrades

      • QosSpecification will be considered for upgrades : There will be a QosSpecification for each Virtual Pool present in the system. Upgrade should ensure that there gets a QosSpecification created for a Virtual Pool coming from earlier versions.

Performance

      •  Can this change adversely affect performance? If so, how are you going to test it? Is there a way to test this early, before the entire implementation is ready?

Scalability and resource consumption

      • Will it scale? How long will essential operations take at the scale of O(10,000,000)? How are you going to test it?
      • Will specific performance at scale testing be required?
      • Does this change have impact on memory and CPU usage? How does memory and CPU usage scale with the number of objects?

Security

      • Are there any implications for CoprHD security?
      • Will new security scans be required?

Deployment and serviceability

The deployment will work something like this: 

    • User deploys OpenStack in the usual way.
    • User deploys CoprHD in the usual way.
    • The sys admin now adds 'Keystone' as an authentication provider in CoprHD.
      • Provides details like IP address, credentials of the Keystone.
    • We create a new option to 'Deploy CoprHD in OpenStack'.
      • In this step, we modify the KeyStone configuration to install CoprHD as a cinder-compatible endpoint.
      • We restart the keystone service.
      • After this point, all OpenStack service have visibility into CoprHD.
      • CoprHD's vpools will show up as 'volume types' in OpenStack.
      • OpenStack users can now provision volumes from these vpools.
      • All services should work seamlessly.
    • If new virtual storage pools gets created in CoprHD, they will show up automatically in OpenStack as volume types.

Developer impact, dependencies and conditional inclusion

    • What is the impact on build time?
    • Is there any impact on integration with IDEs (Eclipse and IntelliJ)?
    • Are there any new third party dependencies? If so, please list these dependencies here:
Package NamePackage VersionLicenseURLDescription of the package and what it is used for
     
     
     
    • Shall this feature by included with a special build-time flag?
    • Shall this feature be enabled only under certain run-time condition?

Design Review Recordings   

      If you have missed the design review meeting, you can view the recordings.,
          Streaming recording link: https://emccorp.webex.com/emccorp/ldr.php?RCID=f901894b9b5238865c99d35401813d31
          Download recording link: https://emccorp.webex.com/emccorp/lsr.php?RCID=02f64d37b9fda88e25e362d6527f77c3


Reviewed and Approved By

 

NameDateVoteComments, remarks, etc.
15 Dec 2015+1Approved
17 Dec 2015+1 
14 Dec 2015+1Approved
 09 Dec 2015 +1 Approved
16 Dec 2015+1Approved
11 Dec 2015+1Approved
21 Dec 2015+1Approved
   
   





 

 

  • No labels

5 Comments

  1. from security perspective, could you explain a little more on:

    1.  there is a batch importing tenants step from OpenStack when register it as a provider, do we have error handling for it? I mean, what do we do if part of tenants imported and then failed? what we do with the imported tenants?
    2. according to the last review meeting, tenant importing is one time operation, for now we don't have plan to import tenants created after registering open stack provider, please specify this in design, so user don't have confusion on this.
    3. when deleting open stack provider, what do we do with the OpenStack tenants?
    4. the imported tenants don't have user-mapping, a little concern about if it breaks current UI/user-mapping logic, I believe a thorough regression test is needed.
  2. Fred Zheng Thank you for taking a time to review and comments.

     

    For #1 and #2

    Yes the implementation is under progress, Curt Bruns from Intel is working on it. Definitely there is going to be error handling. Curt can add more details.

     

    For#3, OpenStack Provider would be allowed to delete only when there are not any dependent tenants.

     

    For#4, Yes there is no user mapping, however we strictly rely on the OpenStack User Token to perform any operation. Only when Keystone validates that token, then we go ahead and do the operation.

    1. for #4, the concern is about the introduce of tenant with no user-mapping. and the change will affect EXISTED operations on UI and API, so we need a thorough regression test, like QE weekly sanity.

    • Tenant token will be persisted in Cassandra. => Why do we need this? Can we drop this requirement? 
    • Pictures (high-level) need to include Manila and Glance and indicate current focus is Cinder/Block API.
    • Glance interfaces - 'create volume from image' and 'create image from volume' - are these APIss suffice and cover all scenarios? When you use common storage backend for images and volumes (which is the normal use-case for Ceph), I think it has different flow if I remember correctly.  
    • Fabric Mgmt - what are the design considerations in CoprHD?
    • Ingesting existing volumes - do you see use-case where it doesn't make sense? This needs to be part of scope in my view. Or at least mechanism to say, we can't manage these storage systems which are already being managed by Cinder. 
    • Scale testing - please include burst, normal scenarios when CoprHD is managing 100s of storage systems and treat this as very key focus.  
    • Failure/recovery - please comprehend all failure/recovery scenarios as part of testing. 
  3. Reddy:
    Thank you for your time to review and comments. Kindly see the responses below.
    • Admin token is persisted into Cassandra -- this is a kind of optimization, since Keystone tokens once issued are valid for several hours. So it is a kind of caching which improves performance. But if the admin token becomes invalid we get it again. This admin token is used to validate the user tokens sent by users with each request.
    • Agreed, will do. Manila is already shown, We would show the Glance interaction the way Keystone is shown.
    • These are the two main APIs needed currently. Currently ‘create volume from image’ and ‘create image from volume’ are scoped and considered. We shall consider various storage backends for Glance later.
    • Fabric Management is already built within CoprHD, and there are no further enhancements needed for fabric management for this feature.
    • Agreed. Ingesting existing volumes is a great value add, and we are looking to consider it in later release.
    • Sure, agreed. Scale testing has been planned, we would add the details.
    • Agreed, will do.