[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v5 06/14] hw/block/nvme: Add support for active/inactive name
From: |
Dmitry Fomichev |
Subject: |
Re: [PATCH v5 06/14] hw/block/nvme: Add support for active/inactive namespaces |
Date: |
Sun, 4 Oct 2020 23:54:13 +0000 |
User-agent: |
Evolution 3.36.5 (3.36.5-1.fc32) |
On Wed, 2020-09-30 at 13:50 +0000, Niklas Cassel wrote:
> On Mon, Sep 28, 2020 at 11:35:20AM +0900, Dmitry Fomichev wrote:
> > From: Niklas Cassel <niklas.cassel@wdc.com>
> >
> > In NVMe, a namespace is active if it exists and is attached to the
> > controller.
> >
> > CAP.CSS (together with the I/O Command Set data structure) defines what
> > command sets are supported by the controller.
> >
> > CC.CSS (together with Set Profile) can be set to enable a subset of the
> > available command sets. The namespaces belonging to a disabled command set
> > will not be able to attach to the controller, and will thus be inactive.
> >
> > E.g., if the user sets CC.CSS to Admin Only, NVM namespaces should be
> > marked as inactive.
> >
> > The identify namespace, the identify namespace CSI specific, and the
> > namespace
> > list commands have two different versions, one that only shows active
> > namespaces, and the other version that shows existing namespaces, regardless
> > of whether the namespace is attached or not.
> >
> > Add an attached member to struct NvmeNamespace, and implement the missing
> > CNS
> > commands.
> >
> > The added functionality will also simplify the implementation of namespace
> > management in the future, since namespace management can also attach and
> > detach namespaces.
>
> Following my previous discussion with Klaus,
> I think we need to rewrite this commit message completely:
>
> Subject: hw/block/nvme: Add support for allocated CNS command variants
>
> Many CNS commands have "allocated" command variants.
> These includes a namespace as long as it is allocated
> (i.e. a namespace is included regardless if it is active (attached)
> or not.)
>
> While these commands are optional (they are mandatory for controllers
> supporting the namespace attachment command), our QEMU implementation
> is more complete by actually providing support for these CNS values.
>
> However, since our QEMU model currently does not support the namespace
> attachment command, these new allocated CNS commands will return the same
> result as the active CNS command variants.
>
> In NVMe, a namespace is active if it exists and is attached to the
> controller.
>
> CAP.CSS (together with the I/O Command Set data structure) defines what
> command sets are supported by the controller.
>
> CC.CSS (together with Set Profile) can be set to enable a subset of the
> available command sets.
>
> Even if a user configures CC.CSS to e.g. Admin only, NVM namespaces
> will still be attached (and thus marked as active).
> Similarly, if a user configures CC.CSS to e.g. NVM, ZNS namespaces
> will still be attached (and thus marked as active).
>
> However, any operation from a disabled command set will result in a
> Invalid Command Opcode.
>
> Add an attached struct member for struct NvmeNamespace,
> so that we lay the foundation for namespace attachment
> support. Also implement logic in the new CNS values to
> include/exclude namespaces based on this new struct member.
> The only thing missing hooking up the actual Namespace Attachment
> command opcode, which allows a user to toggle the attached
> variable per namespace. The reason for not hooking up this
> command completely is because the NVMe specification
> requires that the namespace managment command is supported
> if the namespacement attachment command is supported.
>
OK, putting this in.
>
> > Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
> > Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> > ---
> > hw/block/nvme-ns.h | 1 +
> > hw/block/nvme.c | 60 ++++++++++++++++++++++++++++++++++++++------
> > include/block/nvme.h | 20 +++++++++------
> > 3 files changed, 65 insertions(+), 16 deletions(-)
> >
> > diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
> > index cca23bc0b3..acdb76f058 100644
> > --- a/hw/block/nvme-ns.h
> > +++ b/hw/block/nvme-ns.h
> > @@ -22,6 +22,7 @@
> > typedef struct NvmeNamespaceParams {
> > uint32_t nsid;
> > uint8_t csi;
> > + bool attached;
> > QemuUUID uuid;
> > } NvmeNamespaceParams;
> >
> > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > index 4ec1ddc90a..63ad03d6d6 100644
> > --- a/hw/block/nvme.c
> > +++ b/hw/block/nvme.c
>
> We need to add an additional check in nvme_io_cmd()
> that returns Invalid Command Opcode when CC.CSS == Admin only.
>
I think Keith has this addition already queued.
> > @@ -1523,7 +1523,8 @@ static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n,
> > NvmeRequest *req)
> > return NVME_INVALID_FIELD | NVME_DNR;
> > }
> >
> > -static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req)
> > +static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req,
> > + bool only_active)
> > {
> > NvmeNamespace *ns;
> > NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
> > @@ -1540,11 +1541,16 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n,
> > NvmeRequest *req)
> > return nvme_rpt_empty_id_struct(n, req);
> > }
> >
> > + if (only_active && !ns->params.attached) {
> > + return nvme_rpt_empty_id_struct(n, req);
> > + }
> > +
> > return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(NvmeIdNs),
> > DMA_DIRECTION_FROM_DEVICE, req);
> > }
> >
> > -static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req)
> > +static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req,
> > + bool only_active)
> > {
> > NvmeNamespace *ns;
> > NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
> > @@ -1561,6 +1567,10 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n,
> > NvmeRequest *req)
> > return nvme_rpt_empty_id_struct(n, req);
> > }
> >
> > + if (only_active && !ns->params.attached) {
> > + return nvme_rpt_empty_id_struct(n, req);
> > + }
> > +
> > if (c->csi == NVME_CSI_NVM) {
> > return nvme_rpt_empty_id_struct(n, req);
> > }
> > @@ -1568,7 +1578,8 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n,
> > NvmeRequest *req)
> > return NVME_INVALID_FIELD | NVME_DNR;
> > }
> >
> > -static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req)
> > +static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req,
> > + bool only_active)
> > {
> > NvmeNamespace *ns;
> > NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
> > @@ -1598,6 +1609,9 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n,
> > NvmeRequest *req)
> > if (ns->params.nsid < min_nsid) {
> > continue;
> > }
> > + if (only_active && !ns->params.attached) {
> > + continue;
> > + }
> > list_ptr[j++] = cpu_to_le32(ns->params.nsid);
> > if (j == data_len / sizeof(uint32_t)) {
> > break;
> > @@ -1607,7 +1621,8 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n,
> > NvmeRequest *req)
> > return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req);
> > }
> >
> > -static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req)
> > +static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req,
> > + bool only_active)
> > {
> > NvmeNamespace *ns;
> > NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
> > @@ -1631,6 +1646,9 @@ static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n,
> > NvmeRequest *req)
> > if (ns->params.nsid < min_nsid) {
> > continue;
> > }
> > + if (only_active && !ns->params.attached) {
> > + continue;
> > + }
> > list_ptr[j++] = cpu_to_le32(ns->params.nsid);
> > if (j == data_len / sizeof(uint32_t)) {
> > break;
> > @@ -1700,17 +1718,25 @@ static uint16_t nvme_identify(NvmeCtrl *n,
> > NvmeRequest *req)
> >
> > switch (le32_to_cpu(c->cns)) {
> > case NVME_ID_CNS_NS:
> > - return nvme_identify_ns(n, req);
> > + return nvme_identify_ns(n, req, true);
> > case NVME_ID_CNS_CS_NS:
> > - return nvme_identify_ns_csi(n, req);
> > + return nvme_identify_ns_csi(n, req, true);
> > + case NVME_ID_CNS_NS_PRESENT:
> > + return nvme_identify_ns(n, req, false);
> > + case NVME_ID_CNS_CS_NS_PRESENT:
> > + return nvme_identify_ns_csi(n, req, false);
> > case NVME_ID_CNS_CTRL:
> > return nvme_identify_ctrl(n, req);
> > case NVME_ID_CNS_CS_CTRL:
> > return nvme_identify_ctrl_csi(n, req);
> > case NVME_ID_CNS_NS_ACTIVE_LIST:
> > - return nvme_identify_nslist(n, req);
> > + return nvme_identify_nslist(n, req, true);
> > case NVME_ID_CNS_CS_NS_ACTIVE_LIST:
> > - return nvme_identify_nslist_csi(n, req);
> > + return nvme_identify_nslist_csi(n, req, true);
> > + case NVME_ID_CNS_NS_PRESENT_LIST:
> > + return nvme_identify_nslist(n, req, false);
> > + case NVME_ID_CNS_CS_NS_PRESENT_LIST:
> > + return nvme_identify_nslist_csi(n, req, false);
> > case NVME_ID_CNS_NS_DESCR_LIST:
> > return nvme_identify_ns_descr_list(n, req);
> > case NVME_ID_CNS_IO_COMMAND_SET:
> > @@ -2188,8 +2214,10 @@ static void nvme_clear_ctrl(NvmeCtrl *n)
> >
> > static int nvme_start_ctrl(NvmeCtrl *n)
> > {
> > + NvmeNamespace *ns;
> > uint32_t page_bits = NVME_CC_MPS(n->bar.cc) + 12;
> > uint32_t page_size = 1 << page_bits;
> > + int i;
> >
> > if (unlikely(n->cq[0])) {
> > trace_pci_nvme_err_startfail_cq();
> > @@ -2276,6 +2304,22 @@ static int nvme_start_ctrl(NvmeCtrl *n)
> > nvme_init_sq(&n->admin_sq, n, n->bar.asq, 0, 0,
> > NVME_AQA_ASQS(n->bar.aqa) + 1);
> >
> > + for (i = 1; i <= n->num_namespaces; i++) {
> > + ns = nvme_ns(n, i);
> > + if (!ns) {
> > + continue;
> > + }
> > + ns->params.attached = false;
> > + switch (ns->params.csi) {
> > + case NVME_CSI_NVM:
> > + if (NVME_CC_CSS(n->bar.cc) == CSS_NVM_ONLY ||
> > + NVME_CC_CSS(n->bar.cc) == CSS_CSI) {
> > + ns->params.attached = true;
> > + }
> > + break;
> > + }
> > + }
> > +
>
> Considering that the controller doesn't attach/detach
> namespaces belonging to command sets that it doesn't
> support, I think a nicer way is to remove this for-loop,
> and instead, in nvme_ns_setup() or nvme_ns_init(),
> always set attached = true. (Since we currently don't
> support namespace attachment command).
>
> The person that implements the last piece of namespace
> management and namespace attachment will have to deal
> with reading "attached" from some kind of persistent state
I did some spec reading on this topic and it seems that
this logic is necessary precisely because there is no
attach/detach command available. Such a command would
prevent attachment of a zoned namespace if CC.CSS is
NVM_ONLY, right? But since we have a static config, we
need to do this IMO.
Also, 6.1.5 of the spec says that any operation that uses
an inactive NSID shall fail with Invalid Field. I am
adding a few bits to fail all i/o commands and set/get
features attempted on inactive namespaces.
> and setting it accordingly.
>
> > nvme_set_timestamp(n, 0ULL);
> >
> > QTAILQ_INIT(&n->aer_queue);
> > diff --git a/include/block/nvme.h b/include/block/nvme.h
> > index 4587311783..b182fe40b2 100644
> > --- a/include/block/nvme.h
> > +++ b/include/block/nvme.h
> > @@ -804,14 +804,18 @@ typedef struct QEMU_PACKED NvmePSD {
> > #define NVME_IDENTIFY_DATA_SIZE 4096
> >
> > enum NvmeIdCns {
> > - NVME_ID_CNS_NS = 0x00,
> > - NVME_ID_CNS_CTRL = 0x01,
> > - NVME_ID_CNS_NS_ACTIVE_LIST = 0x02,
> > - NVME_ID_CNS_NS_DESCR_LIST = 0x03,
> > - NVME_ID_CNS_CS_NS = 0x05,
> > - NVME_ID_CNS_CS_CTRL = 0x06,
> > - NVME_ID_CNS_CS_NS_ACTIVE_LIST = 0x07,
> > - NVME_ID_CNS_IO_COMMAND_SET = 0x1c,
> > + NVME_ID_CNS_NS = 0x00,
> > + NVME_ID_CNS_CTRL = 0x01,
> > + NVME_ID_CNS_NS_ACTIVE_LIST = 0x02,
> > + NVME_ID_CNS_NS_DESCR_LIST = 0x03,
> > + NVME_ID_CNS_CS_NS = 0x05,
> > + NVME_ID_CNS_CS_CTRL = 0x06,
> > + NVME_ID_CNS_CS_NS_ACTIVE_LIST = 0x07,
> > + NVME_ID_CNS_NS_PRESENT_LIST = 0x10,
> > + NVME_ID_CNS_NS_PRESENT = 0x11,
> > + NVME_ID_CNS_CS_NS_PRESENT_LIST = 0x1a,
> > + NVME_ID_CNS_CS_NS_PRESENT = 0x1b,
> > + NVME_ID_CNS_IO_COMMAND_SET = 0x1c,
> > };
> >
> > typedef struct QEMU_PACKED NvmeIdCtrl {
> > --
> > 2.21.0
- Re: [PATCH v5 06/14] hw/block/nvme: Add support for active/inactive namespaces,
Dmitry Fomichev <=