[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-ppc] [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator R
From: |
Michael Roth |
Subject: |
Re: [Qemu-ppc] [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface |
Date: |
Wed, 26 Nov 2014 00:27:58 -0600 |
User-agent: |
alot/0.3.4 |
Quoting Bharata B Rao (2014-11-25 22:54:12)
> On Wed, Oct 1, 2014 at 8:00 PM, Alexander Graf <address@hidden> wrote:
> >
> >
> > On 01.10.14 00:08, Michael Roth wrote:
> >> Quoting Alexander Graf (2014-08-26 06:36:57)
> >>> On 19.08.14 02:21, Michael Roth wrote:
> >>>> From: Mike Day <address@hidden>
> >>>>
> >>>> Signed-off-by: Mike Day <address@hidden>
> >>>> Signed-off-by: Michael Roth <address@hidden>
> >>>> ---
> >>>> hw/ppc/spapr_pci.c | 119
> >>>> +++++++++++++++++++++++++++++++++++++++++++++++++
> >>>> include/hw/ppc/spapr.h | 3 ++
> >>>> 2 files changed, 122 insertions(+)
> >>>>
> >>>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> >>>> index 924d488..23a3477 100644
> >>>> --- a/hw/ppc/spapr_pci.c
> >>>> +++ b/hw/ppc/spapr_pci.c
> >>>> @@ -36,6 +36,16 @@
> >>>>
> >>>> #include "hw/pci/pci_bus.h"
> >>>>
> >>>> +/* #define DEBUG_SPAPR */
> >>>> +
> >>>> +#ifdef DEBUG_SPAPR
> >>>> +#define DPRINTF(fmt, ...) \
> >>>> + do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
> >>>> +#else
> >>>> +#define DPRINTF(fmt, ...) \
> >>>> + do { } while (0)
> >>>> +#endif
> >>>> +
> >>>> /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
> >>>> #define RTAS_QUERY_FN 0
> >>>> #define RTAS_CHANGE_FN 1
> >>>> @@ -47,6 +57,31 @@
> >>>> #define RTAS_TYPE_MSI 1
> >>>> #define RTAS_TYPE_MSIX 2
> >>>>
> >>>> +/* For set-indicator RTAS interface */
> >>>> +#define INDICATOR_ISOLATION_MASK 0x0001 /* 9001 one bit */
> >>>> +#define INDICATOR_GLOBAL_INTERRUPT_MASK 0x0002 /* 9005 one bit */
> >>>> +#define INDICATOR_ERROR_LOG_MASK 0x0004 /* 9006 one bit */
> >>>> +#define INDICATOR_IDENTIFY_MASK 0x0008 /* 9007 one bit */
> >>>> +#define INDICATOR_RESET_MASK 0x0010 /* 9009 one bit */
> >>>> +#define INDICATOR_DR_MASK 0x00e0 /* 9002 three bits
> >>>> */
> >>>> +#define INDICATOR_ALLOCATION_MASK 0x0300 /* 9003 two bits */
> >>>> +#define INDICATOR_EPOW_MASK 0x1c00 /* 9 three bits */
> >>>> +
> >>>> +#define INDICATOR_ISOLATION_SHIFT 0x00 /* bit 0 */
> >>>> +#define INDICATOR_GLOBAL_INTERRUPT_SHIFT 0x01 /* bit 1 */
> >>>> +#define INDICATOR_ERROR_LOG_SHIFT 0x02 /* bit 2 */
> >>>> +#define INDICATOR_IDENTIFY_SHIFT 0x03 /* bit 3 */
> >>>> +#define INDICATOR_RESET_SHIFT 0x04 /* bit 4 */
> >>>> +#define INDICATOR_DR_SHIFT 0x05 /* bits 5-7 */
> >>>> +#define INDICATOR_ALLOCATION_SHIFT 0x08 /* bits 8-9 */
> >>>> +#define INDICATOR_EPOW_SHIFT 0x0a /* bits 10-12 */
> >>>> +
> >>>> +#define DECODE_DRC_STATE(state, m, s) \
> >>>> + ((((uint32_t)(state) & (uint32_t)(m))) >> (s))
> >>>> +
> >>>> +#define ENCODE_DRC_STATE(val, m, s) \
> >>>> + (((uint32_t)(val) << (s)) & (uint32_t)(m))
> >>>> +
> >>>> static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
> >>>> {
> >>>> sPAPRPHBState *sphb;
> >>>> @@ -402,6 +437,80 @@ static void
> >>>> rtas_ibm_query_interrupt_source_number(PowerPCCPU *cpu,
> >>>> rtas_st(rets, 2, 1);/* 0 == level; 1 == edge */
> >>>> }
> >>>>
> >>>> +static void rtas_set_indicator(PowerPCCPU *cpu, sPAPREnvironment *spapr,
> >>>> + uint32_t token, uint32_t nargs,
> >>>> + target_ulong args, uint32_t nret,
> >>>> + target_ulong rets)
> >>>> +{
> >>>> + uint32_t indicator = rtas_ld(args, 0);
> >>>> + uint32_t drc_index = rtas_ld(args, 1);
> >>>> + uint32_t indicator_state = rtas_ld(args, 2);
> >>>> + uint32_t encoded = 0, shift = 0, mask = 0;
> >>>> + uint32_t *pind;
> >>>> + sPAPRDrcEntry *drc_entry = NULL;
> >>>
> >>> This rtas call does not have any idea what a PHB is. Why does it live in
> >>> spapr_pci.c?
> >>
> >> spapr_rtas.c does seem like a better home
> >>
> >>>
> >>>> +
> >>>> + if (drc_index == 0) { /* platform indicator */
> >>>> + pind = &spapr->state;
> >>>> + } else {
> >>>> + drc_entry = spapr_find_drc_entry(drc_index);
> >>>> + if (!drc_entry) {
> >>>> + DPRINTF("rtas_set_indicator: unable to find drc_entry for
> >>>> %x",
> >>>> + drc_index);
> >>>> + rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
> >>>> + return;
> >>>> + }
> >>>> + pind = &drc_entry->state;
> >>>> + }
> >>>> +
> >>>> + switch (indicator) {
> >>>> + case 9: /* EPOW */
> >>>> + shift = INDICATOR_EPOW_SHIFT;
> >>>> + mask = INDICATOR_EPOW_MASK;
> >>>> + break;
> >>>> + case 9001: /* Isolation state */
> >>>> + /* encode the new value into the correct bit field */
> >>>> + shift = INDICATOR_ISOLATION_SHIFT;
> >>>> + mask = INDICATOR_ISOLATION_MASK;
> >>>> + break;
> >>>> + case 9002: /* DR */
> >>>> + shift = INDICATOR_DR_SHIFT;
> >>>> + mask = INDICATOR_DR_MASK;
> >>>> + break;
> >>>> + case 9003: /* Allocation State */
> >>>> + shift = INDICATOR_ALLOCATION_SHIFT;
> >>>> + mask = INDICATOR_ALLOCATION_MASK;
> >>>> + break;
> >>>> + case 9005: /* global interrupt */
> >>>> + shift = INDICATOR_GLOBAL_INTERRUPT_SHIFT;
> >>>> + mask = INDICATOR_GLOBAL_INTERRUPT_MASK;
> >>>> + break;
> >>>> + case 9006: /* error log */
> >>>> + shift = INDICATOR_ERROR_LOG_SHIFT;
> >>>> + mask = INDICATOR_ERROR_LOG_MASK;
> >>>> + break;
> >>>> + case 9007: /* identify */
> >>>> + shift = INDICATOR_IDENTIFY_SHIFT;
> >>>> + mask = INDICATOR_IDENTIFY_MASK;
> >>>> + break;
> >>>> + case 9009: /* reset */
> >>>> + shift = INDICATOR_RESET_SHIFT;
> >>>> + mask = INDICATOR_RESET_MASK;
> >>>> + break;
> >>>> + default:
> >>>> + DPRINTF("rtas_set_indicator: indicator not implemented: %d",
> >>>> + indicator);
> >>>> + rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
> >>>> + return;
> >>>> + }
> >>>> +
> >>>> + encoded = ENCODE_DRC_STATE(indicator_state, mask, shift);
> >>>> + /* clear the current indicator value */
> >>>> + *pind &= ~mask;
> >>>> + /* set the new value */
> >>>> + *pind |= encoded;
> >>>> + rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> >>>> +}
> >>>> +
> >>>> static int pci_spapr_swizzle(int slot, int pin)
> >>>> {
> >>>> return (slot + pin) % PCI_NUM_PINS;
> >>>> @@ -624,6 +733,14 @@ static void spapr_phb_realize(DeviceState *dev,
> >>>> Error **errp)
> >>>> sphb->lsi_table[i].irq = irq;
> >>>> }
> >>>>
> >>>> + /* make sure the platform EPOW sensor is initialized - the
> >>>> + * guest will probe it when there is a hotplug event.
> >>>> + */
> >>>> + spapr->state &= ~(uint32_t)INDICATOR_EPOW_MASK;
> >>>> + spapr->state |= ENCODE_DRC_STATE(0,
> >>>> + INDICATOR_EPOW_MASK,
> >>>> + INDICATOR_EPOW_SHIFT);
> >>>> +
> >>>> if (!info->finish_realize) {
> >>>> error_setg(errp, "finish_realize not defined");
> >>>> return;
> >>>> @@ -1056,6 +1173,8 @@ void spapr_pci_rtas_init(void)
> >>>> spapr_rtas_register(RTAS_IBM_CHANGE_MSI, "ibm,change-msi",
> >>>> rtas_ibm_change_msi);
> >>>> }
> >>>> + spapr_rtas_register(RTAS_SET_INDICATOR, "set-indicator",
> >>>> + rtas_set_indicator);
> >>>> }
> >>>>
> >>>> static void spapr_pci_register_types(void)
> >>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> >>>> index 0ac1a19..fac85f8 100644
> >>>> --- a/include/hw/ppc/spapr.h
> >>>> +++ b/include/hw/ppc/spapr.h
> >>>> @@ -72,6 +72,9 @@ typedef struct sPAPREnvironment {
> >>>>
> >>>> /* state for Dynamic Reconfiguration Connectors */
> >>>> sPAPRDrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
> >>>> +
> >>>> + /* Platform state - sensors and indicators */
> >>>> + uint32_t state;
> >>>
> >>> Do you think it'd be possible to create a special DRC device that
> >>> contains all of its tables and global state and also exposes sensors and
> >>> indicators? That device could then get linked via qom links to the PHBs
> >>> for their slots.
> >>
> >> Sorry for the delay, I've been going back through the code with this
> >> suggestion in mind and there does seem to be a lot of state that
> >> can be nicely encapsulated by modeling the DR Connectors as a QOM
> >> "device" (though I haven't gone as far as to make them actual
> >> DeviceState's since it's more of a firmware abstraction than real
> >> hardware)
> >>
> >> I'm not sure what the best way to plumb things together is, as a first
> >> run, since each DRC must have a index drc_index as per spec, I've moved
> >> put them under /machine/DRConnector as a flat list, where top-level
> >> PHB/CPU/MEMORY DRCs would be allocated statically during sPAPR machine
> >> init (since the corresponding DRC indexes/types/etc are hard-coded into
> >> the top-level of the boot-time DT anyway, though I guess we could also
> >> allocate these on the fly...seems messier though than just plugging new
> >> resources into existing DRCs)
> >>
> >> PHB's in turn will associate themselves with a DRC via an attach/detach
> >> method as part of realize (and in the future, hotplug hooks, though
> >> that's not part of the series). The PHBs in turn will create a DRC for each
> >> hotpluggable PCI slot.
> >>
> >> Creation is via:
> >>
> >> sPAPRDRConnector *spapr_dr_connector_new(sPAPRDRConnectorType type,
> >> uint32_t id);
> >>
> >> where the code computes the drc index based on <type> (one of phb, cpu,
> >> pci,
> >> memory, etc) and <id>, and sticks them under
> >> /machine/dr-Connector/<drc_index>
> >>
> >> Any pci/phb/cpu hotplug hooks can then fetch the DRC via type/id,
> >> and hotplug/unplug via attach()/detach() methods. attach() adds
> >> the attached/hotplugged DeviceState as a link property of the
> >> DRC object, and sets the initial sensor state.
> >>
> >> rtas calls can fetch DRCs via drc_index, and set/get sensor state
> >> via DRC sensor get/set methods.
> >>
> >> Hotplug event delivery still lives outside of DRC implementation for now. I
> >> thought of moving them into DRC, but decisions like whether we should
> >> emit events during coldplug/initial boot seemed to require pushing
> >> a lot of general machine state into DRCs and making the encapsulation
> >> seem superficial.
> >>
> >> Things end up looking like this (2xxxxxxx are PHBs, 4xxxxxxx are PCI
> >> slots):
> >>
> >> address@hidden:~/w/qom/machine/dr-connector$ ls
> >> 20000000 40000018 40000038 40000058 40000078 40000098 400000b8
> >> 400000d8 400000f8
> >> 40000000 40000020 40000040 40000060 40000080 400000a0 400000c0
> >> 400000e0 type
> >> 40000008 40000028 40000048 40000068 40000088 400000a8 400000c8
> >> 400000e8
> >> 40000010 40000030 40000050 40000070 40000090 400000b0 400000d0
> >> 400000f0
> >> address@hidden:~/w/qom/machine/dr-connector$ cd 40000000/
> >> address@hidden:~/w/qom/machine/dr-connector/40000000$ ls -l
> >> total 0
> >> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31 1969 allocation-state
> >> lrwxr-xr-x 2 mdroth mdroth 4096 Dec 31 1969 device ->
> >> ../../../machine/peripheral/hp0
> >> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31 1969 drc-index
> >> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31 1969 entity-sense
> >> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31 1969 indicator-state
> >> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31 1969 isolation-state
> >> -rw-r--r-- 1 mdroth mdroth 4096 Dec 31 1969 type
> >> address@hidden:~/w/qom/machine/dr-connector/40000000$ cat allocation-state
> >> 1
> >> address@hidden:~/w/qom/machine/dr-connector/40000000$ cat indicator-state
> >> 1
> >> address@hidden:~/w/qom/machine/dr-connector/40000000$ cat
> >> ../../../machine/peripheral/hp0/type
> >> virtio-net-pci
> >> address@hidden:~/w/qom/machine/dr-connector/40000000$
> >>
> >> Hopefully this is sort of the approach you were thinking of?
> >
> > This look quite neat so far, looking forward to the patches :).
>
> Michael,
>
> Do you have this code/patches anywhere that I could use ? I have got
> the initial working versions of both CPU and memory hotplug now for
> sPAPR guests based on top of your old PCI hotplug patchset and it
> would be good to rebase them on top of your DR connector device work.
Hi Bharata,
Here's the latest branch:
https://github.com/mdroth/qemu/commits/spapr-pci-hotplug-ppc-next-cleanup4.2
The sPAPRDREntry stuff is now modeled by the sPAPRDRConnector QOM object in
hw/ppc/spapr_drc.c, which manages the device's life-cycle based on
rtas-set-sensor-state calls from the guest. As part of qemu-side hotplug/unplug
you use the attach/detach methods of the DRC to associate DT bits and callbacks
for things like device cleanup or rtas calls to fetch a DT node from the device
associated with a particular DRC.
I still need to fix endian issues, and am realizing the dr connectors and DT
bits for PHBs are not actually a prereq for PCI hotplug, so I may be pulling
that out to a separate series specific to enabling PHB hotplug (namely for
VFIO hotplug). I realize your CPU/MEM sort of depend on the top-level PHB
device tree code so I'm not sure how best to deal with that. Worse case we'd
roll the initial code into your series and base a follow-up series on that of
that instead.
Let me know if you have any questions.
>
> Regards,
> Bharata.