Ppc based weighted work conserving round robin arbiter

 
 
Verified

The weight functionality of the round robin arbiter allows each requestor to be granted a number of acknowledges according to its predefined weight, so using this arbiter results in a balanced number of acknowledges for each valid requestor.Round robin arbitration is a scheduling scheme which gives to each requestor its share of using a common resource for a limited time or data elements. The basic algorithm indicates that once a requestor has been serves he would “go around” to the end of the line and be the last to be served again. Using a parallel prefix computation (PPC) speeds up the next granted requestor calculation and improves the circuit timing.

This Component contains the verified verilog RTL code for a work conserving round robin arbiter, meaning there are no cycles spent on requestors that are inactive.

weighted PPC based work conserving round robin arbiter block diagram

Block diagram

The basic block diagram shows the weighted PPC based work conserving round robin arbiter:

Features

  • Work conserving round robin arbitration with weighted share of the number of grants per each requestor setting strict ratio of grants between all constantly active requests.
  • Weight setting for each of the requestors. A weight of 0 indicates that the requestor is masked, so its request will be ignored.
  • No time slots wasted on inactive requestors.
  • ack signal, for downstream stall of the arbiter, the grant vector will change only if ack is asserted. For a new grant every cycle; the ack signal can be kept asserted.
  • Fast parallel computation using PPC method.
  • Valid output indicating the arbiter grant is valid, meaning there are valid requests asserted.
  • Wrap around, meaning that the round robin reaches the end of the vector, it will return to the requestors at the beginning without losing cycles.
  • Parameterized output stage, providing either one-hot, binary decoded or thermometer decoded output for usage in different applications.

Optional features

  • Optional features are available through different downloadable files
  • synchronous and asynchronous reset.

Deliverables

  • verified RTL code
  • Access to detailed design documentation
  • 1 year support and customization service.

 

parameters table

parameters for typical Weighted round robin arbiter

Parameter

Valid values

description

WEIGHT_S

any

the width of the weight vectors

WIDTH

>7

Number of requestors

OUTPUT_TYPE

0,1,2

0 – ceil (log(WIDTH)) binary encoded index of the     granted requestor.

1 -  WIDTH one-hot encoding of the granted requestor

2 – WIDTH thermometer encoding of the granted requestor, all bits lower than the selected are 0.

Interface table

Signal name

Direction/width

description

clk

Input

Clock signal

resetb

Input

Active low reset signal, when asserted, mask would be set 0.

ack

input

Indication that the previous grant has been consumed by the downstream logic and the arbiter may progress to the next requestor.

valid

Output

 

Indication that the current grant is valid and is a result of a valid requestor.

request

Input [WIDTH – 1:0]

Request input vector

grant

Output [WIDTH – 1:0]

Grant output, either binary decoded or one-hot.

In case of binary decoding the width of the grant would be log2(request WIDTH) rounded up.

Weight[WIDTH – 1:0]

Input [WEIGHT_S – 1:0]

The number of weights in the weight vector is equal to the number of requestors. The WEIGHT_S parameter is the width of the weight input vectors.

Round Robin Arbitration

Round robin arbitration is a scheduling scheme which gives to each requestor its share of using a common resource for a limited time or data elements. The basic algorithm is that once a requestor has been serves he would “go around” to the end of the line and be the last to be served again.

The simplest form of round robin arbiter is based on assignment of a fixed time slot per requestor; this can be implemented using a circular counter. Alternatively, the arbiter can be defined to allow a specific number X of data elements per each requestor, in which case X data elements from each requestor would be processed before moving to the next one.

Round robin arbiters are typically used for arbitration for shared resources, load balancing, queuing systems, and resource allocation. Any application requiring a minimal fairness where none of the requestors is suffering from starvation is a valid application for a round robin arbiter.

Work conserving round robin

The term “work conserving” refers to the ability of a round robin arbiter to skip requestors that do not have any data to process and move directly to the next requestor who has valid data to process. This method of arbitration is more efficient as it does not “waste” time on idle requestors.

The work conserving round robin (WCRRB) arbiter is therefore required to constantly look ahead into the next available requestors and enable the next valid requestor to be granted immediately after the current requestor is complete.

The function of looking constantly into all requests in order to calculate the next valid grant, may be costly in logic, and as the number of requestors get bigger, faster computation methods must be applied in order to meet timing and limit the size of the logic.

Weighted PPC work conserving round robin arbiter

The option to assign weight for each of the requestors to the arbiter enables to create a priority between the requestors so that the ratio of processed data items from each requestor would be according to its weight. For example a weight = 7 requestor will have 7 times more time slots than a weight = 1 requestor.

This mechanism allows the prioritization of some flows over others in an arbitration context.

The function of the weight is achieved by blocking the incoming request once the predefined number of grants from the specific requestor is complete or the requestor has the requests signal de-asserted.

Parallel prefix computation (PPC)

Parallel prefix computation is a type of parallel calculation serving to reduce the number of stages it takes to calculate a function on multiple inputs by using parallel computations where possible. The example below is for a OR computation resulting in a thermometer decoded vector. From the first bit which is set onwards, all bits in the output vector would be set.

The OR parallel prefix computation enables the arbitration logic to find quickly the first request which is asserted, resulting in the ability to find the next valid requestor with the minimal possible logic cones.

The function of an OR PPC is demonstrated in the following table and drawing:

PPC parallel prefix computation


Request vector

PPC output

0001

1111

0010

1110

1100

1100

0100

1100

1000

1000

0011

1111

1001

1111

1110

1110

 

PPC Work conserving round robin arbiter

Using the PPC computation for a work conserving round robin uses a mask to block the currently granted requestor, enabling the next selection of the PPC to indicate the next requestor in line.

To generate the requestor ID from the PPC output vector the vector is binary decoded, resulting with the ID of the granted requestor.

The arbiter also handles the wrap-around of the requesting vector. At some point, one of the highest bits in the requestor’s vector is selected and the next selection would have to be at the lower requestors.  The wrap-around operation is performed without loss of cycles.

The simple form of the work conserving round robin arbiter is well suited for fair arbitration between equal priority and equal processing time requestors, ensuring that the same number of grants will go to each valid requestor.

The arbiter outputs would also indicate that the current result is valid. This would enable downstream logic to ignore the arbiter output in the situation where no requests are asserted.

There is one cycle delay between the assertion of a request and the grant for it, so the arbiter can be integrated into a block as a single pipe stage.

Valid signal function

The purpose of the Arbiter valid signal is to enable downstream logic to distinguish between the case where no requests are asserted and the case where request 0 is asserted in a binary decoded arbiter. Those two cases would result in a binary decoded output of 0 and the valid bit would indicate that request 0 is granted.

Acknowledge signal function

The ack signal serves for stalling the arbiter in the case where downstream logic is unable to process the grant. This function enables the integration of the arbiter in flow controlled pipeline implementation.

The PPC mask, responsible for masking the current request and causing the arbiter to grant the next requestor, will only change if the ack signal is asserted and the PPC output is valid.

Output type

The arbiter can have three types of output, depending on the requirement of the surrounding logic.

A one-hot output type outputs a vector where only the bit associated with the selected requestor is asserted in the grant vector. The grant vector in this case would have the width of the request vector.

A binary decoded output would have the index of selected requestor, decoded using the number of bits required to decode the highest requestor number. The lowest requestor index is 0. Therefore, the valid bit should be used to determine if the arbiter output is valid.

The Thermometer decoding option would output a thermometer decoder vector. This can be used by downstream logic for further calculation or serve other scheduling schemes using this round robin arbiter.

 

Output Type

Parameter setting

description

Binary decoded

0

ceil (log(WIDTH)) binary encoded index of the granted requestor

One-hot

1

WIDTH one-hot encoding of the granted requestor

Thermometer

2

WIDTH one-hot encoding of the granted requestor

Weight logic

The weight of each requestor is actually the maximal number of grants the requestor will receive in the current round. If the request signal is de-asserted, the arbiter would move to the next requestor.

A weight of 0 serves to mask off the requestor, indicating to the arbiter that this request is to be ignored.

The weight of multiple requestors may be the identical and not all weight options must be used.

The minimal weight used, is a weight of 1 so each valid requestor would have at least one grant before it is blocked by the weight logic.

 

Test Items

the below table summarizes the test items that were checked for the arbiter

Item name

Description

Check resulting arbiter distribution

Run many requests where all requestors have constant valid request asserted. See that the number of grants fit the weight  definition

Use same weight for all

Check equal distribution

De-asserted request

De-asserting a request so it cannot fill its allocated weight, see no effect on rest of requestors

Use weight 0

See that the requestor is ignored

Use constantly asserted ack

Check distribution

 

 

 

 

 

Usability

The weighted work conserving round robin arbiter is designed to be used in the context of pipelined decision blocks and within multiple queue or FIFO systems where its advantages are:

  1. Enable fair arbitration, granting all requestors the number of times corresponding to their weight.
  2. Simple integration into FIFO systems.
  3. Fast PPC selection logic

Care must be taken in the following issues:

  1. When the request vector is fast changing and has bits that get de-asserted, overall fairness may be compromised

Integrating with a FIFO system

 

The typical usage of a round robin arbiter for a FIFO system with multiple FIFOs and an arbiter to select which of them will be extracted is implemented through the following connections:

  1. Connect the request vector to the ~empty or each of the FIFOs so when the FIFO is not empty, a request to the arbiter will be asserted.
  2. Connect the weight vector to external configuration registers, setting the priority for each queue.
  3. Connect a one-hot grant vector (OUTPUT_TYPE=1) to the extract of the FIFOs. This would be masked with the downstream flow control signaling.
  4. Connect the arbiter ack signal to the downstream flow control signal so each time there is no flow control, the ack would be asserted providing that the arbiter valid is asserted.