Sorry, just saw this now.
If you use a CPU's adder to implement a counter, but where the data that needs to be incremented is in a PIO, then the CPU must use IORD, then increment the value, the IOWR the value back. This means that if multiple CPUs want to manipulate the counter, then they first need to use mutex to ensure mutual exclusion access to the counter.
What we are asking you to do here is to implement a counter in HW directly such that a CPU's write access to, for example, address 0 of the counter causes the value to increment automatically, and a write to, for example, address 4 of the counter would cause the value to decrement automatically.
With such a counter, the CPUs don't actually need a mutex as only one CPU can access the counter at a given time and the avalon bus will perform arbitration between multiple CPUs if ever they simultaneously try to access it. So if you don't need to read the value back, but just increment/decrement it, then such a hardware primitive is much faster than having a software-based exclusive access policy for a shared variable.