# AN ENHANCED OPS ARCHITECTURE WITH OPTICAL BUFFERS

Tawfik Ismail<sup>a</sup>, Haitham S. Hamza<sup>b</sup>, Khaled Elsayed<sup>c</sup> and Jala Elazab<sup>a</sup> Department of EAL National Institute Of Laser Enhanced Sciences<sup>a</sup> Department of Information Technology Faculty of Computers and Information<sup>b</sup> Department of Electronics and Communication Faculty of Engineering <sup>c</sup> Cairo University

Abstract—Optical Packet Switching (OPS) is a promising technology to enable next-generation high-speed IP networks. One of the main components in an OPS network is the optical switch architecture that provides the basic functionality of switching packets from input ports to the desired output ports while maintaining data in the optical domain. In asynchronous OPS networks, contention may arise when two or more packets need to be directed to the same output source leading to packet loss and thus lower switching performance. Optical buffering, which is implemented by fiber delay lines (FDLs), is one of the approaches used for resolving contention. In this paper, we focus on the design a new FDL-based switch architecture that resolves packet contention in asynchronous OPS networks and achieves the same performance as that of best known architectures but with a reduced hardware complexity. Our analysis shows that, the proposed architectures possess some interesting properties as compared to existing designs. For example, for the same packet loss level, the proposed architecture requires a number of switch ports less than that used in generic architecture.

## I. INTRODUCTION

Wavelength Division Multiplexing (WDM) technology has emerged as a cost efficient method to increase the transmission capacity. In WDM, many optical carrier signals are multiplexed on one optical fiber by assigning different wavelengths to each optical carrier. This multiplexing allows for capacity increase without installing more fibers. Theoretically, the total capacity can reach 1.6 Tbit/s over one fiber where each fiber can carry 1000 wavelengths. Also WDM systems permit unidirectional and bidirectional communications over single fiber [1], [2], [3]. WDM technology is evolving from optical circuit switching technology to optical burst switching and optical packet switching technologies.

The Optical Circuit-Switching (OCS) is the first-generation optical WDM network architectures. It is the simplest approach to design an optical network which relies on wavelength functionality for routing. Circuits or connections are established end-to-end; these connections reserve wavelengths in each fiber and are known as light paths between all pairs of end-nodes. Each connection goes through a setup phase and a release phase. Through the time of connection, the wavelength is reserved for this lightpath on a particular link and cannot be used for any other connection until lightpath is released. Therefore OCS is suitable for very high speed and continuous traffic but inefficient for data traffic due to large granularity aggregated traffic which reduces the utilization of the OCS [4], [5], [6], [7].

In the OBS network, the control and data information travel separately on different channels. Control packets are sent first to reserve the resources in intermediate nodes along end-toend path. The data burst follows the control packet with some offset time [4], [5], [6], [7], [8], [9].

Optical Packet Switching (OPS) is the next evolution to use packets as the switching unit in an optical network. It provides the possibility of fine granularity and more effective use of bandwidth in large capacity systems on a scale of Tb/s speeds. In an OPS network, the control and data information are sent simultaneously on the same channel and each intermediate node converts the control headers to take the switching decisions while the packets always remain in the optical domain. The all-optical packet switching network will be developed to be mature soon and will be combined with WDM system to be the high-speed transport network in this century [2], [10]. The packets can have either fixed or variable length. A fixed length packet is like an ATM cell and a variable length packet is like an IP packet.

In this paper, we focus on the design of efficient WDM-OPS architecture with variable packet length as first consideration in our design, because it is the future generation of WDM network and provides low switching granularity, and due to its cost-efficiency and suitability for internet traffic based on IP packets.

In asynchronous optical packet switching, packet contention is a major issue, the contention occurs when two or more packets arrive from two or more input ports with same wavelength require the same output port at the same time. In such a case only one packet can be forwarded to the output while the other packets will be dropped. This will lead to lower utilization of the packet switch. The contention can be resolved by any of the following three approaches: optical buffering, wavelength conversion, and space deflection or the combination of more than one approach which is called hybrid approach.

In this paper we present an enhanced OPS architecture that makes use of optical buffers to resolve the packet contention. We show that, when the number of wavelengths per fiber (W) is larger than the number of fiber (F) and the number of FDLs (B) is larger than F the proposed architecture exhibits a lower packet loss performance and lower number of ports when compared to existing architectures [12], [15], [19].

The rest of this paper is organized as follows: section II gives a brief overview on the concept of optical buffering architecture. The proposed architecture is presented and its performance analyzed via simulation and compares the proposed architecture with existing solutions in Section III and IV, respectively. The conclusion is provided in Section V.

## II. OPTICAL FIBER BUFFER

The optical buffer in optical packet switching networks is used to resolve packet contention. It is implemented as a set of Fiber Delay Lines (FDLs) made from optical fiber which has different delay time to enable different compensation of concurrent packet contention. The contending packets are forwarded to the suitable FDL corresponding to sufficient delay time to resolve the contention.

Let D is the basic delay unit of FDL called the granularity and B the number of FDLs. The length of the ith FDL is  $L_i$ where  $L_i = i \times D$  and  $1 \le i \le B$ . Each D represents a fiber loop in the FDL thus the first FDL has a length of 1D, the second of 2D, the third of 3D and so on. The data circulates through these loops for a variable period of time according to required delay for contention resolution.

According to the position of buffers, optical buffering can be classified into different types: buffers dedicated to wavelength, shared per port, and shared per node [8], [9], [16], [17]. In each type, optical buffering can be arranged in a Feed-Forward (FF), Feed-Back (FB) (or recirculation), or hybrid fashion. In FF architectures, each delay line connects an output port of the switch to an input port in the buffer switch where the contending packet is forwarded to the appropriate delay line then it leaves the buffer regardless of its success or failure in accessing the output port. Thus, there is only one chance to access the buffer. On the other hand, in FB architectures, even though an attempt to access the output port fails after buffering, the packet can be further re-circulated in the buffer until the packet transport succeeds or the maximum number of re-circulations is reached. Thus, this buffer provides more chances to access the output ports while increase buffering delay with different degradation. In hybrid architectures, both FF and FB buffering schemes are combined.

In buffers dedicated to wavelength, each output channel has a buffer dedicated to a single wavelength. If two or more packets with the same wavelength try to access the same fiber, the contending packets can be delayed along the contending time for its dedicated buffer and then forwarded to the output port. This method gives lower blocking probability, but it requires more buffers and larger switch fabric size. On the other hand, shared buffering can achieve good performance for packet switching and can be used to reduce the total number of buffers in a switch, while achieving a suitable level of packet loss [15], [19]. In the Shared per Port (SP), all output ports per fiber can access only a dedicated buffer while in the Shared per Node (SN), all output ports of the switch can access the same buffers. The number of FDLs directly affects the blocking performance and average delay.

In our study we focus on the shard per port technique because it has good performance, suitable for asynchronous packets, and simple to manage as compared to the shard per node technique. In addition, we also focus on the asynchronous optical packet switching with variable packet length and feedforward architecture which is more suitable for IP traffic [19].

#### **III. PROPOSED ARCHITECTURE**

In this section, we proposed a new feed-forward shared per node optical buffering architecture, and study the routing algorithm for this architecture.

#### A. Proposed Design

Several optical packet switch architectures are proposed based on the output-buffer scheme. The generic architecture, feed-forward shared per port, a set of optical buffers is shared by many wavelengths on a single output port. In this architecture, two-stage switching elements are required as shown in Fig. 1. In the first stage, B ports are needed for buffers and W ports are needed for wavelengths, the total [B + W]ports are needed for each output port. Therefore, the size of the switching fabric is  $MW \times M[B + W]$ , where M is the number in input fibers. In the second stage, a switching fabric of size  $[B \times W]$  and MB FDLs are required.

In the proposed architecture, we use the output-buffer technique as shown in Fig. 2. The difference between the proposed architecture shown in Fig. 2 and the generic architecture shown in Fig.1 is the swapping between the buffer and buffer switch. In the generic buffer architecture the main switch is connected to the buffer bank then to the buffer switch which finally connected to the output fiber. While in the proposed architecture the main switch is connect to the buffer switch which connected to the buffer bank. The buffer bank is connected to a multiplexer and demultiplexer which connected to the output fiber. The output-buffer switch consists of  $(M-1) \times W$  ports while the main switch consists of  $MW \times [(M-1) + W]$ .

### B. Routing Algorithm

In an OPS equipped with optical buffers, a scheduling algorithm is needed to direct packets to the FDL. When several packets with the same wavelength arrive simultaneously, the control scheduler drives every packet to an output port. If the needed wavelength on the output port is not used, the control unit forwards the packet to the output port directly. Otherwise, if the wavelength is not available at the output port, the packet is directed to the proper FDL. The list of buffering schemes used in the asynchronous communication proposed [18].



Fig. 1. Generic shared optical buffers per port architecture.



Fig. 2. Proposed optical buffers architecture.

In this study, we will focus on the Round Robin (RR) scheduling algorithm which is the simplest scheduling algorithm in an optical network for packet queue handling and is also easy to implement by software [19].

RR scheduling serves each packet in equal portions and order, giving all packets the same priority. The packets are served according to the order of switch input ports where the packet coming to port 1 will be served first then the packet coming from port 2, and so on. Once a route is determined for a packet the next packet will be served. RR scheduling uses non-preemptive scheduling thus, once a packet is given the FDL it cannot be taken away.

The procedure used in the selection of the FDLs is shown in Fig. 3. This procedure is applied based on the Round Robin scheduling algorithm [18]. When the packet reaches the input port and requires certain output port, the buffer manager checks if this port is available, i.e., no packet with the same wavelength is accessing this port at this time. If the wavelength is available, the packet is sent to the required port directly; otherwise, the minimum required delay unit's  $\Delta$  is calculated. If the FDL with calculated units is not available, the system steps up the number of units until a free FDL is found, or the maximum FDL length is exceeded. Otherwise, the packet will be discarded and considered as packet loss.

Let  $l_m^{(i)}$  denote the length of a packet observed by port m; in the  $i^{th}$  cycle, and  $t_m^{(i)}$  denote the time difference between

the starting time of the  $i^{th}$  cycle and packet arrival at port m. The buffer manager calculates total time needs for this packet based on, the length of each packet and the arrival gap "time delay between two concurrent generated packets" as arrival information of a packet. It maintains buffer occupancy  $q^{(i)}$ , it represents the time that, packet stored in the buffer in the  $i^{th}$  cycle depart and the buffer becomes idle. This time is defined relative to the starting time of the targeted  $i^{th}$  cycle. Hereafter we will use  $q_m^{(i)}$  to denote buffer occupancy at  $i^{th}$ cycle by port m. The buffer manager calculates the delays for new packets coming from all ports during each cycle time T based on sequential scheduling at ports 1, 2, 3, M. When two or more packets arrive in the same cycle, the buffer manager calculates the delay for the packet at the port with the smallest index first and updates the buffer occupancy for the remaining packets. Buffer occupancy is updated whenever a packet enters the buffer. Theoretically,  $q_{(m-1)}^{(i)} - t_m^{(i)}$  is suitable delay for each packet to avoid packet collision, because of the discretetime nature of the FDL buffer where it constructs as multiples of *D*, the delay given to one packet must be  $\Delta_m^{(i)}D$ , where  $\Delta_m^{(i)} = \lceil (q_{(m-1)}^{(i)} - t_m^{(i)}/D) \rceil$ , where  $\lceil X \rceil$  means the smallest integer greater than or equal to X. A packet enters delay line  $\Delta_m^{(i)}$ if  $\Delta_m^{(i)} < B$ , and it is discarded if  $\Delta_m^{(i)} > B$ . When a packet enters the delay line, the buffer occupancy  $q_m^{(i)}$ , changes to  $q_m^{(i)} \leftarrow t_m^{(i)} + l_m^{(i)} + \Delta_m^{(i)} \times D$  to handle packets at the subsequent ports appropriately. After calculating the packet delays for all ports, buffer management is ready to calculate new packet delays during the next cycle. The Pseudo-code which explains this process is shown in Fig. 4.



Fig. 3. The routing procedure for the proposed architecture.

#### **IV. SIMULATION AND ANALYSIS**

This section is divided into two main parts: first part is provides the validation of the proposed architecture to the architecture in ref. [15] by studying effect of changing the design parameters (D, average load  $\rho$  and B) on the switch performance in term of *packet loss and average delay*. The

| 1  | for $m \coloneqq 1$ to $M$ do                                                 |
|----|-------------------------------------------------------------------------------|
| 2  | begin                                                                         |
| 3  | if $(l_m^{(i)} \neq 0)$ then                                                  |
| 4  | begin                                                                         |
| 5  | $\Delta_m^{(i)} := \left\lceil (q_{m-1}^{(i)} - t_m^{(i)}) / D \right\rceil;$ |
| 6  | while $\Delta_m^{(i)}$ not available                                          |
| 7  | $\Delta_m^{(i)} := \Delta_m^{(i)} + 1;$                                       |
| 8  | if $\Delta_m^{(i)} < B$ then                                                  |
| 9  | begin                                                                         |
| 10 | $q_m^{(i)} := t_m^{(i)} + l_m^{(i)} + \Delta_m^{(i)} \times D;$               |
| 11 | Packet m is given delay $\Delta_m^{(i)} D$ ;                                  |
| 12 | end                                                                           |
| 13 | else packet m is discarded;                                                   |
| 14 | m = m + 1                                                                     |
| 15 | end                                                                           |
| 16 | end                                                                           |
|    |                                                                               |

Fig. 4. The Pseudo-code for proposed architecture.

second part compares the proposed architecture to two main architectures, one is shown in Fig. 1, and the other is referred to ref. [19].

In our simulation we use in a uniform traffic distribution to generate the packet length for each channel per fiber where packets generation for *i* channel are independent of packets generation for *j* channel, where  $i \neq j$  and the arrival packet inter-arrival time has an exponential distribution. The parameters used in simulation are given in Table 1 [18], [19].

TABLE I Simulation Parameters

| Number of Fibers (M)              | 16                            |
|-----------------------------------|-------------------------------|
| Number of Channels/Fiber (W)      | 128                           |
| Channel Speed                     | 40.0 Gbps                     |
| Minimum Packet Length $(L_{max})$ | 64 bytes                      |
| Maximum Packet Length $(L_{min})$ | 2,040 bytes                   |
| Average Packet Length             | 512 bytes                     |
| Number of FDLs (B)                | 16, 32, 64                    |
| Unit Length of FDLs (D)           | 0.1 : 1 Average Packet Length |
| Guard-band distance               | 12.5 bytes                    |
| Clock Speed                       | 78.2 MHZ                      |
| Average Load $(\rho)$             | 0.4, 0.6 and 0.8 Erlang       |

#### A. Simulation Validation and Analysis

## PacketLoss

First we validate *the effect of the granularity on the packet loss* in the proposed architecture of Section III as composed with the architecture proposed in ref. [15] then study the effect of number of FDLs (B) and average load  $\rho$  on the packet loss. Fig. 5 shows that the minimum loss is obtained when granularity is between 0.20:0.40 the average packet lengths in both simulations. Moreover, the packet loss of proposed architecture is less than the packet loss produced by the architecture of in ref. [15].

As for the effect of the number of FDLs on the buffer loss, Fig. 6 shows the simulation results for shared buffering with



Normalized granularities (D) (b)

0.5 0.6

0.7

0.9

0.4

Fig. 5. Packet loss versus granularity with (a) B = 50 and (b) 80 while average load = 0.8 Erlang

different number of FDLs (B), where the granularity of FDLs equals 0.25. We can conclude that when the number of FDLs increases, the packet loss decreases. For example the packet loss is around 5% when B equals 20, while the packet loss declines to 0.05% when B equals 100. To study the effect of the average load ( $\rho$ ) on the packet loss, Fig. 7 shows the simulation results for shared buffering with different  $\rho$  and for various numbers of FDLs (16, 32 and 48). The granularity of FDLs (D) is set to 0.25. As expected the packet rises as  $\rho$  increases. For example, when B equals 16 and  $\rho$  is about 0.4 Erlang, the packet loss is approximately 0.3%, while the packet loss decreases to 0.0001%, while the packet loss becomes 0.8% when  $\rho$  is to 0.8 Erlang.

Averagedelay

0.1 0.2

Second we validate the effect of the granularity on the average packet delay in the proposed architecture with those of the architecture proposed in ref. [15] then study the effect of number of FDLs (B) and average load ( $\rho$ ) on the average delay. Fig. 8 shows that the minimum granularity is between 0.20:0.40 the average packet lengths in both simulations, and the curve behavior is also the same.

As for the effect of the number of FDLs on the average delay, Fig. 9 shows the simulation results for shared buffering with B. We can conclude that when the number of FDLs

increases, the average delay increases sharply until it becomes steady when B is above 100. For example, the average delay reaches to 200 ns when B equals 20, while the average delay increases to 450 ns when B changes to 100.

As for the effect of the load average on the average delay, Fig. 10 shows the simulation results for shared buffering with different load average p and different number of FDLs B = (16, 32, and 48). As can be seen, the average delay increases while the load average increases. For example, when B equals 16, the average delay becomes 100 ns if  $\rho$  equals 0.4 Erlang, while the average delay increases to 190 ns if  $\rho$  changes to 0.8 Erlang. Also if B rises to 48, the average delay becomes 106 ns when  $\rho$  equal 0.4 Erlang, while the average delay increases to 400 ns if  $\rho$  moves to 0.8 Erlang.







Fig. 7. Packet loss versus average load.

#### Comparison With Existing Architectures

This section compares the results of proposed architecture shown in Fig. 2 with those reported in the literature for the same class of output buffering shared per port architectures, the generic architecture shown in Fig. 1 and the proposed architecture in ref. [12]. The proposed architecture in ref. [12] is used to resolve the contention in optical burst switching networks. We can make this comparison based on ref. [19] which used both optical burst switching and optical packet switching networks as optical-based switching networks and used same architecture to resolve the contention. Fig. 11 shows this comparison, it denotes that the packet loss in proposed architecture is slightly larger than the packet loss produced



Fig. 8. Average Delay versus granularity with (a) B = 50 and (b) 80 while average load = 0.8 Erlang.



Fig. 9. Avg. delay versus number of FDLs B.

by ref. [12], while the average delay is approximately the same in both architectures. On the other hand the generic architecture have packet loss slightly larger than the another two architectures.

We can conclude that the proposed architecture has packet loss behavior in the middle of the generic architecture and the architecture proposed in ref. [12] when the load average equals 0.6, while the proposed architecture is better in some region when the load average increases to 0.8. In addition the proposed architecture can reduce the switch ports in case of the number of wavelength per fiber and the total number of



Fig. 10. Average delay versus average load.





Fig. 11. Packet loss versus granularity with average load = (a) 0.6 and (b) 0.8 Erlang.

As for the main switch and buffer switch ports, consider the generic architecture requires  $M \times [W + B]$  output ports for main switch and BW ports for the buffer switch. The proposed architecture requires M[W+M-1] output ports and  $B \times [M - 1]$  ports for the buffer switch. As can seen, the proposed architecture reduces the main switch ports by the ratio [W + M - 1]/[W + B] if B > M - 1. Also, the proposed architecture reduces the buffer ports to [M - 1]/Wif W > M - 1. For example, consider W = 128, M = 16 and B = 100, the main switch ports of the proposed architecture equals 0.63 of the generic architecture main switch ports, while the buffer ports equals 0.12 of generic architecture buffer switch ports.

#### V. CONCLUSIONS

In this paper, we introduced modified FDL-based OPS with reduced complexity. As can be seen, we have discussed a proposed shared FDL buffering architecture to resolve contention in asynchronous optical switches. This architecture based on FDLs only and it provided to evaluate the packet loss and delay performance of shared buffers. By exploiting the granularity of FDLs, we have observed that shared buffers can significantly reduce the packet loss with much smaller switching fabrics and much fewer FDLs.

#### REFERENCES

- B. Mukherjee. "WDM Optical communication networks: progress and challenges," *IEEE J. Sel. Areas Commun.*, vol. 18, no. 10, pp. 1810-1824, Oct. 2000.
- [2] R. V. Caenegem, J. M. Martinez, D. Colle, M. Pickavet, P. Demeester, F. Ramos and J. Mart?. "From IP over WDM to all-optical packet switching: economical view," *J. Lightw. Technol.*, vol. 24, no. 4, pp. 1638-1645, Apr. 2006.
- [3] C.-H. Chang, M.R. Perati, J. Wu and S.-K. Shao. "Performance study of various packet scheduling algorithms for variable-packet-length feedback type WDM optical packet switches," *IEEE on Workshop, High Performance Switching and Routing* June 2006.
- [4] S. J. Ben Yoo. "Optical packet and burst switching technologies for the future photonic internet," J. Lightw. Technol., vol. 24, pp. 4468-4492, Dec. 2006.
- [5] M. Klinkowski, D. Careglio and J. Sol-Pareta. "Wavelength vs burst vs packet switching: comparison of optical network models," *Advanced Broadband Communication Center CCABA Jordi Girona*, Barcelona 2002.
- [6] P. Bayvel. "Wavelength routing and optical burst switching in the design of future optical network architectures," *Optical Comm.*, 2001.
- [7] J. Y. JEONG and J.-M. JEONG1. "Asynchronous variable-length optical packet switch with delay-line loop buffers," *IEICE Trans. Comm.*, Sep. 2007.
- [8] F. Callegati. "optical buffers for variable length packets," *IEEE Commun. Lett.*, vol. 4, no. 9, pp. 292-294, Sep. 2000.
- [9] K.K. Merchant, J.E. McGeehan, A.E. Willner, S. Ovadia, P. Kamath, J.D. Touch, J.A. Bannister. "Analysis of an optical burst switching router with tunable multiwavelength recirculating buffers," *J. Lightw. Technol.*, vol. 23, no.10, pp. 3302-3312, Oct. 2005
- [10] L. Wosinska, J. Chen. "Contention resolution in an asynchronous alloptical packet switch," *IEEE Inter. Conf. Photonics in Switching*, pp. 1-3, Oct. 2000.
- [11] V. Eramo. "An analytical model for TOWC dimensioning in a multifiber optical-packet switch," . *Lightw. Technol.*, vol. 24, no.12, pp. 4799-4810, Dec. 2006
- [12] T. Zhang, K. Lu, and J. P. Jue. "An analytical model for shared fiberdelay line buffers in asynchronous optical packet and burst switches," *IEEE Inter. Comm. Conf. (ICC)*, vol. 3, pp. 1636-1640, 2005.
- [13] S. L. Danielsen, C. Joergensen, B. Mikkelsen, and K. E. Stubkjaer. "Optical packet switched network layer without optical buffers," *IEEE Let. Photonics Technology*, vol. 10, no. 6, pp. 896-898, Jun. 1998.
- [14] D. K. Hunter, M. H. M. Nizam, M. C. Chia, I. Andonovic, K. M. Guild, A. Tzanakaki, M.J. O'Mahony, L.D. Bainbridge, M.F.C Stephens, R.V Penty and I.H. White. "WASPNET: a wavelength switched packet network," *IEEE Commun. Mag.*, vol. 37, no. 3, pp. 120-129, Mar. 1999.
- [15] T. Zhang, K. Lu, and Jason P. Jue. "shared fiber delay line buffers in asynchronous optical packet switches," *IEEE J. Sel. Areas Commun.*, vol. 24, no. 4, pp. 118-127, Apr. 2006.

- [16] I. Chlamtac, A. Fumagalli and C.-J. Suh. "Multibuffer delay line architectures for efficient contention resolution in optical switching nodes," *IEEE Trans. Comm.*, vol. 48, no. 12, pp. 2089-2098, Dec. 2000.
- [17] H. Harai and M. Murata "High-speed buffer management for 40 gb/sbased photonic packet switches," *IEEE/ACM Trans. Networking*, vol. 14, no. 1, pp. 191-204, Feb. 2006.
- [18] F. Callegati, W. Cerroni. "Wavelength allocation algorithms in optical buffers," *IEEE Inter. Conf.*, vol. 2, pp. 499-503, 2001.
- [19] J. Choi and M. kang. "Service differentiation using hybrid shared optical buffers in transparent optical networks," OSA OPTICS EXPRESS, vol. 14, no.12, pp. 5079-5091, Jun. 2006.