Solving a Capacitated Waste Collection Problem using an Open-source Tool ⋆

.


Introduction
Waste generation is growing exponentially, and solid waste management systems are now under pressure to manage the high content of waste. Advances in the industry combined with population growth in cities are increasing the complexity of municipal solid waste (MSW) streams worldwide. Due to the increased complexity shown by the waste, management costs are becoming a problem of public concern. Investments in this sector are often left aside in low and middle-income countries, which can lead to environmental and health problems of inappropriate MSW management [20]. In this sense, solid waste management systems (SWMS) need solutions to manage this high increase in both complexity and amount of waste [11]. An alternative could be to reduce the costs associated with collection and transport, to have more funds available for the assembly of facilities with more sophisticated technology to deal with the waste. Collection and transport represent the first instance in SWMS and constitute 60-80% of total costs [14].
The fleet of vehicles responsible for waste collection and transport over a specific area can also include different types of vehicles, each one assigned to particular kinds of waste [1]. Further challenges are associated with waste transport fleets, including route planning, cost-management, labor allocation, traffic congestion (for big cities), periodic maintenance of the fleet, and task assignments. Traditionally, solid waste collection is carried out without prior analysis related to the demand for planning collection routes. In this strategy, drivers are responsible for route planning, and there is a predetermination of the number of trips per week that trucks will make [7]. This approach has several limitations since waste generation, fundamentally stochastic, is often considered a constant entity. The inefficient collection of solid waste gives rise to unnecessary expenditures, and can also have a negative impact on air quality around waste dumpsters, which can disturb the population (obnoxious effect) [15,13].
In this regard, cities can benefit from innovative solutions achievable by using Information and Communication Technologies (ICT). The use of these technologies can upgrade the city's infrastructure to the smart city level, saving resources that could be used to improve other aspects of SWMS. Practical examples would be the optimization of collection routes using search algorithms, and also the use of Internet of Things (IoT) technology. With such a tool, it is possible to apply vehicle-guided techniques to improve the efficiency of operations. IoT can be defined, in general terms, as an extension of traditional computer networks, in which devices and gadgets capable of sensing, computing, and communicating are used. Several types of smart objects can be used in a collaborative approach to build a system for smart cities, improve decision-making with real-time information, and apply algorithms to solve mathematical problems instantly [2,3,21].
To optimize the collection routes using a merged system composed of optimization algorithms and IoT, which showed the best results in the literature, route optimization algorithms need to be applied and validated. In this work, three metaheuristics available in Google OR-tools will be assessed to optimize the waste collection in 20 paper waste dumpsters in the city of Bragança, Portugal.
The level of waste during the test days was determined based on a stochastic approach, that takes into consideration the region where the dumpster is inserted.
The rest of the paper is organized as follows: Sect. 2 brings the related literature and real data regarding municipal solid waste generation; Sect. 3 presents the methodology employed; Sect. 4 summarizes the results; and finally, Sect. 5 shows the main findings of the present study and future work.

Related Literature
In this section, the most relevant and updated literature associated with the waste collection topic is presented. Documents were obtained by searching in Web of Science (WoS) and Scopus databases for keywords "waste collection" and "optimization" in the fields "Abstracts, keywords, and Titles". 995 and 1333 documents were found in this first search in WoS and Scopus databases, respectively. After removing duplicates using an open-source python-based scientometric analysis, the ScientoPy tool [22], the WoS entries were reduced to 631 and Scopus to 780, which represents 63.4% and 58.5% of the initial databases. Fig. 1 presents the accumulative number of documents associated with the most relevant keywords throughout the years.  From Fig. 1, it is possible to verify the sharp growth of keywords (mainly from the year 2012), such as solid waste, vehicle routing problem, smart cities and specific algorithms, which reveals the priority and applicability of this research.

Solid Waste Generation
According to the latest available data, global waste generation reached 2.01 billion tons in 2016, which corresponds to 0.74 kg of daily waste generated per capita. This value can vary between 0.11 kg to 4.54 kg, depending on the income level of the country [10]. Countries in the Pacific, East Asia, Europe, and Central Asia regions account for 43% of the world's waste generation. Sub-Saharan, Middle East, and North Africa regions are the regions with the lowest waste generation, accounting for 15% of the world's waste. The overall scenario shows that waste generation is strictly connected to economic development.
Municipal solid waste represents an average of 7-10% of the total waste generation in EU countries, being one of the most complex to manage [10]. On the other hand, in Portugal, this type of waste has a high share of the total waste generated. The way that a particular European country deals with this stream is an indicator of the overall quality of waste management systems in that country. In fact, Portugal is the EU Country with the highest share of municipal solid waste (32.8%), followed by Latvia (32.6%) and Croatia (23.3%). The waste generated in cities presents a complex composition, has a very high public visibility, and has a direct impact on human health. Municipal solid waste data shows that each European citizen was responsible for an average of 505 kg of municipal waste generation in 2020. Eurostat data show an increase of 8.2% compared to the first record in 1995 for the total waste generated in Europe. In this scenario, Portugal is one of the countries with the highest increase in waste generation, with 45.7% more urban waste generated compared to the first record [10].
The company responsible for waste collection in the Northeast region of Portugal is Resíduos do Nordeste (see www.residuosdonordeste.pt), covering 13 municipalities and around 134000 inhabitants. In this context, Bragança is the city with the largest number of inhabitants, therefore with the highest amount of waste generated (27.4% of the total waste collected in 2020). In 2020 the company collected a total amount of 56973.74 tons of waste, from which 4719 came from the selective collection of waste. Despite the low amount of waste collected in the selective collection, data records show an increasing number of recycling points installed by the company. Each point is composed of 3 dumpsters, for the deposition of cards and paper (blue), glass (green), and metal and plastic (yellow). The number of recycling points increased from 616 to 939 from 2016 to 2020, and the amount of waste collected increased from 3039 to 4719 tons. Figure 2 shows the evolution of the most significant types of waste collected by the company in a selective collection from 2014 to 2020. The trend of waste collected for different types of waste shows that paper waste is the most generated, with the highest increase in collection among the other types.

Waste Collection Approaches
From the database containing 1411 records, those with the keyword "vehicle routing problem" (see Fig. 1) or similar were filtered for further analysis of literature, which resulted in a small database composed of 88 documents. From these 88, 35 papers were found with the potential to contribute to a brief literature review on the topic of vehicle routing problems in a municipal solid waste collection from 2017 to 2021. A fundamental step for solving the vehicle routing problem is the mathematical formulation. Each study considers specific objective functions and constraints in an attempt to mimic real-case scenarios in specific regions. The basis for most of the problems reported in the literature is the vehicle routing problem (VRP), a variant of the classic traveling salesman problem (TSP). The difference between TSP and VRP is the number of vehicles scheduled to perform the task, and in this regard, VRP necessarily uses more than one vehicle to collect items. Other important issues related to the solid waste management system are presented in the literature. Chao et al. [12] for example addressed the optimization of collection routes in urban areas, considering the recycling operation in the sorting facility. A multi-objective approach was used to maximize the total profits of the system and balance the workload of recycling operations at the sorting facility over time. Zhang et al. [25] studied the optimization of waste transport routes with the objective of minimizing total distances traveled and maximizing the satisfaction of residents. Resident satisfaction was used in their work as the penalty cost against a time window constraint, as the activity frequency of residents is highly correlated with the time of the day.
For waste collection problems, the authors generally consider capacity constraints on trucks and time windows for collection. These considerations are widely known in VRP literature, with the respective variant of the problems namely Capacitated Vehicle Routing Problem (CVRP), Vehicle Routing Problem With Time Windows (VRPTW), and CVRPTW (both constraints considered). However, VRP can be used in several situations of scheduled collection, which gives rise to several innovative formulations of the problem that can be seen in [17]. Among these innovative formulations, one has shown particular interest in this work: Waste Collection Problem (WCP).
This class of problem was first introduced by Beltrami and Bodin in 1974, using a heuristic algorithm to address the collection of waste in New York City. WCP is an extension of VRP modeled to be closer to the problem faced by companies/municipalities regarding route planning for the collection of solid wastes. In the WCP, a set of homogeneous or heterogeneous fleets is designed to collect waste from multiple dumpsters, with a single depot. The problem is then to collect the highest amount of waste in the shortest distance. Therefore, three main agents can be identified in this formulation: the fleet, the central depot, and the dumpsters. At an early stage, to properly address the problem faced by one company, it is necessary to acquire information about how the system works in a real scenario so that the formulation is as complete as possible.
Similar to the classic VRP, WCP has different variations based on the type of waste to be collected. For instance, practices adopted for residential waste collection are different from those used in commercial waste. Some constraints used for VRP problems can be considered in the context of waste collection, which generates problems such as WCP with time windows or time-dependent WCP. An extensive and interesting literature review on solid waste collection optimization objective functions, and constraints can be found in [8].

Optimization Algorithms
Due to the large problem sizes and numerous constraints in waste collection problems, the optimization of collection routes of real-life waste management systems requires the use of non-exact methods to generate a near-optimal solution for the routes. Undoubtedly, the most popular approach to solving waste collection optimization is through metaheuristics. These methods are able to provide a sufficiently good solution to an optimization problem using limited computational capacity or incomplete information. The mechanism behind this algorithm is based on mimicking nature, once they are inspired by biological evolution or physical sciences. Metaheuristic algorithms are classified into trajectory-based (for example, simulated annealing and tabu search) and population-based approaches (for example, ant colony optimization and genetic algorithms).
Ant Colony Optimization (ACO) is based on the behavior of the ants following the path of other ants due to the deposition of pheromones and is recognized to be advantageous in terms of convergence speed when compared to other algorithms. Mancera-Galván et al. [16] employed two ACO algorithms to study waste collection routes, achieving improvements in route collection considering objective length minimization and routes re-design.
Genetic Algorithms (GA) is a widely known metaheuristic inspired by the evolutionary theory of the species, namely the natural selection, underlying con-cepts of survival of the fittest and the inheritance of characteristics from parents to offspring by reproduction. Three operators are often used for this purpose, namely crossover, reproduction, and mutation. GA is simple to be implemented, and for this reason, several works have explored their use to solve VRP in a solid waste collection context [9,4].
Simulated annealing (SA) is a widely used optimization technique inspired by metallurgy annealing. Babaee et al. [5] addressed the vehicle routing problem through SA with the goal of minimizing the total operational cost considering time windows. Tabu Search (TS) is a metaheuristic that uses memory structures (tabu list) to store ineligible candidate solutions to generate other candidates, prohibiting exploration of a solution in the tabu list. A waste collection synchronization mechanism was developed by Shao et al. [23] using TS to achieve higher profits. Guided Local Search (GLS) is a metaheuristic search method that operates by associating of cost and penalty of each solution, using properties of both TS and SA algorithms. In the work done by Barbucha et al. [6], GLS was combined with an asynchronous team concept to build an agent-based GLS algorithm to solve the CVRP.

Methodology
The main goal of this work is to evaluate the performance of different optimization algorithms available in the open-source tool (Google OR-tools) to find optimal routes for paper waste collection in the city of Bragança, located in the Northeast region of Portugal (inland city). The study was carried out considering a period of Da days and k dumpsters (real locations provided by the company). The daily level inside each dumpster was determined using a demographic factor, calculated by analyzing the individual dumpsters nearby. The locations of dumpsters j that needed to be collected on the day i along with waste levels were used as input to the system, and the output was the total distances traveled and the load carried using different search strategies to find the best route. Figure 3 shows a representation of the system, its input and, consequently, its output.

Capacitated Waste Collection Problem
The type of waste collected in selective collection with higher generation is paper, which is why paper waste dumpsters were chosen to be the focus of this study. The problem addressed here can be summarized as a capacitated waste collection vehicle routing problem (CWCP). This is a variant of VRP where vehicles with limited carrying capacity need to pick up or deliver items to multiple locations. The items have a quantity, such as weight or volume, and the vehicles have a maximum capacity that they can carry. The problem is to pick up or deliver the items for the lowest cost, without ever exceeding the capacity of the vehicles. These routes can be described using a graph where the dumpsters are the vertices and the roads are the arcs. A cost is associated with each arc, which can be the distance (cost associated with this approach), the travel time, or for For each arc (l, j) a non-negative value c lj is associated, which corresponds to the distance between the vertex l and j, in terms of cost. Associated with each location l a demand d l ≥ 0 is defined (d 0 = 0) such that d l ≤ Q, assuming that the set of trucks, K, have the same capacity Q. Thus, the CWCP solution must take into account the following assumptions: -Trucks have limited capacity Q.
-The number of trucks is supposed to be limited (K) and homogeneous.
-Every location is visited exactly once by exactly one truck.
-Trucks only collect one type of waste (i.e. paper).
-Each route begins and ends at the central depot.
-Dumpsters must be served once each 2 days.
-Waste must be transported to the central depot when the full capacity of the truck is reached. -The routes of trucks and their sequence order are calculated during the construction of the solution, depending on the algorithm used.
To solve the CWCP described above, the mathematical formulation of the problem will be presented based on a general formulation found in the literature [24]. In order to find the sequence order of visits to dumpster locations, the decision variables are defined as follows: x ljk = 1, if truck k visits location j after location l 0, otherwise.
y lk = 1, if truck k visits location l 0, otherwise.
The mathematical formulation of the CWCP is given by: subject to: Thus, the objective function presented in (1) allows minimizing the total distance while respecting the constraints. The constraints (2) ensure that every location is visited once and is left by the same truck, while the set of constraints (3) guarantees that every truck leaves the depot only once. In turn, constraints (4) guarantee continuity of the route, i.e, the number of trucks arriving at every location and entering the depot is equal to the number of trucks leaving. In the constraints (5) capacity constraints are stated, making sure that the sum of the demands of the locations visited in a route are less than or equal to the capacity of the truck performing the service. The subtour elimination restrictions are expressed by constraints (6), ensure that the solution contains no cycles disconnected from the depot. Additional decision variables, u lk , are used in the subtour elimination constraints and represent the truck load k after visiting the location l (7). Finally, (8) and (9) specify the definition domain of the decision variables x ljk and y lk .

Waste Level throughout Days
Due to the lack of data, it is still not possible to create models to accurately predict the waste levels over the days in the city of Bragança. However, the initial level within the chosen dumpsters can be calculated using a uniform probability distribution as the initial approach. Over the period studied, changes in waste level were considered dynamic and determined based on the analysis of the proximity of each dumpster using the software Google Earth to find the populated area. The parameter filling velocity (f v) was calculated with this image analysis. This parameter is the key to determining the waste level during the days as it constitutes the percentage in the volume of daily waste oscillation each day. Eq. (10) illustrates the expressions used to determine the filling velocity (f v j ) of dumpster j, j = 1, ..., dumpsters, the daily waste level (L), and the amount of waste (Lc) in m 3 .
where F A j represents the filled area with buildings around dumpster j, T A is the total area around each dumpster, and σ M is the parameter that introduces stochastic behavior in waste oscillation. The L i,j and Lc i,j represent the waste level and amount in day i for dumpster j, respectively, and T V represents the total volume of the dumpster. In Fig. 4, all collection points are illustrated on the map of the city of Bragança (left side, marked in yellow) along with an example of how F A j was determined for one location (right side).

Open Source Solver -Google OR-Tools
There are many difficulties in developing algorithms to solve classic VRP problems, whether in the online software or closed source, which present specific requirements and barriers for learning and/or application in real data/scenarios.
Recently, an open-source tool has emerged that presents several libraries and different solvers for different variants of problems and/or types of VRP, the Google OR-Tools [19]. This tool emerged in 2019, developed by researchers and programmers from the Google company, who present in that cloud system, several learning libraries, statistical data, vehicle routing problems, programming environment, flexibility, functionality, and easy to use in different programming languages (such as python). Furthermore, this tool has some pre-defined algorithms, such as (meta-)heuristics and several strategies for defining objectives, variables, constraints, and/or parameters. In addition, the integration of skills with external systems or online services is allowed (Google Maps API, distance matrix API, and others), as well as visualization systems.
Google OR-Tools is a fast and portable package, extremely practical to solve complex combinatorial optimization problems, allowing tests and/or applications in real-world problems. Google OR-Tools can solve many types of VRP, including problems with pickups and deliveries, and/or multiple capacity dimensions, initial loads, skills, scheduling problems, and so on. Finally, there is extensive documentation available online about OR-Tools, full of examples and libraries, which reveals its promising capacity in the field of operational research systems.

Route Optimization
The main goal of this work is to study the performance of three optimization algorithms, more specifically three metaheuristics, available in open-source Google OR-Tools to optimize paper waste collection routes in the city of Bragança. The proposed mechanism for the collection was to collect all dumpsters once every 2 days, to put the algorithms under stress. The metaheuristics to be assessed are Guided Local Search, Tabu Search, and Simulated Annealing. Algorithm 1 illustrates the complete procedure adopted to perform the study. Li,j = Li−1,j + f vj 6: if mod(i, 2) ̸ = 0 then

7:
Lci,j = Li,j T V 100 8: Li,j = 0 9: coli,j = j 10: end if 11: end for 12: end for 13: for i in days do 14: DMi ←− distance matrix(coli,j) 15: T Di, loadi ←− GLS, T S, SA(DMi, Lci,j) 16: end for To invoke the metaheuristics GLS, TS and SA, the within Algorithm 1, two input values are required: the distance matrix and the load inside the dump-sters. The parameter col i,j carries the information of which dumpsters should be collected over the days and is used by the distance matrix() function in order to return DM i , which will carry the distance matrix information daily. The load inside each dumpster on collection days is stored in Lc i,j , which is calculated by multiplying the waste level col i,j by the total volume T V . The maximum level that each dumpster can have on day 0 is given by CM . Daily changes can range from C min to C max , which represents the lower and upper threshold for waste generation, respectively.

Numerical Results
The results obtained for optimizing the routes using Guided Local Search, Tabu Search, and Simulated Annealing available in the Google OR-Tools, considering default parameter values, will be presented and discussed. The collection decision-making was chosen to put the algorithms under stress, so that on collection days, the algorithm has to find the best route considering all points of collection. In this regard, trucks need to collect all waste once each 2 days.
A period of 30 days (days = 30) was considered. The fleet of trucks is composed of 3 trucks (K = 3) with a maximum capacity of Q = 16 m 3 , and each dumpster (dumpsters = 20) have a maximum capacity of T V = 2.5 m 3 . Both data used for capacities are real, and the dumpsters/truck ratio in this system is close to the real operations of the company responsible for the collection. The maximum level for each dumpster CM was defined as 80%, C min is 20% and C max was chosen to be 80%. The total area of a circle with radius 150m defines the value of T A (T A = 70 685.83 m 2 ).
In order to get the real distances for each collection point (distance matrix) DM i uses the Google Maps API module, serving as input to the information in col i,j , that contains the latitude and longitude of the dumpster location j in day i that needs collection.

Results and Discussion
The proposed system for waste collection is considering the collection of all dumpsters once every 2 days, so the daily amount of waste collection will be the same. The relevance of levels in the algorithm is part of decision-making since the metaheuristics take into consideration the levels to find the optimal route. However, the solution in terms of the total distance to the optimal route obtained by each metaheuristic is different, which will be the focus of this discussion. The total distance traveled and the total execution time in each approach are represented in Fig. 5.
Traveled distances demonstrate that GLS is the the algorithm that returns the shortest path for the collection of waste in the system considered here. For instance, the result in GLS was 1.77% and 3.42% lower than the results obtained in TS and SA, respectively. GLS and SA present similar results in terms of execution time, with a little advantage for GLS. This parameter is highly important for future work since the real system will be composed of a higher number of locations, increasing the computational time to process and find the best route. The tool available in Google OR-tools enables the user to visualize the path that each vehicle is assigned to follow, along with the load collected at distinct points. With this in mind, a graphical representation could show what are the practical differences between each route. Thus, Fig. 6 provides the illustration of the routes found using the three metaheuristics on the first collection day (day 2). The representation shows that each algorithm has its own routes to collect the waste on the first day of collection. Due to the amount of waste to be collected on this day, only 2 trucks were assigned for collection in all algorithms.

Comparison with Real Scenario
The results obtained for route optimization demonstrate few differences using distinct algorithms. However, it is important to remember that all algorithms are working with route optimization, which should be enough to overcome realcase scenarios of traditional waste collection. To prove that all algorithms are delivering the optimized route, a comparison with real data from the company can be performed.
Real data provided by the company include the average distance traveled by the trucks, the amount of fuel they spent, and the annual amount of waste paper collected in 2020. The real system is composed of a larger number of recycling points and trucks, and the same fleet collects paper and plastic on different days. For this reason, a correlation parameter named collection cost (CC) was necessary for comparison purposes. The CC of the real system was calculated based on the average cost in one month and the amount of waste collected in m 3 . The amount of waste collected was provided in mass, so the conversion was performed based on data on paper waste density [18]. The results obtained for the real scenario are shown along with CC obtained for optimized routes in Fig. 7. The difference between the estimated CC with real data and the optimized results is astonishing, however, is worth mentioning that this result was calculated based on estimates of the average collection of waste and density value that may differ from the real one. Despite this detail, the results serve to demonstrate how powerful route optimization can be to save resources in an SWMS. Furthermore, the average gas price in 2020 was considered for this analysis.
This research shows the extreme usefulness of a recent and popular open-source optimization tool, Google OR-Tools, for solving a real capacitated waste collection vehicle routing problem, CWCP, in the city of Bragança.
In the developed approach three metaheuristics were used, GLS, SA, and TS, to optimize the waste collection of 20 paper waste dumpsters once every 2 days. Additionally, the tool allowed the evaluation and graphical representation of each route solution. The experimental results showed the quality of solutions achieved by the three approaches. In terms of performance, it is possible to assess that GLS was the best optimization algorithm to find the shortest path in the waste collection system. Additionally, a cost comparison with a real scenario and the optimized scenario showed an effective cost-saving, proving the potential application of OR-Tools for route optimization in SWMS.
After the optimization procedure is properly implemented and validated in the lower-scale system used in this study, the next step is to jump to a study in a real large-scale waste collection system in the city of Bragança.