SET-MRTS:An Empirical Experiment Tool for Real-Time Scheduling and Synchronization

2022-04-19 05:48ZeWeiChenHangLeiMaoLinYangYongLiao

Journal of Electronic Science and Technology 2022年1期

Ze-Wei Chen | Hang Lei | Mao-Lin Yang | Yong Liao

Abstract—In the real-time scheduling theory,schedulability and synchronization analyses are used to evaluate scheduling algorithms and real-time locking protocols,respectively,and the empirical synthesis experiment is one of the major methods to compare the performance of such analyses.However,since many sophisticated techniques have been adopted to improve the analytical accuracy,the implementation of such analyses and experiments is often time-consuming.This paper proposes a schedulability experiment toolkit for multiprocessor real-time systems(SET-MRTS),which provides a framework with infrastructures to implement the schedulability and synchronization analyses and the deployment of empirical synthesis experiments.Besides,with well-designed peripheral components for the input and output,experiments can be conducted easily and flexibly on SET-MRTS.This demonstration further proves the effectiveness of SET-MRTS in both functionality and availability.

Index Terms—Experiment,real-time system,schedulability,software tools,synchronization.

1.lntroduction

Real-time systems have been broadly used in critical systems,such as telecom,industrial control,automotive,and avionics systems,and the scheduling that guarantees the time constraints in a system is essential to real-time systems.Although real-time scheduling algorithms have been well studied for uniprocessors[1],[2],the problem for multiprocessors is still unfathomed,especially when shared resources and intra-task parallelism,i.e.,parallel tasks,are taken into consideration.

In general,there are three performance metrics to compare the effectiveness of different schedulability analyses:The utilization bound,resource augmentation bound (speed-up bound),and empirical analyses[3].The utilization bound for a scheduling algorithm guarantees the schedulability,if the total utilization of a considered task set does not exceed the bound.However,it is usually impossible to obtain a utilization bound for parallel systems.A scheduler provides a resource augmentation bound ofb≥ 1 if it can successfully schedule any task set onmprocessors of speedbas long as the ideal scheduler can schedule the task set onmprocessors of speed 1.Nevertheless,the ideal scheduler is only a hypothetical scheduler that may not exist.Therefore,we cannot conduct schedulability tests based on the speed-up bound.The empirical analysis studies the schedulable conditions with respect to specific task sets and conducts the schedulability test with randomly generated task sets.Since the utilization bound and the speed-up bound focus on high-level abstractions,the schedulability analysis conducted by these approaches is usually too pessimistic.Moreover,it is still challenging to derive non-trivial theoretical bounds for tasks with shared resources[4]-[7].Therefore,the empirical analysis has been widely utilized to compare the scheduling algorithms and locking protocols.

As there is no existing benchmark of synthetic task sets for real-time schedulability tests,researchers have to implement their own experiment setups,which consist of the implementation of system models,the generation of unbiased task sets,a wide range of configurations to guarantee a justified comparison,and the result analysis.However,conducting empirical experiments could be time-consuming.On the one hand,although many analyses have been detailedly introduced in their original work,the experiment setups are often private such that the implementation cannot be reused.On the other hand,the increasingly sophisticated analytical techniques and the consideration of intra-task parallelism,i.e.,parallel real-time tasks,can also complicate the process of reproduction.Therefore,a public tool that facilitates the experiment setups and supports reusability is desirable in the literature.

To facilitate the process of the empirical experiments,we designed and implemented an open-sourced schedulability experiment toolkit for multiprocessor real-time systems (SET-MRTS,available from https://github.com/RTLAB-UESTC/SET-MRTS-public) and introduced the essential parts of SET-MRTS in this paper.The proposed SET-MRTS provides an efficient framework to build up schedulability experiments,which supports the empirical measure for independent and synchronized (with shared resources) task sets,sequential and parallel task models,and homogeneous and heterogeneous multiprocessor models.The algorithm library of SET-MRTS contains various state-of-the-art schedulability analyses of multiprocessor scheduling algorithms and locking protocols[6]-[13].The already-implemented algorithms can be easily reused since all analysis algorithms are inheritable,and SET-MRTS also provides shared utilities.Further,SETMRTS supports the deployment on a distributed environment such that large-scale experiments can be conducted more efficiently.

The remainders of this paper are as follows:The backgrounds are introduced in Section 2;the design and the framework of SET-MRTS are presented in Section 3;a demonstration is provided in Section 4;Section 5 refers to the related work;Section 6 gives the summary of this work.

2.Backgrounds

In this section,we briefly review the multiprocessor real-time system model in SET-MRTS.

2.1.Tasks

In real-time systems,the workload is typically generated by a set of recurrent processes or tasks.Each instance of a task is considered as a job.A task set is said to be sporadic if the recurrent interval is not strictly determined.According to the standard sporadic task model,the taskτican be modeled by a three-tuple model (Ci,Ti,andDi),whereCidenotes the worst-case execution time (WCET),Tiis the minimum time interval between successive jobs,andDiis the relative deadline.The utilization of taskτiis defined byUi=Ci/Ti,and the total utilization of the task sets is the summation of all tasks’ utilization,i.e.,Usum=∑Ui.Furthermore,we assert a task set to be an implicit deadline task set iff all tasks haveTi=Di,a constraineddeadline task set iff all tasks haveTi≥Di,and an arbitrary-deadline task set for the rest conditions.

Regarding the parallel tasks,many parallel task models have been proposed to characterize the intra-task parallelism of tasks,such as the Gang task model[14],the synchronous parallel task model[15],the fork-join task model[16],and the directed acyclic graph (DAG) task model[17].

2.2.Processors

In the real-time scheduling theory,multiprocessor platforms are often classified into three categories.

1) Identical:For the identical multiprocessor platform,all processors in the system are the same.

2) Uniform:For the uniform multiprocessor platform,each processor is characterized by its speed-up factor,and a processor with a speed-up factor ofαcan finishαtimes of workloads compared with a 1-speed processor in a unit time.

3) Unrelated:For an unrelated multiprocessor platform,a task may execute at different speeds on specific processors.In other words,the execution speed is related to not only processors but also tasks.

2.3.Shared Resources

In addition to the processors,the tasks share a set of other serially reusable resources,such as shared memories,critical data,and input/output (I/O) ports.Without loss of generality,we model the shared resources in high-level abstractions.We assume each job of taskτirequires a set of shared resources at runtime.In caseτirequires a shared resourcelq,we assumeτican requestlqfor at mostNi,qtimes,and each request requires at mostLi,qunits of time to finish.We consider the lock-based schemes to handle shared resource contention,including spinlocks and semaphores.

3.Framework

SET-MRTS,as presented inFig.1,is mainly comprised of a kernel module,which maintains the system model and the algorithm library,and peripherals for the experiment setup.In particular,we adopt the objectoriented programming (OOP) paradigm to implement the modules,such that each module can be reused and extended by inheritance.The system models include the task model,processor model,and shared resource model,as described in Section 2.The algorithm library consists of the implementation of analysis algorithms.The peripherals include the parameter generator,configuration module,plotting tool,and I/O convertor.In the following,we further introduce SET-MRTS by its kernel and peripherals,respectively.Besides,since SET-MRTS is an open-sourced project that keeps tracking the state-of-the-art research and the code is fully accessible,we mainly introduce the design concept and functionality in this section.

Fig.1.Architecture of SET-MRTS.

3.1.Kernel Module

The kernel module consists of the system models and the algorithm library,which are illustrated as follows.

3.1.1.System Model

Since SET-MRTS adopts the OOP paradigm,each model is implemented as a class,of which an object represents an instance of the model.For example,an object of the task model class represents a specific task in the system.

SET-MRTS supports the schedulability and synchronization analyses for both sequential tasks and parallel tasks.The attributes of a task,e.g.,WCET,are stored in class members,and the frequently used calculations,e.g.,utilization,are provided by the class functions.Concerning the parallel tasks,we use the DAG task model to depict the intra-task parallelism since it is the generalization of other parallel task models.In addition,since the connection (dependency) between the vertices in a DAG task is usually sparse,DAG of a parallel task in SET-MRTS is represented by an adjacency list.

The objects of different models can have an association with each other.For instance,a task maintains a set of shared resources it requires and vice versa.Therefore,SET-MRTS can easily support partition-based analyses in both task-and resource-oriented approaches[6]-[9].

3.1.2.Algorithm Library

In the real-time scheduling theory,many schedulability analyses have been proposed for several scheduling paradigms,i.e.,global scheduling,partitioned scheduling,semi-partitioned scheduling,and federated scheduling.Besides,the growing importance of parallel tasks also brings more challenges to the literature.In fact,the state-of-the-art schedulability analyses for parallel tasks take the precedence constraints of the vertices into account to derive less pessimistic bounds for the interfering workload[18]-[20].

In addition,many real-time locking protocols have been studied based on different scheduling paradigms,such as the priority inheritance protocol (PIP)[21],flexible multiprocessor locking protocol (FMLP)[22],distributed priority ceiling protocol (DPCP)[23],and multiprocessor priority ceiling protocol (MPCP)[24].Furthermore,to improve the analytical accuracy,researchers have proposed many sophisticated techniques,such as the optimal priority analysis (OPA)[25]and linear programming (LP) optimization[8]-[10],[26].These techniques may require additional mathematical library support.For instance,SET-MRTS utilizes the GNU linear programming kit (GLPK)[27]to solve the LP-based analysis and encapsulates the trivial formulations in only a few invocations,which otherwise can be very time-consuming for the implementation.To evaluate a locking protocol’s performance,researchers need to provide a synchronization analysis based on the existing schedulability analysis under a specific scheduling algorithm.However,the implementation and experiment setups of existing work are often conducted privately,which cannot be reused for the improvement to the existing method or study based on existing work.Therefore,a reusable algorithm library that can speed up the process of empirical experiments is desirable in the literature.

To facilitate the empirical experiments,SET-MRTS provides many reusable infrastructures,i.e.,system models and mathematical support,for the schedulability and synchronization analyses under most scheduling paradigms for both sequential and parallel tasks.Similar to the system models,each analysis algorithm is an inheritable class in SET-MRTS.A new analysis algorithm can be efficiently implemented by inheriting from an existing class with a few overrides.Owing to the reusability,users only have to focus on the implementation of improvements.

3.2.Peripherals

This part presents the peripheral components of SET-MRTS,and the following subsections further explain how SET-MRTS can facilitate the empirical experiments.

3.2.1.Configuration

Roughly clothed, and living in the simplest way, the girls regretted unceasingly the luxuries and amusements of their former life; only the youngest tried to be brave and cheerful

SET-MRTS reads an extensible markup language (XML) configuration file for system modeling in default,and a light-weighted and efficient C++XML parser called TinyXML-2 is utilized[28].In particular,an XML configuration file stores the parameters in specific tags,and each tag may contain multiple elements representing different values for the parameter.Each element may have several attributes to provide additional markup for the content.For example,SET-MRST selects the analysis algorithms by adding elements in the tags pair,and we can use the attributes of each element (algorithm) to customize the name and line style presented on the plot.

In addition,empirical experiments usually require large-scale configurations for decent coverage,but manually modifying the configuration file could be tedious.To improve the efficiency,SET-MRTS supports the combination of parameters with multiple elements in a configuration file such that a large-scale experiment can be configured in only a few configuration files.For instance,the evaluation in the recently published work contains 768 configurations[29],which can be conducted by SET-MRTS with only two configuration files.Besides,the experiment results are exported to individual folders.Thus,the result of a specific configuration is easy to locate.

3.2.2.Generator

Because no benchmark exists for real-time scheduling and synchronization analyses,the task set generation becomes one of the essential parts for conducting a justified evaluation or comparison.SET-MRTS supports several unbiased and widely used algorithms for the utilization generation,e.g.,UUniFast[30],UUniFast-Discard[3],and RandFixedSum[31].The former two algorithms are widely used to generate a set of task utilization,in which the task utilization cannot exceed 1,while the RandFixedSum algorithm does not have such a limit.Hence,the RandFixedSum algorithm can generate high-utilization tasks,which is widely used for the parallel tasks’ generation.

In addition,concerning the DAG structures of parallel tasks,the Erdös-Rény algorithm[32]is used as a default method to construct a DAG task,and simpler parallel models,like the synchronous task model and the fork-join task model,are also supported in SET-MRTS.In addition,the parameters for the aforementioned algorithms are configurable in the XML configuration file,such as the number of tasks and the range of vertices in a DAG task.

3.2.3.Plotting

In empirical experiments,the schedulability of an analysis algorithm is often affected by many factors.To fathom the influence of each related factor,we usually evaluate the schedulability under one varying parameter while keeping the other parameters fixed,i.e.,control variables.For example,we usually compare several analysis algorithms under the growing total utilization indicating the increasing contention on the processors.However,checking the result at each point in an experiment is tedious.Therefore,the visualized output that provides an intuitive presentation is desired for large-scale experiments.

To achieve a more intuitive presentation,SET-MRTS utilizes the MathGL library[33]to make high-quality two-dimensional scientific graphics for the experiments.Besides,as mentioned in the configuration subsection,SET-MRTS supports customized plotting,such as the axis title,legend position,figure resolution,and export format.

3.2.4.Convertor

Instead of generating task sets with built-in algorithms,SET-MRTS can also load task sets from external files.That is,users can generate task sets with their own generators and convert them to files in a specific format.TakingFig.1for example,SET-MRTS can load pre-generated task sets from an external file,i.e.,TskSet.load.This feature has two major benefits.On the one hand,users can implement their own generators with specifically designed purposes without modifying the default generator.On the other hand,with the growing complexity of schedulability or synchronization analyses,like the LP-based analysis,empirical experiments are often very time-consuming.A reloadable task set file helps to compare the previous results with new proposed analyses,which can largely improve the efficiency of empirical experiments.For now,SET-MRTS only supports reloadable sequential tasks.Each reloadable file is a plain text file consisting of tasks presented in a determined order of parameters.Hence,any external generator can easily export a reloadable file.

Although SET-MRTS integrates a plotting library,figures exported by SET-MRTS may not be directly used for manuscripts due to the style,size,or format.Thus,further adjustments may be required after experiments.Along with the original figures,SET-MRTS exports raw data of the experiment results in the commaseparated value (CSV).The raw data are the unprocessed results of all schedulability tests under a specific configuration,which allows a finer-grained post-hoc analysis.

4.Demonstration

We demonstrated SET-MRTS by conducting several empirical experiments.Note that the demonstration did not cover a wide range of configurations in the hyperspace evaluation since this paper did not intend to present the all-round comparison.Instead,we tested several state-of-the-art analyses in moderate configurations for sequential and parallel tasks,respectively.

In the demonstration,we further implemented the synthesis experiments in a distributed environment to improve the evaluation’s efficiency.In particular,since the schedulability test is independent for each randomly generated task set,a simple client-server architecture is satisfied with our purpose.Fig.2shows the flow charts of the server and client,respectively.As shown inFig.2(a),a server takes charge of the work assignment and result collection.Meanwhile,it also maintains the transmission control protocol (TCP)connection to the clients through a heartbeat mechanism.On the other side,a client (Fig.2(b)) reads a configuration and initializes its sockets for connections.Once a client successfully connects or sends the result to the server,it repeatedly receives instructions from the server and generates random task sets according to the configuration.Finally,when a client receives a termination code from the server,it will release all resources and quit.Moreover,each client is an individual process such that the nodes in a distributed environment can have multiple clients executing according to the computing capacities,i.e.,central processing units (CPUs).When the network latency is negligible to the cost of a schedulability test,e.g.,the LP-based analysis,the distributed approach can significantly reduce the experiment time,especially for large-scale configurations.In addition,since both the server and client need to read the same configuration file to ensure consistency,the synchronization of the configuration file is required as a preliminary.

Fig.2.Flow charts of (a) server and (b) client for the experiment.

4.1.Selected Analyses

With regard to the schedulability and synchronization analyses,we selected several state-of-the-art analyses for parallel tasks as follows.

1) GFP-Fonseca:The schedulability analysis for global fixed-priority scheduling proposed by Fonsecaet al.[19].

2) GEDF-Chwa:The schedulability analysis for global earliest deadline first scheduling proposed by Chwaet al.[20].

3) FED-Li:The schedulability analysis for federated scheduling proposed by Liet al.[34].

4) Spin-FIFO and spin-PRIO:The synchronization analyses for parallel tasks with non-preemptive spinlocks using first-in-first-out and priority-ordered pending queues,respectively[13].

5) LPP:The synchronization analysis for the limited pending protocol proposed by Jianget al.[35].

6) POMLP and POMIP: The synchronization analyses for theO(m) locking protocol and theO(m)independence preserving protocol for parallel tasks proposed by Jianget al.[36].

7) DPCP-p:The synchronization analysis for DPCP for parallel tasks proposed by Yanget al.[37].

4.2.Configuration

Concerning the generation of parallel tasks,we used the Erdös-Rény algorithmG(|V|,ρ) to generate the task graphs,with |V| being the number of vertices randomly chosen in [20,40] andρ=0.1 being the probability of creating an edge between any two vertices.WCET of each vertex in taskτiwas randomly chosen in [300,1500].We used the critical path utilizationθito denote the ratio of the critical path length over the period of taskτi,i.e.,Li/Ti,which was randomly selected in [0.25,0.50] for plausibility.The period was calculated byLi/θiafterLiandθiwere determined.Besides,we used the normalized utilizationUnormto denote the total utilizationUsumover the number of processorsm,which was set toUsum/Unormfor each schedulability test.

Regarding the synchronization analysis,we conducted the test with 4 shared resources.Concerning the requests ofτifor a shared resourcelq,the maximum request numberNi,qwas uniformly chosen in [1,15],and the maximum request lengthLi,qwas uniformly chosen in [15,30].Since the evaluation was synthetic,we omitted the unit of time in the configurations.

At each point in an experiment,we randomly generated 1000 task sets,and the selected analyses were evaluated by the acceptance ratio,which is the number of schedulable task sets over the total task set number.For simplicity,we generated implicit-deadline tasks,and the task priority under fixed-priority scheduling was given by the Rate-Monotonic algorithm.The experiments were deployed on a distributed environment based on the cloud virtual machine (CVM) with multiple Intel Xeon E5-2680 (2.4 GHz) cores.

4.3.Experiment Results

The experiment results are presented inFig.3.In particular,Fig.3 (a)presents the comparison of three state-of-the-art schedulability analyses for parallel tasks,andFig.3 (b)presents the evaluation of several state-of-the-art synchronization analyses for parallel tasks with different locking protocols.

Since we mainly conducted the experiments to demonstrate the functionality and compatibility of SET-MRTS instead of thoroughly comparing the selected analyses,we did not cover a wide range of configurations.According to the results,many state-of-the-art analyses can be implemented and compared by SET-MRTS,and the results are in line with the corresponding studies.Note that the difference between POMIP and LPP is marginal inFig.3 (b).

Fig.3.Experiment results exported by SET-MRTS: Comparison of state-of-the-art (a) schedulability and(b) synchronization analyses.

Concerning the efficiency of the client-server architecture,we evaluated the runtime cost for GFP-Fonseca under an increasing number of tasks (i.e.,from 2 to 16) with different numbers of clients.In particular,the normalized utilization was 0.5,and we recorded the cumulative runtime cost for testing 1000 task sets at each point in an experiment.We conducted the test with {2,4,8,16} clients,and the results are shown inFig.4.

As shown inFig.4,the runtime cost grows with the increasing number of tasks since the algorithm needs to test the schedulability for each task.In addition,it is observed that the runtime cost is inversely proportional to the number of clients.It is mainly because that the schedulability tests for the randomly generated tasks are independent,and increasing the number of clients is equivalent to decreasing the number of task sets to be tested on each client,which is almost linearly dependent when the miscellaneous cost,e.g.,network latency,is marginal.Hence,the simple client-server architecture provided by SET-MRTS can effectively improve the efficiency of empirical experiments.

Fig.4.Runtime cost with different numbers of clients.

5.Related Work and Comparison

Regarding the experiment tools,several related tools exist in the literature.The algorithm in [38] is a set of MATLAB functions for real-time analyses,including the task set generation and schedulability analysis for sequential tasks under fixed-priority and earliest-deadline-first scheduling.The schedulability test collection and toolkit (SchedCAT)[39]is a library implemented in Python/C++for real-time schedulability experiments.Since the algorithm proposed in [38] does not support more complicated synchronization analyses,SchedCAT is comprehensively used in recent years.In light of SchedCAT,the original SET-MRTS[40]is a redesigned project,which provides a system model for sequential tasks and a complete experiment setup.However,the original version of SET-MRTS does not include the design for the algorithm library and mathematical support,i.e.,an LP solver,and the peripherals cannot support large-scale experiments and the analysis for parallel tasks.In this paper,refactored SET-MRTS is presented.One of the differences between SchedCAT and SET-MRTS is that SchedCAT cannot be used to conduct experiments directly since SchedCAT does not provide a complete experiment setup,i.e.,only the kernel part is provided,while SET-MRTS is developed to integrate all utilities for schedulability experiments.On the other hand,present SET-MRTS supports analyses for both sequential tasks and parallel tasks,which is not a trivial modification from the original version,while SchedCAT mainly focuses on sequential tasks.In fact,with the growing importance of the study of parallel real-time tasks,the graph-based analysis becomes a necessity in the literature,and a convenient and efficient experiment tool is also desired for the evaluation and comparison.Therefore,SET-MRTS is developed to satisfy such an academic purpose.Besides,SET-MRTS has been used in several articles published in influential journals and conferences[26],[29],[41],[42],which further validates the practicality of this work.

Several simulation tools have also developed for the study of real-time scheduling algorithms[43],[44].Unfortunately,these projects have not been updated for many years and cannot provide a schedulability test.Beyond the synthetic evaluation,case studies in practical systems are also important to validate a proposed method in real-world environments.The Linux testbed for multiprocessor scheduling in real-time systems(LITMUS-RT)[45],[46]is a real-time extension of the Linux kernel,which is admissive to be a trustworthy testbed in the literature.LITMUS-RT provides abstractions and interfaces to help the prototyping of multiprocessor real-time scheduling and locking protocols,and an affiliated trace tool,i.e.,the Feather-Trace tool,has also been developed to provide statistics of system overhead.However,the limitation of LITMUS-RT includes the incapability of schedulability tests.In fact,the schedulability of sporadic tasks is hard to be verified by practical systems.Therefore,the synthetic evaluation is complementary to that conducted in a real-world environment.

6.Conclusion

In summary,SET-MRTS is compatible with mainstream synthesis models and is suitable for a variety of schedulability and synchronization analyses.Besides,SET-MRTS effectively helps to conduct empirical experiments with built-in components or external tools.The algorithm library consists of many implemented algorithms or analyses,which can be reused to reduce redundant coding in future research,and with comprehensively supported models,the new proposed analysis can be implemented efficiently.Furthermore,the efficiency of synthesis experiments can also be improved by deploying in a distributed environment.

Disclosures

The authors declare no conflicts of interest.

Journal of Electronic Science and Technology2022年1期

Journal of Electronic Science and Technology的其它文章: Linearly Polarized Polarization-Maintaining Er3+-Doped Fluoride Fiber Laser in the Mid-lnfrared; Total Transmission from Deep Learning Designs; lnvestigation on Surface Plasmon Polaritons and Localized Surface Plasmon Production Mechanism in Micro-Nano Structures; An in situ Digital Background Calibration Algorithm for Multi-Channel R-βR Ladder DACs; Retraining Deep Neural Network with Unlabeled Data Collected in Embedded Devices; Exploration of the Relation between lnput Noise and Generated lmage in Generative Adversarial Networks