In software pipelining, iterations of a loop in the source program are continuously initiated at. An iterative software pipelining method promotes instructions of a program loop to previous loop iterations and then reschedules the instructions until either 1 the resultant schedule is optimal i. Warp is a systolic array built out of custom, highperformance processors, each of which can. She is a coauthor of the classic compiler textbook, popularly known as the dragon book. The trick is to replicate the body of the loop after it has been scheduled, allowing different registers to be used for different values of the same variable when they have to be. This is automated desginspace exploration for matrix multiplication. Global software pipelining with iteration preselection. The basic idea behind software pipelining was first developed by patel and davidson for scheduling hardware pipelines. This paper shows that software pipelining is an effective and viable scheduling technique for vliw processors. Us patent for software pipelining a hyperblock loop patent. The implementations of the ideas in this thesis would not have.
Glaeser, some scheduling techniques and an easily schedulabe horizontal architecture for high performance scientific computing, in proceedings of. When app academys san francisco staff moved into new offices a month ago, the administrative group faced a common but important foundational task. An effective scheduling tech nique for vliw machines. Rau and glaeser were the first to use software pipelining in a compiler for a machine with specialized hardware designed to support software pipelining. Pipelining on a gold member of the mill family mill. The elements of a pipeline are often executed in parallel or in timesliced fashion. The preprocessing step ensures that each original instruction in the loop body can be overexecuted as many times as necessary. Emerging architectures often have support for software pipelining. Comp412 fall2010 instructionscheduling,iii softwarepipelining comp412 warning. Her software pipelining algorithm is used in commercial systems for instruction level parallelism. In software pipelining, iterations of a loop in the source program are continuously initiated at constant intervals, before the preceding iterations complete.
Time optimal software pipelining of loops with control flows. A report on compilation techniques for shortvector. Hierarchical reduction complements the software pipelining technique, permitting a consistent. In fact, monica lam presents an elegant solution to this problem in her thesis, a systolic array optimizing compiler 1989 isbn 0898383005. In the meantime, trace scheduling was touted to be the scheduling technique of choice for. Software pipelining sw pl is a performance enhancing technique that exploits instructionlevel parallelism on a wide instruction word to accomplish execution speedup. The site facilitates research and collaboration in academic endeavors. The execution of a software pipelined loop goes through three phases. Traditionally, this parallelism internal to the data path of a processor is only available to the microcode programmer, and the problems of minimizing the execution time of the microcode within and across basic blocks are known as local and global compaction, respectively. Pipelining and parallel functional units are common optimization techniques used in highperformance processors.
Software pipelining cmu school of computer science carnegie. Improving software pipelining by hiding memory latency. Im pretty sure that cdc and fps hand coders used the technique. Tolerating latency through software controlled data prefetching a disser t a ion submi t t e dt ot he d p a r m n to fe le ct ic gi n in g a n dt h ec o m it t e eo ng r d u est ie s o.
Lam computer science department, stanford university. Automated empirical optimization of software and the atlas project. Method for software pipelining of irregular conditional. In proceedings of the acm sigplan 1988 conference on programming language design and implementation, pp. Large number of pipeline stages up to 20 in pentium 4 increases branch penalty, unless the branch prediction is accurate.
Bp2 combining branch predictor by scott mcfarling, 1993. During the preprocessing stage, each instruction in the loop body is processing in turn n 4. Her research team created the first, widely adopted research compiler, suif. One approach involves improving the speed of the processor. Monica lam is a professor in the computer science department at stanford.
We address the problem of time optimal software pipelining of loops with control flows, one of the most difficult open problems in the area of parallelizing compilers. Explicit compilerbased memory management for outofcore. This has often been cited as a reason that software pipelining cannot be effectively implemented on conventional architectures. Toward a software pipelining framework for manycore chips by. Monica lam department of computer science carnegie mellon university pittsburgh, pennsylvania 152 abstract this paper shows that software pipelining is an effective and viable scheduling technique for vliw processors.
First, the processors in the xx a systolic array optimizing compiler array cooperate at a fine granularity of parallelism, interaction between processors must be considered in the generation of code for individual processors. Branch predictor and predication bp1 two level branch predictor paper by tseyu yeh and yale patt in isca19, 1992. Lam s contributions in compiler optimizations include software pipelining, data locality, and parallelization. Software pipelining has been known to assembly language programmers of machines with instruction level parallelism since such architectures existed. Of course, if the loop iterates less than k times at runtime, then the code must not enter the softwarepipelined version. The advantage of software pipelining is that optimal performance can be. Monica lam obtained her bs degree in computer science from university of british columbia, and her phd degree in computer science. Software pipelining university of wisconsinmadison. The advantage of software pipelining is that optimal performance can be achieved with compact object code.
Often, a test must be performed beforehand which jumps to an alternative, nonsoftwarepipelined version of the loop in these cases. Toward a software pipelining framework for manycore chips by juergen ributzka. The standard technique for dealing with them on a genreg machine, as originated by monica lam, is to unroll and inject regtoreg copies. An effective scheduling technique for vliw machines by monica lam. She is the faculty director of the open virtual assistant lab oval. An efficient scheduling technique for vliw machines. Lam showed that special hardware is unnecessary for effective modulo scheduling. An extended scheduling technique for software pipelining. A systolic array optimizing compiler 1987 by m lam add to metacart. Monica lam is a professor in the computer science department at stanford university, and the faculty director of the stanford mobisocial laboratory. Some computer architectures have explicit support for software pipelining, notably intels ia64 architecture. All new chapter on interprocedural analysis, written by worldrenowned computer scientist, monica s. Software pipelining for i1, i pipelining challenging.
Monica lam a systolic array optimizing compiler1989 isbn 0898383005 modulo renaming. There are many approaches for improving the execution time of an application program. In proceedings of the sigplan 88 symposium on programming language design and implementation, pages 318328. In computer science, software pipelining is a technique used to optimize loops, in a manner that parallels hardware pipelining. Software pipelining is a type of outoforder execution, except that the reordering is done by a compiler or in the case of hand written assembly code, by the programmer instead of the processor. Special thanks to monica lam, bob rau and richard hu. Presents the five methods for translation to explain syntaxdirected translation. She led the suif project which produced one of the most popular research compilers, and pioneered numerous compiler techniques used in industry.
An effective scheduling technique for vliw machines, pldi 1988. The microarchitecture breaks the instructions into mips like microoperations, but it complicates the design and wastes silicon. In software pipelining, iterations of a loop in the source program are continuously in. The preprocessing step ensures that each original instruction in the loop body. Removing anti dependences by repairing springer for. It documents the research and results of the compiler technology developed for the warp machine. Software pipelining is an excellent method for improving the parallelism in loops even when other methods fail. An experimental study of an ilpbased exact solution. An effective scheduling technique for vliw machines. Lam abstract the basic idea behind software pipelining was first developed by patel and davidson for scheduling hardware pipelines. Section 1 introduces terms common to many techniques.
Illustrates new techniques for dataflow analysis that emphasize the unity of code optimization and other program analysis software. Her contributions in compiler optimizations include software pipelining, data. An effective scheduling technique for vliw machines monica lam department of computer science carnegie mellon university pittsburgh, pennsylvania 152 abstract this paper shows that software pipelining is an effective and viable scheduling technique for vliw processors. Effective compiler generation of such code dates to the invention of modulo scheduling by rau and glaeser 1. Monica lam is part of stanford profiles, official site for faculty, postdocs, students and staff information expertise, bio, research, publications, and more. In the meantime, trace scheduling was touted to be the scheduling technique of choice for vliw very long instruction word machines. Basic instruction scheduling and software pipelining. Monica lam is part of stanford profiles, official site for faculty, postdocs, students and staff. Monica lam obtained her bs degree in computer science from university of british columbia, and her phd degree in computer science from carnegie mellon university in 1987.
The article would be stronger with better references to swp premodulo scheduling. In computer science, software pipelining is a technique used to optimize loops, in a manner that. Lam, anoop gupta in proceedings of the fifth international conference on architectural support for programming languages and operating systems, 1992 software controlled data prefetching is a promising technique for improving the performance of the memory subsystem to match todays highperformance processors. Software pipelining ist ein entwurfsmuster zur programmierung eines prozessors mit mehreren. However, formatting rules can vary widely between applications and fields of interest or study.
In fact, monica lam presents an elegant solution to this problem in her thesis, a systolic array optimizing compiler 1989 isbn. Citeseerx document details isaac councill, lee giles, pradeep teregowda. An effective scheduling technique for vliw machines, in proceedings of the acm sigplan 88 conference on programming language design and implementation pldi 88, july 1988 pages 318328. An effective scheduling technique for vliw machines, proceedings of the acm sigplan 88 conference on programming language design and implementation pldi 88, july 1988, pp. And today, software pipelining is used in all advanced compilers for machines with instructionlevel parallelism, none of which, except the intel itanium, relies on any specialized support for software pipelining. Software pipelining is a practical and efficient loop scheduling technique used in generating efficient code for vliw architectures, superscalar processors and microcode compaction for horizontal microarchitectures. Satyanarayanan, garth gibson, and monica lam for their input and guidance on this dissertation and the research behind it. Software pipelining allows exploitation of parallelism inside and across loop iterations. Figure 4 reducing the rampuprampdown effect with software pipelining. Modulo variable expansion even though john ruttenberg wrote the words and he knows a lot more than i do about software pipelining and certainly where the bodies are buried, i corrected modulo renaming to modulo variable expansion as thats what monica lam calls it in her paper and in her thesis.
This is completely unnecessary on the mill, because the drops from prior iterations are still on the belt and can be addressed directly without unrolling or copying. A compilation technique for software pipelining of loops with. Novel parallel givens qr decomposition implementation on. In computing, a pipeline is a set of data processing elements connected in series, where the output of one element is the input of the next one. Monica lam is a professor in the computer science department at stanford university since 1988. A guide to the theory ofnpcompleteness 1979 by m r garey, d s johnson. A systolic array optimizing compiler 1987 citeseerx. In proceedings of the acm sigplan 88 conference on programming language design and implementation, pages 318328, atlanta, ga, july 1988. In software pipelining, iterations of a loop in the source program are continuously. Our understanding of software pipelining subsequently deepened with the work of many others. Anti dependences writeafterread dependences constrain the reordering of instructions and limit the effectiveness of instruction scheduling and software pipelining techniques for superscalar and vliw processors. Monica lam obtained her bs degree in computer science from university of british columbia, and her phd degree in computer science from carnegie mellon. An effective scheduling technique for vliw machines by monica lam, 1988.
I tried to clarify the history of software pipelining, distinguish between swp and modulo scheduling, and generally to credit rao, lam, and gao with their true accomplishments. A method for software pipelining of irregular conditional control loops including preprocessing the loops so they can be safely software pipelined. Software pipelining has been widely accepted as an efficient technique for scheduling instructions in a loop body for vliw and superscalarprocessors. Hierarchical reduction complements the software pipelining technique, permitting a consistent performance improvement be obtained. Her software pipelining algorithm is used in commercial systems for. Software pipelining is a technique to improve the performance of a loop by overlapping the execution of several iterations. This thesis details the implementation of swing modulo scheduling, a software pipelining technique, that is both e. In software pipelining, iterations of a loop in the source program. Lam is a professor of computer science at stanford university, was the chief scientist at tensilica and the founding ceo of moka5. As the name implies, pipelining here is not primarily achieved through hardware, but rather refers to the interleaved execution of distinct iterations of a sw loop. We use cookies to make interactions with our website easy and meaningful, to better understand the use of our services, and to tailor advertising.