Computer Architecture & Engineering (ARC)

Image result for computer architectureThe Computer Architecture focal point for research into various aspects of computer design and performance.

computer architecture is a set of rules and methods that describe the functionality, organization, and implementation of computer systems. Some definitions of architecture define it as describing the capabilities and programming model of a computer but not a particular implementation.In other definitions computer architecture involves instruction set architecture design, microarchitecture design, logic design, and implementation.

Topics:


Branch Prediction
Many techniques have been developed to improve the efficiency of branch handling. Review some of these: include some data showing the efficiency gains obtained. Find out which modern commercial processors use these techniques and describe the strategies they use.
MultiScalar Architectures
The ability to put 107 or more transistors on a chip poses a new problem for computer architects. Now that it's relatively easy to fit a large register file, decent sized cache, large TLB, branch prediction unit, all the floating point hardware an engineer needs, etc, how is it best to use all the transistors? Superscalar processors duplicate ALUs and associated functional units and attempt to issue more than one instruction in each cycle, but this needs complex hazard detection logic. Another approach is to provide multiple, essentially independent processors - the multiscalar approach. Review this approach: describe some of the proposed architectures and their expected performances.
Read K Olukotun et alThe Case for a Single-Chip Multiprocessor, Computer Arch News, 24, Oct 1996, 2-11. (Proc ASPLOS-VII, Cambridge, 1996) for an introduction to this concept and check recent proceedings of ISCA, Supercomputing, etc conferences for additional references. Useful names are J E Smith, G Sohi, ..
Cache Coherence
Most modern high performance processors include support for coherent caches so that they can be used in shared memory clusters. Describe the capabilities of the cache coherence units in some of the commercially available processors. Find some papers dealing with the problem of keeping caches coherent across network-connected processors and discuss how this affects the potential for building very large scale shared memory processors.
Distributed Shared Memory Systems
Distributed memory systems constructed from powerful workstations linked by networks are a cheap way to build a parallel computer. However, it's generally considered that shared memory models are easier to program, so there is much interest in providing an illusion of shared memory on such systems. Review the techniques used to emulate a shared memory machine on one with physically distributed memory. Find some data demonstrating the performance of such systems. The paper by Protic et al (see references) should be useful.
Memory Technology
Memory performance affects overall system performance more than processor speed in some applications. Review alternative memory technologies: RamBus, Synchronous DRAM, optical (holographic) memory, etc.
IEEE Micro, vol 17(6), Nov/Dec 1997 will be a good starting point.
Memory Interfaces
The bandwidth to main memory is a limiting factor in single processor systems. In a system with multiple processors sharing the same bus, the strain on available bandwidth increases: not only do you have transactions originating from more than one processor, you now have cache coherence transactions as well! This is making manufacturers consider cross-bar switches between the CPUs and the memory. Find out what is done in the latest multi-processor "clusters" to provide sufficient processor-memory bandwidth for all the processors on the bus. Get some technical data on the SGI Challenge architectures as a starting point. Looking at the supercomputers (eg Cray) and how they approach this problem would also be enlightening!
Intelligent Memories
Some authors have proposed performing some simple calculations in the memory sub-systems as a way of relieving the demand for the processor-memory bus. Review some of these proposals.
Starting source: Papers by David Patterson.
Asynchronous Processors
The problems with global clock distribution are encouraging researchers to look at processors which use asynchronous logic. Review some of the techniques and circuits used in asynchronous processors. Avoiding the need for a global clock is only one of the advantages offered by asynchronous processors: there are at least three others - what are they?
Arithmetic
Although you might think that almost everything that there is to know about binary arithmetic has been discovered by now, new ideas are stillappearing! Review some techniques for building fast adders, multipliers,  dividers, etc. Alternatively, you could review techniques for implementing more complex calculations such as FFTs, FIR filters, etc directly in hardware.
Graphics Accelerators
Intel's MMX technology has managed to attract a fair bit of publicity, but it's by no means the first attempt to add hardware to a system to enhance graphics handling. Review the architectural enhancements needed to accelerate the processing of images. Describe some commercially available and some research processors for this purpose.
IEEE Micro, vol 16(4), Aug 1996 is a useful starting point. 
3D graphics processing is a possible focus here see Talisman, IEEE Micro, 17(2),11(1997).
VLIW
VLIW architectures are not a new idea, but since Intel and HP chose a VLIW style for the new Merced processor, they have become of major significance! Review VLIW architectures: identify their strengths and drawbacks; describe some VLIW machines which have been produced commercially. (Whether Merced fits this description remains to be seen, but that shouldn't stop you writing about it!)
Dedicated Processors
Many architectures have been developed or proposed for specific purposes, eg MPEG encoding/decoding, other video compression techniques, Neural Nets, encryption, etc
Select one application area and review the architecture of special purpose processors designed primarily for this application.
Optical Technologies
Optical techniques have the potential to overcome some of the limitations which copper-based systems exhibit. There is scope for a number of essays here: optical interconnects, optical processors, optical memory, etc.