Show simple item record

dc.contributor.authorQin, Xiaohan, 1964-en_US
dc.date.accessioned2009-10-06T16:53:03Z
dc.date.available2009-10-06T16:53:03Z
dc.date.issued1997en_US
dc.identifier.otherb41658292en_US
dc.identifier.other39983105en_US
dc.identifier.otherThesis 46618en_US
dc.identifier.urihttp://hdl.handle.net/1773/6925
dc.descriptionThesis (Ph. D.)--University of Washington, 1997en_US
dc.description.abstractTwo recent trends are affecting the design of medium-scale shared-memory multi-processors. The first is the use of nodes which themselves consist of clusters of processors. Clusters, already available as commodity parts, not only make powerful nodes, they also let the system scale up gracefully. The second trend is the use of programmable protocol processors and software for maintaining cache coherence to shorten the hardware design cycle and to provide flexibility and extensibility.One problem arising from software cache coherence is that remote memory accesses suffer a longer latency than with a pure hardware scheme. Another issue raised by software schemes in cluster environments is that of contention on the protocol processor due to the high service demand for this device.Our solution to the first problem offers users or compiler writers a set of explicit communication primitives to provide hints for moving data properly and promptly. The communication primitives, running on protocol processors, introduce a flavor of message-passing and permit protocol optimization. To the second issue, we investigate three architectural choices that strive to achieve resource balance: (1) selecting an appropriate cluster size to control resource sharing, (2) adding a remote cache (per node) to keep remote data in clusters, and (3) adding a forwarding logic to reduce the load on the protocol processor and to speed up the processing of simple messages.This dissertation studies how the overhead of a software scheme and its contention on the protocol processor can be reduced by various combinations of the design options and how the software overhead can be further hidden by the communication primitives. In the absence of communication primitives, we employ an MVA-based analytical model to estimate the protocol processor's contention and overall performance for a fast turn-round. When communication primitives are present, we employ simulation method. We find that the software implementation supplemented with remote cache and forwarding logic can deliver a performance competitive with the rigid and pure hardware scheme. With the judicious use of communication primitives, the enhanced software scheme can improve performance beyond the limit of the hardware implementation. In addition, the software cache coherence is more flexible, scalable and easier to optimize.en_US
dc.format.extentx, 125 p.en_US
dc.language.isoen_USen_US
dc.rightsCopyright is held by the individual authors.en_US
dc.rights.urien_US
dc.subject.otherTheses--Computer science and engineeringen_US
dc.titleOn the use and performance of communication primitives in software controlled cache-coherent cluster architecturesen_US
dc.typeThesisen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record