이더넷 다중 클러스터에서 GHT의 병렬 분산 구현

서지반출

기타언어초록

이더넷 클러스터에서 그 분산처리 규모를 확장하려면 스위치 당 최대포트 수(현재 48포트)에 의해 물리적 제약을 받는다. 본 연구에서는 MPI기반 이더넷 클러스터에서 일반화 허프변환(generalized Hough transform: GHT)의 분산처리 규모를 확장하기 위해 다수의 이더넷 스위치들로 다중 클러스터를 구현하고, 확장에 따른 통신 부담을 병렬분산 시간분석 모델 및 통신성능 모델로 분석한 후 고속화 구현하였다. 다중 클러스터 분산처리환경에서 가능한 작업분할 정책들에 대해 평가하고, 허프공간 누산기 배열분할(accumulator array partitioning: AAP)정책을 수정 적용하여 노드간의 통신회수와 통신시간을 최소화하였고, 노드 수의 증가에 따라 AAP 정책의 분할 데이터 범위를 크게 하고 그에 부합하는 부하균형 알고리즘도 구현하였다. 단일링크 병목을 갖는 클러스터간(intercluster) 통신지연을 최대한 줄이기 위하여 일감 분배에는 선형 파이프라인 방송을 사용하고, 작은 결과 메시지들의 수합(gathering)에는 선형 플랫트리(flat tree)를 사용함으로써 총체적으로 계산과 통신을 최대한 시간 중첩시켰다. 제안한 병렬분산 GHT를 이더넷 다중 클러스터 상에서 그 성능을 점근해석하고 실험하여, 4개 고속 이더넷 스위치로 128 노드의 MPI 기반 다중 클러스터를 구현하여 거의 선형에 가까운 속도제고율(speedup)을 확인하였다.

기타언어초록

Extending the scale of the distributed processing in a single Ethernet cluster is physically restricted by maximum ports per switch. This paper presents an implementation of MPI-based multicluster consisting of multiple Ethernet switches for extending the scale of distributed processing, and a asymptotical analysis for communication overhead through execution-time analysis model. To determine an optimum task partitioning, we analyzed the processing time for various partitioning schemes, and AAP(accumulator array partitioning) scheme was finally chosen to minimize the overall communication overhead. The scope of data partitioned in AAP was modified to fit for incremented nodes, and suitable load balancing algorithm was implemented. We tried to alleviate the communication overhead through exploiting the pipelined broadcast and flat-tree based result gathering, and overlapping of the communication and the computation time. We used the linear pipeline broadcast to reduce the communication overhead in intercluster which is interconnected by a single link. Experimental results shows nearly linear speedup by the proposed parallel distributed GHT implemented on MPI-based Ethernet multicluster with four 100Mbps Ethernet switches and up to 128 nodes of Pentium PC.

키워드

다중 클러스터 병렬처리 속도제고율 GHT MPI

다운URL