![](/img/trans.png)
[英]Single processes on separate machines vs multithreading on a single machine with cores equal to number of single CPU machines
[英]number of cores effecting multithreading OS
我正在嘗試確定有關PC工作方式的一些信息,我有一個雙核PC,並且嘗試使用我編寫的代碼對其進行測試,該程序在每個線程使用線程(在Java中)將兩個矩陣相乘處理矩陣行除以線程數。 因此,在兩個1024 X 1024矩陣上測試我的代碼,我得到了以下結果-(所有結果均為10次運行的中值)1個線程-9.878秒\\\\\\ 2個線程-5.944秒\\\\\\ 3個線程-5.062秒\\\\\\ 4線程-4.895秒\\\\\\ 5到1024個線程,時間在4.8到5.3秒之間變化\\\\\\。
我要弄清楚的是,為什么前四個線程中的每個線程的時間減少都更少? 難道工作不應該在內核之間平均分配嗎? 所以1個線程10秒,2個線程5秒,剩下的時間更長,因為我只有2個內核,增加更多的線程只是在創建更多的上下文切換。
我無法理解的第二件事,假設在第4個線程之后,我的PC只是在不會真正拆分工作的線程之間切換,而只是切換哪個線程在做某項工作,不應僅僅用1024個線程來大幅增加時間因為我要讓它執行很多上下文切換?
預先感謝對此事的任何回應
添加代碼-
/**
* A class representing matrix multiplying threads , implements runnable
* used to test the time difference according to changes in amount of
* threads used in the program !
*
* @author R.G
*/
public class MatrixMultThread implements Runnable{
//Thread fields and constants
private static final String INT_ERROR = "An error has occured during thread join";
private static final String THREAD_COUNT_ERROR = "Invalid number of threads";
static final int MATRIX_ROW = 1024;
static final int MATRIX_COL = 1024;
static final int UPPER_THREAD_LIMIT = MATRIX_ROW;
private int startRow;
private int endRow;
private float[][] target;
private float[][] sourceTwo;
private float[][] sourceOne;
/**
* MatrixMultThread constructor - constructs the threads that handle multiplication.
*
* @param startRow - the row this thread should start calculating from
* @param endRow - the row this thread should stop calculating at (included in calc)
* @param sourceOne - first matrix in the multiplication
* @param sourceTwo - second matrix in the multiplication
* @param target - result matrix
*/
public MatrixMultThread(int startRow, int endRow, float[][] sourceOne, float[][] sourceTwo, float[][] target){
this.startRow = startRow;
this.endRow = endRow;
this.target = target;
this.sourceOne = sourceOne;
this.sourceTwo = sourceTwo;
}
/**
* Thread run method, invoking the actual calculation regarding
* this thread's rows.
*/
public void run() {
int sum = 0;
for(; startRow <= endRow; startRow++){
for(int j = 0; j < MATRIX_COL ; j++){
for(int i = 0; i < MATRIX_ROW ; i++){
sum += sourceOne[startRow][i] * sourceTwo[i][j];
}
target[startRow][j] = sum;
sum = 0;
}
}
}
/**
* A method used for multiplying two matrices by threads.
*
* @param a - first source matrix
* @param b - second source matrix
* @param threadCount - number of threads to use in the multiplication
*/
public static float[][] mult(float[][] a, float[][]b, int threadCount) {
if(threadCount > UPPER_THREAD_LIMIT || threadCount < 1){
System.out.println(THREAD_COUNT_ERROR);
System.exit(1);
}
//Result matrix
float[][] result = new float[MATRIX_ROW][MATRIX_COL];
Thread[] threadList = new Thread[threadCount];
//Creating the threads
int firstRow = 0;
int lastRow = 0;
for (int i = 0; i < threadCount ; i++){
firstRow = i * (MATRIX_ROW / threadCount);
lastRow = ((i + 1) * (MATRIX_ROW / threadCount)) -1 ;
Thread singleThread;
//in case the number does not divide exactly we let the last thread do a bit extra work
//to compensate on the missing few matrix lines.
if((i + 1) == threadCount){
singleThread = new Thread(new MatrixMultThread(firstRow, MATRIX_ROW - 1, a, b, result));
}else{
singleThread = new Thread(new MatrixMultThread(firstRow, lastRow, a, b, result));
}
threadList[i] = singleThread;
singleThread.start();
}
//Join loop
for (int i = 0; i < threadCount ; i++){
try {
threadList[i].join();
} catch (InterruptedException e) {
System.out.println(INT_ERROR);
System.exit(1);
}
}
return result;
}
/**
* Main method of multiplying two matrices using various number of threads
* functionality time is being tested.
*
* @param args.
*/
public static void main(String[] args) {
//Thread number and timers for milliseconds calculation.
int numberOfThreads = 1024;
long startTimer, endTimer;
//Initializing matrices
float[][] a = new float[MATRIX_ROW][MATRIX_COL];
float[][] b = new float[MATRIX_ROW][MATRIX_COL];
for(int i = 0 ; i < MATRIX_ROW ; i++){
for(int j = 0 ; j < MATRIX_COL ; j++){
a[i][j] = (float)(Math.random() * ((100 - 0) + 1)); //Random matrices (values
b[i][j] = (float)(Math.random() * ((100 - 0) + 1)); //between 0 and 100).
}
}
//Timing the multiplication.
startTimer = System.currentTimeMillis();
mult(a, b, numberOfThreads);
endTimer = System.currentTimeMillis();
System.out.println("Matrices multiplied in " + (endTimer - startTimer) + " miliseconds");
}
}
您的程序受CPU限制。 這意味着它將消耗整個調度程序的量子。 因此上下文切換開銷相對較小:
overhead = ((consumed_quanta + context_switch_time) / consumed_quanta) - 1
在自願離開CPU的進程中,上下文切換開銷會更大:例如,兩個線程在它們之間不斷傳遞相同的消息(因此,一個線程發送消息,而另一個線程讀取消息,然后第二個線程將消息發送給第一個,並且等等)將具有非常高的上下文切換開銷。
SMT(x86域中的HyperThreading)允許單個核心服務多個線程,就好像它是多個邏輯核心一樣。 由於CPU通常必須等待外部資源(例如:它需要高速緩存中的數據),因此在這些死機時間段內允許另一個線程繼續運行可以導致性能的提高,而額外電路卻相對較少(與添加另一個內核相比)。 由於HT,在實際系統中(而不是在綜合基准中)提高性能的典型引用數字約為10-20%,但YMMV:HT在某些情況下會使性能變差,並且在不同的情況下可能會帶來更顯着的提高-cases。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.