簡體   English   中英

如何使用cublas或推力計算復雜向量的內積?

[英]How to compute complex vectors' inner product using cublas or thrust?

經過很長時間的搜索,我仍然無法解決此問題。

我有兩個向量:x = [a1,...,bN],y = [b1,...,bN]。

我想計算它們的內積:= a1 * conj(b1)+ ... + aN * conj(bN)。 (conj(。)表示復共軛運算)

我試過cublasCdotu,它只計算a1 * b1 + ... + aN * bN。

cublasCdotc返回conj(a1)* conj(b1)+ ... + conj(aN)* conj(bN)。

最后,我嘗試了推力::: inner_product,它也計算a1 * b1 + ... + aN * bN。

我的推力代碼如下:

typedef thrust::complex<float> comThr;
thrust::host_vector< comThr > x( vec_size );
thrust::generate(x.begin(), x.end(), rand);

thrust::host_vector< comThr > y( vec_size );
thrust::generate(y.begin(), y.end(), rand);
comThr z = thrust::inner_product(x.begin(), x.end(), y.begin(), comThr(0.0f,0.0f) );

您能給我一些關於這個問題的建議嗎? 謝謝!

您可以使用thrust::inner_product做到這一點。 所需要做的只是一個用戶定義的二進制函數,該函數實現a * conj(b) ,其中conj是復共軛。 推力庫包括所需的所有復雜運算符,因此實現像這樣的運算符很簡單:

  __host__ __device__
  comThr operator()(comThr a, comThr b) 
  { 
    return  a * thrust::conj(b); 
  };

一個完整的工作示例:

#include <iostream>
#include "thrust/host_vector.h"
#include "thrust/functional.h"
#include "thrust/complex.h"
#include "thrust/inner_product.h"
#include "thrust/random.h"

typedef thrust::complex<float> comThr;

struct a_dot_conj_b : public thrust::binary_function<comThr,comThr,comThr>
{
  __host__ __device__
  comThr operator()(comThr a, comThr b) 
  { 
    return  a * thrust::conj(b); 
  };
};

__host__ static __inline__ comThr rand_comThr()
{
    return comThr((float)rand()/RAND_MAX, (float)rand()/RAND_MAX);
}

int main()
{
    const int vec_size = 16;

    thrust::host_vector< comThr > x( vec_size );
    thrust::generate(x.begin(), x.end(), rand_comThr);

    thrust::host_vector< comThr > y( vec_size );
    thrust::generate(y.begin(), y.end(), rand_comThr);

    comThr z = thrust::inner_product(x.begin(), x.end(), y.begin(), comThr(0.0f,0.0f),
                                     thrust::plus<comThr>(),  a_dot_conj_b());

    comThr zref(0.0,0.0);
    for(int i=0; i<vec_size; i++) {
        comThr val = x[i] * thrust::conj(y[i]);
        std::cout << i << " " << x[i] << " op " << y[i] << " = " << val  << std::endl;
        zref += val;
    }

    std::cout << "z = " << z << " zref = " << zref << std::endl;

    return 0;
}

它將像這樣編譯和運行:

$ nvcc -arch=sm_52 -o dotprod_thrust dotprod_thrust.cu 

$ ./dotprod_thrust 

0 (0.394383,0.840188) op (0.296032,0.61264) = (0.631482,0.00710744)
1 (0.79844,0.783099) op (0.524287,0.637552) = (0.917879,-0.0984784)
2 (0.197551,0.911647) op (0.972775,0.493583) = (0.642147,0.78932)
3 (0.76823,0.335223) op (0.771358,0.292517) = (0.690638,0.0338566)
4 (0.55397,0.277775) op (0.769914,0.526745) = (0.572826,-0.0779383)
5 (0.628871,0.477397) op (0.891529,0.400229) = (0.751725,0.173921)
6 (0.513401,0.364784) op (0.352458,0.283315) = (0.284301,-0.0168827)
7 (0.916195,0.95223) op (0.919026,0.807725) = (1.61115,0.135091)
8 (0.717297,0.635712) op (0.949327,0.0697553) = (0.725294,0.553463)
9 (0.606969,0.141603) op (0.0860558,0.525995) = (0.126716,-0.307077)
10 (0.242887,0.0163006) op (0.663227,0.192214) = (0.164222,-0.0358752)
11 (0.804177,0.137232) op (0.348893,0.890233) = (0.40274,-0.668025)
12 (0.400944,0.156679) op (0.020023,0.0641713) = (0.0180824,-0.0225919)
13 (0.108809,0.12979) op (0.0630958,0.457702) = (0.0662707,-0.0416127)
14 (0.218257,0.998924) op (0.970634,0.23828) = (0.449871,0.917584)
15 (0.839112,0.512932) op (0.85092,0.902208) = (1.17679,-0.32059)
z = (9.23213,1.02127) zref = (9.23213,1.02127)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM