How to process feature vectors with different dimension in machine learning?

Question

I'm a beginner in machine learning, and I'm trying to use a data set to train a log linear classifier. The data set contains five features, and each feature is a vector, but the dimension of the features are different. The dimensions are 3, 1, 6, 2, and 2 respectively. I tried PCA method to reduce the dimensions to 1 with scikit-learn, but it didn't works well. So how do I process the features to fit a log linear classifier model like logistic regression?

Answer 1

A simple way to do this is just to flatten all of your features. And then feed it into your classifier.

An example:

features = [... 
          [[0, 1 3], [5], [2, 6, 4, 7, 8, 9], [1, 0], [0, 1]], #for one sample
          ...]

Use a list comprehension to flatten each list inside features:

flattened_features = [[i for k in f for i in k] for f in features]

This will turn features into something like this:

    flattened_features
    [... 
    [0,1,3,5,2,6,4,7,8,9,1,0,0,1], #for one sample
    ...]

Now you can convert this into a numpy array and feed it into your model.

How to process feature vectors with different dimension in machine learning?

Question

1 answers

solution1
0 ACCPTED 2018-04-13 03:09:32

How to process feature vectors with different dimension in machine learning?

Question

1 answers

solution1 0 ACCPTED 2018-04-13 03:09:32

solution1
0 ACCPTED 2018-04-13 03:09:32