简体   繁体   中英

Python Data Optimal Spreading of Parameters

Let's assume that I have a certain number of parameters that describe a system:

ie

position, velocity, mass, length, width

Now every parameter has an associated upper and lower bound:

position = [0,100]
velocity = [10,300]
mass = [50,200]
length = [2,10]
width = [2,10]

A data-point is defined by a certain combination of these parameters: ie

data_point = [10,250,50,4,2]

Now, the question is: Is there a python package/algorithm such that I can initialize a certain number of data-points (ie 5) such that those data points are optimally spread over the parameter space.

Side Note:

Yes, I know "optimally spread" is not well defined, but I am really not sure about how to go here. One possible definition could be:

maximize the distance between the data-points (Euclidean distance between vectors)

EDIT:

Using linspace is a very good idea. However, I quickly noticed an issues with my data. I actually forgot to talk about constraints:

Some data-points are not possible. ie

constraints = [lenght*2-width, position-velocity]

...if these values are greater or equal to zero, then the data-point can be considered as feasible.

So my question is: How can I include constraints in a smart way?

Using linspace, you will see that velocity will always be greater than position, and thus we will get no feasible datapoint.

position = [0,100]
velocity = [10,300]
mass = [50,200]
length = [2,10]
width = [2,10]

# Find Samples 
start = [s[0] for s in [position, velocity, mass, length, width]]
end = [s[1] for s in [position, velocity, mass, length, width]]

num_samples = 5
samples = np.linspace(start, end, num_samples)

print(samples)

This is the output:

[[  0.   10.   50.    2.    2. ]
 [ 25.   82.5  87.5   4.    4. ]
 [ 50.  155.  125.    6.    6. ]
 [ 75.  227.5 162.5   8.    8. ]
 [100.  300.  200.   10.   10. ]]

Now, let's check the constraints:

def check_constraint(samples, constraints):
    
    
    checked_samples = []
    for dimensions in samples:
        position, velocity, mass, length, width = dimensions

        # Here I am checking the constraints:
        if any([i<0 for i in [length*2-width, position-velocity]]):
            pass
        else:
            checked_samples.append(dimensions)
            
    
    return checked_samples

samples_checked = check_onstraint(samples, constraints)
print(samples_checked)

These would be the samples left after checking the constraints:

[]

You could do something like this to get an even grid of points:

import numpy as np

...

start = [s[0] for s in [position, velocity, ...]]
end = [s[1] for s in [position, velocity, ...]]

num_samples = 5
samples = np.linspace(start, end, num_samples)

This will return points evenly spaced throughout the parameter space.

Edit To include more constraints it might be good to do something like:

start = ...
end = ...
num_results = 5
results = []

while len(results) < num_results:
    sample = np.random.uniform(start, end)
    if is_valid(sample):
        results.append(sample)

that way you can define the is_valid function and check any conditions you'd like. The resulting points should be uniformly distributed around the parameter space.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM