Adding rows to a Pandas dataframe from another dataframe

Question

So I'm trying to sort a dataframe based on a randomly generated row. The dataframe is listed below. What I am trying to do is randomly pick a row, which I will call the centroid in the data frame and then make it so that the the rows which are less than the data are above it, and the rows which are greater than the centroid are below it. However I am not sure how to do that, I have given the dataframe and data below as well as the function I use to compare rows. I decide if a row is less than or greater by summing up the values in the row, and comparing it to the sum of the centroid.

Is there a good way to do this?

Any advice is appreciated.

def compareRows(arr1, arr2):
    arr1 = sum(arr1)
    arr2 = sum(arr2)
    return arr1 > arr2
data = np.array(pd.read_csv('https://raw.githubusercontent.com/gsprint23/cpts215/master/progassignments/files/cancer.csv',  header=None))
    data = data.T
    #print(data)
    df = pd.DataFrame(data[1:], columns=data[0], dtype=float).T

If you need anymore information please let me know

Thank you for reading

Answer 1

Grab one row at random with pd.DataFrame.sample
- note: this returns a one row dataframe
create a temporary dataframe d without the random row
create a boolean series of truth values that determine which other rows are greater than our random row
subset our temporary dataframe by where not greater than, append our random row, append subset of temporary dataframe where greater than our random row

sampled = df.sample(1)
d = df.drop(sampled.index)
gt = d.apply(compareRows, 1, arr2=sampled.squeeze())

pd.concat([d[~gt], sampled, d[gt]])
# d[~gt].append(sampled).append(d[gt])

Adding rows to a Pandas dataframe from another dataframe

Question

1 answers

solution1
2 ACCPTED 2017-09-28 06:10:20

Adding rows to a Pandas dataframe from another dataframe

Question

1 answers

solution1 2 ACCPTED 2017-09-28 06:10:20

solution1
2 ACCPTED 2017-09-28 06:10:20