Adding rows in between values fully factorial and interpolate values

  interpolation, pandas, python, python-3.x

I have a dataframe of 3 columns of which 2 are fully factorial and the third is easily calculatable plus some noise, but for the sake of argument, we’ll drop the noise. The following shows a repoducible example:

from itertools import product
import pandas as pd
mylist = list(product([0,10,20], [0,10,20]))
df= pd.DataFrame(data=mylist, columns=["A", "B"])
df['C'] = df['A'] **2 + df['B'] ** 2
print(df.head(10))

which results in:

    A   B    C
0   0   0    0
1   0  10  100
2   0  20  400
3  10   0  100
4  10  10  200
5  10  20  500
6  20   0  400
7  20  10  500
8  20  20  800

How ever I would like to increase the resolution. E.g. not go in steps of 10 but for steps of 1.

Currently what I do is add a completly new dataframe with the same code as above but bigger ranges. Get a row for each possible fully factorial combination of 'A' and 'B'.
Afterwords I fit a polinomial regression for the row of c and apply that to the new dataframe.

my current code looks something like this:

new_a_list = list(range(min(a_list), max(a_list)+1, 1))
new_b_list = list(range(min(b_list), max(b_list)+1, 1))
my_new_list = list(product(new_a_list, new_b_list))
new_df= pd.DataFrame(data=my_new_list, columns=["A", "B"])
model =  make_pipeline(PolynomialFeatures(2), linear_model.LinearRegression())
model.fit(df[['A', 'B']], df['C'])
new_df['C'] = model.predict(new_df[['A', 'B']])
print(new_df.head(20))

This results in the following output:

    A   B             C
0   0   0  1.136868e-13
1   0   1  1.000000e+00
2   0   2  4.000000e+00
3   0   3  9.000000e+00
4   0   4  1.600000e+01
5   0   5  2.500000e+01
6   0   6  3.600000e+01
7   0   7  4.900000e+01
8   0   8  6.400000e+01
9   0   9  8.100000e+01
10  0  10  1.000000e+02
11  1   0  1.000000e+00
12  1   1  2.000000e+00
13  1   2  5.000000e+00
14  1   3  1.000000e+01
15  1   4  1.700000e+01
16  1   5  2.600000e+01
17  1   6  3.700000e+01
18  1   7  5.000000e+01
19  1   8  6.500000e+01

I do know that there is interpolation build in to pandas, but I am unaware how to add the rows in between existing rows in a efficient manner. And therefore dont even reach the part, where I could apply my interpolation knowhow.

Thanks for all your input.

Source: Python Questions

LEAVE A COMMENT