Category : numpy-ndarray

I am trying to count the number of elements in a rolling window of a dataframe satisfying a condition. let’s say I have a column having data 1,2,3,4,5,6,7,8,9,10. I want to count the number of elements above their mean (i.e. 5.5) So far I tried this: df["count"] = df["O"].shift().rolling(10).apply(lambda x : x[x>=x.mean()].count()) expected value: 5 ..

Read more

I have a dataset to need to make a K-Means figure using machine learning. The following is my code from sklearn.cluster import KMeans import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn import datasets df = pd.read_csv(‘../6_to_12_Month_All_Data.csv’ ,encoding=’utf-8′) X = df[[‘Yellow’, ‘Blue’]].values X KM=KMeans(n_clusters=3,init=’random’,random_state=5) KM.fit(X) KM.predict(X) plt.figure(figsize=(25,6)) plt.style.use(‘ggplot’) plt.scatter(X[:,0],X[:,1],c=KM.predict(X)) The ..

Read more

My data is about genome sequence basically a long string of "AAATTGCCAA…AA". Here is a pic of my dataFrame I converted the data into a NumPy array by using the function given below. def seq_conversion(matrix): vectorSize= 29907 print(‘vectorSize’, vectorSize) out_data=[] for i in range (len(matrix)): sample=np.zeros(vectorSize) for j in range (0, len(matrix[i])): if(matrix[i][j]==’C’): sample[j]=0.25 elif(matrix[i][j]==’T’): ..

Read more