2016-08-17 26 views

答えて

1

あなたはNaN値の後に行をカウントされ、マスクでboolean indexingを使用することができます。

df = pd.DataFrame(np.arange(120).reshape(30,4),columns=list('ABCD')) 
df.ix[2:5,'A'] = np.nan 
df.ix[20:23,'A'] = np.nan 
print (df) 
     A B C D 
0  0.0 1 2 3 
1  4.0 5 6 7 
2  NaN 9 10 11 
3  NaN 13 14 15 
4  NaN 17 18 19 
5  NaN 21 22 23 
6 24.0 25 26 27 
7 28.0 29 30 31 
8 32.0 33 34 35 
9 36.0 37 38 39 
10 40.0 41 42 43 
11 44.0 45 46 47 
12 48.0 49 50 51 
13 52.0 53 54 55 
14 56.0 57 58 59 
15 60.0 61 62 63 
16 64.0 65 66 67 
17 68.0 69 70 71 
18 72.0 73 74 75 
19 76.0 77 78 79 
20 NaN 81 82 83 
21 NaN 85 86 87 
22 NaN 89 90 91 
23 NaN 93 94 95 
24 96.0 97 98 99 
25 100.0 101 102 103 
26 104.0 105 106 107 
27 108.0 109 110 111 
28 112.0 113 114 115 
29 116.0 117 118 119 
a = df.A.notnull() 
b = (a.cumsum()-a.cumsum().where(~a).ffill().fillna(0)).where(df.A.isnull().cumsum() != 0) 
print (b) 
0  NaN 
1  NaN 
2  0.0 
3  0.0 
4  0.0 
5  0.0 
6  1.0 
7  2.0 
8  3.0 
9  4.0 
10  5.0 
11  6.0 
12  7.0 
13  8.0 
14  9.0 
15 10.0 
16 11.0 
17 12.0 
18 13.0 
19 14.0 
20  0.0 
21  0.0 
22  0.0 
23  0.0 
24  1.0 
25  2.0 
26  3.0 
27  4.0 
28  5.0 
29  6.0 
Name: A, dtype: float64 
#get rows without 5 below NaN and without NaN in column A 
print (df[b > 5]) 
     A  B  C  D 
11 44.0 45.0 46.0 47.0 
12 48.0 49.0 50.0 51.0 
13 52.0 53.0 54.0 55.0 
14 56.0 57.0 58.0 59.0 
15 60.0 61.0 62.0 63.0 
16 64.0 65.0 66.0 67.0 
17 68.0 69.0 70.0 71.0 
18 72.0 73.0 74.0 75.0 
19 76.0 77.0 78.0 79.0 
29 116.0 117.0 118.0 119.0 
1

借入jezraelのセットアップ

df = pd.DataFrame(np.arange(120).reshape(30,4),columns=list('ABCD')) 
df.ix[2:5,'A'] = np.nan 
df.ix[20:23,'A'] = np.nan 

使用pd.concatany

# grab null rows 
isnull = df.A.isnull() 

# get subsequent 10 rows after null 
mask = pd.concat([isnull.shift(i) for i in range(11)], axis=1).any(1) 

# boolean mask 
df[~mask] 

enter image description here

+0

今私は確信していない...私は私たちが見たと思った... – piRSquared

+0

ありがとう、それは非常に、それは働く – Pramod

関連する問題