Source code: org/apache/derby/impl/store/raw/data/ReclaimSpaceHelper.java
1 /*
2
3 Derby - Class org.apache.derby.impl.store.raw.data.ReclaimSpaceHelper
4
5 Copyright 1998, 2004 The Apache Software Foundation or its licensors, as applicable.
6
7 Licensed under the Apache License, Version 2.0 (the "License");
8 you may not use this file except in compliance with the License.
9 You may obtain a copy of the License at
10
11 http://www.apache.org/licenses/LICENSE-2.0
12
13 Unless required by applicable law or agreed to in writing, software
14 distributed under the License is distributed on an "AS IS" BASIS,
15 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16 See the License for the specific language governing permissions and
17 limitations under the License.
18
19 */
20
21 package org.apache.derby.impl.store.raw.data;
22
23 import org.apache.derby.impl.store.raw.data.BasePage;
24 import org.apache.derby.impl.store.raw.data.ReclaimSpace;
25
26
27 import org.apache.derby.iapi.services.daemon.DaemonService;
28 import org.apache.derby.iapi.services.daemon.Serviceable;
29 import org.apache.derby.iapi.services.sanity.SanityManager;
30 import org.apache.derby.iapi.error.StandardException;
31
32 import org.apache.derby.iapi.store.access.TransactionController;
33
34 import org.apache.derby.iapi.store.raw.ContainerKey;
35 import org.apache.derby.iapi.store.raw.ContainerHandle;
36 import org.apache.derby.iapi.store.raw.LockingPolicy;
37 import org.apache.derby.iapi.store.raw.Page;
38 import org.apache.derby.iapi.store.raw.PageKey;
39 import org.apache.derby.iapi.store.raw.RecordHandle;
40 import org.apache.derby.iapi.store.raw.Transaction;
41
42 import org.apache.derby.iapi.store.raw.xact.RawTransaction;
43 import org.apache.derby.iapi.store.raw.data.RawContainerHandle;
44
45
46 /**
47 This class helps a BaseDataFactory reclaims unused space.
48
49 Space needs to be reclaimed in the following cases:
50 <BR><NL>
51 <LI> Row with long columns or overflow row pieces is deleted
52 <LI> Insertion of a row that has long columns or overflows to other row pieces is rolled back
53 <LI> Row is updated and the head row or some row pieces shrunk
54 <LI> Row is updated and some long columns are orphaned because they are updated
55 <LI> Row is updated and some long columns are created but the update rolled back
56 <LI> Row is updated and some new row pieces are created but the update rolled back
57 </NL> <P>
58
59 We can implement a lot of optimization if we know that btree does not overflow.
60 However, since that is not the case and Raw Store cannot tell if it is dealing
61 with a btree page or a heap page, they all have to be treated gingerly. E.g.,
62 in heap page, once a head row is deleted (via a delete operation or via a
63 rollback of insert), all the long rows and long columns can be reclaimed - in
64 fact, most of the head row can be removed and reclaimed, only a row stub needs
65 to remain for locking purposes. But in the btree, a deleted row still needs to
66 contain the key values so it cannot be cleaned up until the row is purged.
67
68 <P><B>
69 Row with long columns or long row is deleted
70 </B><BR>
71
72 When Access purge a committed deleted row, the purge operation will see if the
73 row has overflowed row pieces or if it has long columns. If it has, then all
74 the long columns and row pieces are purged before the head row piece can be
75 purged. When a row is purged from an overflow page and it is the only row on
76 the page, then the page is deallocated in the same transaction. Note that
77 non-overflow pages are removed by Access but overflow pages are removed by Raw
78 Store. Note that page removal is done in the same transaction and not post
79 commit. This is, in general, dangerous because if the transaction does not
80 commit for a long time, uncommit deallocated page slows down page allocation
81 for this container. However, we know that access only purges committed delete
82 row in access post commit processing so we know the transaction will tend to
83 commit relatively fast. The alternative is to queue up a post commit
84 ReclaimSpace.PAGE to reclaim the page after the purge commits. In order to do
85 that, the time stamp of the page must also be remembered because post commit
86 work may be queued more than once, but in this case, it can only be done once.
87 Also, doing the page deallocation post commit adds to the overall cost and
88 tends to fill up the post commit queue. <BR>
89
90 This approach is simple but has the drawback that the entire long row and all
91 the long columns are logged in the purge operation. The alternative is more
92 complicated, we can remember all the long columns on the head row piece and
93 where the row chain starts and clean them up during post commit. During post
94 commit, because the head row piece is already purged, there is no need to log
95 the long column or the long rows, just wipe the page or just reuse the page if
96 that is the only thing on the page. The problem with this approach is that we
97 need to make sure the purging of the head row does indeed commit (the
98 transaction may commit but the purging may be rolled back due to savepoint).
99 So, we need to find the head row in the post commit and only when we cannot
100 find it can we be sure that the purge is committed. However, in cases where
101 the page can reuse its record Id (namely in btree), a new row may reuse the
102 same recordId. In that case, the post commit can purge the long columns or the
103 rest of the row piece only if the head piece no longer points to it. Because
104 of the complexity of this latter approach, the first simple approach is used.
105 However, if the performance due to extra logging becomes unbearble, we can
106 consider implementing the second approach.
107
108 <P><B>
109 Insertion of a row with long column or long row is rolled back.
110 </B><BR>
111
112 Insertion can be rolled back with either delete or purge. If the row is rolled
113 back with purge, then all the overflow columns pieces and row pieces are also
114 rolled back with purge. When a row is purged from an overflow page and it is
115 the only row on the page, then a post commit ReclaimSpace.PAGE work is queued
116 by Raw Store to reclaim that page.<BR>
117
118 If the row is rolled back with delete, then all the overflow columns pieces and
119 row pieces are also rolled back with delete. Access will purge the deleted row
120 in due time, see above.
121
122 <P><B>
123 Row is updated and the head row or some row pieces shrunk
124 </B><BR>
125
126 Every page that an update operation touches will see if the record on that page
127 has any reserve space. It it does, and if the reserve space plus the record
128 size exceed the mininum record size, then a post commit ROW_RESERVE work will
129 be queued to reclaim all unnecessary row reserved space for the entire row.
130
131 <P><B>
132 Row is updated and old long columns are orphaned
133 </B><BR>
134
135 The ground rule is, whether a column is a long column or not before an update
136 has nothing to do with whether a column will be a long column or not after the
137 update. In other words, update can turn a non-long column into a long column,
138 or it can turn a long column into a non-long column, or a long column can be
139 updated to another long column and a non-long column can be updated to a
140 non-long column. The last case - update of a non-long column to another
141 non-long column - is only of concern if it shrinks the row piece it is on (see
142 above).<BR>
143
144 So update can be looked at as 2 separate problems: A) a column is a long column
145 before the update and the update will "orphaned" it. B) a column is a long
146 column after the update and the rollback of the update will "orphaned" it if it
147 is rolled back with a delete. This section deals with problem A, next section
148 deals with problem B.<BR>
149
150 Update specifies a set of columns to be updated. If a row piece contains one
151 or more columns to be updated, those columns are examined to see if they are
152 actually long column chains. If they are, then after the update, those long
153 column chains will be orphaned. So before the update happens, a post commit
154 ReclaimSpace.COLUMN_CHAIN work is queued which contains the head rows id, the
155 column number, the location of the first piece of the column chain, and the
156 time stamp of the first page of the column chain. <BR>
157
158 If the update transaction commits, the post commit work will walk the row until
159 it finds the column number (note that it may not be on the page where the
160 update happened because of subsequent row splitting), and if it doesn't point
161 to the head of the column chain, we know the update operation has indeed
162 committed (versus rolled back by a savepoint). If a piece of the the column
163 chain takes up an entire page, then the entire page can be reclaimed without
164 first purging the row because the column chain is already orphaned.<BR>
165
166 We need to page time stamp of the first page of the column chain because if the
167 post commit ReclaimSpace.COLUMN_CHAIN is queued more than once, as can happen
168 in repeated rollback to savepoint, then after the first time the column is
169 reclaimed, the pages in the column chain can be reused. Therefore, we cannot
170 reclaim the column chain again. Since there is no back pointer from the column
171 chain to the head row, we need the timestamp to tell us if that column chain
172 has already been touched (reclaimed) or not.
173
174 <P><B>
175 Row is updated with new long columns and update is rolled back.
176 </B><BR>
177
178 When the update is rolled back, the new long columns, which got there by
179 insertion, got rolled back either by delete or by purge. If they were rolled
180 back with delete, then they will be orphaned and need to be cleaned up with
181 post abort work. Therefore, insertion of long columns due to update must be
182 rolled back with purge.<BR>
183
184 This is safe because the moment the rollback of the head row piece happens, the
185 new long column is orphaned anyway and nobody will be able to get to it. Since
186 we don't attempt to share long column pages, we know that nobody else could be
187 on the page and it is safe to deallocate the page.
188
189 <P><B>
190 Row is updated with new long row piece and update is rolled back.
191 </B><BR>
192
193 When the update is rolled back, the new long row piece, which got there by
194 insertion, got rolled back either by delete or by purge. Like update with new
195 long row, they should be rolled back with purge. However, there is a problem
196 in that the insert log record does not contain the head row handle. It is
197 possible that another long row emanating from the same head page overflows to
198 this page. That row may since have been deleted and is now in the middle of a
199 purge, but the purge has not commit. To the code that is rolling back the
200 insert (caused by the update that split off a new row piece) the overflow page
201 looks empty. If it went ahead and deallocate the page, then the transaction
202 which purged the row piece on this page won't be able to roll back. For this
203 reason, the rollback to insert of a long row piece due to update must be rolled
204 back with delete. Furthermore, there is no easy way to lodge a post
205 termination work to reclaim this deleted row piece so it will be lost forever.
206 <BR>
207
208 RESOLVE: need to log the head row's handle in the insert log record, i.e., any
209 insert due to update of long row or column piece should have the head row's
210 handle on it so that when the insert is rolled back with purge, and there is no
211 more row on the page, it can file a post commit to reclaim the page safely.
212 The post commit reclaim page needs to lock the head row and latch the head page
213 to make sure the entire row chain is stable.
214
215 <P><B>
216 */
217 public class ReclaimSpaceHelper
218 {
219 /**
220 Reclaim space based on work.
221 */
222 public static int reclaimSpace(BaseDataFileFactory dataFactory,
223 RawTransaction tran,
224 ReclaimSpace work)
225 throws StandardException
226 {
227
228 if (work.reclaimWhat() == ReclaimSpace.CONTAINER)
229 return reclaimContainer(dataFactory, tran, work);
230
231 // Else, not reclaiming container. Get a no-wait shared lock on the
232 // container regardless of how the user transaction had the
233 // container opened.
234
235 LockingPolicy container_rlock =
236 tran.newLockingPolicy(LockingPolicy.MODE_RECORD,
237 TransactionController.ISOLATION_SERIALIZABLE,
238 true /* stricter OK */ );
239
240 if (SanityManager.DEBUG)
241 SanityManager.ASSERT(container_rlock != null);
242
243 ContainerHandle containerHdl =
244 openContainerNW(tran, container_rlock, work.getContainerId());
245
246 if (containerHdl == null)
247 {
248 tran.abort();
249
250 if (SanityManager.DEBUG)
251 {
252 if (SanityManager.DEBUG_ON(DaemonService.DaemonTrace))
253 {
254 SanityManager.DEBUG(
255 DaemonService.DaemonTrace, " aborted " + work +
256 " because container is locked or dropped");
257 }
258 }
259
260 if (work.incrAttempts() < 3) // retry this for serveral times
261 return Serviceable.REQUEUE;
262 else
263 return Serviceable.DONE;
264 }
265
266 // At this point, container is opened with IX lock.
267
268 if (work.reclaimWhat() == ReclaimSpace.PAGE)
269 {
270 // Reclaiming a page - called by undo of insert which purged the
271 // last row off an overflow page. It is safe to reclaim the page
272 // without first locking the head row because unlike post commit
273 // work, this is post abort work. Abort is guarenteed to happen
274 // and to happen only once, if at all.
275 Page p = containerHdl.getPageNoWait(work.getPageId().getPageNumber());
276 if (p != null)
277 containerHdl.removePage(p);
278
279 tran.commit();
280 return Serviceable.DONE;
281 }
282
283 // We are reclaiming row space or long column. First get an xlock on the
284 // head row piece.
285 RecordHandle headRecord = work.getHeadRowHandle();
286
287 if (!container_rlock.lockRecordForWrite(
288 tran, headRecord, false /* not insert */, false /* nowait */))
289 {
290 // cannot get the row lock, retry
291 tran.abort();
292 if (work.incrAttempts() < 3)
293 return Serviceable.REQUEUE;
294 else
295 return Serviceable.DONE;
296 }
297
298 // The exclusive lock on the head row has been gotten.
299
300 if (work.reclaimWhat() == ReclaimSpace.ROW_RESERVE)
301 {
302 // This row may benefit from compaction.
303 containerHdl.compactRecord(headRecord);
304
305 // This work is being done - post commit, there is no user
306 // transaction that depends on the commit being sync'd. It is safe
307 // to commitNoSync() This do as one of 2 things will happen:
308 //
309 // 1) if any data page associated with this transaction is
310 // moved from cache to disk, then the transaction log
311 // must be sync'd to the log record for that change and
312 // all log records including the commit of this xact must
313 // be sync'd before returning.
314 //
315 // 2) if the data page is never written then the log record
316 // for the commit may never be written, and the xact will
317 // never make to disk. This is ok as no subsequent action
318 // depends on this operation being committed.
319 //
320 tran.commitNoSync(Transaction.RELEASE_LOCKS);
321
322 return Serviceable.DONE;
323 }
324 else
325 {
326 if (SanityManager.DEBUG)
327 SanityManager.ASSERT(work.reclaimWhat() == ReclaimSpace.COLUMN_CHAIN);
328
329 // Reclaiming a long column chain due to update. The long column
330 // chain being reclaimed is the before image of the update
331 // operation.
332 //
333 long headPageId = ((PageKey)headRecord.getPageId()).getPageNumber();
334 StoredPage headRowPage =
335 (StoredPage)containerHdl.getPageNoWait(headPageId);
336
337 if (headRowPage == null)
338 {
339 // Cannot get page no wait, try again later.
340 tran.abort();
341 if (work.incrAttempts() < 3)
342 return Serviceable.REQUEUE;
343 else
344 return Serviceable.DONE;
345 }
346
347 try
348 {
349 headRowPage.removeOrphanedColumnChain(work, containerHdl);
350 }
351 finally
352 {
353 headRowPage.unlatch();
354 }
355
356 // This work is being done - post commit, there is no user
357 // transaction that depends on the commit being sync'd. It is safe
358 // to commitNoSync() This do as one of 2 things will happen:
359 //
360 // 1) if any data page associated with this transaction is
361 // moved from cache to disk, then the transaction log
362 // must be sync'd to the log record for that change and
363 // all log records including the commit of this xact must
364 // be sync'd before returning.
365 //
366 // 2) if the data page is never written then the log record
367 // for the commit may never be written, and the xact will
368 // never make to disk. This is ok as no subsequent action
369 // depends on this operation being committed.
370 //
371 tran.commitNoSync(Transaction.RELEASE_LOCKS);
372
373 return Serviceable.DONE;
374 }
375 }
376
377 private static int reclaimContainer(BaseDataFileFactory dataFactory,
378 RawTransaction tran,
379 ReclaimSpace work)
380 throws StandardException
381 {
382 // when we want to reclaim the whole container, gets an exclusive
383 // XLock on the container, wait for the lock.
384
385 LockingPolicy container_xlock =
386 tran.newLockingPolicy(LockingPolicy.MODE_CONTAINER,
387 TransactionController.ISOLATION_SERIALIZABLE,
388 true /* stricter OK */ );
389
390 if (SanityManager.DEBUG)
391 SanityManager.ASSERT(container_xlock != null);
392
393 // Try to just get the container thru the transaction.
394 // Need to do this to transition the transaction to active state.
395 RawContainerHandle containerHdl = tran.openDroppedContainer(
396 work.getContainerId(),
397 container_xlock);
398
399 // if it can get lock but it is not deleted or has already been
400 // deleted, done work
401 if (containerHdl == null ||
402 containerHdl.getContainerStatus() == RawContainerHandle.NORMAL ||
403 containerHdl.getContainerStatus() == RawContainerHandle.COMMITTED_DROP)
404 {
405 if (containerHdl != null)
406 containerHdl.close();
407 tran.abort(); // release xlock, if any
408
409 if (SanityManager.DEBUG)
410 {
411 if (SanityManager.DEBUG_ON(DaemonService.DaemonTrace))
412 {
413 SanityManager.DEBUG(
414 DaemonService.DaemonTrace, " aborted " + work);
415 }
416 }
417 }
418 else
419 {
420 // we got an xlock on a dropped container. Must be committed.
421 // Get rid of the container now.
422 ContainerOperation lop = new
423 ContainerOperation(containerHdl, ContainerOperation.REMOVE);
424
425 // mark the container as pre-dirtied so that if a checkpoint
426 // happens after the log record is sent to the log stream, the
427 // cache cleaning will wait for this change.
428 containerHdl.preDirty(true);
429 try
430 {
431 tran.logAndDo(lop);
432 }
433 finally
434 {
435 // in case logAndDo fail, make sure the container is not
436 // stuck in preDirty state.
437 containerHdl.preDirty(false);
438 }
439
440
441 containerHdl.close();
442 tran.commit();
443
444 if (SanityManager.DEBUG)
445 {
446 if (SanityManager.DEBUG_ON(DaemonService.DaemonTrace))
447 {
448 SanityManager.DEBUG(
449 DaemonService.DaemonTrace, " committed " + work);
450 }
451 }
452 }
453
454 return Serviceable.DONE;
455
456 }
457
458
459 /**
460 Open container shared no wait
461 */
462 private static ContainerHandle openContainerNW(Transaction tran,
463 LockingPolicy rlock, ContainerKey containerId)
464 throws StandardException
465 {
466 ContainerHandle containerHdl = tran.openContainer
467 (containerId, rlock,
468 ContainerHandle.MODE_FORUPDATE |
469 ContainerHandle.MODE_LOCK_NOWAIT);
470
471 return containerHdl;
472 }
473
474 }