Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IGNITE-23655 Introduce CacheIdleVerifyCancelCommand #11673

Merged
merged 47 commits into from
Jan 13, 2025
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
9e0fc30
IGNITE-23655 CacheIdleVerifyCancelCommand v1.
vladnovoren Nov 21, 2024
f528ca4
IGNITE-23655 Test for idle_verify --cancel v1 (DRAFT)
vladnovoren Nov 27, 2024
0616a87
IGNITE-23655 CacheIdleVerifyCancelJob is not throwing if no running i…
vladnovoren Nov 28, 2024
55768ed
IGNITE-23655 Removed Serializable
vladnovoren Dec 5, 2024
ae2d98f
Merge branch 'master' into ignite-23655
vladnovoren Dec 5, 2024
3afb342
IGNITE-23655 Review fixes (codestyle + futures + runAsync), also fixe…
vladnovoren Dec 9, 2024
e05d35d
IGNITE-23655 Added check that idle_verify is done and message about i…
vladnovoren Dec 10, 2024
3995807
test case: running job not found
vladnovoren Dec 16, 2024
6025b85
InterruptedException draft
vladnovoren Dec 23, 2024
bd299a1
IGNITE-23655 assertTrue(idleVerifyFut.isDone()) fails
vladnovoren Dec 23, 2024
54a82ad
IGNITE-23655 Added assert for waitForCondition, removed assert !job.i…
vladnovoren Dec 23, 2024
1bf8664
IGNITE-23655 Save
vladnovoren Dec 24, 2024
743877b
IGNITE-23655 Save
vladnovoren Dec 24, 2024
531cfbf
IGNITE-23655 Removed debug prints & logs, leaved simple test that doe…
vladnovoren Dec 24, 2024
f583f41
IGNITE-23655 Catch of IgniteInterruptedCheckedException in VerifyBack…
vladnovoren Dec 24, 2024
2427079
IGNITE-23655 CacheIdleVerifyCancelJob#cancelJob: removed iteration ov…
vladnovoren Dec 25, 2024
602e674
Fixed TASKS_VIEW and JOBS_VIEW tests.
vladnovoren Dec 25, 2024
188a7d7
IGNITE-23655 Added log listener to condiguration. Now log checks are …
vladnovoren Dec 25, 2024
38b3245
IGNITE-23655 Added testIdleVerifyCancelCommandOnCheckpoint, it passes…
vladnovoren Dec 26, 2024
49769b3
IGNITE-23655 Overriden ForkJoinPool#submit method with correct signat…
vladnovoren Jan 9, 2025
65d0508
IGNITE-23655 Some minor and checkstyle fixes.
vladnovoren Jan 9, 2025
017d5ee
IGNITE-23655 Removed unnecessary sleep, added await timeout + minor f…
vladnovoren Jan 9, 2025
7d3be43
IGNITE-24175 Test reproducing issue
nizhikov Jan 9, 2025
8b9dff4
IGNITE-24175 Fix
nizhikov Jan 9, 2025
febd532
IGNITE-24175 Rework to syncRunningJobs map
nizhikov Jan 9, 2025
d9a9769
IGNITE-24175 Test for cancel internal task added.
nizhikov Jan 9, 2025
c4f218a
IGNITE-24175 Test for cancel internal task added.
nizhikov Jan 9, 2025
8b82bba
IGNITE-24175 Test for cancel internal task added.
nizhikov Jan 9, 2025
af34c22
IGNITE-23655 Fixed cancel tests: added missing assertTrue on waitForC…
vladnovoren Jan 10, 2025
e10b268
IGNITE-23655 Merged nizhikov changes (Show internal jobs in system vi…
vladnovoren Jan 10, 2025
3375ad6
IGNITE-23655 Removed unnecessary catch of IgniteInterruptedCheckedExc…
vladnovoren Jan 10, 2025
696a079
IGNITE-23655 Minor fixed.
vladnovoren Jan 10, 2025
fff390f
IGNITE-23655 Minor fixes + docs.
vladnovoren Jan 10, 2025
189daef
IGNITE-23655 Removed unnecessary catch of IgniteInterruptedCheckedExc…
vladnovoren Jan 10, 2025
35d98e9
IGNITE-23655 Added list of tasks to cancel in CacheIdleVerifyCancelTask.
vladnovoren Jan 10, 2025
f4a99c2
IGNITE-23655 Checkpoint listener for all nodes + isCancelled() check …
vladnovoren Jan 10, 2025
e5a513b
Merge branch 'master' into ignite-23655
vladnovoren Jan 10, 2025
50aae0d
IGNITE-23655 Codestyle fixes.
vladnovoren Jan 10, 2025
61945a2
IGNITE-23655 Added patch by nizhikov: Fix cancel during partition ite…
vladnovoren Jan 10, 2025
e21c63e
Add test to check --check-crc canceled
nizhikov Jan 10, 2025
c2c617c
Rename constance #1
nizhikov Jan 10, 2025
e6db06e
Rename constant #2
nizhikov Jan 10, 2025
efad094
Update GridCommandHandlerTest.java
nizhikov Jan 10, 2025
386adc4
Update CacheIdleVerifyCancelTask.java
nizhikov Jan 10, 2025
aa19573
Update VerifyBackupPartitionsTaskV2.java
nizhikov Jan 10, 2025
4d9fa97
Update IdleVerifyUtility.java
nizhikov Jan 10, 2025
6ed3187
Update GridCommandHandlerTest.java
nizhikov Jan 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,10 @@
import java.util.TreeMap;
import java.util.TreeSet;
import java.util.UUID;
import java.util.concurrent.Callable;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.ForkJoinTask;
import java.util.concurrent.ThreadLocalRandom;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicBoolean;
Expand Down Expand Up @@ -84,6 +87,7 @@
import org.apache.ignite.internal.client.util.GridConcurrentHashSet;
import org.apache.ignite.internal.management.cache.FindAndDeleteGarbageInPersistenceTaskResult;
import org.apache.ignite.internal.management.cache.IdleVerifyDumpTask;
import org.apache.ignite.internal.management.cache.IdleVerifyTaskV2;
import org.apache.ignite.internal.management.cache.VerifyBackupPartitionsTaskV2;
import org.apache.ignite.internal.management.tx.TxInfo;
import org.apache.ignite.internal.management.tx.TxTaskResult;
Expand All @@ -98,6 +102,7 @@
import org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal;
import org.apache.ignite.internal.processors.cache.persistence.CheckpointState;
import org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager;
import org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointListener;
import org.apache.ignite.internal.processors.cache.persistence.db.IgniteCacheGroupsWithRestartsTest;
import org.apache.ignite.internal.processors.cache.persistence.diagnostic.pagelocktracker.dumpprocessors.ToFileDumpProcessor;
import org.apache.ignite.internal.processors.cache.persistence.file.FileIO;
Expand Down Expand Up @@ -135,7 +140,10 @@
import org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi;
import org.apache.ignite.spi.metric.LongMetric;
import org.apache.ignite.spi.metric.Metric;
import org.apache.ignite.spi.systemview.view.ComputeJobView;
import org.apache.ignite.spi.systemview.view.ComputeTaskView;
import org.apache.ignite.testframework.GridTestUtils;
import org.apache.ignite.testframework.ListeningTestLogger;
import org.apache.ignite.testframework.LogListener;
import org.apache.ignite.testframework.junits.WithSystemProperty;
import org.apache.ignite.transactions.Transaction;
Expand Down Expand Up @@ -173,6 +181,8 @@
import static org.apache.ignite.internal.processors.cache.persistence.snapshot.SnapshotRestoreProcess.SNAPSHOT_RESTORE_METRICS;
import static org.apache.ignite.internal.processors.cache.verify.IdleVerifyUtility.GRID_NOT_IDLE_MSG;
import static org.apache.ignite.internal.processors.diagnostic.DiagnosticProcessor.DEFAULT_TARGET_FOLDER;
import static org.apache.ignite.internal.processors.job.GridJobProcessor.JOBS_VIEW;
import static org.apache.ignite.internal.processors.task.GridTaskProcessor.TASKS_VIEW;
import static org.apache.ignite.testframework.GridTestUtils.assertContains;
import static org.apache.ignite.testframework.GridTestUtils.assertNotContains;
import static org.apache.ignite.testframework.GridTestUtils.assertThrows;
Expand Down Expand Up @@ -219,13 +229,16 @@ public class GridCommandHandlerTest extends GridCommandHandlerClusterPerMethodAb
/** */
protected static File customDiagnosticDir;

/** */
protected ListeningTestLogger listeningLog = new ListeningTestLogger(log);

/** {@inheritDoc} */
@Override protected void beforeTest() throws Exception {
super.beforeTest();

initDiagnosticDir();
cleanPersistenceDir();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this change? Let's revert it.


cleanDiagnosticDir();
initDiagnosticDir();
}

/** {@inheritDoc} */
Expand All @@ -235,6 +248,15 @@ public class GridCommandHandlerTest extends GridCommandHandlerClusterPerMethodAb
cleanDiagnosticDir();
}

/** {@inheritDoc} */
@Override protected IgniteConfiguration getConfiguration(String igniteInstanceName) throws Exception {
IgniteConfiguration cfg = super.getConfiguration(igniteInstanceName);

cfg.setGridLogger(listeningLog);

return cfg;
}

/**
* @throws IgniteCheckedException If failed.
*/
Expand Down Expand Up @@ -752,6 +774,173 @@ public void testIdleVerifyOnInactiveClusterWithPersistence() throws Exception {
assertContains(log, testOut.toString(), "The check procedure has finished, no conflicts have been found.");
}

/**
*
*/
@Test
public void testIdleVerifyCancelCommandOnCheckpoint() throws Exception {
final int gridsCnt = 4;

IgniteEx srv = startGrids(gridsCnt);

srv.cluster().state(ACTIVE);

CountDownLatch beforeCancelLatch = new CountDownLatch(1);
CountDownLatch afterCancelLatch = new CountDownLatch(1);

GridCacheDatabaseSharedManager dbMgr = (GridCacheDatabaseSharedManager)grid(1).context().cache().context().database();

dbMgr.addCheckpointListener(new CheckpointListener() {
@Override public void beforeCheckpointBegin(Context ctx) {
if (ctx.progress().reason().equals("VerifyBackupPartitions"))
beforeCancelLatch.countDown();
}

@Override public void afterCheckpointEnd(Context ctx) throws IgniteCheckedException {
if (ctx.progress().reason().equals("VerifyBackupPartitions")) {
try {
afterCancelLatch.await(30, TimeUnit.SECONDS);
}
catch (InterruptedException e) {
throw new IgniteInterruptedCheckedException(e);
}
}
}

@Override public void onMarkCheckpointBegin(Context ctx) {
// No-op.
}

@Override public void onCheckpointBegin(Context ctx) {
// No-op.
}
});

List<LogListener> listeners = registerIdleVerifyCancelListeners();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like registering and checking listeners can be moved to cancelIdleVerifyAndCheck


IgniteCache<Integer, Integer> cache = srv.createCache(new CacheConfiguration<Integer, Integer>(DEFAULT_CACHE_NAME).setBackups(3));

for (int i = 0; i < 100; i++)
cache.put(i, i);

cancelIdleVerifyAndCheck(beforeCancelLatch, afterCancelLatch, gridsCnt, listeners);
}

nizhikov marked this conversation as resolved.
Show resolved Hide resolved

/**
*
*/
@Test
public void testIdleVerifyTrackedForkJoinPool() throws Exception {
final int gridsCnt = 4;

IgniteEx srv = startGrids(gridsCnt);

srv.cluster().state(ACTIVE);

List<LogListener> listeners = registerIdleVerifyCancelListeners();

IgniteCache<Integer, Integer> cache = srv.createCache(new CacheConfiguration<Integer, Integer>(DEFAULT_CACHE_NAME).setBackups(3));

for (int i = 0; i < 100; i++)
cache.put(i, i);

CountDownLatch beforeCancelLatch = new CountDownLatch(1);
CountDownLatch afterCancelLatch = new CountDownLatch(1);

ForkJoinPool forkJoinPool = new ForkJoinPool() {
@Override public <T> ForkJoinTask<T> submit(Callable<T> task) {
beforeCancelLatch.countDown();

ForkJoinTask<T> submitted = super.submit(task);

try {
afterCancelLatch.await();
}
catch (InterruptedException e) {
throw new RuntimeException(e);
}

return submitted;
}
};

VerifyBackupPartitionsTaskV2.poolSupplier = () -> forkJoinPool;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ForkJoinPool forkJoinPool = new ForkJoinPool() {
@Override public <T> ForkJoinTask<T> submit(Callable<T> task) {
beforeCancelLatch.countDown();
ForkJoinTask<T> submitted = super.submit(task);
try {
afterCancelLatch.await();
}
catch (InterruptedException e) {
throw new RuntimeException(e);
}
return submitted;
}
};
VerifyBackupPartitionsTaskV2.poolSupplier = () -> forkJoinPool;
VerifyBackupPartitionsTaskV2.poolSupplier = () -> new ForkJoinPool() {
@Override public <T> ForkJoinTask<T> submit(Callable<T> task) {
beforeCancelLatch.countDown();
ForkJoinTask<T> submitted = super.submit(task);
try {
afterCancelLatch.await();
}
catch (InterruptedException e) {
throw new RuntimeException(e);
}
return submitted;
}
};


cancelIdleVerifyAndCheck(beforeCancelLatch, afterCancelLatch, gridsCnt, listeners);
}

/**
* @param beforeCancelLatch Latch for await before cancel.
* @param afterCancelLatch Latch for await before cancel completes.
* @param gridsCnt Grids count.
* @param listeners Log listeners.
*/
private void cancelIdleVerifyAndCheck(
CountDownLatch beforeCancelLatch,
CountDownLatch afterCancelLatch,
int gridsCnt,
List<LogListener> listeners
) throws InterruptedException, IgniteCheckedException {
IgniteInternalFuture<Integer> idleVerifyFut = GridTestUtils.runAsync(() -> execute("--cache", "idle_verify"));

beforeCancelLatch.await(30, TimeUnit.SECONDS);

assertEquals(EXIT_CODE_OK, execute("--cache", "idle_verify", "--cancel"));

afterCancelLatch.countDown();

checkSystemViewsNoIdleVerify(gridsCnt);

idleVerifyFut.get(getTestTimeout());

for (LogListener listener : listeners)
assertTrue(listener.check());
}

/**
* @param gridsCnt Grids count.
*/
private void checkSystemViewsNoIdleVerify(int gridsCnt) throws IgniteInterruptedCheckedException {
for (int i = 0; i < gridsCnt; i++) {
int finalI = i;

waitForCondition(() -> {
for (ComputeTaskView taskView : grid(finalI).context().systemView().<ComputeTaskView>view(TASKS_VIEW)) {
if (IdleVerifyTaskV2.class.getName().equals(taskView.taskName()))
return false;
}

return true;
}, 1000);

waitForCondition(() -> {
for (ComputeJobView jobView : grid(finalI).context().systemView().<ComputeJobView>view(JOBS_VIEW)) {
if (IdleVerifyTaskV2.class.getName().equals(jobView.taskName()))
return false;
}

return true;
}, 1000);
}
}

/**
*
*/
private List<LogListener> registerIdleVerifyCancelListeners() {
List<LogListener> listeners = new ArrayList<>();

listeners.add(LogListener.matches("Idle verify was cancelled.").build());

listeners.add(LogListener.matches("Cancel request sent to VerifyBackupPartitionsJobV2.").build());

for (LogListener listener : listeners)
listeningLog.registerListener(listener);

return listeners;
}

/**
* Test deactivation works via control.sh
*
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.ignite.internal.management.cache;

import java.util.Collection;

import org.apache.ignite.internal.client.GridClientNode;
import org.apache.ignite.internal.management.api.ComputeCommand;
import org.apache.ignite.internal.management.api.NoArg;

/**
* Cancels idle_verify command.
*/
public class CacheIdleVerifyCancelCommand implements ComputeCommand<NoArg, Void> {
/** {@inheritDoc} */
@Override public Class<CacheIdleVerifyCancelTask> taskClass() {
return CacheIdleVerifyCancelTask.class;
}

/** {@inheritDoc} */
@Override public String description() {
return "Cancels idle_verify command";
}

/** {@inheritDoc} */
@Override public Class<NoArg> argClass() {
return NoArg.class;
}

/** {@inheritDoc} */
@Override public Collection<GridClientNode> nodes(Collection<GridClientNode> nodes, NoArg arg) {
return nodes;
}
}
Loading