[Fixed in 1.10.2] Devices stuck in Discover or Backup
Posted: Mon May 13, 2019 9:38 pm
We have several instance so Unimus running. it seems that when one instance has > 1000 devices that devices will fail to finish discovery or backup. once they are stuck in this state it is not able to try any further action on this device. restarting the unimus instance or deleting and readding the device seems to resolve the issue.
some of our backups may be very long as we have some switches and routers with significantly large vlan tables.
we are currently running version 1.10.1
below is from the error log when this appears to occur.
some of our backups may be very long as we have some switches and routers with significantly large vlan tables.
we are currently running version 1.10.1
below is from the error log when this appears to occur.
Code: Select all
prod_unimus_unimus.1.xvshfhsqctw3@test.example.com | 2019-04-30 03:07:18.274 WARN 1 --- [ discovery-106] net.unimus.core.api.CoreImpl : Error during discovery of 10.109.41.5
prod_unimus_unimus.1.xvshfhsqctw3@test.example.com |
prod_unimus_unimus.1.xvshfhsqctw3@test.example.com | java.lang.IllegalStateException: Can't start StopWatch: it's already running
prod_unimus_unimus.1.xvshfhsqctw3@test.example.com | at org.springframework.util.StopWatch.start(StopWatch.java:127)
prod_unimus_unimus.1.xvshfhsqctw3@test.example.com | at org.springframework.util.StopWatch.start(StopWatch.java:116)
prod_unimus_unimus.1.xvshfhsqctw3@test.example.com | at net.unimus.core.util.metrics.JobDurationMetrics.startMeasuring(JobDurationMetrics.java:27)
prod_unimus_unimus.1.xvshfhsqctw3@test.example.com | at net.unimus.core.api.CoreImpl$DiscoveryExecutor.doRun(CoreImpl.java:348)
prod_unimus_unimus.1.xvshfhsqctw3@test.example.com | at net.unimus.core.api.CoreImpl$ErrorHandlingExecutor.run(CoreImpl.java:302)
prod_unimus_unimus.1.xvshfhsqctw3@test.example.com | at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
prod_unimus_unimus.1.xvshfhsqctw3@test.example.com | at java.util.concurrent.FutureTask.run(FutureTask.java:266)
prod_unimus_unimus.1.xvshfhsqctw3@test.example.com | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
prod_unimus_unimus.1.xvshfhsqctw3@test.example.com | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
prod_unimus_unimus.1.xvshfhsqctw3@test.example.com | at java.lang.Thread.run(Thread.java:745)