We have a physical Extrahop discover unit and three virtual Explore units (running as VMs on ESXi) which I recently upgraded from 8.5 to 8.6, the previous two upgrades had gone fine but this time it didn’t go so smoothly. The Discover unit was very quick and went fine, the second Explore unit took around half an hour, the first Explore unit took just under ten and a half hours and the third Explore unit took over 11 hours. The entire time I was waiting the browser showed ‘Uploading…6%’ and very slowly ticked up, I didn’t want to cancel the operation in case it caused issues and did mail Extrahop at the time but there’s been no response.
Incredibly after all that time they did successfully update and there’s been no issues since although the Discover unit is a bit slower to come up first time. I’m wondering if there’s anything I can check on the Explore units to stop this happening again, I followed the steps I could find on this site so I disabled the record ingress on the Explore units, I upgraded the Discover unit first then the Explore units and I made sure all the Explore units were on the same firmware. I haven’t done any restarts on the devices themselves or cleared the cache or anything.
One concern I have is the disk usage on the Explore units which is as follows:
Explore Unit 1 - 509GB/718GB (71% used)
Explore Unit 2 - 609GB/718GB (85% used)
Explore Unit 3 - 523GB/718GB (73% used)
The replication is set to level 1 and unit 2 which was by far the quickest, has the most disk usage. I’ve had a hunt through the documentation but I can’t find what should be done long term to manage disk capacity since it can’t just keep climbing.
Any recommendations to check would be appreciated.