VMFS-6 read file error – Found stale lock

Once upon a time there was power cut and ESXi system went down.

… for these in rush, scroll down to “Fix” section…

After power-up none of details could be seen for VM which was up at the time of power cut. On VMs list page it was showing VMFS path to the file and that was it.

CLI level investigation shown that neither VM.vmx nor VM.vmdk files could be read and were raising error.

Fun…. fun… fun….

Disclaimer:
All below is against any good practice and/or anything Vmware would recommend as overwrites sensitive meta-data information of vmfs.
With that said, Vmware does not provide any way to fix it as of today and the only way would be to contact Vmware support.
All below is for educational purposes only and done at your own risk.

First steps

First steps were towards voma tool and output was similar to below (though below is taken from internet as no screenshots were taken at the time of the issue):

voma -m vmfs -f check -d <path_to_device>

returned output similar to below:

#https://www.reddit.com/r/vmware/comments/9j8k6o/esxi_vmfs6_datastore_corruption_after_host_reboot/

VOMA unfortunately does not support VMFS-6 in fix mode (as of 2018.11.02 on ESXi 6.5 and 6.7).

vmfs-tools (https://glandium.org/projects/vmfs-tools/) does not support VMFS-6 neither.

This left me in cul-de-sac… almost.

Big, big thanks to: Ulli aka continuum at communities.vmware.com helped to solve the issue.

Fix

1.
Dump heartbeat section of your VMFS-6 in question (.vf.sf file is in root folder of VMFS-6)

2.
Verify if the file contain only locks from your system (needs to be done on other system as strings is not available on ESXi, scp or any other way to get the file out of your system is your friend) :

3.
If above is confirmed, generate a clean heartbeat section using same build of ESXi and dump it to file:

4.
Transfer that clean file to your ESXi server and incorporate it into VMFS-6 with issues:

Within a minute or two earlier locked files should be accessible if not, try to reboot your ESXi.

 

Links:

Locked files with VMFS 6

Create a VMFS-Header-dump using an ESXi-Host in production

https://communities.vmware.com/thread/597513

Apple censorship – posts on LSIs (liquid indicators) removed by Apple

Dear,

There’s a lot of noise about Apple products and how Apple refuses to accept warranty repairs abusing LSIs.

Apple went one step further and censors and tries to mute all discussions about this subject.

Below is example of post they removed claiming is “inappropriate” for forum whilst was posted under thread where LSI was subject.

Original post below.

Hi (SecTec),

Thanks for participating in the Apple Support Communities.

We’ve removed your post macbook pro problem – “water damage”- Really? because it contained either product feedback or a feature request that was not constructive.

To read our terms and conditions for using the Communities site, see this page:  Apple Support Communities  – Terms of Use

We hope you’ll keep using our Support Communities. You can find more information about participating here:  Apple Support Communities  – Tutorials

If you have comments about any of our products, we welcome your feedback:  Apple – Feedback

We’ve included a copy of your original post below.

Thanks,
Apple Support Communities Staff


Original – removed post:

I’ll add to it a bit of my story, same as above which just highlights that there’s something wrong with MacBook Pro design and/or LSIs (liquid indicators).

 

The story started back in 2014, bought new MacBook Pro, best available model. Used it for professional use, traveling a lot by plane, fully aware of how to operate computer equipment as I used to assemble PCs in the past and am in industry for 20+ years. Worth noting that prior MacBook Pro, used all sorts of other vendors, Dell, Lenovo, HP – never had any problems with equipment and kept replacing it only when it was too old/slow.

 

Back to main story, the acquired MacBook was put into Speck enclosure to protect it from body damages, this was ca. September 2014.

I have also additional insurance covering any liquid spills, etc. so essentially I’m not bothered with that from expenses side, but more from the aspect that I’ve never spilled any liquid on any of my laptops. I’ve learnt my lesson with coffee over keyboard back in the past with desktop PCs, so very careful these days and never have any food/liquids around laptops.

 

Fast forward to January 2015, was traveling to Florida, this time for leisure. Rarely used system, however on the very last day was checking flight plan for the day when the SSD died. System did hang and upon a try to reboot reported missing drive.

Was delivered to Service center. Forced technician to start tests in front of me which didn’t make them happy. There were no complaints about any damages on the body. Had to leave system with them.

Up to my huge surprise couple of days later received a call that LSIs were triggered.

I’ve requested to see that and at that time discovered that:

a) usb port was damaged (sic!), technician was claiming that it was since the beginning,

b) seen LSI triggered (red).

 

What came to my attention was that only one LSI was triggered and none other. This did lead me to request technician how in the world this one could be triggered if none other was triggered.

This did lead him to seek Apple authorization for exceptional approval.

The SSD has been replaced in result.

 

In April 2014, system became slow (like really slow, think about PC with i396 CPU) and started to report that battery needs replacement, was powering off seconds after disconnecting charger.

 

System was then delivered to Apple store. System was inspected by technician and I’ve received call that system was exposed to water and that stops any further activities from their side.

It took some effort to collect documentation from previous case to present them that nothing did happen since last repair, no other LSIs were triggered and this technician also admitted that it is not the first time he has seen LSIs triggered which would be difficult to have triggered as single one and no others.

System was repaired, this time battery, meaning the whole lower body due to new battery specifics.

 

Moving forward to December 2015, the very same Mac failed again. Same issue as previously, meaning slow system, charging.

 

This meant for me couple of things,

a) previous slowness happen just before business trip, but as it was night before, I thought it was just a slow system and battery issue, I took Mac with me, only to discover that it was useless. This forced me to buy new system with similar specification as Mac, due to my profession I have to have systems with 16GB ram, SSD, etc. So, fun, fun… a lot of expenses.

b) this time it failed during business trip and guess what, I didn’t have the earlier acquired spare system with me. Due to importance of the trip, length and requirements, I’ve had to acquire yet another system (sic!). In total two spare systems, thank you Apple.

c) system was out of standard Apple warranty and due to above, I was sick with issues with this system, I’ve submitted claim to have system replaced .

 

Couple of days later, I’ve received call from technician asking questions about the issue, history, etc. Good, they started to work on it.

Up to my surprise there was no sign of life from their side for very long time, when they called back it was 20 days after reporting issue to them.

Without any surprise I’ve been told that there were LSIs in red and as result they deny any of my claims due to water, etc. This was certainly a nice excuse and try from their side, as these repairs are on Seller cost and not necessarily considered as “warranty” repair. I’m unsure how Apple plays with their cost centers, etc… it is outside of this story.

 

Their push back was so mean that it triggered me to not exercise my insurance

 

It took a nice additional days to get it executed.

In March 2016, I’ve had new MacBook Pro in hand.

 

I was so ****** by that time and so used to my new Lenovo, that I kept it using, happily… no issues.

MacBook was happily sleeping in unsealed, original Apple box.

 

For some reason I’ve decided to start using the system again around December, as the retina display provide nice colors comparing to Lenovo display.

 

So, was happy Mac user again.

 

Just until March 2017…. so it counts at max 3 months!!!

 

Mac failed again. This time it had close to zero trips, stayed at home. This time it failed in different way, was left at the desk, went to sleep (display off). The next day I’ve tried to wake it up, no reaction. Tried all combinations SMC reset, etc. Noticed that Mag plug was showing green indicator. Since MacBook are so great in communication with user about it’s state it was my only indicator to see if it reacts on anything. Unplugged it, plugged back and zero, nothing, no lit… just gray… no orange, no green.

Ok… now it is dead like never again. Called Apple, they pushed me through standard, check connector, etc…

 

Went to Service Center again, technician happily noted, no scratches, nothing, mint condition. Given past experience I’ve pushed them to sign that no physical damages were there. Since I was in rush, it was late and I’ve had to prepare for trip next day, couldn’t wait or go to Service Center when they would be able to open system next to me.

 

The day after, I’ve received mail from Service Center. 800$ for Motherboard, 1000$ for 512GB SSD + some peanuts for fans.

As per Apple Technician, one LSI is on and up to my bigger surprise they claim there were traces of some liquid inside!!! I can honestly say that this system didn’t have any contact with any liquid. I’ve been super duper careful with it, much more with any other system earlier, due to earlier experience.

Yet, Apple once again claims system had contact with liquid.

 

I’m sorry Apple, it is either a scam or something is wrong with design and/or LSIs. What I can say, there was no liquid spilled. The conditions in which system was used, was always in-house, office, hotel, living room, lounge at airport, plane. No, no bathroom, no kitchen, no exteriors, nothing. Always dry air, no temperature shock, system was always in sleeve and then in travel rolling bag (very well protected from everything).

Leaving system at Apple didn’t think they could say anything about LSI, especially this time when I took care of the system that much.

 

Interesting in the whole story is that at the same time, the Lenovo system I’m typing from now, was used for much more time than the last MacBook. Earlier I’ve used the good old ThinkPad W530, etc. never had an issue.

 

What did I like about MacBook pro?… that was hardware design as it nice, lightweight and small comparing to T530 tank.

As this MacBook failed again, am I going to stay with Apple? Rather not…

I’ll try to understand what LSIs were triggered and will see what Apple has to say, but for the sake of time, this time I’ll probably exercise the insurance unless Apple will be helpful.

Depending on that I’ll either walk away from Apple or might stay… not really sure.

 

This will have some, potentially minor impact on Apple sales, but some will be there, I’ll advise internally within company to stop purchasing Apple, so we talk about potentially another 200-300 units which would be otherwise purchased within next 3 years or so.

I can’t recommend to any business now to use Apple.

What could change that is Apple admitting some design issues and take it more seriously.


This is a send-only account. Replies received at this address are automatically deleted.


TM and copyright © 2017 Apple Inc. 1 Infinite Loop, MS 96-DM. Cupertino, CA 95014.