Register to reply

Are all SSDs Unreliable?

by russ_watters
Tags: ssds, unreliable
Share this thread:
russ_watters
#1
Feb13-14, 03:45 PM
Mentor
P: 22,239
I have a Crucial M4 SSD that is 2.5 years old and still in warranty. The drive has a nasty habit: If it loses power, it disappears from my laptop. Crucial has a procedure for bringing it back to life prominently displayed on its support site:

1. Remove the drive from my laptop
2. Connecting it to the power of my desktop for 20 minutes.
3. Remove power and wait 30 seconds.
4. Repeat step 2.
http://forum.crucial.com/t5/Solid-St...tem/ta-p/65215

The procedure works. Unfortunately, the result is often a corrupted windows installation. Crucial has issued bios fixes attempting to mitigate the issue, but if anything it appears to be getting worse: it has happened 3 times in the past 4 months or so and even at that, the drive has only been in my system for half the time!

If you google "ssd disappear", the majority of the hits you get are about this issue on the M4.

Crucial won't replace it because they say it is normal(!?):
Justin _M : It is normal for an SSD to need to power cycle after a sudden power failure, but if its locking up just after normal use that would be a cause of concern.

Russell Watters: I disagree. No computer is perfectly stable and most occasionally lock-up/crash. I've never owned another hard drive that required such a procedure to bring it back to life.

Russell Watters: I'm lucky enough to be computer savvy enough to know how to replace a hard drive (which I suppose is normal for a replacement HD buyer), otherwise I'd be screwed!

Justin _M : Its just due to the nature of the caching technology that this can happen when the computer crashes. You shouldn't be needed to reinstall windows each time this happens, but the power cycle will restart the garbage controller back in the SSD so it can get back to work on actively maintaining itself.

Russell Watters: I still disagree and point out that Crucial wouldn't be trying to address the issue with firmware fixes if it was actually normal. I'll certainly check to see if this is common to other vendors' drives (note, I already own another drive from another manufacturer and have yet to have this happen). Crucial needs to fix its caching technology.

Russell Watters: In any case, I'm not really here to argue: are you telling me Crucial will not under any circumstances replace a drive that is displaying the "disappear" problem?

Justin _M : That's not entirely true, but if the troubleshooting fixes it, I wouldn't want to send out a new drive to you to make you think it will be different, when this is the nature of the SSD as a whole.
[emphasis added]
Digging further, though, they may not be wrong, but that is not an easy question to answer: Googling "kingston ssd disappear", aside from links where people say they are going to replace their disappeared Crucial M4 with a new Kingston, there are some that describe the same issue happening with Kingston:
http://www.hardwarecanucks.com/forum...y-what-do.html
This involved a "Sandforce" controller and the forums suggest it was a bios bug that was fixed, but nevertheless took down the company:
http://en.wikipedia.org/wiki/SandForce

But the M4 doesn't use the SandForce controller.

Worse, this paper implies that Crucial may be correct:
...but a report from the 11th Usenix Conference on File and Storage Technologies (FAST 13), given early this year, suggests most models have a fundamental problem with sudden power loss. While the paper came out in mid-February, I only recently came across it, after a reader asked if Id look into a rather puzzling recovery program recommended by Crucial for its M4 SSD line....

Baffled, I began to poke at this further, then stumbled across the aforementioned report from early this year.

Samsung Flash SSDResearchers working with the University of Ohio rounded up 15 different SSDs from five different vendors, as well as a brace of HDDs, and put them through a series of tests designed to measure how they responded to sudden power failures. No vendors are identified, but the drives in question incorporate both MLC and SLC. Some (the SLC versions) are explicitly enterprise drives. Some include supercapacitors, which are designed to mitigate catastrophic power failure.

Of the 15 drives (10 different models, from five vendors), only one drive model, from one vendor, had no failures of any sort. One device failed completely (SSD #1), while one-third of SSD #3 became unusable due to metadata corruption. The other SSDs all exhibited various types of data corruption when they unexpectedly lost power, including the high-end enterprise SSDs with SLC NAND and supercapacitors. According to the research team, part of the problem is that virtually none of the devices actually behave as expected under fault conditions. While all the drives claim to use ECC RAM, for example, many exhibited single-bit errors of the kind of errors that ECC is meant to prevent. While one of the two included hard drives also developed errors, the HDDs are both far cheaper and showed no sign of the disastrous failures that characterized the SSDs.
[emphasis added]
http://www.extremetech.com/computing...ling-your-ssds

I quote/link the article about the white paper instead of the paper itself because it is a bit over my head. I really don't know what to do here. I have a $400 SSD that I'd rather not have to throw in the trash, but I also would rather not waste a day re-installing windows every time a minor issue causes it to disappear and become corrupted.

Anyone have experience with this issue? Comments? Recommendations?
Phys.Org News Partner Science news on Phys.org
Security CTO to detail Android Fake ID flaw at Black Hat
Huge waves measured for first time in Arctic Ocean
Mysterious molecules in space
Chronos
#2
Feb13-14, 05:30 PM
Sci Advisor
PF Gold
Chronos's Avatar
P: 9,377
I'm at a disadvantage. I have a 256 Samsung SSD [~$200] and have not experienced such an issue. It has data migration software that can be used to clone the OS from a hard drive, which is considerably less aggravating than a fresh install.
Ben Niehoff
#3
Feb13-14, 05:58 PM
Sci Advisor
P: 1,588
I had this problem in a drive, but the solution is not nearly so complicated. What I did was:

1. Power on laptop and press whatever key to enter the BIOS screen.

2. Let it sit in BIOS, plugged in, for about 10 minutes. (This way, the drive is getting power, but is not in use).

At this point, either the drive shows up in BIOS (yay!), or it will after rebooting.

It turns out, however, that my drive was having other problems. It would die while my laptop was in sleep mode, even if my laptop was plugged in. I forget the exact details, but since I was using Linux, I would see errors in dmesg about the drive.

I asked Corsair to RMA the drive, and they did so quickly and painlessly. I have not noticed any problems since*. So you may just have a bad drive; I'd contact Crucial while you're still in warranty.

* However, I received the RMA around the same time I got my Surface Pro 2, so I haven't used that laptop as much since then either. In fact, since it has been sitting with no power for a few months now, I can go turn it on later and let you know if there is any issue with the drive booting up.


Some further suggestions: Do you have any regular method of data backup? I built myself a NAS server in mirror mode and I regularly copy things there, so generally speaking I am not concerned if the drive on my laptop fails, or if its OS gets corrupted, except for the temporary inconvenience.

For even more piece-of-mind, you can use Clonezilla to make a clone of your OS drive. I can tell you from experience that it works beautifully. While my SSD was being RMA'd, I cloned the OS to my old HDD and continued to use the laptop, and then cloned it back to the new SSD. You should use Gparted to shrink your partitions slightly (it can shrink Windows partitions safely, including moving "immovable" files), because Clonezilla cannot clone a drive to a *smaller* drive, and not all 250 GB drives contain exactly the same number of bytes.

Edit to correct: My SSD is a Corsair, not Crucial. I was very impressed with Corsair's handling of the issue.

russ_watters
#4
Feb13-14, 06:54 PM
Mentor
P: 22,239
Are all SSDs Unreliable?

Quote Quote by Chronos View Post
I'm at a disadvantage. I have a 256 Samsung SSD [~$200] and have not experienced such an issue.
Have you ever had to shut down your computer by holding-down the power button due to a lock-up or failed shutdown?
It has data migration software that can be used to clone the OS from a hard drive, which is considerably less aggravating than a fresh install.
Yeah, if I start using the SSD again, I'm going to do something like that.
AlephZero
#5
Feb13-14, 07:32 PM
Engineering
Sci Advisor
HW Helper
Thanks
P: 6,957
It's interesting the way usage of SSDs has changed over time. Back in the early days of SSDs on supercomputers, nobody would have even thought about using them for "permanent" file storage. They were strictly for fast access to scratch files, and/or as another level of memory paging where the application could pre-fetch the data it knew would be needed next, rather than letting a OS's virtual memory logic keep trying to play catch-up with the CPU.

I don't have any experience either way with modern "consumer level" SSDs though. But I suspect they more "my computer boots faster than yours" bragging rights for many users, rather than something actually useful - though the reduced power consumption and mechanical reliability are obviously real benefits if you need them.
Ben Niehoff
#6
Feb13-14, 07:56 PM
Sci Advisor
P: 1,588
Mechanical reliability is a big deal to me. I lost an HDD once because I was using my laptop on an airplane. Once I realized what was happening, I shut it down and I was able to get most things off of it later before the drive became unusable. Luckily this happened on the way home; my trip would have been a disaster if it had happened on the way out.

This is actually the main reason I bought my SSD.
Chronos
#7
Feb14-14, 01:37 AM
Sci Advisor
PF Gold
Chronos's Avatar
P: 9,377
Yes, I have had to hard boot many times, Russ. The Samsung has so far handled it with ease. Of course, that could change tomorrow.
Psinter
#8
Feb25-14, 01:15 AM
Psinter's Avatar
P: 94
Quote Quote by russ_watters View Post
Anyone have experience with this issue? Comments? Recommendations?
I have a comment (useless, but I still want to comment). It's weird. When I go to Amazon I see many bad reviews for SSD Drives while Hard Disk which are cheaper have way less negative reviews. Hard disks failing usually warn and can have their data recovered, but according to the reviews SSDs appear to terminally fail in an instant without warnings. They also appear to have so many tricks on how to make them work. (I think things should just work with no tricks whatsoever like unplugging power, waiting minutes, and stuff).

SSD are claimed to be wonders of technology in the media industry, but why do they have so many people saying they die quickly? Why do the user has to unplug powers, wait some time, re install operating systems, do magic tricks, and many other stuff with this technology? According to the reviews they too appear to be lasting less than hard disks in the long run. That's what has kept me from getting one. It really scares me to see so many people saying they die in a few months in those reviews plus the fact that they have to do acrobatics and magic tricks with their computers to make them work.
Chronos
#9
Feb25-14, 01:56 AM
Sci Advisor
PF Gold
Chronos's Avatar
P: 9,377
The 'magic' of an SSD is its instant on ability. It is also a critical weakness when it fails. It is not difficult to restore if you know what to do. The only 'trick' is to boot off a back up HDD with an uncorrupted OS. You can then clone the OS off the HDD to the SSD with proper software.
rcgldr
#10
Feb28-14, 01:48 AM
HW Helper
P: 7,034
Part of the issue is the way SSD's try to distribute writes to the SS memory somewhat evenly, so they utilize a mapping scheme to map "logical" sectors into "physical" sectors, and that map needs to be stored somewhere, usually also in the SS memory. If there's a power loss during a map update operation, it can lose a lot of data. I would assume some sort of self archiving, like having dual maps and not re-using sectors removed from the previous map until the current map update was completed. I don't know how sophisitcated the mapping schemes in current SSD's are.

Maybe someday the number of writes for the "lifespan" of a SS memory will increase so that a mapping scheme is no longer needed (or maybe some SSD's are already there?).
SixNein
#11
Feb28-14, 03:52 AM
PF Gold
SixNein's Avatar
P: 194
Quote Quote by russ_watters View Post
Anyone have experience with this issue? Comments? Recommendations?
Change your settings for battery power.

http://www.dummies.com/how-to/conten...ows-7-or-.html

Increase the cut off point so that you don't run completely out of power.


An RMA may also be a good idea.
enorbet
#12
Apr25-14, 10:50 PM
enorbet's Avatar
P: 138
Greetings
This is a bit of a grey area for me as I have yet to adopt SSD, but I fix a few. It's a good practice to be sure you have Trim enabled. Some SSD manufacturers have and ship their own versions but at the least the built-in Win7, or newer, version should be enabled. Here's an example - http://lifehacker.com/5640971/check-...e-in-windows-7
Psinter
#13
Apr26-14, 02:29 PM
Psinter's Avatar
P: 94
According to what I've read on NAND technology it is better to buy SLC NAND technology SSDs. Yet it appears most SSDs at affordable prices use 3-bit MLC (TLC) and that is what I suspect is the culprit in people's problems with them.

I read this piece of a white paper to conclude that: https://www.samsung.com/global/busin...tepaper03.html

Even when it is a paper from SSD manufacturers itself, they themselves put lots of SSDs with 3-bit MLC technology on the market. I applaud their honesty on this subject. My interpretation of their honesty:
Quote Quote by My Interpretation
"We and everyone else are selling you crappy stuff (consumer grade), but hey, at least we are not lying you. It's just industry trends and we give you full details on the subject. If you want good stuff, pay 3-5 times the price of an MLC SSD. Then you will get good stuff."
I suppose I will save a lot of money and then buy a good one in one shot.
Vanadium 50
#14
Apr26-14, 05:26 PM
Mentor
Vanadium 50's Avatar
P: 16,178
Russ, I have had four SSDs.

#1 was a Kingston 128. It failed. Kingston didn't believe me at first, but I was able to demonstrate that it was the drive by writing and rereading a block and showing the readback failed. They RMA'ed it and sent back a 120 - after several weeks. I strongly suspect that the drive was the same model, but the firmware was altered to use 8 GB as spares. The 120 works OK, but I am not going to buy another Kingston SSD.

I have a Crucial C300 in my desktop. Works fine - been using it for years.

The fourth drive is in my laptop. It's a Samsung 128 and it also works fine.
DHF
#15
May20-14, 12:57 PM
DHF's Avatar
P: 65
I have a Kingston 120G that I bought for work about 2 years ago and it worked flawlessly up to the day...well that it didn't. no warning, just poof. after much necromancy I managed to revive it for seconds at a time but never long enough to copy even a single folder off of it. Although it does have a 3 year warranty I am in kind of a hard spot because it has years worth of sensitive company info. I had an image dated the day before so no work was lost but I cant return the drive to Kingston for a replacement due to the information on it. So I am in the market for a new drive, I did enjoy the speed but I am not sure I want to take another chance with the SSDs.
B. Elliott
#16
Jun8-14, 07:44 PM
PF Gold
B. Elliott's Avatar
P: 397
Something that wasn't mentioned are the conditions under which the SSD is being used. ie; Virtual Memory, Write Caching, AHCI vs IDE and Prefetch/SuperFetch settings.

I've been using three different SSD's for quite some time now and have installed quite a few in others computers. So far I have personally yet to experience or hear of any problems from them. Then again, I do get a bit meticulous with the settings that pertain the SSDs.


Register to reply

Related Discussions
Internet - too much, too fast, too unreliable? General Discussion 34
Does this professor sound unreliable? Academic Guidance 16
Is redshift unreliable as a measuring tool? Cosmology 14
Most Unreliable Technique in the World to compute pi Linear & Abstract Algebra 4