Solar Troubleshooting: Hunting SMA String Arc Faults

Solar Troubleshooting: Hunting SMA String Arc Faults

Be salty! Drink LMNT! Free Sample Pack with any purchase for new customers!

One of the downsides of having done your own solar is that when something goes wrong, there’s no “call someone to fix it” resolution path - though other people’s success with that method seems limited. One of the major upsides to ground mount solar, though, is that troubleshooting is far easier than on a roof mount system!

I’ve spent part of early summer troubleshooting an arc fault warning on my system, and I’ve learned some things that you might find useful for troubleshooting arc faults on solar arrays - especially if you’ve got a SMA Sunny Boy. They have… some interestingly wrong error messages.

SMA Sunny Boy Arc Faults

If you’re reading this from my normal blog posts, enjoy.

If you’re reading this from Google search results, I’m sorry. You’re in for what is ideally a simple set of fixes, and at worst a really rough time hunting the fault. As there’s not much more than generic articles on this matter, hopefully something a bit more in depth and hands on gives you some ideas if you find yourself here.

SMA Sunny Boy Event 4301 or 4302

Serial el.arc in String A detected by AFCI mod.
Serial el.arc in String B detected by AFCI mod.
Serial el.arc in String C detected by AFCI mod.

Again, I’m sorry. You’re not having a good day/week/month if you’ve found this page from search results.

Write Useful Error Messages

Actually, I lied. As I understand things, you’ll never see an “el.arc in String B,” or String C. Why? Because according to both the tech support people I’ve talked with, and my own experiments, the error message that says “String A” really just means “DC arc detected, somewhere on the DC side.” I believe some big inverters will split things into a few different detection realms, but for the smaller Sunny Boy inverters? It’ll say String A, but mean anything DC.

Seriously. I was a bit confused, and started calling tech support when I got endless warnings about String A arc faults with things wired like this - it’s pretty well impossible to have an arc fault on a disconnected string unless something inside the inverter has failed. Or, of course, unless the error messages are nonsense.

So: SMA? Shame on you. First, you wrote a defective error message that’s both “specific” and “wrong.” Then, you didn’t document the quirk on your website to guide people who are getting it to the reasonable step of “Check all your strings.” Finally, even some of your Level 2 techs don’t understand this! One of them told me that it was string-specific, and after I found myself in the nonsense situation, it took another one to tell me that, no, it said String A but didn’t mean String A, go check the rest of my strings. This is not helpful.

So, anyway, if you’re here because you cannot find a fault on your String A? Go check the rest.

What’s a DC Arc Fault and Why Should I Care?

If you know what an arc fault is, you can skip this section. If not, you should read it, because they’re Very Bad - if they’re actually happening. Arc fault detection is more of an art than a science, and there are plenty of things that can be detected as an arc fault that either aren’t an arc fault, or sort of are but aren’t a problem. Like vacuum cleaner motors with brushes. But this isn’t the case with DC solar strings. DC arc faults are bad.

Have you ever seen an electric welder in action? Bright, blazing blue arc, melting metal to join other metal things together. You shouldn’t look into them, because they emit a wide range of light, in and out of the visible spectrum, and at intensities that are very harmful to look directly at. That’s what an arc fault is. For welding, it’s a side effect of the arc needed to heat metal. For PV systems, it’s certainly not something you want to find in the middle of your array, but any good string array is going to have enough voltage to do it without trying hard.

An arc fault is just a spark jumping a gap. This can be caused by all sorts of conditions, but it’s usually found at a bad connection. The arc vaporizes bits of metal from the contacts and puts out an insane amount of heat, making things more likely to arc in the future. And, worse, on a DC system, an arc won’t self extinguish - because a DC system has no zero crossings of voltage or current. Imagine a little arc welder, welding away inside one of your connections until you’ve melted the wire or set things around it on fire, and you have the worst case of an arc fault in a PV system. They’re not something to mess with, and they will damage something - in a hurry.

Unfortunately, arc fault detectors can also be triggered by other things - interference, a hot cell that is still working fine but is “a bit weird,” and probably a wide range of other stuff that I’ve not considered. And I’ll talk about some of that later.

But, if you’re getting an arc fault warning on your system, until you’re damned sure it’s not an actual fault, you should assume that you’re seeing something real, and go about troubleshooting it as a real warning.

How Do I Know It’s Faulting?

Another rather substantial point of annoyance, in my book, is that SMA inverters do not show arc faults to the user login. They only show it to the installer login. I don’t know why, because this is one of those “kind of a big deal…” errors - if it’s real. It’s certainly a loss of production, and if real, it indicates an actual problem in the system.

On my faulting inverter, this is what the log for the day looks like from the user login - nothing particularly interesting beyond some (legitimate) power cycles. No cause for concern, right?

But if you log in with the installer login, you’ve got… well, this is just the top of the log.

Yeaaaaaaah. Probably should look into that. You’ll also see these messages in the Sunny Portal logbook, though - and you should have access to that. If your installer was decent.

But, this sort of shutdown also leaves a rather distinctive pattern in the inverter production curves. If you have deep V-cuts in your production log, it’s the inverter shutting down for some period of time, and arc faults are a likely cause. It could also be grid voltage/frequency issues, but you’ll need access to the installer login logs (or the Sunny Portal logbook) to see the real cause. If you’re seeing these, though, call your installer and have them take a look.

See the other little “Vs” that aren’t actually shutting all the way down, but reflect a brief drop in production? Those are bad news too - I only ever see them in the proximity of the arc faults. They’re not a cloud or anything - I can compare against my other array and discover that only one array is impacted. A very real perk of having two identical arrays is that I can just diff the two against each other - they’re close enough that a cloud that impacts one will impact the other as well.

On a particularly bad day, this is about 10kWh lost to inverter shutdowns - it will start coming back online and fault out again. There’s a reason I’ve been trying to track these down! Again, this is a perfectly clear, sunny day.

Strings vs Microinverters or Optimizers

Most of the advice in this post is focused around strings of solar panels - long sets of (typically) 10-12 panels in series, with no microinverters, optimizers, or any other sort of “active electronics” on each panel. While there’s no way to meet NEC 2017+ rapid shutdown requirements with this configuration on a roof mount system, there’s no requirement for rapid shutdown on a ground mount system - and if the panels are well positioned such that they all get equal sun, there’s no good reason to have per-panel electronics either. You should spend about as much on cost-efficient ground mounts as you would on the per-panel electronics, so the cost of a ground mount system shouldn’t be much higher than a roof mount system if you’re doing the work yourself.

If you have microinverters (or probably even DC optimizers), hunting panel problems or connection problems is radically easier if you’ve marked which inverter is where in your system - because you can trace the connection problem to a single inverter/panel combination and focus your efforts there.

But with a string of panels (say, 12 panels), there’s an awful lot of places that things can go wrong - typically 13 MC4 connections, and a bunch of diodes in the panel junction boxes too. Anything, at any of these points, can cause problems - and odds are decent that, at some point in the life of your system, one of them will get grumpy. Or you’re lucky and they won’t. But if they do, it’s useful to understand the various failure modes and how to remedy them!

My System Layout (Electrically)

My system is simple enough, electrically. I have four “main strings” of 12 panels each on two A-frames, with one side pointing east and the other west. Both strings from an A-frame feed a single Sunny Boy 6.0-US inverter, and those are (on paper) absolutely identical. Then I have a separate string of 6 panels, facing south, feeding a Sunny Boy 3.0-US.

Nothing complex, and the system simply doesn’t deal with partial shading outside early morning or late evening, when the two arrays shade each other somewhat (but it shades everything, so production just drops off almost entirely).

I’ve documented the build in some detail, if you’re interested in learning more about it.

Ignore Intermittent Errors!

If your system has thrown an arc fault error for the first time and recovered, my totally awful advice you should definitely not listen to is to simply ignore it. Nothing is more frustrating than trying to track down a problem that shows up once every six months or less, and the chances of you actually finding the error are exceedingly slim unless it’s something super obvious - at which point it will be failing regularly enough to track down.

One of two things will happen: It will get worse, or it will not happen again. If the first, your job is easier. If the second, well… you’ve not spent time chasing phantoms.

The best kind of problem to have is the sort where nothing works at all - because you can go through and check connections until you find something that doesn’t work. A good voltmeter is helpful here, though if you have an open connection, a visual inspection is likely to discover it.

Much more frustrating to track down are the exceedingly intermittent problems. In my case, I got an arc fault error or two on one of my strings, on one of my arrays. The first one was well over a year back. I checked connections, found a screw connection that wasn’t fully tight, figured I was done with it, and then… they kept happening this spring, but so infrequently as to be virtually impossible to do anything useful about.

In this case, I wasn’t particularly concerned about any sort of real damage (beyond what was causing the problems) - the arc fault detection shuts things down at the first hint of a series arc, so… I did my usual thing for intermittent faults that I don’t expect to do real damage: Absolutely nothing.

Once I got into the hot weather this summer, I started seeing multiple arc fault warnings a day some days, and associated loss of production (the inverter trips off for 10 minutes per arc fault detection, with both strings dropping out), so I set about to find the problem and fix it. If the problem will show up every afternoon for a couple hours, running it to ground is an awful lot… if not easier, at least slightly less maddening.

Solar String Safety

If you’re running down arc faults, be safe. You’ll notice that MC4 connectors are marked, “Do Not Disconnect Under Load.” Various other bits of solar equipment are marked the same. The reason? If you pop a connection open under 350-450VDC, with 8-10A flowing, you will get an arc going! And that will be a bad day for your eyes if you’re up close - assuming it doesn’t light other stuff on fire. You’ve got a welding arc, without any of the stuff intended for use with welding.

You must cut current to a string (with something rated to break DC current) before you work with it! Inverter disconnects on the Sunny Boys are rated to do this. And it’s a really good idea to pop any other disconnects you’ve got. If you’re working on those disconnects, go disconnect one of the MC4 connections in the string once current is zero. At that point, there’s no path for current to flow, so you should avoid anything particularly interesting. You don’t want a surprise solar short circuit if you’re working on panels during the day.

The safest option, of course, is to work at night. You may have a bit of voltage laying around from the moon or ambient light, but it’s reasonably unlikely to do anything terribly nasty. However, if you’re not comfortable working with high voltage systems and reasoning about disconnects and how to keep yourself safe… you probably shouldn’t be doing this sort of thing. Sorry. I try to keep my blog reasonably safe in terms of projects, but screwing around with 600V, multi-kW solar strings just isn’t the place to learn electronics work. Call your installer.

Thermal Imaging

My next bit of advice here is that your life will be radically easier if you have access to a thermal imaging device. You can usually rent them for a day from various rental places, some libraries have them available, and I don’t care that it costs $100 or so to rent, you’re likely to actually find your problem far faster. The technique is simple: “Look for the things that are far hotter than the rest of the similar things while the array is operating and go investigate those hot things.”

I’ll have some examples of this going forward, because I finally found the problem with one, after having spent quite a while having zero success running the location of the fault down. The bad error messages didn’t exactly help…

I’ll be reviewing my unit, an HTI HT-19, in a few weeks.

Fuses and DC Wiring

For no good reason beyond “They’re the easiest to check and probably have some screw terminals,” if you’re hunting these problems, it’s worth checking the assorted connections in the system you can easily access. For me, this is mostly the fuse boxes at the A-frame end of the system - there are a lot of connections in this little box, and the first time I saw an arc fault warning, I found a “looser than I’d prefer it” screw on one of these fuse holders. I tightened it, the problems went away. Solved, right?

So, check all this stuff. Pop your fuses, look for carbon arc traces or corrosion. Check your screws. Tug on wires going into them (you did read the above section and depower the array, right?).

While you’re doing this, reseat the wires into your spring connectors at the inverter. Remove the connector, shove a small screwdriver into the hole above the wire spring, and you’ll release tension and be able to pull the wire out. Check for any indications of a problem, and shove the wire back in. Again, it’s not likely to be your problem, but it’s quick and easy to check.

And reseat your AC side, too. It’s rare, but sometimes a fault on the AC side can be coupled over (wires are antennas!) and trigger the DC side arc detectors. Just reseat everything you can get at.

Oh, and while I was getting photos, a ground wasp decided to say hi!

MC4 Connectors

Fortunately and unfortunately, the most common place for a system to develop a fault is in the MC4 connectors. Unfortunately, you have a lot of these, and they’re not all interoperable. Fortunately, they’re easy to replace with the right tools.

If you’ve only ever wired solar panels with pre-crimped MC4 connectors, you’ve probably never torn one apart. They’re fairly simple, but, unfortunately, there are a lot of “MC4 Compatible” clones on the market that just don’t quite mate properly in the long term.

Each end of the connector is a two piece plastic housing, with a metal crimp terminal contained within. When seated, the terminals press into each other, and they’re rated to 20-30A - well beyond the typical 10A short circuit current of a PV panel. But, if the metal oxidizes, tolerances are loose, or things aren’t properly seated, they can cause problems.

The plastic bits simply exist to give you a way to grip and connect the core metal bits. Ideally, they keep the water out so the connections don’t corrode - although not all are quite tightly sealed enough to do this. An o-ring seals the joint, and the screw-on caps compress glands at both ends firmly around the PV wire.

Inside, the metal terminals should be firmly crimped around the PV wire - it’s a crimped, not soldered, connection.

Inside the female connector, springy contacts press hard against the male connector and transfer the current. Given too much heat from a weak connection, these springs can collapse or corrode, and then you’ve got bigger problems.

Take your MC4 tool and pop all of your connections apart (not under load!), inspect them for damage/corrosion (a good head mounted flashlight is useful here) or signs of arcing, and put them back together. Personally, I think it’s worth putting them back together having rotated one 180 degrees, so that if there is a bad spot in the metal you can’t see, it won’t mate with the corresponding bad spot. Pay attention for connectors that are really hard to get apart - they may be slightly welded from arcing. Or they’re just tight and the O-ring is stuck. Just look for anything weird.

Of course, if you have access to a thermal imager, inspect the connections while the array is running at peak current. Look for any that are abnormally hot - and if they are, just replace both sides.

A particular location inspect closely would be any time you’ve got “MC4 Compatible” connectors from potentially different manufactures joining. On a string system, this likely to be where the panel string joins to the PV extender cables, unless they came from the same place. NEC 2020 requires that both halves of mating connectors either be from the same manufacturer, or be “rated interoperable” - which probably won’t happen, so make sure they’re the same manufacturer. I have my problems with the NEC, but this one is exceedingly sane, and based on actual fire investigations. So, if you’re not sure, you might re-terminate your end-of-panel-string MC4 connector and use the same brand on your PV extension wire run to the fuses. If you’re assembling a new array, just assume they don’t match and put new ends on at that point.

Junction Boxes/Bypass Diodes

The next place to look (and where I found my problem) is the junction boxes on the back of the panels. These typically contain a few connections and the bypass diodes. A fault here will stand out on the thermal imaging like you wouldn’t believe - though an IR thermometer should also get you something of interest if it has a narrow enough beam to miss the blazing hot panels behind the junction box. Depending on the nature of the fault here, it may silently be losing production energy as heat, it might be a loose connection arcing, or it might be a diode doing something goofy and putting noise on your lines that’s getting detected as an arc fault.

Inside these boxes, you will, in some configuration or another, have three diodes (possibly more if you have half cell panels). These can fail in a few different ways, and while it’s hard to detect one failed open without some external test equipment, the other cases (bad connection or a shorted diode) will both stand out as “far hotter than normal.”

This is a normal junction box from my array. Notice how the junction box is far cooler than the panel material.

This is one of several that’s running rather hotter than the rest. The plastic isn’t terribly thermally conductive, so they just feel “really hot, like all the rest” to the fingers, but the thermal imager doesn’t lie - it’s running abnormally hot. The junction box is the hottest thing in the image, and one half of the box is far hotter than the rest of it. This is a box worth inspecting more closely.

Depending on the failure modes involved, you may also see something in the panel face thermal images. I found that a third of one of my south facing panels was running hotter than the rest. I initially thought it was a shorted diode, but it seems to have been an open connection in the (hot running…) junction box, as a shorted section will “checkerboard” rather more than “just run a bit hot.” But either fault - shorted cells or open connections - will show up incredibly clearly in thermal. Again, I can’t feel the difference with my hand here (I had no idea there was a fault in the south facing array!), but it’s glaringly obvious in thermal.

When a section of panels is shorted, the variation in the silicon will show a distinctive pattern, more along these lines. If you see a third of a panel doing this, that would indicate a shorted diode. A uniformly hot strip (or panel) just means that the region isn’t producing anything. Yes, a panel producing energy runs cooler than one that’s just sitting in the sun!

I found my fault in a junction box. It was quite disturbing, really. What can be done? That’s a post for two weeks from now - but, in the words of the Mythbusters, “Well, there’s your problem!” However, I assure you, this horror show isn’t fatal to the panel!

The Panels

If there’s nothing wrong in your junction boxes, start inspecting the panels. Here, you’re really going to need some sort of thermal imager - there’s no reasonable way to feel for hot cells on a blazing hot panel, and you don’t want to be shading the panels while you’re testing, either.

A fault in the interconnect ribbons can show up as an arc fault (probably because it’s actually arcing internally), and I’ve heard from several sources that a particularly bad “hot cell” (a cell running far hotter than the neighbors from one of a variety of failure modes) can also trip arc fault detection. If you suspect a panel in this step, try bypassing it to see if the problem goes away.

Unfortunately, any fault this far into a panel is just going to require replacing the panel - there’s no good way to repair a faulty panel at this level.

Bypassing Panels

If you think you’ve found a problem, the next step is to verify it. If you have a number of strings in parallel joined together, you may want to shorten all the strings, but if you have (as I do) strings connected to separate MPPT inputs on the inverter, you can just bypass a panel or set of connections and see if the fault goes away. The Sunny Boy inverters can handle a range of voltages and different strings need not agree. I’ve run “down two panels” while testing, and other than the reduced production, the system operates just fine.

You’ll need about a 10’ chunk of PV extension wire with MC4 connectors, and the process is simple - just bypass a panel. Or, if you’re suspecting some connections, bypass two panels. This way, not only do you skip faults in the panels, you skip an entire connector in the middle that could be causing the problem.

If you don’t have a thermal imager and can’t find anything obvious visually, this is likely to be your next step as well - you can isolate faults this way. If the array is misbehaving daily and runs fine for a week with a set of panels bypassed, it’s a really good time to dig far deeper into those panels and see what’s going on.

Annoyingly (almost certainly by design), most panels don’t come with long enough leads to actually skip a panel without the extension cable. It’s a nice sanity check during install, though - it does make it virtually impossible to wire things wrong with keyed connectors!


Hopefully you’ve found the problem by now, because I’m now off into “rare but not yet impossible” sorts of faults.

Wires are antennas. Antennas are wires. Interference can travel between wires. If there’s some strong source of interference in the area, it’s possible that your arc faults are a result of that. Some big motor switching nearby could potentially do it, or if you have a fault on the AC side wiring that’s popping a bit, your DC side might pick it up.

The Inverter

Finally, it’s possible that your inverter’s arc fault detection has just gone haywire. I initially thought this was the case after finding no faults at all on string A, disconnecting string A entirely, and still getting faults reported on string A. Then I talked to a different tech who told me about the error message nonsense, and I was able to find a fault on string B.

A good test here, if you’ve got multiple inverters and strings, is to just swap the inverters and communication modules around. For the SMA Sunny Boy inverters, it doesn’t take long at all to swap the power section and comms board - I swapped my two inverters in about 20 minutes. SMA’s suggestion was to move the comms boards (the interface boards in the lower section, black thing in the center here) with the power units, otherwise they’d have to resync and it would be a pain, but a couple screws does everything.

If the fault follows the inverter, there’s a fine chance the fault is with the inverter. If the fault follows the panel array, well… look harder, because it’s not the inverter.

Disabling Arc Fault Detection

Finally, if you have a ground mount array (as I do), you’ll see and occasionally hear advice to the tune of, “Well, you’re not required to have arc fault detection, just disable it and see what happens.” This is an option. And it is a reasonable option, if you’re entirely out of ideas, have convinced yourself the entire array is in perfect condition, nothing is getting the slightest bit hot, and you can’t figure anything else out. Yes, I just said you should have a thermal imager before you try this.

If you’re going to do this, treat it as a “Give the fault a chance to get worse while looking for it” sort of thing, not a “Eh, whatever, it’ll be fine…” response to a troublesome fault. If you do have an arc fault, and you ignore it, something will get very, very hot - and you probably don’t want your array on fire.

If you are absolutely out of ideas, cannot find anything, have thermally imaged the whole array, front and back, and have zero ideas, then, yes. Turn it off as a step in troubleshooting, and see what you can find. It’s likely to make the problem stand out, for sure!

Final Thoughts

You may very well sense some frustration in this post, and you’d be entirely right. I’m pretty well pissed at SMA that they (a) have specific-but-wrong error messages, and (b) never bothered to document this anywhere a mere mortal can find. I spent a LOT of time hunting a fault in the string it very specifically told me was faulting, even though that wasn’t the string that was faulting. A single support page saying, “Yes, we’re sorry, but this error message really means any string!” would have saved me an awful lot of sanity and quite a few hours on hold with support - and even the L2 support doesn’t consistently know this is a problem!

But what this really boils down to is simple. If you’re in charge of maintaining string arrays - personally, professionally, as a hobby, whatever - you need a thermal imager. Solar troubleshooting without one is damned hard. Solar troubleshooting with one is easy mode. Find the hotter thing, fix the hotter thing. Repeat until the system is happy.

Now, what am I doing about these funky junction boxes? For that, you’ll have to wait two weeks! And, yes, I plan to review the thermal imager as well, because it’s just as cool as I hoped it was, and is useful for a wide range of things beyond solar.

And I’d ask, if this has been helpful to you or you’ve found something else responsible for a fault in your array, please, add a comment. There’s just not much about actual arc fault troubleshooting on the internet!


Comments are handled on my Discourse forum - you'll need to create an account there to post comments.

If you've found this post useful, insightful, or informative, why not support me on Ko-fi? And if you'd like to be notified of new posts (I post every two weeks), you can follow my blog via email! Of course, if you like RSS, I support that too.