16. March 2010 · Write a comment · Categories: Rant, Techie · Tags: ,

So last night I spoke of some issue I was having regarding the FreeNAS installation I was working with. It appears things aren’t much better, and far less conclusive so far. It seems like I’m discovering more issues than really resolving them at this point.

    Let me state more exactly what my issue(s) are:

  • After large transfers (particularly with files >4GB in size) are initiated, the network connection hangs before the transfer finishes; I’ve never finished a 50+GB transfer yet
  • I cannot access the fileserver in the least: no webGUI, no Transmission webGUI, no SSH
  • Pinging the fileserver results in 100% packet loss
  • The terminal on the fileserver box itself is still functional; I can reboot the system and issue other commands just fine
  • Restarting the network interface restores functionality (by ifconfig re0 down and ifconfig re0 down)
  • This was first seen using AFP for file transfers; NFS fared no better yesterday evening, nor did CIFS/SMB
    Hardware operating the machine (gutted from old machines, but runs stably):

  • Asus M4A78LT-M LE motherboard
  • Athlon II X3 2.6GHz CPU unleashed to run as a (reported) Phenom II X4 B25 2.7GHz CPU
  • 2GB DDR3 1066MHz RAM
  • 6x SATAII Seagate drives
  • Netgear GA311 Gigabit NIC

Going to the FreeNAS forums in search of some help wasn’t the most productive. As with many community-driven projects, forums regulars really emphasize search and self-education first. So upon trying to identify my problem, the “Q: When I tried drag and drop or copy and paste the server started transfer/copy then reboots/freezes/hangs. Is there a way to fix this? question seemed to be most akin to my problems. Looking through the responses, I knew I wasn’t dumb enough to share the root folder. I ran a memory tester overnight last night and all thirteen passes of the RAM passed error-free. This morning before I left for work, I ran a stress-tester on the CPU (as it’s an unlocked CPU…lord knows what state that fourth core was in) and it crunched 350+TFLOP-worth without any failed iterations, so it did not look like a CPU issue either. This basically left me to wonder if it’s the NIC or the software crapping out.

I stumbled across some information earlier this evening that showed BitTorrent causing some issues with the stability of FreeNAS. Upon shutting down the BitTorrent service, I finally broke the 8-10GB marker I was suffering through for a while. I made a solid transfer of ~23GB of mixed files to the ZFS pool without a hitch! I bumped it up to a ~60GB transfer and it decided to hang just after passing the 50GB mark. Unlike before, the webGUI was still responsive and I could SSH into the box1. Upon restart, I said to hell with it (the logs weren’t really telling me anything) and just decided to switch to using an old 10/100 Netgear FA312 NIC to see if maybe it’s just some issue between the NICs.

I dumped a 95+GB transfer from the Finder into one of the ZFS pools. We’re currently sitting around 32GB transferred and still rolling at ~10.5MB/s. I’m not a huge fan of taking this huge hit from ~40MB/s over BaseT/1000, but BaseT/100 is better than nothing if it will stably transfer everything. If the FA312 manages to survive this, then I’ve obviously got issues on hand.

  1. I need to tweak the Netgear GA311 drivers/interface commands to get it to survive these enormous transfer loads
  2. If tweaking the drivers/interface doesn’t work, I may (potentially) have a defective card?
  3. I need to figure out if Transmission trashing through the BitTorrent service is a bug or if it is a misconfiguration of the service

I obviously have some work to do in order to get this box stabilized and running at (expected) spec. Times like this make me, a) wish I played around with Unix and BSD more back in the day, or b) wish I wasn’t such a nerd with this much attention for detail.

1The connection lagged considerably to login over SSH, and even once logged in the response to typed characters was like molasses in North Dakota. A smbd was stuck in sbwait for lord knows how long. Killing the process didn’t help; I had to eventually reboot the machine.

Leave a Reply

Your email address will not be published. Required fields are marked *