Friday, July 22, 2016

Feedly:Errata Security. My Raspeberry Pi cluster



from Errata Security


So I accidentally ordered too many Raspberry Pi's. Therefore, I built a small cluster out of them. I thought I'd write up a parts list for others wanting to build a cluster.

To start with is some pics of the cluster What you see is a stack of 7 RPis. At the bottom of the stack is a USB multiport charger and also an Ethernet hub. You see USB cables coming out of the charger to power the RPis, and out the other side you see Ethernet cables connecting the RPis to a network. I've including the mouse and keyboard in the picture to give you a sense of perspective.

Here is the same stack turn around, seeing it from the other side. Out the bottom left you see three external cables, one Ethernet to my main network and power cables for the USB charger and Ethernet hub. You can see that the USB hub is nicely tied down to the frame, but that the Ethernet hub is just sort jammed in there somehow.

The concept is to get things as cheap as possible, on per unit basis. Otherwise, one might as well just buy more expensive computers. My parts list for a 7x Pi cluster are:

$35.00/unit Raspberry Pi

 $6.50/unit

stacking case

 from Amazon

 $5.99/unit

micro SD flash

from Newegg

 $4.30/unit

power supply

 from Amazon

 $1.41/unit

Ethernet hub

from Newegg

 $0.89/unit

6 inch and 1-foot micro USB cable

from Monoprice

 $0.57/unit

1 foot Ethernet cable

from Monoprice

...or $54.65 per unit (or $383 for entire cluster), or around 50% more than the base Raspberry Pis alone. This is getting a bit expensive, as Newegg. always has cheap Android tablets on closeout for $30 to $50.

So here's a discussion of the parts.

Raspberry Pi 2

These are old boards I'd ordered a while back. They are up to RPi3 now with slightly faster processors and WiFi/Bluetooth on board, neither of which are useful for a cluster. It has four CPUs each running at 900 MHz as opposed to the RPi3 which has four 1.2 GHz processors. If you order a Raspberry Pi now, it'll be the newer, better one.


The case

You'll notice that the RPi's are mounted on acrylic sheets, which are in turn held together with standoffs/spaces. This is a relatively expensive option.

A cheaper solution would be just to buy the spaces/standoffs yourself. They are a little hard to find, because the screws need to fit the 2.9mm holes, where are unusually tiny. Such spaces/standoffs are usually made of brass, but you can also find nylon ones. For the ends, you need some washers and screws. This will bring the price down to about $2/unit -- or a lot cheaper if you are buying in bulk for a lot of units.

The micro-SD

The absolute cheapest micro SD's I could find were $2.95/unit for 4gb, or half the price than the ones I bought. But the ones I chose are 4x the size and 2x the speed. RPi distros are getting large enough that they no longer fit well on 4gig cards, and are even approaching 8gigs. Thus, 16gigs are the best choice, especially when I could get hen for $6/unit. By the time you read this, the price of flash will have changed up or down. I search on Newegg, because that's the easiest way to focus on the cheapest. Most cards should work, but check 

http://ift.tt/XsZp5v

to avoid any known bad chips.

Note that different cards have different speeds, which can have a major impact on performance. You probably don't care for a cluster, but if you are buying a card for a development system, get the faster ones. The Samsung EVO cards are a good choice for something fast.

USB Charging Hub

What we want here is a

charger

 not a

hub

. Both can work, but the charger works better.

A normal hub is about connecting all your USB devices to your desktop/laptop. That doesn't work for this RPi -- the connector is just for power. It's just leveraging the fact that there's already lots of USB power cables/chargers out there, so that it doesn't have to invite a custom one.

USB hubs an supply

some

 power to the RPi, enough to boot it. However, under load, or when you connect further USB devices to the RPi, there may not be enough power available. You might be able to run a couple RPis from a normal hub, but when you've got all seven running (as in this stack), there might not be enough power. Power problems can outright crash the devices, but worse, it can lead to things like corrupt writes to the flash drives, slowly corrupting the system until it fails.

Luckily, in the last couple years we've seen suppliers of

multiport chargers

. These are designed for families (and workplaces) that have a lot of phones and tablets to charge. They can charge high-capacity batteries on all ports -- supplying much more power than your RPi will ever need.

If want to go ultra cheaper, then cheap hubs at $1/port may be adequate. Chargers cost around $4/port.

The charger I chose in particular is the

Bolse 60W 7-port charger

. I only need exactly 7 ports. More ports would be nicer, in case I needed to power something else along with the stack, but this Bolse unit has the nice property that it fits

snugly

 within the stack. The frame came with extra spacers which I could screw together to provide room. I then used zip ties to hold it firmly in place.

Ethernet hub

The RPis only have 100mbps Ethernet. Therefore, you don't need a

gigabit

 hub, which you'd normally get, but can choose a 100mbps hub instead: it's cheaper, smaller, and lower power. The downside is that while each RPi only does 100-mbps, combined they will do 700-mbps, which the hub can't handle.

I got a

$10 hub from Newegg

. As you can see, it fits within the frame, though not well. Every gigabit hub I've seen is bigger and could not fit this way.

Note that I have a couple extra RPis, but I only built a 7-high stack, because of the Ethernet hub. Hubs have only 8 ports, one of which is needed for the uplink. That leaves 7 devices. I'd have to upgrade to an unwieldy 16-port hub if I wanted more ports, which wouldn't fit the nice clean case I've got.

For a gigabit option, Ethernet switches will cost between

$23

and

$35

dollars. That $35 option is a "smart" switch that supports not only gigabit, but also a web-based configuration tool, VLANs, and some other high-end features. If I paid more for a switch, I'd probably go with the smart/managed one.

Cables (Ethernet, USB)

Buying cables is expensive, as everyone knows whose bought an Apple cable for $30. But buying in bulk from specialty sellers can reduce the price to under $1/cable.

The chief buy factor is

length

. We want short cables that will just barely be long enough. in the pictures above, the Ethernet cables are 1-foot, as are two of the USB cables. The colored USB cables are 6-inches. I got 

these

 off Amazon because they looked cool, but now I'm regretting it.

The easiest, cheapest, and highest quality place to buy cables is

Monoprice.com

. It allows you to easily select the length and color.

To reach everything in this stack, you'll need 1-foot cables. Though, 6-inch cables will work for some (but not all) of the USB devices. Although, instead of putting the hubs on the bottom, I could've put them in the middle of the stack, then 6-inch cables would've worked better -- but I didn't think that'd look as pretty.

Power consumption

The power consumption of the entire stack is 13.3 watts while it's idle. The Ethernet hub by itself was 1.3 watts (so low because it's 100-mbps instead of gigabit).

So, round it up, that's 2-watts per RPi while idle.

In previous power tests, it's an extra 2 to 3 watts while doing heavy computations, so for the entire stack, that can start consuming a significant amount of power. I mention this because people think terms of a low-power alternative to Intel's big CPUs, but in truth, once you've gotten enough RPis in a cluster to equal the computational power of an Intel processor, you'll probably be consuming more electricity.

The operating system

I grabbed the lasted Raspbian image and installed it on one of the RPis. I then removed it, copied the files off (

cp -a

), reformatted it to use the

f2fs

 flash file system, then copied the files back on. I then made an image of the card (using

dd

), then wrote that image to 6 other cards. I then I logged into each one ad renamed them

rpi-a1

, ...,

rpi-a7

. (Security note: this means they all have the same SSH private key, but I don't care).

About flash file systems

The micro SD flash has a bit of wear leveling, but not enough. A lot of RPi servers I've installed in the past have failed after a few months with corrupt drives. I don't know why, I suspect it's because the flash is getting corrupted.

Thus, I installed

f2fs

, a wear leveling file system designed especially for this sort of situation. We'll see if that helps at all.

One big thing is to make sure

atime

 is disabled, a massively brain dead feature inherited from 1980s Unix that writes to the disk every time you read from a file.

I notice that the green LED on the RPi, indicating disk activity, flashes very briefly once per second, (so quick you'll miss it unless you look closely at the light). I used

iotop -a

 to find out what it is. I think it's just a hardware feature and not related to disk activity. On the other hand, it's worth tracking down what writes might be happening in the background that will affect flash lifetime.

What I found was that there is some kernel thread that writes rarely to the disk, and a "f2fs garbage collector" that's cleaning up the disk for wear leveling. I saw nothing that looked like it was writing regularly to the disk.

What to use it for?

So here's the thing about an RPi cluster -- it's technically useless. If you run the numbers, it's got less compute power and higher power consumption than a normal desktop/laptop computer. Thus, an entire cluster of them will still perform slower than laptops/desktops.

Thus, the point of a cluster is to have something to play with, to experiment with, not that it's the best form of computation. The point of individual RPis is not that they have better performance/watt -- but that you don't need as much performance but want a package with very low watts.

With that said, I should do some password cracking benchmarks with them, compared across CPUs and GPUs, measuring power consumption. That'll be a topic for a later post.

With that said, I will be using these, though as individual computers rather than as a "cluster". There's lots of services I want to run, but I don't want to run a full desktop running VMware. I'd rather control individual devices.

Conclusion

I'm not sure what I'm going to do with my little RPi stack/cluster, but I wanted to document everything about it so that others can replicate it if they want to.

Web Analytics