I’ve been working on a Reddit-based site that’ll be hosted at Amazon EC2 and, as you probably know, Reddit currently only provides VMWare and Virtualbox images for testing and development. EC2 is an ideal test environment because I can boot an image, play geek god with my own Reddit clone and then shut it down as needed – all in a matter of minutes.
But first I needed a Reddit-ready Amazon EC2 AMI so that making mistakes in between hacks wouldn’t require restarting the whole painstaking installation process over and over. I decided to share my Reddit AMI so that more developers will use the Reddit system; hopefully more sites will crop up using this impressive code base. I remember the excitement of reading the Slashdot code back in the day, how much I learned from that experience – especially being a system which handled millions of concurrent users with very few, if any, errors.
Log into your Amazon Web Services account, click on the EC2 tab, then on AMI menu on the left. On the “Viewing:” filter, select “All Images”(this will take a while to load). Search for “reddit” and you’ll find a public AMI image with the ID below.
Current AMI ID: ami-56e81c3f
SSH Username: ec2-user
Password: reddit
Disclaimer: You must tighten the security of this AMI if you wish to use it in a production environment. This AMI is provided as-is, 100% free of charge, no guarantees offered.
Note: You’ll need to open TCP port 9090 on your security group to view the site remotely. Then access http://YOUR_AMAZON_PUBLIC_DNS:9090/
If everything goes ok, you should be staring at your very own Reddit like the Reddit admins see it! Unlimited internets and endless karma. But feels kinda lonely, don’t you think?
Note: You need to configure it further to get it fully functional, modify your template and finish the installation. This AMI is just a starting point.

The first decision to be made is whether to download an official image for VMWare or Virtualbox and convert them to AMI somehow, or create my own, by building and installing the needed dependencies. By virtue of lazyness, I went for the image conversion route. What could possibly go wrong?
First, I downloaded the VMWare virtual machine image, which is kindly provided by the Reddit folks here.
I found this article via Yahoo! and followed the instructions.
A coffee mug later, I was surprised by a message long forgotten in the days of cheap Terabyte disks: “Out of space on device”.
It turns out "cat *.raw >> output.raw" is NOT a very good idea. So watch your step when you copy and paste instructions from the web.
Substituting output.raw with output.img, and after about 15 minutes, the commands finally returned and I found a delicate 21 gigabyte elephant staring back at me. There is no way I can upload that to Amazon EC2, even bzip2′d to 10. I didn’t dig much into why the qemu-img command did this, but it seems that every vdmk chunk is exploded into a 2 gigabyte image, then when you concatenate them all you add up all that unused image space into one large cluster. I guess the vmdk conversion route works best if you have a single file VMWare image.
Having hacked at the image conversion attempt for a couple of hours, I finally go down the conservative route: download and install packages. This is pretty boring stuff, so let’s get this step out of the way already.
sudo su - and proceed with the following steps as root.Note: Even if you’re only a regular Amazon customer, you can still quickly sign up for AWS and get your EC2 instance running in minutes.
Below is the command history I used to get a base system together.
The Python module configuration wasn’t as straightforward as I wished, mainly because Reddit uses old versions of some packages which have been superseded by incompatible ones. WebHelpers, for example, needs to be 0.3 – you can’t go with the newer versions. The most recent WebHelpers do not include the webhelpers.rails.* subdirectory.
yum install python-setuptools
yum install python-imaging
wget http://www.cython.org/release/Cython-0.13.tar.gz
tar xzvf Cython-0.13.tar.gz
cd Cython-0.13
python setup.py install
yum install libevent
yum install libevent-devel
wget http://memcached.googlecode.com/files/memcached-1.4.5.tar.gz
tar xzvf memcached-1.4.5.tar.gz
cd memcached-1.4.5
./configure –prefix=/usr
make
make install
wget http://launchpad.net/libmemcached/1.0/0.44/+download/libmemcached-0.44.tar.gz
export CFLAGS=”${CFLAGS} -march=i486″ # without this, you ll probably run into undefined references to `__sync_fetch_and_add_4′
./configure –prefix=/usr
make
make install
# the biggest package is also the simplest!
yum install postgresql-server
yum install postgresql
yum install postgresql-devel
# libs, pg_config script et al
yum install postgresql-devel
# libpqxx requires a build
# Find current links at http://pqxx.org/development/libpqxx/wiki/DownloadPage
wget http://pqxx.org/download/software/libpqxx/libpqxx-3.1.tar.gz
cd libpqxx-3.1
./configure –prefix=/usr
make
make install
cd ..
mkdir package
wget http://cr.yp.to/daemontools/daemontools-0.76.tar.gz
tar xzvf daemontools-0.76.tar.gz
cd admin/daemontools-0.76/
package/install
Here you may run into the following issue:
./load envdir unix.a byte.a
/usr/bin/ld: errno: TLS definition in /lib/libc.so.6 section .tbss mismatches non-TLS reference in envdir.o
/lib/libc.so.6: could not read symbols: Bad value
collect2: ld returned 1 exit status
make: *** [envdir] Error 1
Which, of course, you can fix by editing compile/error.h
Substitute extern int errno;
With #include <errno.h>
Rerun package/install – you should be fine now.
(What, no patch? Pardon the lazyness once again, it’s a one line fix so you probably don’t need a patch for that.)
# Some required XML libs
yum install libxml2
yum install libxml2-devel
yum install libxslt
yum install libxslt-devel
# Now let’s get Erlang installed. We’ll need it for RabbitMQ Server
# First, we need curses
yum install ncurses-devel
# This will take a while…
wget http://www.erlang.org/download/otp_src_R14B_erts-5.8.1.1.tar.gz # source is over 57MBytes – HUGE!
tar xzvf otp_src_R14B_erts-5.8.1.1.tar.gz
cd otp_src_R14B
./configure –prefix=/usr
make
make install
# Get RabbitMQ Server from rpm package
wget http://www.rabbitmq.com/releases/rabbitmq-server/v2.1.0/rabbitmq-server-2.1.0-1.noarch.rpm
rpm –nodeps -i rabbitmq-server-2.1.0-1.noarch.rpm #unfortunately we need nodeps because we installed erlang from source
Head over to the Cassandra site click on Download for the mirror list and copy the closestmirror link.
Copy the link location and, as usual:
wget CLOSESTMIRROR
tar xvzf apache-cassandra-0.6.5-bin.tar.gz
cd apache-cassandra-0.6.5
mkdir -p /var/log/cassandra
chown -R `whoami` /var/log/cassandra
mkdir -p /var/lib/cassandra
chown -R `whoami` /var/lib/cassandra
# Fire Cassandra up
./bin/cassandra -f
Pfeeew. Now we’re done with the pre-requisites.
Now, let’s follow the Reddit setup instructions to get an initial instance up and running.
I’ve run virtual Apache servers since 1997, and my usual setup is to have public files under /www/sites – You may choose any other location. Basically, this varies with every UNIX distribution and every system I’ve seen, so feel free to adopt your own strategy here.
cd r2
sudo python setup.py develop # look, ma, no hands!
Here you may run into trouble with lxml-2.3beta1.tar.gz. Libxml2 provides /usr/include/libxml2 but its subdirectory libxml is also searched for by C programs as in #include <libxml/xmlversion.h> Unless configure was clever enough to add -I/usr/include/libxml2, this step will fail to include the needed file. lxml 2.3 happens to fail here on my system.
There’s probably more than one way of fixing this, I just went with the simplest.
Let’s move on. Now you’ll run into an issue with pycassa, that is because “pycassa has moved to http://github.com/pycassa/pycassa“. So, we’ll build this one by hand.
Back to /www/sites/r/reddit.com/r2 and we retry python setup.py install
We now find that it does not detect the newer pycassa. We must edit setup.py to substitute the old URL by hand and then retry. Easy. Edit setup.py using your favorite editor(</tongueincheek>), search for pycassa and substitute for the above URL.
setup.py should now complete successfully. Now onto the almighty make – it should run without issues.
Create a postgres user if you don’t have one. Then we follow the Reddit instructions and create the user and set directory permissions accordingly.
# start and configure rabbitmq for the 1st time
/sbin/service rabbitmq-server start # start RabbitMQ queue service
rabbitmqctl add_vhost /
rabbitmqctl add_user reddit reddit
rabbitmqctl set_permissions -p / reddit “.*” “.*” “.*”
Now we need to work on the Cassandra configuration for the “permacache”. Follow the quick instructions from the “Set up Cassandra” section on the Reddit guide.
If all goes well now, you should now be able to start Reddit.
paster serve --reload example.ini http_port=9090 (I used 9090 because Cassandra took 8080.)
The Reddit system is tested daily by millions of users. It’s a great software system for you to learn more about the RabbitMQ message queue server, Pylons, the amazing Cassandra distributed storage database and improve your overall web development foo, based on a tried and tested code base.
Open source gives us the opportunity to see the inner workings of systems which handle billions of requests daily, something that was completely out of the reach of the student and neophite back in the 1980′s and early 90′s. Reading the Reddit source and playing with it live is an opportunity for you to learn or improve your Python skills, enterprise-level architecture strategies that work.
Reading dead code is one thing. Reading the same code which is currently handling a billion requests somewhere on the WWW is another entirely different experience, it’s something which is always on my mind when I study the Apache sources: this code handles over 60% of all the HTTP requests out there every second. Having access to such code is of incalculable value.
I hope you have fun and learn tons from playing with the Reddit system.
While I read ultra-marathonist Dean Karnazes’ latest book, I came a cross an interesting dieting tip: eat everything the Neanderthal Man ate, and you won’t get fat. This caught my attention as just a while ago I had read Dr. Mark Hyaman’s book called Ultrametabolism, where he describes a whole food diet which will increase your metabolic rate and keep you burning fat even as you rest.
The logic behind the Neanderthal diet is simple: our digestive system hasn’t changed much in the past 130000 years. The difference is that nowadays we don’t need to hunt for our food anymore. We just drive our gasoline-powered cars to the nearest McDonald’s and easily inject 2000 calories of energy into our system in just a couple of minutes. No chasing wild animals, no climbing trees after some fruit. The result is obvious: we’re getting fat.
Our ancestors had to spend hours hunting and working hard at obtaining a total average of 2000 calories per day. Today, one milkshake has more sugar than we could hope to find on all the fruit a Neanderthal man could eat.
The human mind and body are incapable of absorbing everything the market offers us. We get more and more of everything, in ever larger extra plus sized portions. All while our body would have been satisfied with a tiny portion of it.
I was surprised to find out that the Neanderthal Man’s diet really does work. I was more alert, I lost weight while eating everything I wanted to eat, and it seems that everything seemed to work better on my body. It’s also very simple: you may eat anything the Neanderthal Man could, including grains, vegetables, all meats, fruits and so on. These tyypes of foods do not allow you to overeat, because our body is tuned to be satisfied by just the right amount of them. When a certain amount of fiber and nutrients have been ingested, the body sends you a clear signal that you’re satisfied. Before you add anything to your plate, ask yourself “did this exist 130 thousand years ago”? If the answer is yes, you may go ahead and eat all you want of it.
So, what does any of this have to do with capitalism?
If you look closely, the Neanderthal Man’s diet analogy is perfect for the world we live in.
It is impossible to absorb the surreal amount of information and new products being released every instant. We are becoming physically and mentally obese. We must cultivate the conscience that it is impossible to possess and consume everything that is shown to us at the speed that everything is thrown at us.
It seems like there is an insatiable urge for permanent showbiz. Super-athletes must break their records on every next competition and they must live under unreal behavioral restrictions. Stock markets must soar every session or heads will roll, sales have to increase every period, goals must be met under the most stringent deadlines. All these are examples of artificial expectations that are absolutely the contrary of what nature tells us to do. In the natural world, everything oscillates, natural progress is slow and constant and there is no schedule or deadline that can make a tree develop any faster than its DNA was programmed to. Attempts to artificially speed up nature almost always end up being catastrophic.
So why are we trying to speed up our own nature when that is impossible? Everything in nature happens on its own time, completely ignoring the excesses of modern capitalism. We are becoming mentally and physically obese – and we have been submitting ourselves to an artificial and very unhealthy lifestyle for a very long time. For thousands of years, in fact.
Planet Earth is screaming hints at us. Birds are dying, fish are dying, the air is polluted, all while the TV is telling us to use less water. In the meantime, British Petroleum pollutes more water than humanity could consume.
We must hit the brakes, urgently.
The idea is to practice a sort of Neanderthal Capitalism, more primitive and slower, under which we can develop with health and have time to absorb all that the market has to offer, without getting caught running in the hamster wheel. As with the Neanderthal Diet, no sacrifice is required, just act naturally.
Here are some ideas:
That is part of the recipe to Neanderthal Capitalism
PS. Someone just wrote me an email about this article(same day it was published, yay!). The email said: “if Neanderthal was so good, why is he extinct.” Good question. I don’t know.