ENGINEERING
4th November 2020

Running an Ethereum Node on Kubernetes is Easy

People who were running an outdated version of Geth or were reliant on a vendor (Infura in this case) experienced a forked Ethereum (similar to a server going offline) Nov. 11, 2020, see the post mortem.

No need to fear, having a properly updated Ethereum node is a lot easier than people think using "the cloud".

Kubernetes is a "container orchestration platform", that's a fancy word for keeping docker containers running - and despite whatever complaints you've heard, it makes running and upgrading software like Ethereum painless. Just update version in the yaml file and apply it.

You'll just need a big enough machine to run the Geth client, currently ~650GB of disk (and growing, so at least 1TB), and I'd recommend at least 8GB of RAM. You can get this from Digital Ocean for $80/mo - memory optimized 16GB instance. Then another $100/mo for the 1TB of disk. AWS and GCP might be slightly more expensive (they charge for the Kubernetes cluster). You can expect Geth to finish an initial quick sync in a few hours, then the rest of the data will be backfilled over days/weeks.

If you don't have a Kubernetes cluster/instance/whatever, setting one up is usually "one-click" from Digital Ocean, AWS, or GCP, but it may take a few minutes before the system comes online. On your local machine (e.g. laptop) you'll install the kubectl command-line tool, and then authenticate with the cluster. Most tutorials look daunting but it really is this easy.
you'll be running this command any time after updating geth.yaml (Kubernetes keeps state on the server so it knows, based on the type and name whether this "object" needs to be created or updated after running the apply command):

kubectl apply -f geth.yaml

where the contents of geth.yaml (it should be human-readable) are:

apiVersion: apps/v1
kind: StatefulSet
metadata:
name: geth-mainnet-full
spec:
serviceName: geth-mainnet-full
replicas: 1
selector:
matchLabels:
app: geth-mainnet-full
template:
metadata:
labels:
app: geth-mainnet-full
spec:
containers:
- name: geth-mainnet-full
image: ethereum/client-go:v1.9.23
args:
[
"--http",
"--http.addr=0.0.0.0",
"--http.vhosts=geth-mainnet-full",
"--http.api=eth,net,web3,txpool",
"--ws",
"--ws.addr=0.0.0.0",
"--datadir=/data",
]
env:
ports:
- containerPort: 8545
name: gethrpc
- containerPort: 30303
name: gethdiscovery
volumeMounts:
- name: data
mountPath: "/data"
resources:
limits:
memory: 12000Mi
requests:
memory: 10000Mi
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: do-block-storage
resources:
requests:
storage: 1000Gi

To view geth logs in Kubernetes

Run:

kubectl get pods

to see it running, get the name of the pod from the first column and paste it into the following command (remove the variable)

kubectl logs $NAME_OF_POD

To upgrade

Simply edit (using your favorite text editor) the image field where it says 1.9.23 to whatever version you need. Then run:

kubectl apply -f geth.yaml

or edit it live on Kubernetes:

EDITOR=vim kubectl edit statefulset geth-mainnet-full

(applying a file or editing live updates the "state" and will trigger a reboot)

To reboot

(if for whatever reason the instance is unresponsive)

kubectl rollout restart statefulset geth-mainnet-full

To "ssh in"

kubectl exec -it geth-mainnet-full-0 -- sh

To proxy to your local machine (laptop)

kubectl port-forward statefulset/geth-mainnet-full 8545:8545

If you prefer a GUI to monitor your Kubernetes cluster status and tail logs, see https://infra.app/.

Come work with us!

If you’re a software engineer interested in helping us contextualize and categorize the world’s crypto data, we’re hiring. Check out our open engineering positions to find out more.